Go Back > Great Internet Mersenne Prime Search > Hardware

Thread Tools
Old 2004-01-26, 18:44   #1
LoKI.GuZ's Avatar
Sep 2003

2·7 Posts
Default Help: trying to determine latency on movaps instructions on AthlonXP

Hi there

I have been working a lot with SSE optimizations, and i had the opportunity to notice something i have heard of before: amd's sse implementation suffers from serious bottlenecks.
So i decided to determine the amount of clock cycles spent on a movaps instruction with the following code:

x is a pointer to a vector of 32 bit ints (allocated using _aligned_malloc); start and end are the unsigned int variables i use to store time stamp values.

on visual c++ .net 7 (managed extensions off):
mov ecx, x

mov [start], eax

movaps xmm0, [ecx]

mov [end], eax

right now the difference end-start is reporting to be ~5000 clock cycles. what could be wrong??
LoKI.GuZ is offline   Reply With Quote
Old 2004-01-26, 20:05   #2
P90 years forever!
Prime95's Avatar
Aug 2002
Yeehaw, FL

2·5·769 Posts

There is often no way to obtain simple answers to question like yours with today's cpus. Just a few things for you to consider:

There is a difference between latency (how long a dependent instruction must wait) vs. throughput (how many can be executed per clock cycle). If your code has sufficient independent code paths, then often the latter measurement is more important.

RDTSC is real nasty to a modern CPU's out-of-order core. Do not use it to measure the timing of a single instruction. Instead use it to measure a loop executed several hundred or thousand times.

Memory and caching is critical. If timing memory reads you will get vastly different results if the data is in the L1 cache, L2 cache, or main memory.

Last fiddled with by Prime95 on 2004-01-26 at 20:05
Prime95 is offline   Reply With Quote

Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Cas Latency Fred Hardware 13 2016-03-30 13:54
Determine squares fenderbender Math 14 2007-07-28 23:24
determine hyderman Homework Help 7 2007-06-17 06:01
Benchmarks varying FSB, Memory Latency and multiplier S485122 Software 0 2006-11-08 20:21
possibly simple question regarding P4 vs AthlonXP speed diff penguin22 Software 23 2002-10-08 21:34

All times are UTC. The time now is 12:06.

Tue Dec 7 12:06:51 UTC 2021 up 137 days, 6:35, 0 users, load averages: 1.21, 1.43, 1.41

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.