![]() |
Performance Questions
Sorry to bombard the forum with basic questions, but I have found little on the internet regarding how Prime95 (or any of the GIMPS engines) make efficient use of the cache. I understand that without a cache the memory bandwidth requirements would be emornious. I know that an N-pt FFT can be divided into several smaller FFTs by arranging the data in a matrix form and performing FFTs on the rows and then (after a transpose) the columns. Does GIMPS use this?
Also, could someone point me to a performance analysis of the GIMPS algorithm? I'm interested in how much time is spent doing the various parts of the LL Test. Thanks again! Nevarcds |
Did anyone ever get back to him?
|
[quote=nevarcds]how Prime95 (or any of the GIMPS engines) make efficient use of the cache.
< snip > Does GIMPS use this?[/quote] See George Woltman's notes at [URL="http://www.mersenne.org/gimps/p4notes.doc"]http://www.mersenne.org/gimps/p4notes.doc[/URL] But be aware that they're five years old and don't include his later findings on P4 cache and SSE2 instructions. [quote]I'm interested in how much time is spent doing the various parts of the LL Test.[/quote] 99% in the FFT routines, IIRC. |
[QUOTE=cheesehead]99% in the FFT routines, IIRC.[/QUOTE]
No, it's more like 80-90% doing the FFTs, the remainder doing the dyadic-squaring and round-and-propagate-carries steps. |
| All times are UTC. The time now is 23:30. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.