Go Back > Great Internet Mersenne Prime Search > Hardware

Thread Tools
Old 2012-02-19, 20:14   #1
Feb 2012
Athens, Greece

2F16 Posts
Default How would you design a CPU/GPU for prime number crunching?

Suppose you could design a CPU specifically for running LL primality tests using mprime (suppose it was customized for your processor). How would your ideal prime CPU look like?

What would be the optimal:
- number of cores
- frequency VS number of cores, would a very fast single-core be preferable?
- more integer cores than floating point (like Bulldozer) or the same?
- cache sizes and organization (L0/L1/L2/L3)
- would you share L2 cache between cores like Bulldozer does? or dedicate L2 to every core like Nehalem?
- faster low-latency L1/L2 or larger L3 cache?
- memory bandwidth? (more channels or higher MHz?)
- obviously it would include SSE/AVX, but what other SIMD features would you add? what would you change compared to current processors? AVX is only for floating point now, would the integer support in AVX2 be useful?
- could MIMD be of any use?

is there anything we could remove from current CPUs that isn't needed in prime and increases temperature and power usage without benefit to GIMPS? The internal GPU of course, but is there anything else?

And what about the ideal GPU? It would run double precision FP obviously, but what other design requirements would you have? Would you prefer higher frequency to more parallelization?

Out of all the CPUs and GPUs that have been manufactured, which architecture was the most efficient for GIMPS? (not actual performance which is related to technology etc, but from a design point of view)
emily is offline   Reply With Quote
Old 2012-02-20, 03:33   #2
Romulan Interpreter
LaurV's Avatar
Jun 2011

25·5·59 Posts

The CPU should have 4 memory registers and do a single fast operation (beside of I/O reading/writing the registers, or other operation that can all be slow). There registers should have each 4 billion bits, and the operation should be hardware multiplication: mul Rx,Ry,Ra. This would multiply Rx with Ry, with the fastest possible algorithm and give the lowest part in the specified Ra register, and the highest part in the unspecified Rb.

edit: well, I can live with 3 registers only, and do squaring... :P

Last fiddled with by LaurV on 2012-02-20 at 03:34
LaurV is offline   Reply With Quote
Old 2012-02-20, 03:38   #3
jasong's Avatar
"Jason Goatcher"
Mar 2005

350710 Posts

Originally Posted by emily View Post
And what about the ideal GPU? It would run double precision FP obviously...
No, actually, double precision FP is a kludge because we're dealing with the floating point registers. The optimal thing would be to have all the registers be integer-based, then you wouldn't have to deal with that extra rounding step and possible errors.

Last fiddled with by jasong on 2012-02-20 at 03:39
jasong is offline   Reply With Quote
Old 2012-02-20, 04:05   #4
Basketry That Evening!
Dubslow's Avatar
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts

Please look for answers before starting a new thread. Specifically, there is a stickied FAQ for this question.
Dubslow is offline   Reply With Quote
Old 2012-02-20, 18:46   #5
Feb 2012
Athens, Greece

47 Posts

Yeah looks like I should read more before asking! Thanks :)
emily is offline   Reply With Quote

Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
are blade servers good for number crunching? ixfd64 Hardware 11 2011-11-02 23:54
DSP hardware for number crunching? ixfd64 Hardware 15 2011-08-09 01:11
The prime-crunching on dedicated hardware FAQ jasonp Hardware 142 2009-11-15 23:20
The Number Crunching King Primeinator Lounge 18 2008-09-20 18:18
Number Crunching Series. mfgoode Puzzles 15 2006-06-08 05:34

All times are UTC. The time now is 14:36.

Sun May 9 14:36:40 UTC 2021 up 31 days, 9:17, 0 users, load averages: 1.23, 1.35, 1.71

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.