mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2011-07-25, 22:30   #1
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

22·577 Posts
Default DSP hardware for number crunching?

Can a Cisco TelePresence codec be used to find Mersenne primes?

The hardware in question uses an array of 32 Blackfin ADSP-BF561 processors. I know that DSP applications take advantage of FFT, which is used in GIMPS.

So my question is, can one use this hardware for number crunching? Just curious.
ixfd64 is offline   Reply With Quote
Old 2011-07-26, 02:38   #2
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

6B016 Posts
Default

*ANY* computer with sufficient memory can crunch for GIMPS....the question is if it has enough performance to make it interesting. "P90 years forever!" (from P95 no less!)

Sure looks like it has enough processing horses to be interesting...how much memory on the codec as a whole? I assume this thing has an ethernet interface for programming and I/O.

The main obstacle I see is that you'd have to port something over yourself, as I don't see P95 or anyone here thinking there will be enough of these in GIMPS duty to justify re-targeting mprime or any other code. mfaktc or CUDAlucas might be easiest, as the P95 core is full of assembler optimisations.

With 32 processors, it might also be able to do matrix reduction for factoring/sieving jobs. If the price were right, I could see it being pressed into that use.

Now, at what cost might I get one? Can I get a developer's kit, or is it going to be like working on a PS/3?
Christenson is offline   Reply With Quote
Old 2011-07-26, 03:27   #3
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

63428 Posts
Default

That device has 2 Blackfin cores, each blackfin core has 2 16bit multiplier/accumulators operating at 600 MHz, for a total of 2400MMAC/s (mega-multiple/accumulates per sec). While the instruction thoughput approaches that of a modern processor (assuming it could be fully utilized), I'm doubtful that a 16 bit MAC would be useful for GIMPS. The processor is obviously not optimized for this sort of thing.

The memory architecture is also lacking for this (or any factoring step, sadly) application. 32k of L1 SRAM cache is good, 128k of L2 SRAM cache is not, and a PC133 bus to main memory is downright ugly.

Of course there are those who will say that a cycle is a cycle. The datasheet says
Quote:
The architecture has been optiĀ­mized for use in conjunction with the VisualDSP C/C++ compiler
... so knock yourself out!
bsquared is offline   Reply With Quote
Old 2011-07-26, 03:36   #4
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

1001000001002 Posts
Default

I guess it would be safe to assume that programs like Mprime, glucas, etc. won't run on this thing?
ixfd64 is offline   Reply With Quote
Old 2011-07-26, 03:55   #5
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

2·17·97 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
I guess it would be safe to assume that programs like Mprime, glucas, etc. won't run on this thing?
I don't know for sure, but it's very doubtful since the device does not have an x86 architecture.
bsquared is offline   Reply With Quote
Old 2011-07-26, 04:16   #6
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

2·17·97 Posts
Default

Quote:
Originally Posted by Christenson View Post
Now, at what cost might I get one? Can I get a developer's kit, or is it going to be like working on a PS/3?
Eval kit.
bsquared is offline   Reply With Quote
Old 2011-07-26, 21:06   #7
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

24·107 Posts
Default

Only 16 bit MACs? Gonna make it tough on the programmers here, didn't see the SLOW (relatively) bus to modern memory.

It can be done, but it will be a major hack....you got a source for a few thousand of these, cheap?
Christenson is offline   Reply With Quote
Old 2011-07-26, 21:17   #8
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

CE216 Posts
Default

Quote:
Originally Posted by Christenson View Post
Only 16 bit MACs? Gonna make it tough on the programmers here, didn't see the SLOW (relatively) bus to modern memory.

It can be done, but it will be a major hack....you got a source for a few thousand of these, cheap?
Me? No. There *is* a 1000 piece price point, but I imagine that a few thousand would be hard to come by given any definition of cheap. And of course you'll have to design and build boards for them, and buy the rest of the BOM: stuff like memory and capacitors and connectors, and have them assembled (BGA packages can't really be attached by hand - specialized equipment is needed which you don't have).

By the way, don't take my posting of links as an indication that I'm an expert with these devices. I'm just browsing the online website and data sheets the same as the rest of you.

Last fiddled with by bsquared on 2011-07-26 at 21:19 Reason: add the bit about assembly
bsquared is offline   Reply With Quote
Old 2011-07-27, 03:09   #9
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

24×107 Posts
Default

Bsquared...thanks. I was thinking of these CPUs assembled into teleconferencing machines, scrapped or something due to no market and thus available for much less than the original manufacturing cost, like maybe at $10 apiece....so I could have a small farm of them instead of my next GPU. I've got some Pentium systems like that around my house, and they are all idle.....

I'm still wrestling with the mathematician, who says, lets see, its a large enough computer, and reasonably fast, so it CAN do the job (and leaves all the practical details as an exercise to the reader), and the engineer, who says, suppose the resources are committed to programming this beast, will the return be worth the effort? Will many machines run the code?

I think a reasonably skilled C coder could talk cudaLucas and/or P95 and make it run on the machine, but I think there are probably better-leveraged projects for GIMPS.

Oh, and divide the performance by approximately 5 if 32 bit operations have to be synthesized from 16 bit operations. (4 is the cost of doing 4 16 bit multiplies to make a 32 bit multiply; add a fudge factor for the cost of managing the extra instructions to make it work.

We need an energy per operation cost for this machine, rather badly to do meaningful estimation.
Christenson is offline   Reply With Quote
Old 2011-08-02, 11:22   #10
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3·1,163 Posts
Default

There is no free lunch. Compared to a high-performance general-purpose CPU costing hundreds of dollars, a $10 DSP is worth every penny.

If what you want is raw double-precision flops without regard to space constraints or power consumption, no embedded processor, or even an FPGA, is going to be an attractive option. It's not like 1/100 of an LL test for 1/100 of the dollars is valuable.
jasonp is offline   Reply With Quote
Old 2011-08-08, 04:41   #11
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

24·107 Posts
Default

I would have thought that high-end GPUs offered the best available performance for under a grand or so -- and that ultimate performance in a given technology would probably involve some kind of dedicated FFT hardware and something similar to an FPGA to route the results. (That is, take FPGA concept, but the fundamental units aren't gates -- they are floating-point register dataflows).

In my mind, the question was if there were enough of these DSPs available to make the programming effort worthwhile. A completed LL test with a residue is a completed LL test. If it takes half a year and $0.10 worth of electricity instead of $0.75, it's possibly a worthwhile bargain, provided there are enough (100s) of machines available at the right price.
Christenson is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
The prime-crunching on dedicated hardware FAQ (II) jasonp Hardware 46 2016-07-18 16:41
The prime-crunching on dedicated hardware FAQ jasonp Hardware 142 2009-11-15 23:20
The Number Crunching King Primeinator Lounge 18 2008-09-20 18:18
Number Crunching Series. mfgoode Puzzles 15 2006-06-08 05:34
Optimal Hardware for Dedicated Crunching Computer Angular Hardware 5 2004-01-16 12:37

All times are UTC. The time now is 21:49.

Tue Oct 27 21:49:11 UTC 2020 up 47 days, 19 hrs, 2 users, load averages: 1.91, 1.98, 1.97

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.