mersenneforum.org  

Go Back   mersenneforum.org > Other Stuff > Archived Projects > NFSNET Discussion

 
 
Thread Tools
Old 2004-06-01, 18:02   #12
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

103·113 Posts
Default

As I work for a Silicon Valley chip start-up that has developed a hybrid device that attempts to capture the best features of ASICs (high performance, low power consumption, low cost when unit volumes are high) and FPGAs (dynamic reprogrammability, low nonrecurring engineering costs), I can comment on the custome-hardware issue.

With the cost of a tyical 0.13 micron mask being around a million dollars, that sets an absolute minimum on the cost of developing an ASIC - figure at least double that once the costs of chip design/layout and manufacturing the silicon are accounted for. Also, industry data indicate that mask costs rise roughly exponentially with inverse feature size, e.g. a .09 micron mask will cost double or more of the same design at 0.13 micron. Thus, this is only a viable solution for someone with deep pockets, or for someone selling the end product in high volumes.

If the proposed custom circuit is implementable in reprogrammable logic, an FPGA-based solution might be attractive. However, consider that the cost of a typical high-end FPGA, say a Xilinx Virtex 2, is on the order of $1000 per unit. And the same-sized FPGA will have an order of magnitude less effective switching circuitry (i.e. gate count) and performance (i.e. operating frequency) than a similar-sized ASIC would.

And as already stated, even if someone went ahead and spent the money needed to prove this concept and came up with a viable solution, this does zilch for the relation-collecting phase of the algorithm, which (at least to my non-NFS-expert eyes) seems to be where the bulk of the labor is, anyway.

Bob and Paul: what are the asymptotic estimates for the sieving work (assuming optimal sieve parameters) vs. the Gaussian elimination step?
ewmayer is online now  
Old 2004-06-01, 18:46   #13
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

1D2416 Posts
Wink

Quote:
Originally Posted by ewmayer

<snip>

With the cost of a tyical 0.13 micron mask being around a million dollars, that sets an absolute minimum on the cost of developing an ASIC - figure at least double that once the costs of chip design/layout and manufacturing the silicon are accounted for. Also, industry data indicate that mask costs rise roughly exponentially with inverse feature size, e.g. a .09 micron mask will cost double or more of the same design at 0.13 micron. Thus, this is only a viable solution for someone with deep pockets, or for someone selling the end product in high volumes.

If the proposed custom circuit is implementable in reprogrammable logic, an FPGA-based solution might be attractive. However, consider that the cost of a typical high-end FPGA, say a Xilinx Virtex 2, is on the order of $1000 per unit. And the same-sized FPGA will have an order of magnitude less effective switching circuitry (i.e. gate count) and performance (i.e. operating frequency) than a similar-sized ASIC would.

<snip>

Bob and Paul: what are the asymptotic estimates for the sieving work (assuming optimal sieve parameters) vs. the Gaussian elimination step?

Asymptotically, they are identical. If parameters are optimally chosen, then
the work to collect the relations is the same as to solve the matrix. The work
to solve the matrix is O(N^(2+o(1))) via Block Lanczos. N is the number
of rows/columns which is typically a small multiple of the size of the factor
base. The size of the factor base is O(sqrt(sieving time)).

Note that there are two different machines involved. One to collect the
relations, one to solve the matrix. The paper on the matrix solver suggests
that FPGA's would be ineffective in building that machine. It really requires
custom VLSI. The latest paper was assuming .09 micron technology.

BTW, I thought you were teaching at Case-Western?

Bob
R.D. Silverman is offline  
Old 2004-06-01, 19:47   #14
xilman
Bamboozled!
 
xilman's Avatar
 
"𒉺𒌌𒇷𒆷𒀭"
May 2003
Down not across

101010000000012 Posts
Default

Quote:
Originally Posted by Bob Silverman
Asymptotically, they are identical. If parameters are optimally
There is an old joke which can be easily adapted to present purposes. "Asymptotically" means "as N tends to infinity" but no-one has ever observed N doing it.

In real life, due in large part to parallelism trade-offs, the linear algebra takes a few percent (between 1% and 10% say) of the sieving effort when measured by cpu cycle counts.

The trade-off arises because it is easy to program the sieving in such a way that the communication costs in a parallel machine are completely unimportant compared with the computation costs. No-one knows how to write an equally parallelizable linear algebra program. Until quite recently it was widely thought that clusters held together with ethernet would not be adequate for the job either. We now know better, but the algorithm appears not to scale too well. Measurements seem to indicate that performance on a cluster is roughly proportional to the square root of the number of processors.

Paul

Last fiddled with by xilman on 2004-06-02 at 17:54 Reason: Added an omitted "not", which completely changes the sense of a statement and, furthermore, corrects it!
xilman is offline  
Old 2004-06-01, 19:54   #15
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

22·5·373 Posts
Wink

Quote:
Originally Posted by xilman
There is an old joke which can be easily adapted to present purposes. "Asymptotically" means "as N tends to infinity" but no-one has ever observed N doing it

Paul
The joke is due to John Selfridge. I was there when he said it. The original
joke was "Although we know log log x goes to infinity, no-one has ever observed it doing so".

BTW, recent NFSNET work has shown that as our numbers get bigger, the
linear algebra takes a bigger fraction of the total elapsed time. We can
always reduce sieve elapsed time by throwing more machines at the problem.
Not so for the LA. I'm sure Paul will agree.

We now have 3 numbers sieved for which the LA still has not been done.
It will be 4 when 11,206+ finishes.
R.D. Silverman is offline  
Old 2004-06-01, 19:58   #16
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

103×113 Posts
Default

Quote:
Originally Posted by Bob Silverman
Asymptotically, they are identical. If parameters are optimally chosen, then
the work to collect the relations is the same as to solve the matrix. The work
to solve the matrix is O(N^(2+o(1))) via Block Lanczos. N is the number
of rows/columns which is typically a small multiple of the size of the factor
base. The size of the factor base is O(sqrt(sieving time)).
...So even if you can speedup the matrix solution phase by an arbitrarily large amount, you only get an overall 50% speedup.


Quote:
Note that there are two different machines involved. One to collect the
relations, one to solve the matrix. The paper on the matrix solver suggests
that FPGA's would be ineffective in building that machine. It really requires
custom VLSI. The latest paper was assuming .09 micron technology.
The one thing here that is most interesting to me personally is that the two "machines" seem to be evolving in different directions. The relation-collecting part is an ideal application for large-scale distributed computing, as it requires zero internode communication (each node just deposits its relations with a cetralized server), whereas the linear algebra seems to require (or at least benefit greatly from, in the sense of actually achieveing the asymptotic work estimate) a centralized machine with a large pool of shared memory and very high interprocessor bandwidth. But since general-purpose multiprocessor hardware appears to be more than up to the task (and unlike ASIC masks, is getting cheaper all the time), aside from its theoretical interest, I don't see why someone would want to spend boatloads of money building custom hardware to save (using RSA512 as a rough barometer) a few days of multiprocessor machine time per factorization. In other words, unless the relation-collecting capability starts to greatly outstrip the linear-algebra crunching capacity, this seems to me to be a solution in search of a problem.


Quote:
BTW, I thought you were teaching at Case-Western?
I was, from '93-'99. Midwest weather sucks, and I appear to have a low tolerance for repetition - and teaching, especially semi-motivated undergrads, is unavoidably repetitive. I admire people that can make a lifelong career of it, but at the same time I don't care to join them.
ewmayer is online now  
 

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
nfsnet.org updates Nichtwahr NFSNET Discussion 16 2007-04-27 14:45
nfsnet down fivemack NFSNET Discussion 9 2007-04-23 14:55
Questions about NFSNET... WraithX NFSNET Discussion 2 2007-03-03 23:56
Questions about NFSNET koders333 NFSNET Discussion 4 2005-09-26 13:19
http://www.nfsnet.org/downloads/nfsnet-04145-powerpc-MacOSX.tgz Death NFSNET Discussion 15 2004-06-22 07:35

All times are UTC. The time now is 00:05.


Sat Jul 17 00:05:40 UTC 2021 up 49 days, 21:52, 1 user, load averages: 1.42, 1.44, 1.45

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.