20210710, 18:50  #23 
(loop (#_fork))
Feb 2006
Cambridge, England
2^{2}·1,613 Posts 
I'd be surprised if you could get anything much higher performance ... the board won't support the bizarre Platinum 92xx dualchip packages, and the absolutely highest performance Platinum 82xx is 28 cores at 2.7GHz so only 40% faster. I'd expect it to be three or four years before those chips hit the secondhand market and another two or three to get trivially cheap  the Ivy Bridge Xeons have only just hit the sub£100 mark and supply for the 12core versions isn't there.

20210713, 08:16  #24  
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2^{6}×7 Posts 
Quote:
As far as I can determine, the platinum processors offer no advantage over gold in dualCPU systems  the advantage of platinum CPUs is they can be used in boards taking up to 8 processors. The 2nd generation 24core 3.0 GHz Gold 6248R is being sold by Intel at around $2700 in quantities of 1000. That’s considerably less than the $10,000 of the 28core 2.7 GHz Platinum 8280. For this reason there are a large number of 6248Rs being bought from Intel, and there are plenty of 6248Rs on the second hand market today. There should not be much difference in performance between 28core 2.7 GHz and 24core 3.0 GHz  in fact I would prefer the higher clockspeed even at a reduced number of cores. Currently the 6248Rs are selling around £1600 (GBP ) each. When they fall to £500 each I will buy a couple. Last fiddled with by drkirkby on 20210713 at 08:20 

20210714, 18:06  #25  
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2^{6}×7 Posts 
Quote:
1) One copy of mprime, 52 cores Code:
[Worker #1 Jul 14 16:25] Timing 5760K FFT, 52 cores, 2 workers. Average times: 2.07, 2.10 ms. Total throughput: 959.54 iter/sec. [Worker #1 Jul 14 16:26] Timing 5760K FFT, 52 cores, 3 workers. Average times: 3.92, 3.92, 2.12 ms. Total throughput: 981.91 iter/sec. [Worker #1 Jul 14 16:28] Timing 5760K FFT, 52 cores, 4 workers. Average times: 3.92, 3.91, 3.93, 3.94 ms. Total throughput: 1019.05 iter/sec. 2) One copy of mprime, 48 cores Code:
[Worker #1 Jul 14 16:31] Timing 5760K FFT, 48 cores, 2 workers. Average times: 2.02, 2.02 ms. Total throughput: 991.00 iter/sec. [Worker #1 Jul 14 16:32] Timing 5760K FFT, 48 cores, 3 workers. Average times: 4.00, 3.99, 2.03 ms. Total throughput: 992.50 iter/sec. [Worker #1 Jul 14 16:33] Timing 5760K FFT, 48 cores, 4 workers. Average times: 4.00, 3.99, 3.98, 3.99 ms. Total throughput: 1002.56 iter/sec. 3) Two copies of mprime running, in two different directories. 26 cores for each process. Code:
[Worker #1 Jul 14 15:56]Timing 5760K FFT, 26 cores, 2 workers. Average times: 6.27, 6.19 ms. Total throughput: 320.92 iter/sec. [Worker #1 Jul 14 15:57] Timing 5760K FFT, 26 cores, 3 workers. Average times: 13.04, 12.49, 6.11 ms. Total throughput: 320.51 iter/sec. [Worker #1 Jul 14 15:58] Timing 5760K FFT, 26 cores, 4 workers. Average times: 13.25, 12.72, 13.36, 12.85 ms. Total throughput: 306.80 iter/sec. [Worker #1 Jul 14 15:56] Timing 5760K FFT, 26 cores, 2 workers. Average times: 6.27, 6.19 ms. Total throughput: 320.92 iter/sec. [Worker #1 Jul 14 15:57] Timing 5760K FFT, 26 cores, 3 workers. Average times: 13.04, 12.49, 6.11 ms. Total throughput: 320.51 iter/sec. [Worker #1 Jul 14 15:58] Timing 5760K FFT, 26 cores, 4 workers. Average times: 13.25, 12.72, 13.36, 12.85 ms. Total throughput: 306.80 iter/sec. 2 workers on each process 320.92+321.10=642.02 iter/sec 3 workers on each process 320.51+320.34=640.85 iter/sec 4 workers on each process 306.80+305.99=612.79 iter/sec The best result, using two workers, gives 642.02 iter/sec, which is significantly poorer than the configuration with highest throughput (1 mprime process, 52 cores, 4 workers). 4) Two copies of mprime running, in two different directories. 24 cores for each process. Code:
[Worker #1 Jul 14 16:08] Timing 5760K FFT, 24 cores, 2 workers. Average times: 6.44, 6.38 ms. Total throughput: 311.93 iter/sec. [Worker #1 Jul 14 16:10] Timing 5760K FFT, 24 cores, 3 workers. Average times: 13.36, 12.82, 6.32 ms. Total throughput: 311.13 iter/sec. [Worker #1 Jul 14 16:11] Timing 5760K FFT, 24 cores, 4 workers. Average times: 12.98, 13.07, 13.03, 12.97 ms. Total throughput: 307.36 iter/sec. [Worker #1 Jul 14 16:08] Timing 5760K FFT, 24 cores, 2 workers. Average times: 6.44, 6.38 ms. Total throughput: 311.93 iter/sec. [Worker #1 Jul 14 16:10] Timing 5760K FFT, 24 cores, 3 workers. Average times: 13.36, 12.82, 6.32 ms. Total throughput: 311.13 iter/sec. [Worker #1 Jul 14 16:11] Timing 5760K FFT, 24 cores, 4 workers. Average times: 12.98, 13.07, 13.03, 12.97 ms. Total throughput: 307.36 iter/sec. 2 workers per process =311.93+311.69 = 623.62 iter/sec 3 workers per process =311.13+310.32 = 621.45 iter/sec 4 workers per process =307.36+306.75 = 614.11 iter/sec The best result (2 workers per process), is still far from optimal. Using two copies of mprime appears to kill the performance. 5) Two copies of mprime running, in two different directories. 52 cores for each process  trying to use twice as many cores as available. I did not mean to benchmark this  I got the result by accident. I would not even bother having reported it, other than it gives a surprising result. Trying to use 104 cores, when there are only 52 available, actually gives better throughput than expected when using two copies of mprime. But it is not as good as using one copy of mprime. Code:
[Worker #1 Jul 14 15:43] Timing 5760K FFT, 52 cores, 2 workers. Average times: 4.89, 5.03 ms. Total throughput: 403.19 iter/sec. [Worker #1 Jul 14 15:44] Timing 5760K FFT, 52 cores, 3 workers. Average times: 9.78, 9.78, 4.95 ms. Total throughput: 406.50 iter/sec. [Worker #1 Jul 14 15:45] Timing 5760K FFT, 52 cores, 4 workers. Average times: 9.36, 9.25, 9.31, 9.29 ms. Total throughput: 430.14 iter/sec. [Worker #1 Jul 14 15:43] Timing 5760K FFT, 52 cores, 2 workers. Average times: 4.82, 5.00 ms. Total throughput: 407.47 iter/sec. [Worker #1 Jul 14 15:44] Timing 5760K FFT, 52 cores, 3 workers. Average times: 9.82, 9.74, 4.93 ms. Total throughput: 407.12 iter/sec. [Worker #1 Jul 14 15:45] Timing 5760K FFT, 52 cores, 4 workers. Average times: 9.28, 9.18, 9.24, 9.33 ms. Total throughput: 431.98 iter/sec. 2 workers 403.19 + 407.47 = 810.66 iter/sec 3 workers 406.50 + 407.12 = 813.62 iter/sec 4 workers 430.14 + 431.98 = 862.12 iter/sec Conclusions. Based on just the one FFT size of 5760 KB, the best results (1019.05 iter/sec) are achieved using * One copy of mprime * All 52 cores * 4 workers. Running two copies of mprime does not seem effective. The combined throughput of both copies is much less than obtainable with one copy. This supports kriesel's view that mprime handles multiple processors well. A couple of results surprise me a) Using 48cores gives higher throughput than 52 cores if using 2 or 3 workers. However, if using 4 workers, using 52 cores is best. b) Running two copies of mprime, each using 52 cores (twice as many as available), actually gives better results than using two copies of mprime with the available cores. However, the maximum throughput with that weird configuration is 862.12 iter/sec, which is much less than the 1019.05 iter/sec possible with one copy of mprime using all 52 cores. Any comments  particularly on some of the weird results, like 48 cores outperforming 52 cores with 2 or 3 workers? Last fiddled with by drkirkby on 20210714 at 18:43 

20210714, 18:34  #26 
"Curtis"
Feb 2005
Riverside, CA
5×1,033 Posts 
Three.

20210715, 02:46  #27  
Jun 2003
5^{2}·211 Posts 
Quote:
Also, when I said (3x8)x2, that would mean 6 workers total on 48 cores. Do you have those results as well? 

20210715, 08:00  #28  
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2^{6}·7 Posts 
Quote:
Code:
$numactl cpunodebind=0 membind=0 mprime1 $numactl cpunodebind=1 membind=1 mprime2 Quote:


20210715, 08:37  #29 
Jun 2003
5^{2}·211 Posts 

20210715, 08:42  #30  
Jun 2003
5^{2}·211 Posts 
Quote:
In fact, the whole point of doing this test was the inability of P95 to take advantage of the 4>12 improvement in mem channel, and George brought up the issue of mprime not being numaaware. So you need to redo the test with numactl. But... you might also need to set the affinity flags inside each mprime copy so that the program doesn't try to fight the OS. 

20210715, 18:13  #31 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2×3^{2}×11×31 Posts 
Such a situation could be detected on Windows using the CPU pane of Task Manager (Win 7) or Resource Monitor (Win 10) with the caveat that each hyperthread is represented, so it becomes very unwieldy on Xeon Phi. (7250 is 68 cores x 4way HT.)

20210715, 21:04  #32  
"David Kirkby"
Jan 2021
Althorne, Essex, UK
1C0_{16} Posts 
Quote:
I'm not sure how to use the Affinity properly  I have attached the benchmark result, as maybe someone can point out how to control this. I realise I need to put something in local.txt numactl is not binding processes to particular CPUs  I infer this from the temperature of the cores. Perhaps I am using it numactl incorrectly. 6) Testing with both 4 workers (12 cores each) and 6 workers (8 cores each). No attempt was made to control where workers run. Code:
[Worker #1 Jul 15 21:14] Timing 5760K FFT, 48 cores, 4 workers. Average times: 3.98, 3.98, 4.07, 4.07 ms. Total throughput: 993.78 iter/sec. [Worker #1 Jul 15 21:15] Timing 5760K FFT, 48 cores, 6 workers. Average times: 6.44, 6.38, 6.38, 6.61, 6.62, 6.62 ms. Total throughput: 921.81 iter/sec. 7) Two processes, with each process tested using 2, 3 and 4 workers. Attempting (UNSUCCESSFULLY) to tie processes to specific CPUs using numactl. Code:
$ numactl physcpubind=0 mprime0 m [Worker #1 Jul 15 15:32] Timing 5760K FFT, 24 cores, 2 workers. Average times: 6.24, 6.51 ms. Total throughput: 313.79 iter/sec. [Worker #1 Jul 15 15:33] Timing 5760K FFT, 24 cores, 3 workers. Average times: 13.07, 13.00, 6.53 ms. Total throughput: 306.46 iter/sec. [Worker #1 Jul 15 15:34] Timing 5760K FFT, 24 cores, 4 workers. Average times: 13.35, 13.36, 13.42, 13.57 ms. Total throughput: 298.06 iter/sec. Code:
$ numactl physcpubind=1 mprime1 m [Worker #1 Jul 15 15:32] Timing 5760K FFT, 24 cores, 2 workers. Average times: 6.24, 6.49 ms. Total throughput: 314.49 iter/sec. [Worker #1 Jul 15 15:33] Timing 5760K FFT, 24 cores, 3 workers. Average times: 13.08, 12.98, 6.54 ms. Total throughput: 306.48 iter/sec. [Worker #1 Jul 15 15:34] Timing 5760K FFT, 24 cores, 4 workers. Average times: 13.34, 13.35, 13.41, 13.55 ms. Total throughput: 298.19 iter/sec. 4 workers 313.79+314.49 = 628.28 iter/sec. 6 workers 306.46+306.48 = 612.94 iter/sec. 8 workers 298.06+298.19 = 596.25 iter/sec. Using the "sensors" command I can determine the temperature of the CPU cores. Based on that, I'm convinced that using one process with 24 cores, and trying to bind them to CPU1 (2nd CPU) Code:
$ numactl physcpubind=1 mprime 8) Two processes. Each process tested using 2, 3 and 4 workers. (Like #7 above, but not attempting to tie processes to cores.) Code:
[Worker #1 Jul 15 21:42] Timing 5760K FFT, 24 cores, 2 workers. Average times: 6.10, 6.18 ms. Total throughput: 325.85 iter/sec. [Worker #1 Jul 15 21:43] Timing 5760K FFT, 24 cores, 3 workers. Average times: 12.67, 12.64, 6.22 ms. Total throughput: 318.77 iter/sec. [Worker #1 Jul 15 21:45] Timing 5760K FFT, 24 cores, 4 workers. Average times: 12.82, 12.82, 12.82, 12.80 ms. Total throughput: 312.16 iter/sec. Code:
[Worker #1 Jul 15 21:42] Timing 5760K FFT, 24 cores, 2 workers. Average times: 6.10, 6.17 ms. Total throughput: 325.81 iter/sec. [Worker #1 Jul 15 21:44] Timing 5760K FFT, 24 cores, 3 workers. Average times: 12.65, 12.67, 6.24 ms. Total throughput: 318.35 iter/sec. [Worker #1 Jul 15 21:45] Timing 5760K FFT, 24 cores, 4 workers. Average times: 12.75, 12.72, 12.72, 12.79 ms. Total throughput: 313.82 iter/sec. 4 workers 325.85+325.81 = 651.66 iter/sec. 6 workers 318.77+318.35 = 637.12 iter/sec. 8 workers 312.16+313.82 = 625.98 iter/sec. The throughput with 2 processes is lower than with one process, but these are higher than in tests 7 above. Conclusions.


20210715, 21:29  #33  
"David Kirkby"
Jan 2021
Althorne, Essex, UK
2^{6}×7 Posts 
Quote:
9) Temperature sensing  only 12 cores in use. Here's a set of temperature reading when running mprime to test for primes  not benchmarking. I used only 12 cores  4 workers, each with 12 cores. CPU0 is hotter than CPU1,, despite the hot air from CPU0 blows over CPU1. However, its far from clear that 12 cores are hot on CPU0, and 14 cool on CPU0, so it seems the work is moving from core to core. Code:
Package id 0: +48.0°C (high = +89.0°C, crit = +99.0°C) Core 0: +42.0°C (high = +89.0°C, crit = +99.0°C) Core 1: +43.0°C (high = +89.0°C, crit = +99.0°C) Core 2: +43.0°C (high = +89.0°C, crit = +99.0°C) Core 3: +42.0°C (high = +89.0°C, crit = +99.0°C) Core 4: +42.0°C (high = +89.0°C, crit = +99.0°C) Core 5: +42.0°C (high = +89.0°C, crit = +99.0°C) Core 6: +47.0°C (high = +89.0°C, crit = +99.0°C) Core 8: +43.0°C (high = +89.0°C, crit = +99.0°C) Core 9: +42.0°C (high = +89.0°C, crit = +99.0°C) Core 10: +43.0°C (high = +89.0°C, crit = +99.0°C) Core 11: +43.0°C (high = +89.0°C, crit = +99.0°C) Core 12: +43.0°C (high = +89.0°C, crit = +99.0°C) Core 13: +41.0°C (high = +89.0°C, crit = +99.0°C) Core 16: +40.0°C (high = +89.0°C, crit = +99.0°C) Core 17: +41.0°C (high = +89.0°C, crit = +99.0°C) Core 18: +41.0°C (high = +89.0°C, crit = +99.0°C) Core 19: +41.0°C (high = +89.0°C, crit = +99.0°C) Core 20: +39.0°C (high = +89.0°C, crit = +99.0°C) Core 21: +41.0°C (high = +89.0°C, crit = +99.0°C) Core 22: +39.0°C (high = +89.0°C, crit = +99.0°C) Core 24: +41.0°C (high = +89.0°C, crit = +99.0°C) Core 25: +39.0°C (high = +89.0°C, crit = +99.0°C) Core 26: +39.0°C (high = +89.0°C, crit = +99.0°C) Core 27: +41.0°C (high = +89.0°C, crit = +99.0°C) Core 28: +41.0°C (high = +89.0°C, crit = +99.0°C) Core 29: +46.0°C (high = +89.0°C, crit = +99.0°C) dell_smmvirtual0 Adapter: Virtual device fan1: 0 RPM fan2: 802 RPM fan3: 808 RPM nvmepci10200 Adapter: PCI adapter Composite: +33.9°C (low = 20.1°C, high = +77.8°C) (crit = +81.8°C) Sensor 1: +33.9°C (low = 273.1°C, high = +65261.8°C) coretempisa0001 Adapter: ISA adapter Package id 1: +42.0°C (high = +89.0°C, crit = +99.0°C) Core 0: +39.0°C (high = +89.0°C, crit = +99.0°C) Core 1: +39.0°C (high = +89.0°C, crit = +99.0°C) Core 2: +39.0°C (high = +89.0°C, crit = +99.0°C) Core 3: +37.0°C (high = +89.0°C, crit = +99.0°C) Core 4: +37.0°C (high = +89.0°C, crit = +99.0°C) Core 5: +38.0°C (high = +89.0°C, crit = +99.0°C) Core 6: +38.0°C (high = +89.0°C, crit = +99.0°C) Core 8: +37.0°C (high = +89.0°C, crit = +99.0°C) Core 9: +38.0°C (high = +89.0°C, crit = +99.0°C) Core 10: +38.0°C (high = +89.0°C, crit = +99.0°C) Core 11: +38.0°C (high = +89.0°C, crit = +99.0°C) Core 12: +38.0°C (high = +89.0°C, crit = +99.0°C) Core 13: +39.0°C (high = +89.0°C, crit = +99.0°C) Core 16: +39.0°C (high = +89.0°C, crit = +99.0°C) Core 17: +38.0°C (high = +89.0°C, crit = +99.0°C) Core 18: +38.0°C (high = +89.0°C, crit = +99.0°C) Core 19: +39.0°C (high = +89.0°C, crit = +99.0°C) Core 20: +38.0°C (high = +89.0°C, crit = +99.0°C) Core 21: +36.0°C (high = +89.0°C, crit = +99.0°C) Core 22: +37.0°C (high = +89.0°C, crit = +99.0°C) Core 24: +37.0°C (high = +89.0°C, crit = +99.0°C) Core 25: +37.0°C (high = +89.0°C, crit = +99.0°C) Core 26: +38.0°C (high = +89.0°C, crit = +99.0°C) Core 27: +38.0°C (high = +89.0°C, crit = +99.0°C) Core 28: +38.0°C (high = +89.0°C, crit = +99.0°C) Core 29: +39.0°C (high = +89.0°C, crit = +99.0°C) Code:
drkirkby@canary:~/Desktop$ sensors coretempisa0000 Adapter: ISA adapter Package id 0: +73.0°C (high = +89.0°C, crit = +99.0°C) Core 0: +69.0°C (high = +89.0°C, crit = +99.0°C) Core 1: +65.0°C (high = +89.0°C, crit = +99.0°C) Core 2: +65.0°C (high = +89.0°C, crit = +99.0°C) Core 3: +64.0°C (high = +89.0°C, crit = +99.0°C) Core 4: +64.0°C (high = +89.0°C, crit = +99.0°C) Core 5: +65.0°C (high = +89.0°C, crit = +99.0°C) Core 6: +69.0°C (high = +89.0°C, crit = +99.0°C) Core 8: +63.0°C (high = +89.0°C, crit = +99.0°C) Core 9: +65.0°C (high = +89.0°C, crit = +99.0°C) Core 10: +65.0°C (high = +89.0°C, crit = +99.0°C) Core 11: +65.0°C (high = +89.0°C, crit = +99.0°C) Core 12: +65.0°C (high = +89.0°C, crit = +99.0°C) Core 13: +63.0°C (high = +89.0°C, crit = +99.0°C) Core 16: +64.0°C (high = +89.0°C, crit = +99.0°C) Core 17: +66.0°C (high = +89.0°C, crit = +99.0°C) Core 18: +66.0°C (high = +89.0°C, crit = +99.0°C) Core 19: +65.0°C (high = +89.0°C, crit = +99.0°C) Core 20: +63.0°C (high = +89.0°C, crit = +99.0°C) Core 21: +64.0°C (high = +89.0°C, crit = +99.0°C) Core 22: +64.0°C (high = +89.0°C, crit = +99.0°C) Core 24: +65.0°C (high = +89.0°C, crit = +99.0°C) Core 25: +63.0°C (high = +89.0°C, crit = +99.0°C) Core 26: +64.0°C (high = +89.0°C, crit = +99.0°C) Core 27: +65.0°C (high = +89.0°C, crit = +99.0°C) Core 28: +65.0°C (high = +89.0°C, crit = +99.0°C) Core 29: +74.0°C (high = +89.0°C, crit = +99.0°C) dell_smmvirtual0 Adapter: Virtual device fan1: 0 RPM fan2: 1409 RPM fan3: 806 RPM nvmepci10200 Adapter: PCI adapter Composite: +33.9°C (low = 20.1°C, high = +77.8°C) (crit = +81.8°C) Sensor 1: +33.9°C (low = 273.1°C, high = +65261.8°C) coretempisa0001 Adapter: ISA adapter Package id 1: +82.0°C (high = +89.0°C, crit = +99.0°C) Core 0: +77.0°C (high = +89.0°C, crit = +99.0°C) Core 1: +79.0°C (high = +89.0°C, crit = +99.0°C) Core 2: +80.0°C (high = +89.0°C, crit = +99.0°C) Core 3: +78.0°C (high = +89.0°C, crit = +99.0°C) Core 4: +80.0°C (high = +89.0°C, crit = +99.0°C) Core 5: +82.0°C (high = +89.0°C, crit = +99.0°C) Core 6: +80.0°C (high = +89.0°C, crit = +99.0°C) Core 8: +78.0°C (high = +89.0°C, crit = +99.0°C) Core 9: +80.0°C (high = +89.0°C, crit = +99.0°C) Core 10: +81.0°C (high = +89.0°C, crit = +99.0°C) Core 11: +81.0°C (high = +89.0°C, crit = +99.0°C) Core 12: +83.0°C (high = +89.0°C, crit = +99.0°C) Core 13: +81.0°C (high = +89.0°C, crit = +99.0°C) Core 16: +80.0°C (high = +89.0°C, crit = +99.0°C) Core 17: +80.0°C (high = +89.0°C, crit = +99.0°C) Core 18: +81.0°C (high = +89.0°C, crit = +99.0°C) Core 19: +81.0°C (high = +89.0°C, crit = +99.0°C) Core 20: +81.0°C (high = +89.0°C, crit = +99.0°C) Core 21: +78.0°C (high = +89.0°C, crit = +99.0°C) Core 22: +79.0°C (high = +89.0°C, crit = +99.0°C) Core 24: +77.0°C (high = +89.0°C, crit = +99.0°C) Core 25: +78.0°C (high = +89.0°C, crit = +99.0°C) Core 26: +80.0°C (high = +89.0°C, crit = +99.0°C) Core 27: +82.0°C (high = +89.0°C, crit = +99.0°C) Core 28: +81.0°C (high = +89.0°C, crit = +99.0°C) Core 29: +82.0°C (high = +89.0°C, crit = +99.0°C) nouveaupci7300 Adapter: PCI adapter fan1: 2473 RPM temp1: +30.0°C (high = +95.0°C, hyst = +3.0°C) (crit = +105.0°C, hyst = +5.0°C) (emerg = +135.0°C, hyst = +5.0°C) nvmepci0100 Adapter: PCI adapter Composite: +31.9°C (low = 0.1°C, high = +85.8°C) (crit = +86.8°C) Sensor 1: +31.9°C (low = 273.1°C, high = +65261.8°C) Sensor 2: +29.9°C (low = 273.1°C, high = +65261.8°C)


Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Assigning too much memory slows down P1 stage 2?  ZFR  Software  11  20201213 10:19 
Allow mprime to use more memory  ZFR  Software  1  20201210 09:50 
Mini ITX with LGA 2011 (4 memory channels)  bgbeuning  Hardware  7  20160618 10:32 
mprime checking available memory  tha  Software  7  20151207 15:56 
Cheesy memory slows down prime95?  nomadicus  Hardware  9  20030301 00:15 