mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2009-06-22, 16:43   #1
stars10250
 
stars10250's Avatar
 
Jul 2008
San Francisco, CA

3×67 Posts
Default i7/prime95 memory question

Appologies if this has been discussed already.

I assembled my new i7 system over the weekend and got odd behavior when running prime95. It's a 6 GB ram build, but with my 32 bit Win XP OS the system indicates just over 3 GB available memory. This was expected. When I installed Prime 95 (latest version), it limited my max memory to around 2.7 GB, also a reasonable number. But when I ran Prime95, I could watch it draw up to about 1.7 GB of memory in Task Manager and then it would say it ran out of memory. The 4 cores (hyperthreading diabled in bios) all fought for memory, sometimes to the point of 1 core quitting altogether because the others got to it all first. My questions are 1) why did it stop 1 GB short of using the available memory, and 2) why doesn't the software divide up the memory in a fair way so that each gets a reasonable amount? Can I force a memory limit per core in the worktodo? The machine is now doing 4 LLs, so the memory battle is over for now. But 3 of the cores will finish on the same day and then the fight will begin again.

So far I'm happy with the build. It is currently OC'd to 3.3 GHz, all 4 core temps ~60c, 43 ms iteration times on 45M exponents.
stars10250 is offline   Reply With Quote
Old 2009-06-22, 22:36   #2
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

101011001111002 Posts
Default

Where you doing P-1 at first?????
Uncwilly is offline   Reply With Quote
Old 2009-06-23, 00:44   #3
lavalamp
 
lavalamp's Avatar
 
Oct 2007
Manchester, UK

25378 Posts
Default

If your overclock is stable with HT enabled, you might want to experiment a little with hyperthreading. At 4032 MHz on the P95 bench I get 35.639 ms for 2560K FFT, but with 2 workers on 1 core I get 32.588 ms. With 8 workers on 4 cores I get 10.889 ms.

It's worth a little experimentation to make sure you're getting the absolute best out of the CPU, hyperthreading should squeeze just a little more out of it.
lavalamp is offline   Reply With Quote
Old 2009-06-23, 01:03   #4
stars10250
 
stars10250's Avatar
 
Jul 2008
San Francisco, CA

3×67 Posts
Default

Yes, it was doing p1 at first and fighting for memory in stage 2. This system replaces a single-core system, so I had one assignment that was about 50% into a LL that transferred to the i7. This new machine therefore had 3 new assignments and one old LL.

My concern with hyperthreading is that I thought I remember the discussion resolving to it being identical throughput with or without HT. These jobs take about a month when 1 core is devoted, so if I HT it will be closer to 2 months. That's a long time to wait for a single result. I like to see more regular progress. And if something breaks, I've (potentially) lost less work with 1-month assignments.

You're saying you see a reduction of about 3 ms per core with HT on? If so, that's worth doing.

Does anyone have answers to my questions of why it didn't use the full 2.7 GB that it alloted in the p-1 (stage 2) and why it doesn't share memory more evenly?
stars10250 is offline   Reply With Quote
Old 2009-06-23, 02:49   #5
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

22×2,767 Posts
Default

Quote:
Originally Posted by stars10250 View Post
Yes, it was doing p1 at first and fighting for memory in stage 2. This system replaces a single-core system, so I had one assignment that was about 50% into a LL that transferred to the i7. This new machine therefore had 3 new assignments and one old LL.

Does anyone have answers to my questions of why it didn't use the full 2.7 GB that it alloted in the p-1 (stage 2) and why it doesn't share memory more evenly?
IIRC this is a known issue and on George's to-do list. Search the forum, you may find a similar thread with Geo.'s involvment.
Uncwilly is offline   Reply With Quote
Old 2009-06-23, 03:19   #6
lavalamp
 
lavalamp's Avatar
 
Oct 2007
Manchester, UK

53·11 Posts
Default

Quote:
Originally Posted by stars10250 View Post
My concern with hyperthreading is that I thought I remember the discussion resolving to it being identical throughput with or without HT. These jobs take about a month when 1 core is devoted, so if I HT it will be closer to 2 months. That's a long time to wait for a single result. I like to see more regular progress. And if something breaks, I've (potentially) lost less work with 1-month assignments.
I didn't mean run two tests on the same core, I meant run one test with two threads on the same core. Or possibly one test with 8 threads on 4 cores, which could conceivably shorten each test to 1 week or there abouts.

I performed several tests with a 20M FFT size for someone interested in acquiring one for LLing of 100 million digit candidates, and got many numbers and figures out of it:
http://www.mersenneforum.org/showpos...2&postcount=21

Also, if you take a look at the numbers in my first post in this thread there's a 9% decrease in iteration time for using a helper thread on the same core. Whether that would remain the case with three other LL tests running on the other cores I do not know, other issues come into play such as limited cache availability and memory bandwidth. That is why trying different combinations is necessary.
lavalamp is offline   Reply With Quote
Old 2009-06-23, 03:35   #7
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

14CD16 Posts
Default

How about a 64 bit OS? I am using Vista Business 64 Bit. Though it is NOT recommended and real stable for all the newest games and toys it works just fine for Prime95 and it get you all the memory....
petrw1 is offline   Reply With Quote
Old 2009-06-23, 03:58   #8
stars10250
 
stars10250's Avatar
 
Jul 2008
San Francisco, CA

3×67 Posts
Default

I might try a 64 bit OS sometime but this is my main home computer and I need it to be able to run all of my other software too (photoshop, office, turbotax, etc). I'm not sure about compatibility. It sure would be nice to take advantage of my 6 GB of ram :)

Regarding HT, I want the maximum throughput while running LL on 2560K FFTs. I agree that experimentation is the best course of action as it is tailored to my particular setup. I will be doing this for quite a while I'm sure. Tonight I switched back to HT (8 threads) and while my timing numbers basically doubled, my core temperatures all increased by 6 C. I didn't expect this. It was readily apparent and nothing else in my system changed.
stars10250 is offline   Reply With Quote
Old 2009-06-23, 06:51   #9
lavalamp
 
lavalamp's Avatar
 
Oct 2007
Manchester, UK

53×11 Posts
Default

I'm not sure how it's possible for your timing numbers to double unless you left the affinities set to 0,1,2,3 instead of changing to 0,2,4,6 or 1,3,5,7 (or 0,3,4,7 or any other combination spread over the 4 cores). Try setting the thread affinities to 0,2,4,6 and set one helper thread each, that will mean two threads per LL test, and the tests should be arranged as so LL1{0,1} LL2{2,3} LL3{4,5} LL4{6,7}. You can easily check that they are running on the correct virtual cores by stopping the other three tests temporarily, and checking in task manager to see which two cores are topping out at 100% usage.

A small temp increase would be expected as you now have an additional I/O unit active per core, and also more work can be done per core per clock cycle.
lavalamp is offline   Reply With Quote
Old 2009-06-23, 08:29   #10
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

2·7·461 Posts
Default

Quote:
Originally Posted by lavalamp View Post
I'm not sure how it's possible for your timing numbers to double unless you left the affinities set to 0,1,2,3 instead of changing to 0,2,4,6 or 1,3,5,7 (or 0,3,4,7 or any other combination spread over the 4 cores).
Oh, joy; so on Windows the left-hand sides of the cores are 0246 and the right-hand sides 1357, whilst on Linux the left-hand sides of the cores are 0123 and the right-hand sides 7645 (!)
fivemack is offline   Reply With Quote
Old 2009-06-23, 09:04   #11
joblack
 
joblack's Avatar
 
Oct 2008
n00bville

25·23 Posts
Default

Hyperthreading is very important with the i7 ... if you only calculate only one number with 8 threads it could accelerate your calculations ...

Last fiddled with by joblack on 2009-06-23 at 09:51
joblack is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Prime95 P-1 memory leak? TheMawn Software 5 2013-12-12 01:22
Memory Question Unregistered Information & Answers 3 2011-04-23 21:51
Prime95 v25.7 memory usage opyrt Software 1 2008-11-09 08:43
Question about Prime95 and virtual memory JuanTutors Software 12 2006-10-18 20:22
Prime95 Memory Usage jimmyhua Software 7 2005-07-10 07:37

All times are UTC. The time now is 16:11.


Fri Jul 7 16:11:55 UTC 2023 up 323 days, 13:40, 0 users, load averages: 1.77, 1.44, 1.25

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔