![]() |
[QUOTE=Prime95;263713]That is concerning. What hit rates are others getting for single bit levels of TF?[/QUOTE]
There's a confounding factor here....I've had P-1 find at least one factor in the middle of my TF range (not on the same exponent), so P-1 could be removing some candidates where xyzzy would otherwise find a factor. 1/70 is what would be expected without any pre-selection. |
[QUOTE=Christenson;263686]Not quite what I had in mind.....
when physically at the computer, launch a loop that says sleep 30; if (xyzzy in process table && mfaktc not in process table) mfaktc then, when away from the computer, launch xyzzy, which contains just a "sleep 30" while TRUE (or while mfaktc not in process table) command. that way, the mfaktc is local and launched locally, and the presence of xyzzy in the process table divorces the remoteness from the mfaktc process.[/QUOTE] Christenson, The method you propose fails too. (Sorry - I've tried it also) The console remembers the last method of logging in. So if you logged remotely with mstsc, and said a sleep-then-run-when-logged-out, it's as though you are still logged in via mstsc and fails. There's a lot of forum posts on the topic the only work around (for a win box) that's quoted is vnc-ing in. The other alternative is hacking tesla drivers. Which is a bit beyond my abilities (and besides vnc is a lot easier). :) -- Craig |
[QUOTE=nucleon;263719]Christenson,
The method you propose fails too. (Sorry - I've tried it also) The console remembers the last method of logging in. So if you logged remotely with mstsc, and said a sleep-then-run-when-logged-out, it's as though you are still logged in via mstsc and fails. There's a lot of forum posts on the topic the only work around (for a win box) that's quoted is vnc-ing in. The other alternative is hacking tesla drivers. Which is a bit beyond my abilities (and besides vnc is a lot easier). :) -- Craig[/QUOTE] I've said it before, and I'll say it again: What a broken operating system! What if we made mfaktc more like P95 and gave it a window and all, rather than ran it from the console? Or are we still back to hacking the drivers? |
[QUOTE=Christenson;263717]There's a confounding factor here....I've had P-1 find at least one factor in the middle of my TF range (not on the same exponent), so P-1 could be removing some candidates where xyzzy would otherwise find a factor. 1/70 is what would be expected without any pre-selection.[/QUOTE]
And there's another confounding/confuzzling factor: Xyzzy, are you counting successes of of your stats page, or from your "results.txt" file? They will be different, as my factors found and reported on manual results keep getting assigned by the server as P-1 results. I only know which are which by noting that all my TF is "manual testing" and looking through my results page, but that seems impractical for the volume you have! :hello: |
[QUOTE=Christenson;263685]My actual numbers (ignoring 10 or 15 NFs from the last 24 hours)
247 attempts, mostly in the 50-60M range, a few from the 80M range. 5 factors. So my original numbers are a bit optimistic, and I can expect to hit a dry spell here as my statistical experience increases. As for unfinished LLs, the issue is probably that positive feedback from the stats page takes at least a week, maybe two or three months, which is a *very* long time in the "instant" internet age. Almost anything else (LL-D, P-1, TF, ECM) is quicker. So is contributing to NSF@home, or, perhaps more practically, folding@home. The dedicated don't mind, but the casual don't have that much patience. I'm working on that patience part even for mfaktc, albeit slowly.[/QUOTE] That's more like it. Expect 247/70 factors and find 5. (Don't hold the front page!) I hope that "dry spell" shit was taking the piss:smile: I know you view GIMPS as "proving candidates composite as fast as possible", but after a certain bit level (and P-1), the most effective way of doing this is an LL test... and then who knows?! David |
[QUOTE=Christenson;263717]There's a confounding factor here....I've had P-1 find at least one factor in the middle of my TF range (not on the same exponent), so P-1 could be removing some candidates where xyzzy would otherwise find a factor. 1/70 is what would be expected without any pre-selection.[/QUOTE]
That's what I was talking about when I asked "to what extent do TF and P-1 tread on each other's toes?" David |
[QUOTE=Prime95;263713]That is concerning. What hit rates are others getting for single bit levels of TF?[/QUOTE]
As Michael Winner (director of Death Wish 1 to n and celebrity gourmet/advertizer/self-puplicist) would say: "Calm down dear". Stats to follow. David |
[QUOTE=davieddy;263732]
<snip> (Don't hold the front page!) [Correct. The headline already on it is fine -- GPUs smashing through TF assignments; some manual effort still needed] I know you view GIMPS as "proving candidates composite as fast as possible", but after a certain bit level (and P-1), the most effective way of doing this is an LL test... and then who knows?! David[/QUOTE] Even in my extremely narrow :smile: view of GIMPS, a point arrives where one *has* to do lots of LL tests...each additional bit level of TF costs twice as much time, so my one hour TF to 70 bits in the 50-60M range becomes 32 hours to 75 bits becomes about a year to 83 bits...and the LL on the range only takes a month or two, then another month or two for the eventual LL-D. I do actually run some LL and LL-D and P-1, all CPU based. The P-1 has been a little bit more effective (GHz days/candidate found composite) than the first-time LL tests. But, anything we can do to reduce the pfaffing around needed to run mfaktc increases the available TF testing. Then we can think about P-1, and seeing if its worth doing multiple LLs in parallel on GPUs. I read a bunch of P95 tonight, looking for how P95 does multitasking and taking notes. The first step of untangling P-1 and TF I began tonight by asking GW how to format mfaktc output when it finds factors so the server always decides its a TF result on the manual results page. The second step is beyond me; we will need to adjust the P-1 so it doesn't search for candidates in the TF space, or determine that it is not cost-effective to do so. I'm OK if P-1 occasionally finds something in TF space, like 69-70 bits, but, at the moment, it would have been cheaper to finish TF (an hour) rather than spend 4GHz days of CPU on it. My experience is 1 in 10, but the variance on such a small sample is notoriously high. |
I'll let you off this time for not [/quote]Your Comment[quote]ing
in the middle of my original. The idea behind GIMPS is to find Mersenne primes. (Checkout the acronym). David PS "Credit" for work done is not any part of my incentive. Obviously $$$$ isn't either for any of us. I would hope that some of my posts here have been constructive/entertaining though. PPS I think factors found by P-1 are typically a few bits bigger than "how far TFed". Bugs excepted, they couldn't be lower of course. |
1 Attachment(s)
[QUOTE]And there's another confounding/confuzzling factor: Xyzzy, are you counting successes of of your stats page, or from your "results.txt" file?[/QUOTE]We keep track of all our work locally. Our results are attached. We plan to submit a very large batch of work in the next day or so, so it will be interesting to see if the trend continues.
:max: |
[QUOTE=Christenson;263721]I've said it before, and I'll say it again: What a broken operating system!
What if we made mfaktc more like P95 and gave it a window and all, rather than ran it from the console? Or are we still back to hacking the drivers?[/QUOTE] It's got nothing to do with the OS. Conspiracy theories aside, it apparently works with tesla products but not GT/GTX cards. It's a software call. Regardless of how it's called (cmd line app or win gui app). With the current drivers, there's _no_ (zero zilch nada) work around other than vnc. Much more knowledgeable people than me have said this. Remoting in with VNC works (albeit slower than mstsc), I have VNC server on 4x Win boxes (2x WinXP and 2x Win7). I can restart mfaktc remotely at a whim. No problemo. VNC is free for private use. Move onto another subtopic. Case closed. -- Craig |
[QUOTE=Batalov;263650]I've already tried launching "sleep 30; mfaktc" in a Cygwin shell and then disconnect. Some time later you connect and see the same result.
My guess is that the other driver is not initiated (lazily, which is I guess usually a good thing), until I would go and login there physically.[/QUOTE] Use TeamViewer [URL="http://www.teamviewer.com/"]http://www.teamviewer.com/[/URL] instead of remote desktop. It's free for private use, and I just connected from work to my home computer and started mfaktc. |
[QUOTE=Prime95;263713]That is concerning. What hit rates are others getting for single bit levels of TF?[/QUOTE]
Here are the stats from my last 1448 tests, 76M range: From 66 to 67 - 16 tests, 0 factors From 67 to 68 - 123 tests, 1 factor From 68 to 69 - 124 tests, 2 factors From 69 to 70 - 310 tests, 1 factor From 70 to 71 - 303 tests, 5 factors From 71 to 72 - 286 tests, 4 factors From 72 to 73 - 272 tests, 1 factor So that's 14 factors in 1448 tests, adjusted for mfaktc reporting TF's as P1's. Doug |
[QUOTE=drh;263776]So that's 14 factors in 1448 tests[/QUOTE]
Like Xyzzy 1 in 100 hit rate instead of the expected 1 in 70. It's way too soon to conclude there is a bug, keep the data coming. |
From the machine I have access to right now ::
From 65 to 66 - 27 tests, 0 factors From 66 to 67 - 14 tests, 0 factors From 67 to 68 - 15 tests, 0 factors From 68 to 69 - 45 tests, 1 factors From 69 to 70 - 26 tests, 0 factors From 70 to 71 - 15 tests, 0 factors From 71 to 72 - 4 tests, 0 factors Too few results to be useful as-is, but maybe helps in the aggregate? |
mfaktc, 0.16, 0.16p1, 0.17; Win7-x64 with cudart64_32_16.dll, i7-840QM.
From 65 to 66 - 136 tests, 2 factors found, 2.1 = 136/65 expected, 82457xxx to 82481xxx. From 66 to 67 - 151 tests, 6 factors found, 2.3 = 151/66 expected, 78417xxx to 82481xxx. From 67 to 68 - 146 tests, 1 factors found, 2.2 = 146/67 expected, 79014xxx to 82481xxx and one 26062xxx. From 68 to 69 - 148 tests, 1 factors found, 2.2 = 148/68 expected, 79583xxx to 82481xxx and one 26062xxx. From 69 to 70 - 130 tests, 1 factors found, 1.9 = 130/69 expected, 80086xxx to 82481xxx and one 26062xxx. From 70 to 71 - 127 tests, 4 factors found, 1.8 = 127/70 expected, 81234xxx to 82481xxx and one 26062xxx. From 71 to 72 - 123 tests, 0 factors found, 1.7 = 123/71 expected, 81234xxx to 82481xxx and one 26062xxx. -------------------------------------------- total - 961 tests, 15 factors found, 14.2 expected more-or-less. small sample size, but seems reasonable(?). |
[QUOTE=Prime95;263778]Like Xyzzy 1 in 100 hit rate instead of the expected 1 in 70. It's way too soon to conclude there is a bug, keep the data coming.[/QUOTE]
Yep. We "expect" 16 hits in 1120 tests. Standard deviation 4 hits. But isn't there much more data accessible now? |
[QUOTE=Christenson;263736]
But, anything we can do to reduce the pfaffing around needed to run mfaktc increases the available TF testing. Then we can think about P-1, and seeing if its worth doing multiple LLs in parallel on GPUs. [/QUOTE] Oliver has modestly described mfaktc as a "proof of concept". Since it is obviously more than that, someone should surely ensure it is bug-free, and is incorporated into GIMPS in a user-friendly way. David PS Or even "optimize" it? |
[QUOTE=davieddy;263790]Oliver has modestly described mfaktc as a "proof of concept".
Since it is obviously more than that, someone should surely ensure it is bug-free, and is incorporated into GIMPS in a user-friendly way. David PS Or even "optimize" it?[/QUOTE] I'm working on that user-friendliness, just not as quickly as we all might like. P95, how do we get the server to recognize a "factor found" result from mfaktc on the manual testing page? |
1 Attachment(s)
[QUOTE=Prime95;263778]keep the data coming.[/QUOTE]
M222xxxxxx from 2^64 to 2^71 mfaktc 0.17 Win7 64/gtx570x2 ca 7750 tests - 121 factors found |
[QUOTE=Ungelovende;263792]M222xxxxxx from 2^64 to 2^71
mfaktc 0.17 Win7 64/gtx570x2 ca 7750 tests - 121 factors found[/QUOTE] That is scary -- expected is 848 factors |
Hmmm.
Lost here. "Test" means? 7750/67 = 116 David |
[QUOTE=Prime95;263797]That is scary -- expected is 848 factors[/QUOTE]
I counted 2^64 to 2^65 as one 'test'. 2^65 to 2^66 on the same exponent as another 'test' :smile: |
[QUOTE=Ungelovende;263800]I counted 2^64 to 2^65 as one test. 2^65 to 2^66 on the same exponent as another test :smile:[/QUOTE]
Whew!! You did get the expected number of factors. Maybe Xyzzy is just an unlucky bloke. Let's see how his next batch does. |
[QUOTE=Prime95;263801]Whew!! You did get the expected number of factors. Maybe Xyzzy is just an unlucky bloke. Let's see how his next batch does.[/QUOTE]
I was half hoping for a compliment on my "fielding" skill there. David |
Ok some stats from me. The current batch I'm looking at is 380M, I've broken down the range to 3x seperate sections:
a) 2^65 to 2^74 b) 2^74 to 2^76 c) 2^76 to 2^77 I find doing this method gives the best efficiency of compute time with my hardware. So across my farm, no-factor/factor-found is: a) 612/85 b) 170/1 c) 342/5 Why (c) is greater than (b), is initially I did other combos before I settled in on this plan. (i.e. 75-76, and 73-76), but I always did 76-77. (c) seems to fit in with the 1 in 70 expectation although it might be low sample size. I've leave it up to readers with more maths ability than me to work out if (a) is within expectations. I have 11x instances of mfaktc running, so stats are a little hard to collate :) I've been running since April. If I have time I can get some stats on another range I looked at - 100M. -- Craig |
1 Attachment(s)
[QUOTE]Maybe Xyzzy is just an unlucky bloke. Let's see how his next batch does.[/QUOTE]The attached image summarizes our work to date, including today's dump. We will attach the raw data in the next post.
|
2 Attachment(s)
[COLOR=White].[/COLOR]
|
Some more data...
[code] Range start | Range end | Level | Tests | Factors | 1 / ... -------------+-------------+-------+-------+---------+--------- 27,270,697 | 28,129,921 | 67-68 | 3,702 | 38 | 97.42 29,132,821 | 29,687,591 | 66-67 | 1,419 | 12 | 118.25 29,132,821 | 30,757,057 | 67-68 | 1,790 | 17 | 105.29 - - - - - - + - - - - - - + - - - + - - - + - - - - + - - - - 27,270,697 | 30,757,057 | total | 6,911 | 67 | 103.15 -------------+-------------+-------+-------+---------+--------- 40,149,029 | 47,437,847 | 68-69 | 690 | 14 | 49.29 [/code]P-1 had not been done on any of the 690 exponents from the last range but on all from the other ranges. A test would be a single bit level, obviously. I have no idea why the last range yielded so many factors. :no: [QUOTE=Prime95;263801]Maybe Xyzzy is just an unlucky bloke.[/QUOTE] I've had a streak of 863-or-some-such NF results once. Chat hippens. :wink: |
[QUOTE=Xyzzy;263649]1:115 here.
53M range. 69 to 70 or 70-71. The sample size is > 9000 tested. :sad:[/QUOTE] Xyzzy: I did a quick view to your results.txt posted on June 14th. I've focused on the M53.xxx.xxx exponents and checked the exponent status of the first 4 no factor M53.xxx.xxx: all of tham had P-1 done before your TF attempt to 2^70. [url]http://mersenne.org/report_exponent/?exp_lo=53332333&exp_hi=53332600&B1=Get+status[/url] What is the expected factor rate once P-1 has been done with reasonable B1/B2 bounds? Oliver |
Xyzzy, there are two things that I recommend to you if you think something is broken.[LIST=1][*]run the long selftest (mfaktc.exe -st) under the same circumstances as your daily work (number of instances, settings in mfaktc.ini, ...)[*]"rediscover" some allready known factors which were initially [B]not[/B] discovered by mfaktc.
E.g. yesterday I ran some of my [B]recent P-1 factors in the M53.xxx.xxx range[/B] (23 factors from 2^68 to 2^71) with 0.17. I did "full length runs", not like the selftest just a small range in the class where the factor falls in. Everything worked fine. :smile:[/LIST] Oliver |
[QUOTE=TheJudger;263831]
What is the expected factor rate once P-1 has been done with reasonable B1/B2 bounds? Oliver[/QUOTE] That question has been excersizing several of us. In particular, it was decided a few years ago that it was more efficient to do P-1 before the last worthwhile bit of TF. I wonder whether the P-1 would then render the "last bit" not worthwhile. Of course, you and GPUs have thrown a large spanner in the works:smile: David |
Two Netiquette crimes at once
[QUOTE=davieddy;263836]That question has been excersizing several of us.
In particular, it was decided a few years ago that it was more efficient to do P-1 before the last worthwhile bit of TF. I wonder whether the P-1 would then render the "last bit" not worthwhile. Of course, you and GPUs have thrown a large spanner in the works:smile: David[/QUOTE] (Responding to your own post and partially answering the question therein) There is obviously little to be gained by refining "bit level" for TF further. (Working near a Max/Min optimum). But if the "worthwhileness"of the "last bit" was marginal, P-1 would obviously tip it over the edge. David |
[QUOTE=TheJudger;263831]...checked the exponent status of the first 4 no factor M53.xxx.xxx: all of tham had P-1 done before your TF attempt to 2^70.
What is the expected factor rate once P-1 has been done with reasonable B1/B2 bounds?[/QUOTE] Ah, mystery solved! I should have thought of that myself. IIRC, P-1 will find somewhere in the neighborhood of 30-40% of the factors. |
[QUOTE=Prime95;263843]Ah, mystery solved! I should have thought of that myself. IIRC, P-1 will find somewhere in the neighborhood of 30-40% of the factors.[/QUOTE]
Out of the hundreds of exponents that I've tested, I only ran a TF on exactly 1 that had a P-1 done on it beforehand, and that only consisted of 2 tests, so I'm not sure this solves the mystery. I'm now at 15 factors on 1495 tests, 1493 before a P-1. This is all in the 76M range, and there has only been 23 P-1's done in that range, IIRC. |
[QUOTE=drh;263845]I'm now at 15 factors on 1495 tests, 1493 before a P-1.[/QUOTE]
I think that's unlucky but not suspicious. Xyzzy's was worrisome because it involved 9000 tests. |
[QUOTE=Prime95;263851]I think that's unlucky but not suspicious. Xyzzy's was worrisome because it involved 9000 tests.[/QUOTE]
I agree, and just to clarify, a "test" is a single bit level on an exponent, not only counting individual exponents, right? I've got 1495 "tests" on 322 exponents. |
[QUOTE=Prime95;263843]Ah, mystery solved! I should have thought of that myself. IIRC, P-1 will find somewhere in the neighborhood of 30-40% of the factors.[/QUOTE]
From a "throughput" POV, has TF on GPUs (at least temporarily) made P-1 redundant? |
[QUOTE=davieddy;263864]From a "throughput" POV, has TF on GPUs (at least temporarily) made P-1 redundant?[/QUOTE]
No. Prime95 will refuse to do P-1 if a number has had so much TF done that P-1 will not be profitable. I think we'll see a slight reduction in the B1/B2 bounds selected now that more TF is being done. |
[QUOTE=Prime95;263866]No. Prime95 will refuse to do P-1 if a number has had so much TF done that P-1 will not be profitable.
I think we'll see a slight reduction in the B1/B2 bounds selected now that more TF is being done.[/QUOTE] Hmm. 1) I think "No" meant "Yes". 2) Without fully understanding P-1, I would have thought "reduction" meant "increase"! David |
[QUOTE=Prime95;263843]Ah, mystery solved! I should have thought of that myself. IIRC, P-1 will find somewhere in the neighborhood of 30-40% of the factors.[/QUOTE]
Or read it here!: [url]http://mersenneforum.org/showpost.php?p=263717&postcount=971[/url] :devil::uncwilly::pals::leaving: Sorry I didn't have any numbers....hoping that the extra TF will allow P-1 to look for larger factors... Anyway, calculation about my narrow view of GIMPS rolled through my head last night...it went like this: Decrease TF cost by factor of ~128, get ~7 extra bit levels....7*~1/70 = 1/10....so 10% more exponents will have factors found, or 10% fewer will need LL and LL-D tests...looks like the big performance increase will need to be in LL, possibly by getting the CPU out of the sieving path for TF and possibly P-1. Got to work on mfaktc this week! |
[QUOTE=Christenson;263872]looks like the big performance increase will need to be in LL[/QUOTE]
Just look at the [URL="http://primes.utm.edu/largest.html"]"Top Ten" primes[/URL] How do you think the top nine were discovered? David PS I was wondering for a bit why the latest discovery was attributed to "G12" when GIMPS has found 13. I think the explanation is that Cooper and Boone found 2. Lightning strikes... |
[QUOTE=davieddy;263868]Hmm.
1) I think "No" meant "Yes".[/QUOTE] "No" means "No". A quick test showed that the client will happily run P-1 on a 43M exponent even if the TF bit level is set as high as 80, albeit at much reduced limits. [QUOTE]2) Without fully understanding P-1, I would have thought "reduction" meant "increase"![/QUOTE] No. The limits are determined by the law of diminishing returns. Specifically, the program stops doing stage 1 when the benefit gained by doing additional stage 1 falls below the cost of doing it. Similarly with stage 2. Additional levels of TF reduce the benefit of P-1, which means that these cross-over points are reached earlier. |
[QUOTE=Mr. P-1;263875]"No" means "No". A quick test showed that the client will happily run P-1 on a 43M exponent even if the TF bit level is set as high as 80, albeit at much reduced limits.
No. The limits are determined by the law of diminishing returns. Specifically, the program stops doing stage 1 when the benefit gained by doing additional stage 1 falls below the cost of doing it. Similarly with stage 2. Additional levels of TF reduce the benefit of P-1, which means that these cross-over points are reached earlier.[/QUOTE] Many thanks. David PS are you the inventor of P-1? |
[QUOTE=Christenson;263872]Sorry I didn't have any numbers....hoping that the extra TF will allow P-1 to look for larger factors...[/QUOTE]
Unfortunately it doesn't work like that. Information about the TF level is used in the calculation of the P-1 probability of success, which in turn influences the optimal bounds - in a negative direction. What you want is for P-1 to find more large factors by somehow skipping small ones. Unfortunately there is no way for this algorithm to do this. It could work in the opposite direction: You could have the TF factoring algorithm skip "smooth" potential factors which P-1 would find. However the cost of testing each potential factor for smoothness would greatly exceed the cost of doing them all. |
[QUOTE=davieddy;263876]PS are you the inventor of P-1?[/QUOTE]
Good Heavens, no, just a fan of it. The algorithm was first publicly described by [url=http://en.wikipedia.org/wiki/John_Pollard_(mathematician)]John Pollard[/url] in 1974, though I have read (don't ask me where) that it was known to Selfridge and Lehmer before then. I started doing P-1 work near exclusively for GIMPS shortly after the function was included within the client. At that time it was not possible to obtain this kind of work from the server. I would take Test assignments and manually convert them to Pfactor, unreserving them when done. Few, if any, other GIMPS participants worked this way, so Mr. P-1 seemed a reasonable moniker when I joined the forum. It's far less justifiable now. |
[QUOTE=Mr. P-1;263878]Good Heavens, no, just a fan of it.
The algorithm was first publicly described by [URL="http://en.wikipedia.org/wiki/John_Pollard_(mathematician)"]John Pollard[/URL] in 1974, though I have read (don't ask me where) that it was known to Selfridge and Lehmer before then. I started doing P-1 work near exclusively for GIMPS shortly after the function was included within the client. At that time it was not possible to obtain this kind of work from the server. I would take Test assignments and manually convert them to Pfactor, unreserving them when done. Few, if any, other GIMPS participants worked this way, so Mr. P-1 seemed a reasonable moniker when I joined the forum. It's far less justifiable now.[/QUOTE] Modesty here is a sensible attitude. There are usually some folk who know more than you! Now what sort of music floats your boat? David OK [url=http://www.youtube.com/watch?v=pyj2qL-bQ4E]Silence is Golden[/url] |
Dumb question for everyone, especially P95.
If we fool with the format of results.txt...making it state that it's [mfaktc 0.17 barret79_mul32] or whatever on the factor found lines, should we also make it report the assignment key on the same line? Thanks |
Poisson distribution is a bee... /stings like a bee, anyway/
Just to see what people were obsessing about, I tried some mfaktc'ing - and [I]of course,[/I] there were no factors for more than a hundred tests (these were 69ers, slow and annoying). And then ...one, then three more within an hour (granted, I've made the search a bit faster by finding a 64-to-65-bit niche :-). [CODE][SIZE=2]309927157 [/SIZE][SIZE=2]F [/SIZE][SIZE=2]2011-06-16 06:06 [/SIZE][SIZE=2]0.2 [/SIZE][SIZE=2]22510517584353120737 [/SIZE][SIZE=2]0.0017[/SIZE] [SIZE=2]309926621 [/SIZE][SIZE=2]F [/SIZE][SIZE=2]2011-06-16 06:06 [/SIZE][SIZE=2]0.2 [/SIZE][SIZE=2]30955062015569795249 [/SIZE][SIZE=2]0.0089[/SIZE] [SIZE=2]309926359 [/SIZE][SIZE=2]F [/SIZE][SIZE=2]2011-06-16 06:06 [/SIZE][SIZE=2]0.2 [/SIZE][SIZE=2]47315537535696427297 [/SIZE][SIZE=2]0.0307[/SIZE] [SIZE=2]309932159 [/SIZE][SIZE=2]F [/SIZE][SIZE=2]2011-06-16 03:49 [/SIZE][SIZE=2]0.0 [/SIZE][SIZE=2]22436461899374945833 [/SIZE][SIZE=2]0.0070[/SIZE][/CODE] Too easy. The program seems to work fine :rolleyes: |
Yep.
If any "conspiracy theorist" thinks it's missing factors, the proof of this couldn't be simpler to come up with! David |
Another point is that suggesting a TF program is "slightly broken"
is akin to saying a woman is "a bit pregnant"! David |
The arguments for being "sligthly broken" are as follows:
a) doesn't quite tickle the server optimally when reports are resulted manually..... b) requires manual care and feeding, instead of being able to be told to go get work from the server, and having results show up on the server automagically. c) mfaktc uses the CPU to sieve, so you need a decent CPU core to feed a good GPU card. d) It can break if interrupted...needs to keep multiple checkpoint files for when working on large jobs. e) Once those issues are fiixed, I'd argue the program is perfect....all of these have to do with care and feeding. |
[QUOTE=Christenson;263907]c) mfaktc uses the CPU to sieve, so you need a decent CPU core to feed a good GPU card.
[/QUOTE] Can't quite understand why sieving is an issue. ~2005 I wrote TF and LL programs and proved that M29? was M29. Really fun excercise BTW. Can't remember precisely how I timed it, but optimal sieving took a negligible time compared with the division. As for all the "high level hassles" - aren't and never could be bothered with them. Tell the "OS" to shut up, invoke "protected mode", then get on with it. David PS have you heard that "8 bit tune" yet? |
Would it be possible to split mfaktc into two programs? One which makes the candidates and writes them to disk and the other which pass the candidates to the gpu. This would remove completely being cpu-bound. As long as enough computers are available(in theory on mersenneforum not just one person depending on file size).
|
[QUOTE=henryzz;263939]Would it be possible to split mfaktc into two programs? One which makes the candidates and writes them to disk and the other which pass the candidates to the gpu.[/QUOTE]
Possible but not feasible. :sad: Currently mfaktc needs[LIST][*]one 8 byte integer [B]per grid[/B] (k_base in src/tf_common.cu)[*]one 4 byte integer per [B]factor candidate[/B] (XXX_ktab[] in src/tf_common.cu)[/LIST]So 100M candidates per second need 400MB/sec written to / read from disk. If you manage to reduce the the needed bandwidth per factor candidate to one 1 byte integer you'll need 100MB/sec. 1 byte per FC is easy if you evaluate those FCs serially but not so easy if you need to do highly parallel and independend. But even if you get it down to 1 bit per FC you'll need 12.5MB/sec for 100M candidates per second. Oliver |
[QUOTE=henryzz;263939]Would it be possible to split mfaktc into two programs? One which makes the candidates and writes them to disk and the other which pass the candidates to the gpu. This would remove completely being cpu-bound. As long as enough computers are available(in theory on mersenneforum not just one person depending on file size).[/QUOTE]
I think the number of candidates makes it sensible to make them "on the fly". I worked with batches of 15,015 x 8 bits for reasons anyone who has tried sieving (which I know includes you!) will understand. David |
[QUOTE=davieddy;263944]I think the number of candidates makes it sensible to make them
"on the fly". I worked with batches of 15,015 x 8 bits for reasons anyone who has tried sieving (which I know includes you!) will understand. David[/QUOTE] Good point the volume is just too great. Even if storage size was available then the disk drive would struggle to keep up. |
[QUOTE=Christenson;263907]The arguments for being "sligthly broken" are as follows:
a) doesn't quite tickle the server optimally when reports are resulted manually..... b) requires manual care and feeding, instead of being able to be told to go get work from the server, and having results show up on the server automagically. c) mfaktc uses the CPU to sieve, so you need a decent CPU core to feed a good GPU card. d) It can break if interrupted...needs to keep multiple checkpoint files for when working on large jobs. e) Once those issues are fiixed, I'd argue the program is perfect....all of these have to do with care and feeding.[/QUOTE] You're talking about the biggest improvement in speed for given cost basis (both initial and ongoings) since project inception, and you're complaining. Sheesh tuff crowd. a) not sure what you exactly mean here, I take it you're annoyed when your video card does the equivalent of x GHz-days work, and you get x/100 credit as it found a 'cheap' factor. Seriously deal with it. We're all in the same boat here. And I'm assuming here it's the same as CPU TF work. b) write a lynx script if it annoys you (it's not difficult) I'm hardly a script guru and I did one in a weekend. I'm a fan of what I suggested previously - prime95 have a generic extensions option, rather than writing custom code to sit on top of mfaktc. As a general rule custom code always costs more in the long run, than using generic. c) so, it's using the best of both worlds. Anything else is a compromise. A decent video card requires decent PC. Match appropriately and you shall be rewarded. d) Do a breadth field search rather than a depth field search. Look at my stats above - more factors are found doing a large number of small TFs than doing large bit depth TFs. Or do a periodic rsync to another media if it really concerns you. If you're worried about GPU efficiency - combining some of the earlier stages with "Stages=0" gives similar efficiencies as doing larger bit depths. e) all issues above have work around if it concerns you. I have nothing but praise for mfaktc and the performance it's getting it's awesome. I'm getting 10x the results for similar cost basis. I'd buy Oliver a beer, if we were near each other. :) I guess I'm getting defensive at the 'slightly broken' phrase. If you said here's a list of suggested improvements. I probably wouldn't get so defensive. -- Craig |
Hi Craig,
[QUOTE=nucleon;263946] d) Do a breadth field search rather than a depth field search. Look at my stats above - more factors are found doing a large number of small TFs than doing large bit depth TFs. Or do a periodic rsync to another media if it really concerns you. If you're worried about GPU efficiency - combining some of the earlier stages with "Stages=0" gives similar efficiencies as doing larger bit depths. [/QUOTE] With Stages=1 mfaktc will combine "small" bitlevels automatically, do you think I should increase the autocombining limit a little bit? Oliver |
Nucleon, don't take offense...what you have to understand is that I code for a living, and code is never "perfect"....and I intend to address these opportunities by submitting the patches to Mr Oliver, "The Judger"...it's just taken me longer than I would like to find the "gwthread.c" routines in P95 to re-use, and I'm more distractible than I'd like. P95's code for communications is actually pretty straightforward, and indeed uses mutexes to keep lines, messages, and results in one piece between different threads. The only change I would make in P95 is to add a call to block further low-priority (communications thread) access to mutexes when a high-priority thread is within a few seconds of reporting a result and getting more work to do.
As for "splitting" mfaktc into sieving and factoring parts, in some sense, it is split now. The issue with the sieving is that it is to some degree CPU-bound, since there are now several hundred parallel CPUs on the GPU that can use the output of the sieve to run TF tests. I would argue that it might be useful to move the sieving process onto the GPU. The underlying requirement is significant bandwidth, as described above, from the sieving process to the TF testing process. You wouldn't want the sieve output to cross the disk, just memory, which leads back to the single process, multithreaded model now in use. It is a question of whether we can effectively (low cost task switch, maybe stay off the main PCI bus) run heterogeneous threads on the GPU. To me the major problem with using up a significant part of a good CPU is that that CPU is taken away from other GIMPS work, particularly LL tests....as I calculated above, we might remove 10% additional candidates from the LL pool, so lots and lots of LL still has to be done. |
[QUOTE=nucleon;263946]
I guess I'm getting defensive at the 'slightly broken' phrase. If you said here's a list of suggested improvements. I probably wouldn't get so defensive. -- Craig[/QUOTE] It was I who compared "slightly broken" with being "slightly pregnant", thinking (narrow-mindedly) that a competent exhaustive search for factors was not likely to miss 30% of them. He (Christenson) was merely pointing out my naivety! Anyway, the mystery of the low factor discovery rate has been solved, with the realization that P-1 had already been done. David |
FWIW, we like mfaktc the way it is now. The UI is simple and getting work queued up is no problem. None of our computers used for this are networked externally, so we just load things up manually in two week chunks. Not having a GUI is a plus!
We have had only one issue overall, but that was due to user error. (The forum assistant responsible has been beaten mercilessly.) We have always liked the idea of programs doing one thing well, and chaining programs together to do what we want. |
The discussion and proposal has been to simply automate the fetching of work and reporting of results, optionally, just as mprime has an option to use or not use primenet.
I'm still feeding mfaktc manually, just that two weeks at a time seems like an awful lot of assignments to handle at once. |
[QUOTE=TheJudger;263948]Hi Craig,
With Stages=1 mfaktc will combine "small" bitlevels automatically, do you think I should increase the autocombining limit a little bit? Oliver[/QUOTE] Yeah I noticed the auto combine feature. I think it's easy for me to say increase it, but I'm dealing with GTX580s/460s. I'm guessing with people with lower hardware would rather see frequency of results, rather than efficiency. But by all means include something in the readme or the ini file under stages. The diminishing point of returns for combining seems to around k-4 or k-3, where k is the last bit depth, for the exponents I looked at. -- Craig |
IMHO, the problem with combining stages isn't bitlevels...it's time...that is, all of us like feedback that something has been completed. If it takes less than an hour, I'd tell you to combine it, otherwise, leave it separate.
Eric C. |
OK, I have a GTX480 with 1536 Meg of RAM...In a 25-30C room (probably 80F), per GPU-Z on Win7 64 bit, it's running no load (since I have to page back and find mfaktc0.17 to download to this machine), fans at 1575RPM, and 76C for the first GPU temperature and 70C for the second GPU temperature. Two monitors are connected, the primary through the DVI and the secondary through a DVI to VGA converter. The case is open; it's an ATX tower so the fan intakes point down right now.
Is this normal behavior for such a beast, or is it running hot? |
[QUOTE=Christenson;264034]OK, I have a GTX480 with 1536 Meg of RAM...In a 25-30C room (probably 80F), per GPU-Z on Win7 64 bit, it's running no load (since I have to page back and find mfaktc0.17 to download to this machine), fans at 1575RPM, and 76C for the first GPU temperature and 70C for the second GPU temperature. Two monitors are connected, the primary through the DVI and the secondary through a DVI to VGA converter. The case is open; it's an ATX tower so the fan intakes point down right now.
Is this normal behavior for such a beast, or is it running hot?[/QUOTE] I would say normal. Nvidia tries to minimize temperature variation on GPU, no load => low fan speed, relative high temperature. Once you increase the load the temperature will rise to ~85-90°C (reference card/cooling) and the fan speed will be adjusted in that way the the temperature is more or less constant. Oliver |
FWIW, under 95% load (average) our GTX570 GPUs never exceed 70°C. They typically run a little more than 65°C. The ambient temperature is (right now) about 84°F.
:max: |
[QUOTE=Prime95;263843]Ah, mystery solved! I should have thought of that myself. IIRC, P-1 will find somewhere in the neighborhood of 30-40% of the factors.[/QUOTE]
Assuming YRC this suggests that TF and P-1 don't "tread on each other's toes" too much. 3 extra bits of TF will reduce the factors found by P-1 by 3/70 * 35%. If P-1's hit rate is ~6%, does a reduction of 1.5% render it not worthwhile? David |
I've started work on mfaktc-primenet. One question that has come up is as follows:
When working with Operation Billion Digits numbers, do I count them against DaysOfWorkToGet, since, strictly speaking, they aren't reportable to Primenet? I certainly don't count the OBD assignments when getting Primenet assignments that I intersperse in my OBD assignments.... When these get to the output spool file, they will have to be dropped, anyway. (They will still be in "results.txt") |
[QUOTE=ATH;263758]Use TeamViewer [URL="http://www.teamviewer.com/"]http://www.teamviewer.com/[/URL] instead of remote desktop. It's free for private use, and I just connected from work to my home computer and started mfaktc.[/QUOTE][url=http://www.logmein.com/]LogMeIn[/url] is also free for private use, and mfaktc starts fine.
|
crystal ball
[QUOTE=TheJudger;265202]M59715763 has a factor: 897894732457600805887
M59715763 has a factor: 947242006352100744497 found 2 factor(s) for M59715763 from 2^69 to 2^70 [mfaktc 0.18-pre1 barrett79_mul32] Oliver[/QUOTE] Can you already tell what we're going to expect from mfaktc 0.18? |
[QUOTE=Brain;265231]Can you already tell what we're going to expect from mfaktc 0.18?[/QUOTE]Hopefully I'm not breaking any NDAs :unsure:[quote]version 0.18-pre1 (2011-05-24)
- autoadjustment of SievePrimes is now less dependend on the gridsize and absolute speed. Instead of measuring the absolute (average) time waited per precessing block (grid size) now the relative time spent on waiting for the GPU is calculated. In the per-class output "avg. wait" is replaced by "CPU wait". - in all GPU kernels the functions cmp_72() and cmp_96() are replaced by cmp_ge_72() and cmp_ge_96(). Those cmp_ge_?? only check if the first of two input numbers is greater or equal than the second number. cmp_?? checked if is is smaller, equal or greater. A small performance improvement (< 1%) is possible for all GPU kernels. This was suggested by bdot on [url]www.mersenneforum.org[/url]. Thank you! - added even more debug code for CHECKS_MODBASECASE. The new code did not show any issues. :)[/quote] |
[QUOTE=James Heinrich;265232]Hopefully I'm not breaking any NDAs :unsure:[/QUOTE]
Not really.... My inspection of the diffs to 0.18-pre2(I'm supposed to be developing the Primenet part) found: 1) Handles Ctrl-C reliably 2) Slight improvement with the _ge functions mentioned above. 3) Minor code style improvement: I told him that we were scanning argv[0] (the name by which mfaktc is invoked) for arguments like -h, and he and I have both changed that in our tip revisions. I don't know what Oliver has cooking, he seems busy ATM; I have cooking: 1) Automatic Primenet interaction using cURL and P95 code, controlled by the presence of a Use_Primenet in the config (big, may not make it this release except as a dummy) 2) Tack on [mfaktc-VERSION] to everything in results.txt, including factors found. This then makes it simple for the server to identify the work source. I'm also inclined to include the assignment key from worktodo.txt. This is straightforward and in reach. Longer term, 3) Move sieving from the CPU to the GPU. It takes a lot of CPU to keep the GPU fully occupied; I'd prefer if it only took a trivial amount so I can keep my CPU busy doing P-1 and LL tests. Oliver might be thinking about multiple backup files for better reliability in the face of the inevitable system crashes. Oliver: Don't hold off on making your changes; mine will be small enough to integrate via diff. |
Hi,
[QUOTE=Brain;265231]Can you already tell what we're going to expect from mfaktc 0.18?[/QUOTE] Evolution, not revolution. :smile: [QUOTE=James Heinrich;265232]Hopefully I'm not breaking any NDAs :unsure:[/QUOTE] You didn't sign a NDA so you can't break it. As you know I just asked you not to spread around -pre versions when I give them to you and AFAIK you didn't do. :smile: I like the open discussion here in this thread. [QUOTE=Christenson;265235]I don't know what Oliver has cooking, he seems busy ATM; I have cooking: 1) Automatic Primenet interaction using cURL and P95 code, controlled by the presence of a Use_Primenet in the config (big, may not make it this release except as a dummy)[/QUOTE] Busy? A little bit, but a little bit lazy, too. :wink: 0.18-pre3: added two functions to estimate the amount of work for the current assignment and for the whole worktodo file. This will become useful once automated primenet interaction is implemented. Eric, I'll sent you this soon, I want to test it a little bit more. [QUOTE=Christenson;265235] Don't hold off on making your changes; mine will be small enough to integrate via diff.[/QUOTE] Yepp, and (as allready told) putting your new functions into separate files makes it very easy. Eric: automated primenet interaction is planned for 0.1[B]9[/B], do you agree? Oliver P.S. don't ask for any timeline. It's done when it's done! |
Oliver:
Exactly when the automated primenet interaction is released depends on me finishing it and how much other work you do. It happens when it happens. I hope to get the necessary changes into the main part of mfaktc on 0.18, but we shall see. I have to do them and Oliver has to fold them in along with the dummy version of mfaktc-primenet. My separate file needs a termination function so that we can ensure that libcurl is also informed to close connections. ********** As for estimating assignments, I was thinking that if we ensured there were 5 assignments in worktodo.txt, that would provide enough fodder for figuring out how much to get for next time. ******** |
Compute capability
I guess it does not happen often :-) but you probably should mention in the readme that mfaktc does not support compute capability. I went through getting getting CUDA dlls and installing new video drivers just to find out it does not support v1.0 :-(
BTW is there real problem with using v1.0 or it just hasn't been considered? Andriy |
Hi Andriy,
sorry, I'll add the information into the README.txt. I just have noticed that this kind of information is only available in the Changelog.txt and in the mfaktc article on the mersennewiki.org ([url]http://mersennewiki.org/index.php/Mfaktc[/url]). Early versions ran fine on my 8800 GTX (G80 GPU, the only CC 1.0 GPU) but newer versions don't. There are (at least) two reasons why it doesn't work on CC 1.0 GPUs:[LIST][*]use of atomic instructions for access to the results array (this needs CC >=1.1)[*]I don't know what the problem is but at some point it stopped working on CC 1.0. :sad:[/LIST]From mfaktc Changelog.txt [CODE]- officially GPUs with compute capability 1.0 are not supported. AFAIK the only GPU affected is the G80 (8800 GTS 320, 8800 GTS 640, 8800 GTX, 8800 Ultra and their Quadro/Tesla variants (but not a 8800 GTS 512, this one is a G92 GPU)). The issue seems to be the synchronisation of the writes to *d_RES. _PERHAPS_ I'm able to fix this in feature releases. BUT are there really many G80 GPUs out there? I think it is not worth the work (and yes, personally I own a 8800GTX). [/CODE] I won't place my money on a fix for CC 1.0 GPUs... Oliver |
[QUOTE=TheJudger;265318][LIST][*]use of atomic instructions for access to the results array (this needs CC >=1.1)[/QUOTE][/LIST] That is also the reason why my OpenCL version will not be able to run on ATI GPUs before HD5xxx - the older ones don't have atomics ...
I spent a few thoughts on how to make this work without atomics, but implementing locks on GPUs is a little complicated. Without locks it will only work when we assume that we find only one factor per grid. However, I found hints how to implement the locking. So if there's a big vote for supporting HD4xxx or maybe CUDA CC 1.0 I could add something mfakto (and suggest it for mfaktc :smile: ). |
Oliver:
Should/is Bdot on the distribution list for pre-release mfaktc? Just to let you, Bdot, know, the plan for mfaktc is to add a few CPU-side calls to the current code for when factors are found or not or work is needed, and to use as much P95 code as possible (preferably, whole source files unmodified) to do the work. |
Hey, what's the preferred exchange rate for spaces to tabs?
And are we using the one true style or the other true style for braces? [I like mine to line up, but I'll follow whatever convention]. Add to list: More robust parsing of worktodo.txt....in parse.c. |
[QUOTE=TheJudger;265318]
Early versions ran fine on my 8800 GTX (G80 GPU, the only CC 1.0 GPU) but newer versions don't. [/QUOTE] Is it possible to get my hands on one of the earlier builds or they are not good enough to use them? I have two 8800 GTX and one HD4550. I wonder how much work could they do?.. |
[QUOTE=apsen;265336]I have two 8800 GTX and one HD4550. I wonder how much work could they do?..[/QUOTE]The 8800GTX should perform fairly close to my 8800GT; I get ~26GHz-days/day; assuming there was a stable version of mfaktc that performed similarly to recent builds, [i]and[/i] there's no performance penalty for Compute v1.0 implementation, I'd expect ~25-30 GHz-days/day from an 8800GTX. That was quite a few "ifs", however...
|
[QUOTE=apsen;265336]Is it possible to get my hands on one of the earlier builds or they are not good enough to use them?
I have two 8800 GTX and one HD4550. I wonder how much work could they do?..[/QUOTE] If you wade far enough back in this thread (it's 700 posts long!), you will find the early versions of mfaktc. You will want to visit the "Putting it All Together" thread, as the necessary windows DLL is very easy to find there. Finally, if you want to make a post on that thread with links to the posts with the early versions, I think Rodrigo will find that helpful. |
[QUOTE=Christenson;265360]If you wade far enough back in this thread (it's 700 posts long!), you will find the early versions of mfaktc. You will want to visit the "Putting it All Together" thread, as the necessary windows DLL is very easy to find there. Finally, if you want to make a post on that thread with links to the posts with the early versions, I think Rodrigo will find that helpful.[/QUOTE]
That would be great!! :tu::tu: Rodrigo |
[QUOTE=Christenson;265328]Oliver:
Should/is Bdot on the distribution list for pre-release mfaktc? Just to let you, Bdot, know, the plan for mfaktc is to add a few CPU-side calls to the current code for when factors are found or not or work is needed, and to use as much P95 code as possible (preferably, whole source files unmodified) to do the work.[/QUOTE] I received one prerel-version in order to get the signal handling into my stuff. However, mfakto is not yet ready to report results to primenet. Currently some GPUs do not correctly execute some kernels and I may need a fix from AMD. Once it is ready, and George agreed, I can diff-merge your and Olivers latest changes to my code so that we have the same type of communication with primenet. [QUOTE=apsen] ... and one HD4550. I wonder how much work could they do?.. [/QUOTE] With my current kernels that would be about 8-10 GHz-days/day. And the code compiles fine for HD4xxx when I just skip the atomic_inc ... |
[QUOTE=Christenson;265360]If you wade far enough back in this thread (it's 700 posts long!), you will find the early versions of mfaktc. [/QUOTE]
I've went through the whole thread and the latest version that worked for me was 0.8. The later versions seem to be compiled for cc1.1 or higher. But when I tried to compile 0.8 for Win64on my own I'm getting "cudaStreamCreate() failed". What may I be doing wrong? |
[QUOTE=apsen;265613]I've went through the whole thread and the latest version that worked for me was 0.8. The later versions seem to be compiled for cc1.1 or higher. But when I tried to compile 0.8 for Win64on my own I'm getting "cudaStreamCreate() failed". What may I be doing wrong?[/QUOTE]
Hopefully you gave the locations for those versions to Rodrigo's thread...otherwise someone else in your shoes will end up walking the same long mile. At a guess, you need a copy of cudart.dll in the same directory as your executable and/or current working directory. See Rodrigo's thread for the pointers. At a second guess, your card can only support one stream at a time, so try telling it to use only one stream. To really know, you will need to get the error code from cudaStreamCreate, which will require you to program a bit. You will then have to go look it up in the Nvidia documentation. I can modify the code, but can't compile for Win32. Let me know if I need to do that. |
[QUOTE=Christenson;265628]Hopefully you gave the locations for those versions to Rodrigo's thread...otherwise someone else in your shoes will end up walking the same long mile.
[/QUOTE] No. I don't really have an answer yet. So far I could only tell that version 0.8 seems to be the best bet. [QUOTE=Christenson;265628] At a guess, you need a copy of cudart.dll in the same directory as your executable and/or current working directory. See Rodrigo's thread for the pointers. At a second guess, your card can only support one stream at a time, so try telling it to use only one stream. To really know, you will need to get the error code from cudaStreamCreate, which will require you to program a bit. You will then have to go look it up in the Nvidia documentation. I can modify the code, but can't compile for Win32. Let me know if I need to do that.[/QUOTE] There's problem with my compile. The downloaded 0.8 works - mine doesn't. The return code from cudaStreamCreate is 10200 and it's out of range of defined codes in cuda.h. To address your specific points: Yes I have cudart.dll - without it the program will not even start. The downloaded 0.8 works with 3 streams. It just occured to me that I may need to use even older CUDA toolkit... Maybe kjaget could chime in... BTW 0.8 is posted in post #280. |
10200 = 27D8....you sure you have the right return-type declared for cudaStreamCreate?
If you are just trying to run mfaktc, I'd be inclined to ignore the "I can't build it" problem. What do you hope to do with the modification? |
[QUOTE=Prime95;263851]I think that's unlucky but not suspicious. Xyzzy's was worrisome because it involved 9000 tests.[/QUOTE]
Some data from my latest two runs (regular TF runs as assigned from primenet server in M58.xxx.xxx to M60.4xx.xxx) 1st batch[LIST][*]1956 assignments from 2^69 to 2^70[*]1932 no factor results[*]24 factor results ([B]25 factors[/B], one exponent has 2 factors between 2^69 and 2^70)[/LIST]Expected number of factors: 1956/69 = [B]~28.35[/B] 2nd batch[LIST][*]2089 assignments from 2^69 to 2^70[*]2051 no factor results[*]38 factor results ([B]38 factors[/B])[/LIST]Expected number of factors: 2089/69 = [B]~30.41[/B] I feel comfortable with these results. :smile: These runs included 300+ "no factor results" in a row aswell as "5 factors from ~50 assignments" Oliver |
Putting some flesh on the bones
[QUOTE=TheJudger;265946]Some data from my latest two runs (regular TF runs as assigned from primenet server in M58.xxx.xxx to M60.4xx.xxx)
1st batch[LIST][*]1956 assignments from 2^69 to 2^70[*]1932 no factor results[*]24 factor results ([B]25 factors[/B], one exponent has 2 factors between 2^69 and 2^70)[/LIST]Expected number of factors: 1956/69 = [B]~28.35[/B] 2nd batch[LIST][*]2089 assignments from 2^69 to 2^70[*]2051 no factor results[*]38 factor results ([B]38 factors[/B])[/LIST]Expected number of factors: 2089/69 = [B]~30.41[/B] I feel comfortable with these results. :smile: These runs included 300+ "no factor results" in a row aswell as "5 factors from ~50 assignments" Oliver[/QUOTE] I feel comfortable that your results in no way make one doubt the hypothesis that you conducted 4045 independent trials, the probability of a "success" being 1/69. (See the Kamasutra). 2 factors in one trial? 1/69[SUP]2[/SUP] = 1/4761. Found one. Tick. Expected 60 successes +/- 8. Found 62. Tick. Expected number of "gaps">300 = (68/69)[SUP]300[/SUP] *60 = 0.75 Probabity of no such gaps e^-0.75 = 0.47. Found one. Tick. Probability of a gap <28 ~1/3. For 4 such gaps in succession, expected total gap ~50. Probability of 4 or more such gaps in a row =1/81 Found one in 60. Tick. Thoughtful comments on this analysis welcome. David |
[QUOTE=davieddy;265982]
2 factors in one trial? 1/69[SUP]2[/SUP] [/QUOTE] From experience, this might be off by a factor of 2 either way Enough thinking for now! |
[QUOTE=davieddy;265982]
Probability of a gap <28 ~1/3. For 4 such gaps in succession, expected total gap ~50. Probability of 4 or more such gaps in a row =1/81 [/QUOTE] Expect 60*2/3=40 "long" gaps (>= 28). Each of them has a 1/81 chance of being followed by 4+ "short" gaps. So expected runs of 4+ short gaps is 0.5 This might seem a strange way to approach the question of finding 5 factors in ~50 tests, but the Poisson distribution tells us that if we expect 50/69 factors in a randomly selected 50 tests, the probability of 5+ factors is 0.0009. 4045/50 = 81, so this way we would expect 0.0729 occurrences of 5 ln 50. It is clear to see why this underestimates the likelihood of finding [B]some[/B] run of 5 factors in 50 tests, but [B]very hard to see how to adjust it.[/B] Note that this problem is the same as judging how lucky GIMPS has been to find 7 "short" gaps in a row between Mprimes. In this case, 50% of gaps are short and 50% are long (the boundary being an exponent ratio of ~1.3) My conclusion is that we expect 1 such run in 256 Mprimes: lucky yes, outrageous no. [QUOTE=davieddy;265984]From experience, 1/69[SUP]2[/SUP] might be off by a factor of 2[/QUOTE] Pretty sure it should be 1/2! * 1/69[SUP]2[/SUP] (Poisson again) David |
:confused:
On my 4 core system mfaktc 0.17 performance suffer if something is running on other cores. I do not see that effect on 2 core system with mfaktc 0.8. For example: I run only mfaktc 0.17 (on core #4) i get about 93M/s. If I'll start one Prime95 worker on another core (say #1) it drops to about 75M/s. If I'll start two Prime95 workers on another core (say #1 and #2) it drops to about 69M/s. If I'll start three Prime95 workers on another core (say #1, #2 and #3) it drops to about 35M/s. Forth worker has almost no effect as mfaktc runs at higher priority. Win 7 x64, Q6600, GTX 465, mfaktc 0.17. The other computer (Win 7 x64, E5200, 8800 GTS, mfaktc 0.8) has a consistent output of about 26M/s no matter whether Prime95 is running or not. :confused: |
[QUOTE=apsen;266287]Q6600[/QUOTE]That's your problem. I also have a Q6600 and the multi-core performance is horrible. If you check your Prime95 performance on 1/2/3/4 cores you'll notice it will also scale badly -- as you load more cores the throughput of each drops.
|
[QUOTE=James Heinrich;266289]I also have a Q6600 and the multi-core performance is horrible. If you check your Prime95 performance on 1/2/3/4 cores you'll notice it will also scale badly -- as you load more cores the throughput of each drops.[/QUOTE]
What is the best overall performance? Use 3 cores and leave one idle? |
| All times are UTC. The time now is 13:00. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.