mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   News (https://www.mersenneforum.org/forumdisplay.php?f=151)
-   -   Oops - New Prime! (M49 related) (https://www.mersenneforum.org/showthread.php?t=20830)

Madpoo 2016-01-21 02:48

[QUOTE=cuBerBruce;423290]Yep. I tried running four M49-size exponents at the same time, each using one core. The ETAs were 29-30 days. When running fewer than four, the speed (per exponent) went up. With 3 exponents at 1 core each, I was getting 23-24 days for the ETAs. For 1 exponent on 1 core, the ETA was down to about 16.5 days.[/QUOTE]

Okay, that's probably what was going on.

Now that I actually have more time to look at it, I can see that the same computer that found M49 checked in another result on the same day in the 74M range, and checked in 2 more results just 5 days later in the 77M range. So I'm pretty sure it was working on 4 exponents at once.

That seems to have held true over the past few months... it would turn in 4 results at a time within days of each other.

Okay, mystery solved. :smile:

Madpoo 2016-01-21 02:50

[QUOTE=Dubslow;423291]Massive, massive memory bottleneck. Quite a shame, really.[/QUOTE]

That reminds me of an experiment I did a couple weeks ago. I rounded up samples of different servers with different memory speeds. DDR3 at 1066 - 1600 and then DDR4 at 2100.

The memory speed has a big impact on how multiple workers of different sizes get along. I'll just say that DDR4 @ 2100 is really really awesome. Nuff said.

science_man_88 2016-01-21 03:02

[QUOTE=Mark Rose;423301][YOUTUBE]tlpYjrbujG0[/YOUTUBE][/QUOTE]

I hope the how the search works one goes into detail.

Madpoo 2016-01-21 03:05

[QUOTE=Dubslow;423298]It happens to me on far more things than just numbers :razz:[/QUOTE]

We'll just call 'em brain farts and be done with it. :smile:

Imagine how George and I felt when we figured out a prime was hiding in the data for nearly 4 months. Ugh.

I checked again today to make sure... nothing else. Phew.

retina 2016-01-21 03:29

[QUOTE=Madpoo;423309]We'll just call 'em brain farts and be done with it. :smile:

Imagine how George and I felt when we figured out a prime was hiding in the data for nearly 4 months. Ugh.

I checked again today to make sure... nothing else. Phew.[/QUOTE]Why you no test the code?

[size=1]Whenever one of my minions comes to me and claims his latest stuff works I say "Show me it works, don't just tell me". So they go off and write a test suite for it and, shock, it doesn't work![/size]

cuBerBruce 2016-01-21 04:31

[QUOTE=Madpoo;423303]That reminds me of an experiment I did a couple weeks ago. I rounded up samples of different servers with different memory speeds. DDR3 at 1066 - 1600 and then DDR4 at 2100.

The memory speed has a big impact on how multiple workers of different sizes get along. I'll just say that DDR4 @ 2100 is really really awesome. Nuff said.[/QUOTE]

Yeah, i just have DDR3-1600. Is it at all likely that the "M49 machine" would have anything faster?

So far I haven't done much experimenting with running different size exponents together. Perhaps I'll look into that in the near future. Mostly I just put all 4 cores on one exponent on this machine.

Madpoo 2016-01-21 05:17

[QUOTE=retina;423313]Why you no test the code?[/QUOTE]

I wrote a quick and dirty sproc that looks for newly reported primes and will send an email. It tested okay when I told it to look back over the past couple years (so it would find the previous false positives from 2014/2015). Works good.

Now it's just a matter of a *real world* test when it's only looking at stuff in the past xx hours and running on a recurring schedule.

It should definitely work, but in my line of work where zero downtime is the goal, I'd add the typical caveats that it's not guaranteed... SQL agent may crash, the email services may poop out, etc.

Best bet for those is to have overlapping periods of checking... run daily but report on anything new in the past 3-4 days. Yeah, if one does show up, at worst you get *too many* emails for the same thing. Better that than missing a day because some other thing went wonky.

And of course there are the manual checks when I (or someone else) goes through looking for weird things. I had previously done my own tests of all known false positives a year ago and I'm sure at some point I would have done it again, but maybe not for a while. It just happened that someone asked about it so I checked before I probably would have.

Nothing to say about that except like the subject of this thread says... "Oops".

The real trick is fixing the code that's supposed to send an email when the "is prime" result comes in. This other thing is a failsafe since that didn't seem to fire off.

I was afraid that was a result of the move to the new server in Aug 2014, like the PHP mail function didn't work... but it does, and I guess it's not a new thing after all. :( Nothing for that except put on the sleuth hat and trace the code.

Madpoo 2016-01-21 05:25

[QUOTE=cuBerBruce;423318]Yeah, i just have DDR3-1600. Is it at all likely that the "M49 machine" would have anything faster?

So far I haven't done much experimenting with running different size exponents together. Perhaps I'll look into that in the near future. Mostly I just put all 4 cores on one exponent on this machine.[/QUOTE]

If you were running multiple tests of 35M-37M exponents you'd probably be okay running 4 of them. Above 37M-38M and they just stop playing nice with each other. I think that has everything to do with the FFT size in question. 1920K FFT is okay, but when you get up to a 2M FFT things get hairy.

When testing larger exponents, my rule of them us to only run 2 tests at a time with exponents < 58M. A pair of 3M FFT's tends to do okay, but running more than 2 of those at once will flood memory.

If you run something larger than a 3M FFT on one worker, you can do another worker with an FFT < 2M FFT.

Is there a magic formula that ties those FFT sizes into the L1/L2/L3 cache sizes or the memory bandwidth? Probably, I just haven't gone beyond my empirical observations to work that out.

I do know that with DDR4 @ 2100 those rules are obsolete... I can run a pair of tests with FFT sizes of at least 8M on one and 4M on the other without any issues. That's as much as I've tested...haven't tried a combo larger than that.

LaurV 2016-01-21 05:43

[QUOTE=Dubslow;423294]A full marathon is more than 26 miles.[/QUOTE]
Whoops... I took it literally (I read that phrase as "half of the 13.2 miles length of a marathon"), without any calculus. Miles are not my native measuring sticks. I guess you (and Ernst later) are completely right about me being the idiot this time :blush:

S485122 2016-01-21 06:03

[QUOTE=Madpoo;423327]...
I do know that with DDR4 @ 2100 those rules are obsolete... I can run a pair of tests with FFT sizes of at least 8M on one and 4M on the other without any issues. That's as much as I've tested...haven't tried a combo larger than that.[/QUOTE]You mean quad channel DDR4, the 6th generation i7 processors have only dual channel memory even if they support DDR4. There are some processors that support quad channel DDR3 (4820K, 4930K and 4960X) and their memory bandwidth is about 60 GB/s compared to that of 68 GB/s for the quad DDR4.

Jacob

kladner 2016-01-21 06:56

[QUOTE=S485122;423329]You mean quad channel DDR4, the 6th generation i7 processors have only dual channel memory even if they support DDR4. There are some processors that support quad channel DDR3 (4820K, 4930K and 4960X) and their memory bandwidth is about 60 GB/s compared to that of 68 GB/s for the quad DDR4.

Jacob[/QUOTE]

Gosh! You'll have to scrape by with a [URL="http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&IsNodeId=1&N=100007671%20600535697"]6 or 8 core Haswell[/URL]. :smile:

With something like [URL="http://www.newegg.com/Product/Product.aspx?Item=N82E16813132516"]this[/URL].


All times are UTC. The time now is 21:48.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.