mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfakto: an OpenCL program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=15646)

KyleAskine 2012-01-30 00:30

[QUOTE=kracker;287595]+1 Here also.[/QUOTE]

Yes, it works fantastic on Win 7. But my Linux box bombed when I installed it. I had to roll back.

bcp19 2012-01-30 14:57

Is there a... dunno the right 'word'... a changeover in mfakto around 29.504-29.505M? I was just noticing that my GPU is taking now 51-55 minutes to complete 29.505M exp's when it had been taking 43 minutes to do 29.503M ones.

[code]Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4617/4620 | 271.32M | 2.718s | 99.82M/s | 5000 | 0m00s | 6169us
no factor for M29504119 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 43m 40.808s
got assignment: exp=29504159 bit_min=68 bit_max=69
tf(29504159, 68, 69, ...);
k_min = 5001801697200 - k_max = 10003603396367
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4612/4620 | 271.32M | 2.727s | 99.49M/s | 5000 | 0m00s | 6275us
no factor for M29504159 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 43m 41.343s
got assignment: exp=29504177 bit_min=68 bit_max=69
tf(29504177, 68, 69, ...);
k_min = 5001798643380 - k_max = 10003597293337
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4612/4620 | 271.32M | 2.727s | 99.49M/s | 5000 | 0m00s | 6264us
no factor for M29504177 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 43m 41.141s
got assignment: exp=29504227 bit_min=68 bit_max=69
tf(29504227, 68, 69, ...);
k_min = 5001790165680 - k_max = 10003580340517
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4617/4620 | 271.32M | 2.715s | 99.93M/s | 5000 | 0m00s | 6165us
no factor for M29504227 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 43m 41.050s
got assignment: exp=29504269 bit_min=68 bit_max=69
tf(29504269, 68, 69, ...);
k_min = 5001783046260 - k_max = 10003566100192
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4616/4620 | 271.32M | 2.727s | 99.49M/s | 5000 | 0m00s | 6235us
no factor for M29504269 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 43m 41.435s
got assignment: exp=29504351 bit_min=68 bit_max=69
tf(29504351, 68, 69, ...);
k_min = 5001769144680 - k_max = 10003538297770
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4609/4620 | 271.32M | 2.820s | 96.21M/s | 5000 | 0m00s | 6629us
no factor for M29504351 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 48m 29.711s
got assignment: exp=29504383 bit_min=68 bit_max=69
tf(29504383, 68, 69, ...);
k_min = 5001763720800 - k_max = 10003527448086
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4613/4620 | 271.32M | 2.787s | 97.35M/s | 5000 | 0m00s | 6448us
no factor for M29504383 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 52m 37.465s
got assignment: exp=29504399 bit_min=68 bit_max=69
tf(29504399, 68, 69, ...);
k_min = 5001761008860 - k_max = 10003522023253
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4617/4620 | 271.32M | 2.789s | 97.28M/s | 5000 | 0m00s | 6405us
no factor for M29504399 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 52m 11.283s
got assignment: exp=29504443 bit_min=68 bit_max=69
tf(29504443, 68, 69, ...);
k_min = 5001753552180 - k_max = 10003507104992
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4617/4620 | 271.32M | 2.832s | 95.80M/s | 5000 | 0m00s | 6659us
no factor for M29504443 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 56m 20.353s
got assignment: exp=29504507 bit_min=68 bit_max=69
tf(29504507, 68, 69, ...);
k_min = 5001742699800 - k_max = 10003485405784
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4617/4620 | 271.32M | 2.824s | 96.08M/s | 5000 | 0m00s | 6597us
no factor for M29504507 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 55m 39.322s
got assignment: exp=29504509 bit_min=68 bit_max=69
tf(29504509, 68, 69, ...);
k_min = 5001742362540 - k_max = 10003484727685
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4619/4620 | 271.32M | 2.823s | 96.11M/s | 5000 | 0m00s | 6633us
no factor for M29504509 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 51m 11.275s
got assignment: exp=29504569 bit_min=68 bit_max=69
tf(29504569, 68, 69, ...);
k_min = 5001732189300 - k_max = 10003464384765
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4616/4620 | 271.32M | 2.721s | 99.71M/s | 5000 | 0m00s | 6085us
no factor for M29504569 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 52m 12.215s
got assignment: exp=29504669 bit_min=68 bit_max=69
tf(29504669, 68, 69, ...);
k_min = 5001715238520 - k_max = 10003430480082
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4612/4620 | 271.32M | 2.818s | 96.28M/s | 5000 | 0m00s | 6580us
no factor for M29504669 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 54m 25.430s
got assignment: exp=29504677 bit_min=68 bit_max=69
tf(29504677, 68, 69, ...);
k_min = 5001713880240 - k_max = 10003427767718
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4619/4620 | 271.32M | 2.822s | 96.14M/s | 5000 | 0m00s | 6607us
no factor for M29504677 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 53m 29.768s
got assignment: exp=29504693 bit_min=68 bit_max=69
tf(29504693, 68, 69, ...);
k_min = 5001711168300 - k_max = 10003422342993
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4615/4620 | 271.32M | 2.846s | 95.33M/s | 5000 | 0m00s | 6649us
no factor for M29504693 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 51m 17.654s
got assignment: exp=29504773 bit_min=68 bit_max=69
tf(29504773, 68, 69, ...);
k_min = 5001697608600 - k_max = 10003395219456
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4616/4620 | 271.32M | 2.817s | 96.31M/s | 5000 | 0m00s | 6581us
no factor for M29504773 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 53m 33.822s
got assignment: exp=29504801 bit_min=68 bit_max=69
tf(29504801, 68, 69, ...);
k_min = 5001692859240 - k_max = 10003385726253
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4615/4620 | 271.32M | 2.757s | 98.41M/s | 5000 | 0m00s | 6250us
no factor for M29504801 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 53m 46.318s
got assignment: exp=29504863 bit_min=68 bit_max=69
tf(29504863, 68, 69, ...);
k_min = 5001682348740 - k_max = 10003364705653
Using GPU kernel "mfakto_cl_barrett79"
class | candidates | time | avg. rate | SievePrimes | ETA | avg. wait
4617/4620 | 271.32M | 2.831s | 95.84M/s | 5000 | 0m00s | 6653us
no factor for M29504863 from 2^68 to 2^69 [mfakto 0.09-Win mfakto_cl_barrett79]
tf(): total time spent: 51m 21.457s
[/code]

bcp19 2012-01-31 15:45

mfakto just moved up to 29.507M exps and has dropped back down to 43 min per exp. Interesting/weird little bump in that small a range.

KyleAskine 2012-01-31 21:05

[QUOTE=KyleAskine;287696]Yes, it works fantastic on Win 7. But my Linux box bombed when I installed it. I had to roll back.[/QUOTE]

I think this was an installation issue. It now works.

Bdot 2012-02-01 00:09

[QUOTE=bcp19;287889]mfakto just moved up to 29.507M exps and has dropped back down to 43 min per exp. Interesting/weird little bump in that small a range.[/QUOTE]
I bet if you do the range again, it will be fast. I'd blame Windows and all the background tasks it is performing (indexer, backup, Wupdate, defender / virus scanning ...). Even if it is none of that I'd try blaming Windows ;-)

Bdot 2012-02-01 00:23

[QUOTE=KyleAskine;287930]I think this was an installation issue. It now works.[/QUOTE]
I have 12.1 running for 2 days on my Linux box now. No aborts nor other issues. So I think it was some old library still on the system, or old values in LD_LIBRARY_PATH.

bcp19 2012-02-01 00:56

[QUOTE=Bdot;287943]I bet if you do the range again, it will be fast. I'd blame Windows and all the background tasks it is performing (indexer, backup, Wupdate, defender / virus scanning ...). Even if it is none of that I'd try blaming Windows ;-)[/QUOTE]

I can't imagine what process would run for 16+ hours impacting core 4 without affecting other items. Core 1 running 332M LL stayed at .168ms/iter, Core 2 still averaged 20 min 41-46sec on mfaktc, core 3 running 45M LL stayed at .017 ms/iter. So, something impacted Core 4? M/s stayed at 97-99. Odd thing to me was I caught 2 exps ending, and after seeing the ~54min post, the first class printed said 43min to go.

Bdot 2012-02-01 12:49

[QUOTE=bcp19;287947]I can't imagine what process would run for 16+ hours impacting core 4 without affecting other items. Core 1 running 332M LL stayed at .168ms/iter, Core 2 still averaged 20 min 41-46sec on mfaktc, core 3 running 45M LL stayed at .017 ms/iter. So, something impacted Core 4? M/s stayed at 97-99. Odd thing to me was I caught 2 exps ending, and after seeing the ~54min post, the first class printed said 43min to go.[/QUOTE]
Even after thinking about this a little bit more, I have no explanation. I thought about the difference in the exponents. The barrett kernels need a tiny bit longer to process a "1" instead of a "0" in the binary representation of the exponent. Usually, the first 7 bits are preprocessed on the host, so they don't count. M29504269 has just 8 times "1", M29504399 has 10. I doubt this would really be measurable, and for sure it is not accountable for +25% runtime.
mfakto 0.09 still wrote checkpoints after each class. You have 11min=660s more runtime. That is 660/960 ~ 0.7s more per class. As the reported times per class do not fluctuate by that much, it is quite likely that the delay is rather on the host code. If you don't have any task specifically pinned to core #4, and the other tasks are not affected, this really just leaves disk access as the culprit. Which mfaktc-version are you running? If that is before 0.18, then mfaktc would also write CPs after every class, so it should also be delayed by ~0.7s per class ... but if you did not switch to the latest mfakto-version, then I assume you also did not switch to the latest mfaktc-version. And if mfaktc < 0.18 was not affected I'm at the end of my knowledge/guesswork.
BTW, both indexing and virus scan can take forever if you have lots of files. On a dev machine with some GB of subversion repositories (~500k files including the .svn ones), it did not finish within one day - I had to disable them.

bcp19 2012-02-01 19:13

[QUOTE=Bdot;287974]Even after thinking about this a little bit more, I have no explanation. I thought about the difference in the exponents. The barrett kernels need a tiny bit longer to process a "1" instead of a "0" in the binary representation of the exponent. Usually, the first 7 bits are preprocessed on the host, so they don't count. M29504269 has just 8 times "1", M29504399 has 10. I doubt this would really be measurable, and for sure it is not accountable for +25% runtime.
mfakto 0.09 still wrote checkpoints after each class. You have 11min=660s more runtime. That is 660/960 ~ 0.7s more per class. As the reported times per class do not fluctuate by that much, it is quite likely that the delay is rather on the host code. If you don't have any task specifically pinned to core #4, and the other tasks are not affected, this really just leaves disk access as the culprit. Which mfaktc-version are you running? If that is before 0.18, then mfaktc would also write CPs after every class, so it should also be delayed by ~0.7s per class ... but if you did not switch to the latest mfakto-version, then I assume you also did not switch to the latest mfaktc-version. And if mfaktc < 0.18 was not affected I'm at the end of my knowledge/guesswork.
BTW, both indexing and virus scan can take forever if you have lots of files. On a dev machine with some GB of subversion repositories (~500k files including the .svn ones), it did not finish within one day - I had to disable them.[/QUOTE]

I'm running .18 on mfaktc and .09 on mfakto. Disk access should not be a factor since it is being run from a ramdisk and I highly doubt my ram has a latency of .7s.

chalsall 2012-02-01 19:20

[QUOTE=bcp19;288004]I'm running .18 on mfaktc and .09 on mfakto. Disk access should not be a factor since it is being run from a ramdisk and I highly doubt my ram has a latency of .7s.[/QUOTE]

Things which make you go "Hmmmm... That's unusual..." are things which should be investigaged.

Perhaps these exponents should be run again by another mfakto worker (using the same code) to see if the same behaviour is observed.

Bdot 2012-02-01 23:31

[QUOTE=chalsall;288005]Things which make you go "Hmmmm... That's unusual..." are things which should be investigaged.

Perhaps these exponents should be run again by another mfakto worker (using the same code) to see if the same behaviour is observed.[/QUOTE]

Yes, you're right - if it is reproducible at all.

bcp19, could you please rerun one of the slow exponents, just to make sure it is something in mfakto? If it is slow again, then I'd like to know what Windows you're running, and which Catalyst version so I can setup the same ...


All times are UTC. The time now is 22:55.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.