![]() |
|
|
#23 |
|
"Curtis"
Feb 2005
Riverside, CA
486110 Posts |
If you didn't mess with the stage-1 norm, it picks a random subspace to search, so that multiple people can search the same coeff without (much) overlap. For this number, the default search region was broken into 7 pieces. If you watch the output fly by, on the line where a new coeff starts you'll see something like "searching #4 of 7 random sets".
I learned toward the end of this search that I was restricting stage1 norm so much that I was forcing it to search just a small part of the first of those subspaces. So, I left stage1 at default for my C163 search. |
|
|
|
|
|
#24 |
|
I moo ablest echo power!
May 2013
33518 Posts |
Ah, ok. I forgot about the random seeding, and I had my Stage 1 restricted slightly. Regardless, an excellent find!
|
|
|
|
|
|
#25 |
|
"Curtis"
Feb 2005
Riverside, CA
4,861 Posts |
C155 conclusions: I run np1, nps, npr all separately.
Tight stage-1 norms do not produce a higher quantity of quality stage 2 hits. I used a few choices from 1.5e22 to 8e22 for stage 1, finding that the rate of GPU hits varies quite a lot, but the fraction of these hits that produce a nps size of 3.5e20 or lower is no better- in fact, it was slightly worse for very tight stage 1 norms. With stage1 set to 1.5e22, increasing the number of threads on my 460M resulted in faster data production: 2 threads was 30% faster than 1, 3 was 10% faster than 2, 4 5% better than 3, and 5 and 6 matched 4 threads. With stage1 set to 1.8e22, stage 1 hits were produced 45% faster than 1.5e22. Again, -t 2 was 30% faster, while 3 and 4 threads matched 2 threads in hit-rate. I did not test threads for default stage1-norm; I'll do that on the C157s posted in the other thread. Summary: Use at least two threads for searches below 160 digits, and do not set a tight stage 1 bound. Last fiddled with by VBCurtis on 2013-07-01 at 21:35 Reason: corrected percentages |
|
|
|
|
|
#26 |
|
I moo ablest echo power!
May 2013
29·61 Posts |
Interesting. Thanks for the information!
|
|
|
|
|
|
#27 |
|
"Curtis"
Feb 2005
Riverside, CA
4,861 Posts |
My best C163 find:
Code:
# norm 1.006380e-015 alpha -7.184964 e 1.015e-012 rroots 5 skew: 1125907.72 c0: 13552563320177965549083201722855195952 c1: 56748107658641416546177074853658 c2: -129936131458372038527876113 c3: -234012107772195153040 c4: 121278479274204 c5: 26142480 Y0: -8362756448659493213350626044999 Y1: 126296599858935253 I am done with these numbers, but willing to help with any C170 or larger searches. |
|
|
|
|
|
#28 |
|
"Frank <^>"
Dec 2004
CDP Janesville
1000010010102 Posts |
Thank you all for the time spent on these. Unfortunately, it will probably take most of this summer to churn my way through the current queue, depending on the length of the current heat wave.
I had to throttle my hex core down to 2 or 3 active cores (depending on the job) to keep the system at a tolerable temp during the day.
|
|
|
|
|
|
#29 | |
|
Sep 2010
Scandinavia
3·5·41 Posts |
Quote:
I would probably subtract one from that number if I had a monitor connected to it at the same time. It worked something like this when I tried trial factoring mersennes on my GPU. |
|
|
|
|
|
|
#30 |
|
Tribal Bullet
Oct 2004
DD516 Posts |
This problem is different from Mersenne trial factoring; we use a sorting library that automatically chooses how much work to give the card, and the amount chosen will nearly saturate the card every time because each block of work is large and a kernel launch has hundreds or thousands of blocks.
I think it's just a coincidence that best thread number = number of SMs here. Last fiddled with by jasonp on 2013-07-03 at 12:10 |
|
|
|
|
|
#31 | |
|
Sep 2010
Scandinavia
26716 Posts |
Quote:
The law of small numbers strikes again. |
|
|
|
|
|
|
#32 | |
|
"Frank <^>"
Dec 2004
CDP Janesville
2×1,061 Posts |
Quote:
Last fiddled with by schickel on 2013-10-05 at 03:08 Reason: adding note |
|
|
|
|
|
|
#33 | |
|
"Frank <^>"
Dec 2004
CDP Janesville
84A16 Posts |
Quote:
Code:
Tue Nov 05 11:06:15 2013 matrix is 7330605 x 7330831 (2155.5 MB) with weight 554942282 (75.70/col) Tue Nov 05 11:06:15 2013 sparse part has weight 491749598 (67.08/col) Tue Nov 05 11:06:15 2013 using block size 65536 for processor cache size 6144 kB Tue Nov 05 11:06:42 2013 commencing Lanczos iteration (4 threads) Tue Nov 05 11:06:42 2013 memory use: 1909.6 MB Tue Nov 05 11:07:40 2013 linear algebra at 0.0%, ETA 75h 3m Tue Nov 05 11:07:59 2013 checkpointing every 100000 dimensions |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Poly Search vs Sieving times | EdH | Factoring | 10 | 2013-10-14 20:00 |
| Resume msieve poly search job? | Andi47 | Msieve | 1 | 2011-03-28 04:30 |
| gpu poly search error | bdodson | Msieve | 10 | 2010-11-09 19:46 |
| Poly search for c157 from 4788:2422 | henryzz | Aliquot Sequences | 59 | 2009-07-04 06:27 |
| Poly search for c137 from 4788:2408 | axn | Aliquot Sequences | 15 | 2009-05-28 16:50 |