mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Data > Marin's Mersenne-aries

Closed Thread
 
Thread Tools
Old 2016-10-08, 11:14   #1266
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

7·11·41 Posts
Default

Ok, the list was created on a Titan Black:

name GeForce GTX TITAN Black
Compatibility 3.5
clockRate (MHz) 1071
memClockRate (MHz) 3500
totalGlobalMem 6442450944
totalConstMem 65536
l2CacheSize 1572864
sharedMemPerBlock 49152
regsPerBlock 65536
warpSize 32
memPitch 2147483647
maxThreadsPerBlock 1024
maxThreadsPerMP 2048
multiProcessorCount 15
maxThreadsDim[3] 1024,1024,64
maxGridSize[3] 2147483647,65535,65535
textureAlignment 512
deviceOverlap 1


I thought that the crossover points between FFTs was the same for different cards. Why would the crossover points depend on the hardware? it should be dependent only on the FFT size.

I know the <GPU> fft.txt is dependant on the GPU because it only shows the fastest FFTs for that card and ignores all the others. But my list is ALL the possible FFTs and the exponents limits should be the same?

Last fiddled with by ATH on 2016-10-08 at 11:35
ATH is offline  
Old 2016-10-08, 12:03   #1267
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

I may be misrecalling. Maybe I'm thinking of -cufftbench.
Dubslow is offline  
Old 2016-10-08, 13:59   #1268
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

7·11·41 Posts
Default

Quote:
Originally Posted by Dubslow View Post
I may be misrecalling. Maybe I'm thinking of -cufftbench.
Yes, I think so. The -cufftbench generates the GPU specific "<GPU> fft.txt" file which out of the long list of FFTs I listed only shows the fastests FFTs in an increasing order. But I think that every time for example 2592K FFT makes it to your file because it is fast on your GPU, then the "max exp" next to it should be the same each time: 2592 48471289

Last fiddled with by ATH on 2016-10-08 at 14:00
ATH is offline  
Old 2016-10-08, 15:02   #1269
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

722110 Posts
Default

Could you run it a couple times to confirm? Might be that the roundoff data randomly changes by a percent or two each time which could affect the max exponent a percent or two.
Dubslow is offline  
Old 2016-10-08, 19:51   #1270
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

315710 Posts
Default

The numbers match cufftbench I did months ago as well as benchmarks done with CUDALucas compiled with both CUDA 6.5, 8.0 and several earlier versions.
ATH is offline  
Old 2016-10-09, 03:01   #1271
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

7·11·43 Posts
Default

Quote:
Originally Posted by Dubslow View Post
The FFT ranges are produced on a per card basis. What card was that file produced with, ATH?

Yes, the tests are shifted per GP2's comments.
Oh, well... phooey. Is it clLucas that doesn't do shift counts? One of those GPU programs doesn't do it but now I'm not sure which one...
Madpoo is offline  
Old 2016-10-09, 04:37   #1272
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

7×11×43 Posts
Default

Quote:
Originally Posted by Madpoo View Post
Oh, well... phooey. Is it clLucas that doesn't do shift counts? One of those GPU programs doesn't do it but now I'm not sure which one...
The two recent bad results from AirSquirrels look like they were run by cllucas anyway, not cudalucas.

Another sad thing I noticed... it doesn't look like either of those report error codes. Every result from either cllucas or cudalucas have an error code of zero (or none at all...)

Is that not a supported feature on those apps? it doesn't keep track of rounding errors, repeatable or not, like Prime95 does?

My thought was I could look for exponents done by whatever GPU app around certain FFT breakpoints, and if they showed evidence of having rounding errors (by looking at the error code), maybe I could draw some data from that. But looks like I hit a dead end. Bummer. If anyone can think of another way using available data, I can take a different tack, but I'm out of ideas there.

My best alternate approach was simply look at all good/bad results from those GPU apps, broken down into 1e6 chunks, and see if there are any spikes in bad results. Sadly there aren't a lot of results to chew on so the data is a bit choppy. I think I can only compare *known* good/bad results, but not any of the unknown/unverified stuff. There just aren't that many in any given 1M range of exponents. 38M had the most with 483 known results from cllucas apps (3 of which were bad).

In other words, just a dearth of data, but here's what I have... the "range" is the 1e6 range, i.e. 40 = 40M-41M
Code:
clLucas
Range	Bad	Total	PercentBad
34	1	51	1.96078431372549
35	3	88	3.40909090909091
36	6	382	1.57068062827225
37	2	298	0.671140939597315
38	3	483	0.62111801242236
39	3	213	1.40845070422535
40	5	77	6.49350649350649
43	1	28	3.57142857142857
44	1	27	3.7037037037037
56	2	27	7.40740740740741
57	1	56	1.78571428571429
58	3	36	8.33333333333333
59	2	31	6.45161290322581
60	1	6	16.6666666666667
64	1	2	50
Code:
cudaLucas
Range	Bad	Total	PercentBad
1	1	9	11.1111111111111
22	1	2	50
25	4	180	2.22222222222222
26	59	621	9.50080515297907
27	18	651	2.76497695852535
28	4	165	2.42424242424242
29	22	511	4.30528375733855
30	23	300	7.66666666666667
31	20	468	4.27350427350427
32	7	609	1.14942528735632
33	41	775	5.29032258064516
34	22	1254	1.75438596491228
35	32	949	3.37197049525817
36	21	1159	1.81190681622088
37	25	856	2.92056074766355
38	22	655	3.3587786259542
39	8	358	2.23463687150838
40	2	361	0.554016620498615
41	4	196	2.04081632653061
42	2	127	1.5748031496063
44	1	77	1.2987012987013
45	11	185	5.94594594594595
46	3	77	3.8961038961039
47	2	145	1.37931034482759
48	1	216	0.462962962962963
49	1	56	1.78571428571429
50	1	22	4.54545454545455
51	2	16	12.5
52	2	16	12.5
53	1	17	5.88235294117647
55	1	75	1.33333333333333
56	6	63	9.52380952380952
57	3	65	4.61538461538462
58	5	56	8.92857142857143
59	6	70	8.57142857142857
60	2	33	6.06060606060606
61	4	16	25
62	3	8	37.5
63	1	14	7.14285714285714
64	2	27	7.40740740740741
66	5	15	33.3333333333333
67	4	23	17.3913043478261
68	6	32	18.75
69	24	38	63.1578947368421
71	1	16	6.25
72	3	28	10.7142857142857
73	56	133	42.1052631578947
74	3	45	6.66666666666667
75	2	19	10.5263157894737
76	1	17	5.88235294117647
77	2	12	16.6666666666667
78	27	43	62.7906976744186
83	1	5	20
100	1	2	50
Code:
Prime95 (all variations/versions)
Range	Bad	Total	PercentBad
0	71	42225	0.168146832445234
1	526	57402	0.91634437824466
2	1155	46091	2.50591221713567
3	1436	46194	3.10862882625449
4	1787	45727	3.90797559428784
5	1947	46544	4.18313853557924
6	1824	46107	3.95601535558592
7	2065	45154	4.57323825131771
8	1932	46182	4.18344809666104
9	1610	45176	3.5638392066584
10	1724	45538	3.78584918090386
11	1869	45772	4.08328235602552
12	2061	46275	4.45380875202593
13	2061	46221	4.45901213734017
14	2176	46500	4.67956989247312
15	2585	47213	5.47518691885709
16	2286	46760	4.88879384088965
17	2218	46189	4.80200913637446
18	2097	46386	4.52076057431121
19	1997	45327	4.40576256977078
20	1857	45831	4.05184263926164
21	1822	46603	3.90961955238933
22	1887	45706	4.1285608016453
23	1786	44944	3.97383410466358
24	1698	45154	3.76046418921912
25	1698	44697	3.79891267870327
26	1785	45222	3.94719384370439
27	1699	44475	3.82012366498033
28	1542	44977	3.42841896969562
29	1394	43756	3.18584879787915
30	1416	43883	3.22676207187294
31	1472	43617	3.37483091455167
32	1384	43091	3.21180757002622
33	2413	43253	5.5788037823966
34	1804	42355	4.25923739818203
35	1773	42215	4.1999289352126
36	1474	41028	3.59266842156576
37	1418	41343	3.4298430205839
38	1397	30705	4.54974759811106
39	990	22777	4.34648988014225
40	1106	21589	5.12297929501135
41	774	16129	4.7988095976192
42	466	15305	3.04475661548514
43	486	16052	3.02766010465986
44	394	12502	3.15149576067829
45	463	9587	4.82945655575258
46	459	3639	12.6133553173949
47	371	2039	18.1951937224129
48	239	1702	14.042303172738
49	131	2756	4.75326560232221
50	76	753	10.0929614873838
51	47	767	6.1277705345502
52	43	429	10.02331002331
53	37	685	5.4014598540146
54	28	567	4.93827160493827
55	100	778	12.853470437018
56	82	491	16.7006109979633
57	110	654	16.8195718654434
58	90	1288	6.98757763975155
59	86	487	17.6591375770021
60	95	624	15.224358974359
61	81	716	11.3128491620112
62	69	652	10.5828220858896
63	64	694	9.22190201729107
64	53	618	8.57605177993528
65	46	709	6.48801128349788
66	59	532	11.0902255639098
67	70	961	7.2840790842872
68	33	739	4.46549391069012
69	49	730	6.71232876712329
70	28	229	12.2270742358079
71	18	295	6.10169491525424
72	10	420	2.38095238095238
73	9	330	2.72727272727273
74	20	304	6.57894736842105
75	18	168	10.7142857142857
76	24	164	14.6341463414634
77	9	95	9.47368421052632
78	5	73	6.84931506849315
79	2	19	10.5263157894737
80	1	4	25
89	1	3	33.3333333333333
100	4	18	22.2222222222222
101	1	4	25
150	1	6	16.6666666666667
Madpoo is offline  
Old 2016-10-09, 06:04   #1273
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

7×11×41 Posts
Default

CUDALucas 2.05 does have shift counts but I heard that some earlier versions of CUDALucas did not, not sure which version started having them.

It also have roundoff checking. From the ini file:

# ErrorIterations tells how often the roundoff error is checked. Larger values
# give shorter iteration times, but introduce some uncertainty as to the actual
# maximum roundoff error that occurs during the test. Default is 100.
ErrorIterations=100


It does not however report any error codes

M( 43728863 )C, 0xc5686ccd71b894__, offset = 21872935, n = 2592K, CUDALucas v2.05.1, AID: 5D5719BA469FA6F7CBC2FD6BE6087FDA
M( 43883923 )C, 0x53ad50181724cb__, offset = 1282, n = 2592K, CUDALucas v2.05.1, AID: A809A786CBC695E6A5D02F4764E00769
M( 50152231 )C, 0x8d3a32c3ceffa6__, offset = 25077601, n = 2744K, CUDALucas v2.05.1, AID: C1D57ECE79CCA42481E0B18EE969D425
M( 76092673 )C, 0xb094499608396f__, offset = 38056420, n = 4320K, CUDALucas v2.05.1
M( 50198279 )C, 0x9b5c83ca3b592c__, offset = 2702, n = 2744K, CUDALucas v2.05.1, AID: 586AC98F222EF9EC7CE0F7E6FED2AE6D
M( 43538569 )C, 0x7d762f3dac0893__, offset = 12158, n = 2592K, CUDALucas v2.05.1

Last fiddled with by ATH on 2016-10-09 at 06:07
ATH is offline  
Old 2016-10-09, 12:00   #1274
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

23×419 Posts
Default

If I am not mistaken, Owftheevil put the shifts into cudaLucas starting after v2.03.
LaurV is offline  
Old 2016-10-12, 23:37   #1275
airsquirrels
 
airsquirrels's Avatar
 
"David"
Jul 2015
Ohio

11·47 Posts
Default

If someone could quad-check 72356909, I purchased two more Titan Blacks from ebay, but they appear to be flashed with an overclocked firmware. TDP is set at 300W and the clock is running at 967.
airsquirrels is offline  
Old 2016-10-12, 23:46   #1276
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2·3·1,693 Posts
Default

Quote:
Originally Posted by airsquirrels View Post
If someone could quad-check 72356909, I purchased two more Titan Blacks from ebay, but they appear to be flashed with an overclocked firmware. TDP is set at 300W and the clock is running at 967.
Won't MSI Afterburner deal with the overclocking?

I'm afraid a 72M LL run would probably take me a couple of weeks on the GTX460. It takes about 3.5 days on a 40M to 41M DC. I imagine there are bigger guns around here who would answer sooner.

Last fiddled with by kladner on 2016-10-12 at 23:49
kladner is offline  
Closed Thread



Similar Threads
Thread Thread Starter Forum Replies Last Post
Double-Double Arithmetic Mysticial Software 52 2021-04-23 06:51
Clicking an exponent leads to 404 page marigonzes Information & Answers 2 2017-02-14 16:56
x.265 half the size, double the computation; so if you double again? 1/4th? jasong jasong 7 2015-08-17 10:56
What about double-checking TF/P-1? 137ben PrimeNet 6 2012-03-13 04:01
Double the area, Double the volume. Uncwilly Puzzles 8 2006-07-03 16:02

All times are UTC. The time now is 08:34.


Tue Jul 27 08:34:54 UTC 2021 up 4 days, 3:03, 0 users, load averages: 1.67, 1.75, 1.75

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.