mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   LL with OpenCL (https://www.mersenneforum.org/showthread.php?t=18297)

kracker 2013-09-11 19:59

[QUOTE=VictordeHolland;352766]Thanks, now it's working!
Windows 7 x64
Catalyst 13.4
AMD HD7950 @800MHz (downclocked it to reference clocks)

Nice GPU load on 36,666,666!
See attached screenshot[/QUOTE]

Nice, around speed of GTX 680. I would be curious to see at "stock clocks(OC)" as well, if you want. :smile:

VictordeHolland 2013-09-11 20:33

1 Attachment(s)
[QUOTE=kracker;352770]Nice, around speed of GTX 680. I would be curious to see at "stock clocks(OC)" as well, if you want. :smile:[/QUOTE]
Indeed, not bad for a card that costs €50 less!
Sure!
HD7950 @900

VictordeHolland 2013-09-11 21:03

LL territory test
 
Tested the first 20000 iter of M63300049 as a test for LL territory speed , since it has a factor anyway :smile:.
[code]
clLucas_x64_test-1.0>clLucas_x64.exe 63300049
Platform :Advanced Micro Devices, Inc.
Device 0 : Tahiti

start M63300049 fft length = 3276800
err = 0.355469, increasing n from 3276800

start M63300049 fft length = 3538944
Iteration 10000 M( 63300049 )C, 0x3c0d0358c14611b9, n = 3538944, clLucas v1.00 err = 0.1953 (3:09 real, 18.9353 ms/iter, ETA 332:53:33)
Iteration 20000 M( 63300049 )C, 0xfe91586583ff04b5, n = 3538944, clLucas v1.00 err = 0.1953 (3:09 real, 18.9264 ms/iter, ETA 332:41:02)[/code]

kracker 2013-09-12 02:49

[QUOTE=VictordeHolland;352777]Tested the first 20000 iter of M63300049 as a test for LL territory speed , since it has a factor anyway :smile:.
[code]
clLucas_x64_test-1.0>clLucas_x64.exe 63300049
Platform :Advanced Micro Devices, Inc.
Device 0 : Tahiti

start M63300049 fft length = 3276800
err = 0.355469, increasing n from 3276800

start M63300049 fft length = 3538944
Iteration 10000 M( 63300049 )C, 0x3c0d0358c14611b9, n = 3538944, clLucas v1.00 err = 0.1953 (3:09 real, 18.9353 ms/iter, ETA 332:53:33)
Iteration 20000 M( 63300049 )C, 0xfe91586583ff04b5, n = 3538944, clLucas v1.00 err = 0.1953 (3:09 real, 18.9264 ms/iter, ETA 332:41:02)[/code][/QUOTE]
That is not the probably full "possible" speed I believe, ask msft.
Also,
[code]
M( 86243 )P, n = 4608, clLucas v1.00
M( 110503 )P, n = 6144, clLucas v1.00
M( 132049 )P, n = 6912, clLucas v1.00
M( 216091 )P, n = 12288, clLucas v1.00
M( 756839 )P, n = 40960, clLucas v1.00
M( 859433 )P, n = 49152, clLucas v1.00
M( 1398269 )P, n = 73728, clLucas v1.00
M( 1257787 )P, n = 65536, clLucas v1.00
M( 2976221 )P, n = 163840, clLucas v1.00
M( 3021377 )P, n = 163840, clLucas v1.00
[/code]
:smile:

LaurV 2013-09-14 05:25

1 Attachment(s)
Ok, played with it, but the result is very odd! When the CPU is fully booked, the clLucas is... faster! No joke, and there is no mistake with the red text in the attached photo.

The 7990 is even slower, for a less occupancy. I might need to launch two instances to max the card... I will try this later.

Making an analogy with gtx580 (I don't think it is appropriate, but for the sake of example), when the occupancy of the GPU is 99%, the speed is almost double comparing with occupancy of 84%. Should we expect something like that on Radeons too? (assuming I might be able and enough clever to supply the right parameters, number of threads, FFT size, etc)

(P.S. @kracker related to PM: don't push me, you don't know what you are up against! :razz::smile: I didn't do DCs for ages, but just wait few weeks till I am back from a biz trip)

LaurV 2013-09-14 06:05

FWIW:

[CODE]

e:\-99-Prime\clLucas>clLucas_x64.exe -c 10000 86243
Platform :Advanced Micro Devices, Inc.
Device 0 : Tahiti


start M86243 fft length = 4608
Iteration 10000 M( 86243 )C, 0x23992ccd735a03d9, n = 4608, clLucas v1.00 err = 0.01758 (0:04 real, 0.3755 ms/iter, ETA 0:26)
Iteration 20000 M( 86243 )C, 0x89c58d63ebee7ad1, n = 4608, clLucas v1.00 err = 0.01758 (0:03 real, 0.3673 ms/iter, ETA 0:22)
Iteration 30000 M( 86243 )C, 0x8ad9ad7af5b51d09, n = 4608, clLucas v1.00 err = 0.01758 (0:04 real, 0.3673 ms/iter, ETA 0:18)
Iteration 40000 M( 86243 )C, 0xeed70124ff3b4f5a, n = 4608, clLucas v1.00 err = 0.01758 (0:04 real, 0.3675 ms/iter, ETA 0:14)
Iteration 50000 M( 86243 )C, 0x6ef44d2b23c538e1, n = 4608, clLucas v1.00 err = 0.01758 (0:03 real, 0.3670 ms/iter, ETA 0:11)
Iteration 60000 M( 86243 )C, 0x76f20516c9858691, n = 4608, clLucas v1.00 err = 0.01758 (0:04 real, 0.3666 ms/iter, ETA 0:07)
Iteration 70000 M( 86243 )C, 0x1c98576ef37a22df, n = 4608, clLucas v1.00 err = 0.01758 (0:04 real, 0.3678 ms/iter, ETA 0:03)
Iteration 80000 M( 86243 )C, 0x6809f7b9c9f1e33d, n = 4608, clLucas v1.00 err = 0.01758 (0:03 real, 0.3674 ms/iter, ETA 0:00)
M( 86243 )P, n = 4608, clLucas v1.00

e:\-99-Prime\clLucas>clLucas_x64.exe -c 10000 110503
Platform :Advanced Micro Devices, Inc.
Device 0 : Tahiti


start M110503 fft length = 6144
Iteration 10000 M( 110503 )C, 0xacb29fc05973d0a8, n = 6144, clLucas v1.00 err = 0.006348 (0:04 real, 0.3988 ms/iter, ETA 0:39)
Iteration 20000 M( 110503 )C, 0x9cd7ca8aa594b33c, n = 6144, clLucas v1.00 err = 0.006348 (0:04 real, 0.3898 ms/iter, ETA 0:35)
Iteration 30000 M( 110503 )C, 0xba1ef4f09a7c955a, n = 6144, clLucas v1.00 err = 0.006348 (0:04 real, 0.3887 ms/iter, ETA 0:31)
Iteration 40000 M( 110503 )C, 0x827b27dad4e98554, n = 6144, clLucas v1.00 err = 0.006348 (0:04 real, 0.3895 ms/iter, ETA 0:27)
Iteration 50000 M( 110503 )C, 0x9e6c039053cc2c17, n = 6144, clLucas v1.00 err = 0.006348 (0:04 real, 0.3898 ms/iter, ETA 0:23)
Iteration 60000 M( 110503 )C, 0xdb48afced9ebd397, n = 6144, clLucas v1.00 err = 0.006348 (0:04 real, 0.3871 ms/iter, ETA 0:19)
Iteration 70000 M( 110503 )C, 0xd650094b406761ed, n = 6144, clLucas v1.00 err = 0.006348 (0:04 real, 0.3894 ms/iter, ETA 0:15)
Iteration 80000 M( 110503 )C, 0xa4d69c031cb0caa2, n = 6144, clLucas v1.00 err = 0.006348 (0:04 real, 0.3905 ms/iter, ETA 0:11)
Iteration 90000 M( 110503 )C, 0xf1427358e52c1458, n = 6144, clLucas v1.00 err = 0.006348 (0:03 real, 0.3887 ms/iter, ETA 0:07)
Iteration 100000 M( 110503 )C, 0x0f4385fec05eb193, n = 6144, clLucas v1.00 err = 0.006348 (0:04 real, 0.3909 ms/iter, ETA 0:03)
Iteration 110000 M( 110503 )C, 0xc5bb3186236db9db, n = 6144, clLucas v1.00 err = 0.006348 (0:04 real, 0.3921 ms/iter, ETA 0:00)
M( 110503 )P, n = 6144, clLucas v1.00

e:\-99-Prime\clLucas>clLucas_x64.exe -c 10000 132049
Platform :Advanced Micro Devices, Inc.
Device 0 : Tahiti


start M132049 fft length = 6912
Iteration 10000 M( 132049 )C, 0x4c52a92b54635f9e, n = 6912, clLucas v1.00 err = 0.03125 (0:04 real, 0.3618 ms/iter, ETA 0:43)
Iteration 20000 M( 132049 )C, 0x535b0883c22f11d5, n = 6912, clLucas v1.00 err = 0.03125 (0:03 real, 0.3552 ms/iter, ETA 0:39)
Iteration 30000 M( 132049 )C, 0xbcd4392925c8b6c9, n = 6912, clLucas v1.00 err = 0.03125 (0:04 real, 0.3569 ms/iter, ETA 0:35)
Iteration 40000 M( 132049 )C, 0x0472d7bbf21e9336, n = 6912, clLucas v1.00 err = 0.03711 (0:03 real, 0.3558 ms/iter, ETA 0:32)
Iteration 50000 M( 132049 )C, 0x67956f8d0a7e8aa0, n = 6912, clLucas v1.00 err = 0.03711 (0:04 real, 0.3536 ms/iter, ETA 0:28)
Iteration 60000 M( 132049 )C, 0x1a3c4b80c267f04f, n = 6912, clLucas v1.00 err = 0.03711 (0:03 real, 0.3546 ms/iter, ETA 0:24)
Iteration 70000 M( 132049 )C, 0xb3820e7e950dceb0, n = 6912, clLucas v1.00 err = 0.03711 (0:04 real, 0.3573 ms/iter, ETA 0:21)
Iteration 80000 M( 132049 )C, 0xf405e899fbc5b54a, n = 6912, clLucas v1.00 err = 0.03711 (0:04 real, 0.3553 ms/iter, ETA 0:17)
Iteration 90000 M( 132049 )C, 0x28ecbb0541f5ec16, n = 6912, clLucas v1.00 err = 0.03711 (0:03 real, 0.3569 ms/iter, ETA 0:14)
Iteration 100000 M( 132049 )C, 0xbf8e44f1880d99ab, n = 6912, clLucas v1.00 err = 0.03711 (0:04 real, 0.3579 ms/iter, ETA 0:10)
Iteration 110000 M( 132049 )C, 0xe7376c59b084a454, n = 6912, clLucas v1.00 err = 0.03711 (0:03 real, 0.3536 ms/iter, ETA 0:07)
Iteration 120000 M( 132049 )C, 0x816902f6d3a9764a, n = 6912, clLucas v1.00 err = 0.03711 (0:04 real, 0.3547 ms/iter, ETA 0:03)
Iteration 130000 M( 132049 )C, 0x1fc8bd2ec9e48966, n = 6912, clLucas v1.00 err = 0.03711 (0:03 real, 0.3542 ms/iter, ETA 0:00)
M( 132049 )P, n = 6912, clLucas v1.00

e:\-99-Prime\clLucas>clLucas_x64.exe -c 20000 216091
Platform :Advanced Micro Devices, Inc.
Device 0 : Tahiti


start M216091 fft length = 12288
Iteration 20000 M( 216091 )C, 0x13e968bf40fda4d7, n = 12288, clLucas v1.00 err = 0.005127 (0:09 real, 0.4326 ms/iter, ETA 1:17)
Iteration 40000 M( 216091 )C, 0xc26da9695ac418c1, n = 12288, clLucas v1.00 err = 0.005127 (0:08 real, 0.4252 ms/iter, ETA 1:08)
Iteration 60000 M( 216091 )C, 0x99aa87c495daffe7, n = 12288, clLucas v1.00 err = 0.005127 (0:09 real, 0.4281 ms/iter, ETA 0:59)
Iteration 80000 M( 216091 )C, 0xddf612c72037b8a1, n = 12288, clLucas v1.00 err = 0.005127 (0:08 real, 0.4255 ms/iter, ETA 0:51)
Iteration 100000 M( 216091 )C, 0x4de7f101ee1cb7a5, n = 12288, clLucas v1.00 err = 0.005127 (0:09 real, 0.4299 ms/iter, ETA 0:42)
Iteration 120000 M( 216091 )C, 0x3981b56788b529e2, n = 12288, clLucas v1.00 err = 0.005127 (0:08 real, 0.4276 ms/iter, ETA 0:34)
Iteration 140000 M( 216091 )C, 0x669382faea06df89, n = 12288, clLucas v1.00 err = 0.005127 (0:09 real, 0.4281 ms/iter, ETA 0:25)
Iteration 160000 M( 216091 )C, 0xb391010f29c70ee1, n = 12288, clLucas v1.00 err = 0.005127 (0:09 real, 0.4306 ms/iter, ETA 0:17)
Iteration 180000 M( 216091 )C, 0xe3d74c104f02967d, n = 12288, clLucas v1.00 err = 0.005127 (0:08 real, 0.4267 ms/iter, ETA 0:08)
Iteration 200000 M( 216091 )C, 0xf433496947b7b103, n = 12288, clLucas v1.00 err = 0.005127 (0:09 real, 0.4275 ms/iter, ETA 0:00)
M( 216091 )P, n = 12288, clLucas v1.00

e:\-99-Prime\clLucas>clLucas_x64.exe -c 50000 756839
Platform :Advanced Micro Devices, Inc.
Device 0 : Tahiti


start M756839 fft length = 40960
Iteration 50000 M( 756839 )C, 0x87894b9e220b332e, n = 40960, clLucas v1.00 err = 0.03143 (0:26 real, 0.5053 ms/iter, ETA 5:53)
Iteration 100000 M( 756839 )C, 0x1e7a84272964fe1b, n = 40960, clLucas v1.00 err = 0.03143 (0:24 real, 0.4937 ms/iter, ETA 5:20)
Iteration 150000 M( 756839 )C, 0x6c059e45c36d246c, n = 40960, clLucas v1.00 err = 0.03516 (0:26 real, 0.5159 ms/iter, ETA 5:09)
Iteration 200000 M( 756839 )C, 0x46ed44a3439e400e, n = 40960, clLucas v1.00 err = 0.03564 (0:24 real, 0.4761 ms/iter, ETA 4:21)
Iteration 250000 M( 756839 )C, 0x18aa9cf222cf3fab, n = 40960, clLucas v1.00 err = 0.03564 (0:26 real, 0.5157 ms/iter, ETA 4:17)
Iteration 300000 M( 756839 )C, 0x02bb26eca6f1a78e, n = 40960, clLucas v1.00 err = 0.03564 (0:24 real, 0.4917 ms/iter, ETA 3:41)
Iteration 350000 M( 756839 )C, 0x0b9b20d0a7329e18, n = 40960, clLucas v1.00 err = 0.03564 (0:24 real, 0.4793 ms/iter, ETA 3:11)
Iteration 400000 M( 756839 )C, 0xeebed8a846e5f766, n = 40960, clLucas v1.00 err = 0.03564 (0:25 real, 0.4874 ms/iter, ETA 2:50)
Iteration 450000 M( 756839 )C, 0x702ffa6abef4031f, n = 40960, clLucas v1.00 err = 0.03564 (0:25 real, 0.5044 ms/iter, ETA 2:31)
Iteration 500000 M( 756839 )C, 0xd2efeb578776b57e, n = 40960, clLucas v1.00 err = 0.03564 (0:27 real, 0.5387 ms/iter, ETA 2:14)
Iteration 550000 M( 756839 )C, 0x5c8f899854f90417, n = 40960, clLucas v1.00 err = 0.03564 (0:26 real, 0.5225 ms/iter, ETA 1:44)
Iteration 600000 M( 756839 )C, 0x2e3604d47123128e, n = 40960, clLucas v1.00 err = 0.03564 (0:26 real, 0.5127 ms/iter, ETA 1:16)
Iteration 650000 M( 756839 )C, 0xae28f531bed5c3e7, n = 40960, clLucas v1.00 err = 0.03564 (0:25 real, 0.5086 ms/iter, ETA 0:50)
Iteration 700000 M( 756839 )C, 0x0964087d213296ad, n = 40960, clLucas v1.00 err = 0.03564 (0:26 real, 0.5126 ms/iter, ETA 0:25)
Iteration 750000 M( 756839 )C, 0x6430c5b5289c202e, n = 40960, clLucas v1.00 err = 0.03564 (0:26 real, 0.5279 ms/iter, ETA 0:00)
M( 756839 )P, n = 40960, clLucas v1.00

e:\-99-Prime\clLucas>clLucas_x64.exe -c 50000 859433
Platform :Advanced Micro Devices, Inc.
Device 0 : Tahiti


start M859433 fft length = 49152
Iteration 50000 M( 859433 )C, 0x01e8d2509f34fe9f, n = 49152, clLucas v1.00 err = 0.01172 (0:21 real, 0.4156 ms/iter, ETA 5:32)
Iteration 100000 M( 859433 )C, 0xf749ad0737517aa7, n = 49152, clLucas v1.00 err = 0.01172 (0:21 real, 0.4146 ms/iter, ETA 5:10)
Iteration 150000 M( 859433 )C, 0x39c3a73f16044dd4, n = 49152, clLucas v1.00 err = 0.01172 (0:21 real, 0.4208 ms/iter, ETA 4:54)
[/CODE]

LaurV 2013-09-14 06:07

1 Attachment(s)
Trying to squeeze some more juice from the Tahiti, I believe this is what msft is after, isn't he? :smile:

kladner 2013-09-14 06:42

[QUOTE=LaurV;352974](P.S. @kracker related to PM: don't push me, you don't know what you are up against! :razz::smile: I didn't do DCs for ages, but just wait few weeks till I am back from a biz trip)[/QUOTE]

At this point, I think most people know that you are a maniac, and not to be trifled with. Next thing we know, you would be setting up a liquid He[SUP]2[/SUP] loop, or something. :doh!:

Robish 2013-09-14 12:12

[QUOTE=kracker;352654]Ok, so if anyone has a 58xx+,68xx+,77xx+ and want to test out a few things, PM me. :smile: (Windows x64, 32 bit if needed)[/QUOTE]

7870 any good to you?

Robish 2013-09-14 17:10

Windows version
 
[QUOTE=msft;352392]Please wait until the windows version available.[/QUOTE]

will do

cheers

Rob

kracker 2013-09-14 20:27

[QUOTE=kladner;352980]At this point, I think most people know that you are a maniac, and not to be trifled with. Next thing we know, you would be setting up a liquid He[SUP]2[/SUP] loop, or something. :doh!:[/QUOTE]
He is to be trifled. :edit: I don't know his power, but I am getting a head start, I only have a third of my potential power working now. :devil:

...

Ok, well frankly, 3-4 DC's(all cpu) a day is my max at the moment. So not much.

[QUOTE=Robish;352997]will do

cheers

Rob[/QUOTE]

I have it, but only for testing for now.. Windows x64?


All times are UTC. The time now is 13:00.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.