mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-10-31, 19:37   #276
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

1000011110002 Posts
Default

12 DC's completed, all matched, and one overclocked gpu memory from 1125 to 1250 MHz.

This leads me to conclude that:

AMD's memory is stronger than nVidia.
CUDALucas is more intense on the card than clLucas.

Or both.
kracker is offline   Reply With Quote
Old 2013-10-31, 19:49   #277
firejuggler
 
firejuggler's Avatar
 
Apr 2010
Over the rainbow

2×1,303 Posts
Default

From what I have read, non-stock 290X cooler are to be released for Christmas, not before.
However I've seen separate ( said compatible ) watercooler for 290X at around 100 $.

Edit :
It seems that, around the same time, a new low level API will be released (Mantel).
It is essentially for gaming, but might be useful for computation.

Last fiddled with by firejuggler on 2013-10-31 at 19:57
firejuggler is online now   Reply With Quote
Old 2013-12-06, 13:39   #278
msft
 
msft's Avatar
 
Jul 2009
Tokyo

10011000102 Posts
Default

For AMD APP SDK v2.9
ubuntu:
Code:
$ pwd
/opt/AMDAPP/samples/opencl/cpp_cl
$ tar -xvf clLucas.1.02.tar.bz2
$ cd clLucas.1.02/
$ sh -x ./run.sh
+ cmake .
+ make
+ export LD_LIBRARY_PATH=:/opt/clFFT-2.0/library/
+ time ./clLucas 1398269
Platform :Advanced Micro Devices, Inc.
Device 0 : Capeverde

Build Options are : -D KHR_DP_EXTENSION

start M1398269 fft length = 73728
Iteration 10000 0xa4a6d2f0e34629db, n = 73728 err = 0.07031 (0:16 real, 1.5492 ms/iter, ETA 35:37)
...
Iteration 1390000 0x554ae339bfea8fae, n = 73728 err = 0.07812 (0:08 real, 0.8408 ms/iter, ETA 0:00)
M( 1398269 )P, n = 73728, clLucas v1.02
Attached Files
File Type: bz2 clLucas.1.02.tar.bz2 (19.9 KB, 110 views)
msft is offline   Reply With Quote
Old 2013-12-06, 16:15   #279
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23×271 Posts
Default

One thing I've noticed, is that increasing -c (checkpoints) make it slightly faster for me(probably more on faster cards). I noticed the gpu slows or stops during a checkpoint, so I really push it up when running.
kracker is offline   Reply With Quote
Old 2013-12-07, 01:08   #280
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

-st2 passed on:
Code:
Select device - Get device info - Device 1/1: BeaverCreek (Advanced Micro Device
s, Inc.),
device version: OpenCL 1.2 AMD-APP (1268.1), driver version: 1268.1 (VM)
Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomic
s cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_
image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_count
ers_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops
cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_dx9_media_sharing cl
_amd_image2d_from_buffer_read_only
Global memory:799014912, Global memory cache: 0, local memory: 32768, workgroup
size: 256, Work dimensions: 3[256, 256, 256, 0, 0] , Max clock speed:444, comput
e units:4
Compiling kernels (build options: "-I. -DVECTOR_SIZE=4 -O3 -DMORE_CLASSES -DCL_G
PU_SIEVE").................
 OpenCL device info
  name                      BeaverCreek (Advanced Micro Devices, Inc.)
  device (driver) version   OpenCL 1.2 AMD-APP (1268.1) (1268.1 (VM))
  maximum threads per block 256
  maximum threads per grid  16777216
  number of multiprocessors 4 (320 compute elements)
  clock rate                444MHz
 Automatic parameters
  threads per grid          256
  optimizing kernels for    APU
Attached Thumbnails
Click image for larger version

Name:	Passed.jpg
Views:	131
Size:	73.8 KB
ID:	10548  
flashjh is offline   Reply With Quote
Old 2013-12-07, 12:03   #281
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default

Quote:
Originally Posted by flashjh View Post
-st2 passed on:
Code:
...
  name                      BeaverCreek (Advanced Micro Devices, Inc.)
  device (driver) version   OpenCL 1.2 AMD-APP (1268.1) (1268.1 (VM))
  maximum threads per block 256
  maximum threads per grid  16777216
  number of multiprocessors 4 (320 compute elements)
  clock rate                444MHz
 Automatic parameters
  threads per grid          256
  optimizing kernels for    APU
Thank you flashjh - I assume this was meant to go to the mfakto thread -could some mod please move these two posts there?

It must have been running for half a day for this extended test ;-)
Bdot is offline   Reply With Quote
Old 2013-12-07, 12:11   #282
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

46316 Posts
Default

Woops, yes it was. And it actually took over 24 hours
flashjh is offline   Reply With Quote
Old 2013-12-09, 02:07   #283
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

87816 Posts
Default

For anyone who wants it, 1.02 is up for Windows at:

http://mersenneforum.org/cllucas
kracker is offline   Reply With Quote
Old 2014-01-23, 20:40   #284
sanaris
 
"Yury Vorobyov"
Jul 2013
Chelyabinsk

238 Posts
Default

Guys. Empty condition - really?
Code:
      if (fpi != NULL)
        {
          if (fgets (str, 132, fpi) == NULL);//line 1340
          currentLine = atoi (str);
          fclose (fpi);
          printf ("Continue test of file '%s' at line %d\n",
              input_filename, currentLine);
        }
sanaris is offline   Reply With Quote
Old 2014-01-23, 21:04   #285
sanaris
 
"Yury Vorobyov"
Jul 2013
Chelyabinsk

19 Posts
Default

I know someone used Emacs on this. Please set tab size in Emacs to be equal standart of 4 and "linux" or other conventional style. Because "gnu" style does "zero size" or "virtual" tabs instead of "real tabs". In gnu style code looks like "one indent for all" on many machines.

As for me, I suppose
Code:
(setq c-default-style "linux"
c-basic-offset 4)
is the best choice. Cause linux is default style in Vim.

Last fiddled with by sanaris on 2014-01-23 at 21:05
sanaris is offline   Reply With Quote
Old 2014-02-11, 18:49   #286
Shirik
 
Shirik's Avatar
 
Feb 2014

810 Posts
Default

I'm trying this out, and keep getting the following error:

Code:
X:\cllucas>clLucas_x64.exe -f 41943040 332233123
Platform :Advanced Micro Devices, Inc.
Device 0 : Tahiti

Build Options are : -D KHR_DP_EXTENSION

start M332233123 fft length = 41943040
OPENCL_V_THROWERROR< CLFFT_NOTIMPLEMENTED > (772): Failed to clfftBakePlan.
terminate called after throwing an instance of 'std::runtime_error'
  what():  OPENCL_V_THROWERROR< CLFFT_NOTIMPLEMENTED > (772): Failed to clfftBakePlan.

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
Running on a 7970.

The problem goes away if I don't use the -f argument, but based on this thread I was under the impression that I have to use it, or it will either be slow or wrong. (If I can get away without the -f argument, then I have nothing to worry about.)

I'm running a double-check on 20000003 now without the -f argument just to see what the result is, but I was wondering if anyone knows about this.
Shirik is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
Can't get OpenCL to work on HD7950 Ubuntu 14.04.5 LTS VictordeHolland Linux 4 2018-04-11 13:44
OpenCL accellerated lattice siever pstach Factoring 1 2014-05-23 01:03
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
AMD's Graphics Core Next- a reason to accelerate towards OpenCL? Belteshazzar GPU Computing 19 2012-03-07 18:58

All times are UTC. The time now is 07:14.


Mon Aug 2 07:14:46 UTC 2021 up 10 days, 1:43, 0 users, load averages: 1.73, 1.90, 1.74

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.