mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2019-02-27, 03:25   #1
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Pittsburgh, PA

3×7×11 Posts
Default Round off error problems

Hello guys! Hopefully everyone is enjoying or enduring the weather. It's bad weather here at Iowa.

I was running cudalucas 2.06beta on an RTX 2080, but it seems like I kept on getting this error

Code:
Round off error at iteration = 22746300, err = 0.375 > 0.35, fft = 4704K.
majority of the time it goes back to normal,

Code:
Round off error at iteration = 22746300, err = 0.375 > 0.35, fft = 4704K.
Restarting from last checkpoint to see if the error is repeatable.

Using threads: square 256, splice 128.

Continuing M87515717 @ iteration 22740001 with fft length 4704K, 25.98% done

|  Feb 26  21:05:49  |  M87515717  22750000  0x600e2a5aac882d3a  |  4704K  0.34375   4.2361   42.35s  |   3:04:14:02  25.99%  |
Looks like the error went away, continuing.
but sometimes the error repeats,

Code:
Round off error at iteration = 5632400, err = 0.375 > 0.35, fft = 4704K.
Restarting from last checkpoint to see if the error is repeatable.

Using threads: square 256, splice 128.

Continuing M87515717 @ iteration 5630001 with fft length 4704K,  6.43% done

Round off error at iteration = 5632400, err = 0.35938 > 0.35, fft = 4704K.
The error persists.
Trying a larger fft until the next checkpoint.

Using threads: square 256, splice 128.

Continuing M87515717 @ iteration 5630001 with fft length 5120K,  6.43% done

|  Feb 26  00:11:57  |  M87515717   5640000  0x8692fae95ad89471  |  5120K  0.03906   4.0596   40.59s  |   3:20:19:49   6.44%  |
Resettng fft.
I understand the error is not a big issue. But the frequency that this is occurring is alarmingly high and it concerns me. This GPU just finished a DC right before this assignment for M50153029. But, is a LL residue from an LL test like this still trustworthy after it is completed? Any suggestions how to resolve this problem? Even if reliability is not an issue, I would say these errors are using excessive computation time since it has to rollback to the previous checkpoint.

I have attached partial log for an LL test for M87515717, from 6.43% to 26.09%.

Much thanks!

log.txt

Last fiddled with by dcheuk on 2019-02-27 at 03:28
dcheuk is offline   Reply With Quote
Old 2019-02-27, 03:37   #2
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

33·5·53 Posts
Default

A roundoff error < 0.4 simply means you are testing an exponent near the limits of what that FFT size can support. Your hardware and end result are just fine.

Your choice is to endure the rollbacks of force using a larger FFT size (I don't know how to do that).
Prime95 is offline   Reply With Quote
Old 2019-02-27, 03:39   #3
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Pittsburgh, PA

3·7·11 Posts
Default

Quote:
Originally Posted by Prime95 View Post
A roundoff error < 0.4 simply means you are testing an exponent near the limits of what that FFT size can support. Your hardware and end result are just fine.

Your choice is to endure the rollbacks of force using a larger FFT size (I don't know how to do that).
Okay, thanks for the clarification. Good to know that the roundoff is fine, now gonna figure out how to force a larger size FFT.

Thanks again!
dcheuk is offline   Reply With Quote
Old 2019-02-27, 03:43   #4
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Pittsburgh, PA

E716 Posts
Default

Quote:
Originally Posted by dcheuk View Post
Okay, thanks for the clarification. Good to know that the roundoff is fine, now gonna figure out how to force a larger size FFT.

Thanks again!
Oh duh, all I have to do to force FFT length increase is to enter F into the console and then hit enter. lol stupid me

UPDATE: increasing the FFT length seems to have solved the problem. No more errors! yay.

Thanks.

Last fiddled with by dcheuk on 2019-02-27 at 04:30
dcheuk is offline   Reply With Quote
Old 2019-02-27, 05:53   #5
xx005fs
 
"Eric"
Jan 2018
USA

32×23 Posts
Default

Quote:
Originally Posted by dcheuk View Post
Hello guys! Hopefully everyone is enjoying or enduring the weather. It's bad weather here at Iowa.

I was running cudalucas 2.06beta on an RTX 2080, but it seems like I kept on getting this error

Code:
Round off error at iteration = 22746300, err = 0.375 > 0.35, fft = 4704K.
majority of the time it goes back to normal,

Code:
Round off error at iteration = 22746300, err = 0.375 > 0.35, fft = 4704K.
Restarting from last checkpoint to see if the error is repeatable.

Using threads: square 256, splice 128.

Continuing M87515717 @ iteration 22740001 with fft length 4704K, 25.98% done

|  Feb 26  21:05:49  |  M87515717  22750000  0x600e2a5aac882d3a  |  4704K  0.34375   4.2361   42.35s  |   3:04:14:02  25.99%  |
Looks like the error went away, continuing.
but sometimes the error repeats,

Code:
Round off error at iteration = 5632400, err = 0.375 > 0.35, fft = 4704K.
Restarting from last checkpoint to see if the error is repeatable.

Using threads: square 256, splice 128.

Continuing M87515717 @ iteration 5630001 with fft length 4704K,  6.43% done

Round off error at iteration = 5632400, err = 0.35938 > 0.35, fft = 4704K.
The error persists.
Trying a larger fft until the next checkpoint.

Using threads: square 256, splice 128.

Continuing M87515717 @ iteration 5630001 with fft length 5120K,  6.43% done

|  Feb 26  00:11:57  |  M87515717   5640000  0x8692fae95ad89471  |  5120K  0.03906   4.0596   40.59s  |   3:20:19:49   6.44%  |
Resettng fft.
I understand the error is not a big issue. But the frequency that this is occurring is alarmingly high and it concerns me. This GPU just finished a DC right before this assignment for M50153029. But, is a LL residue from an LL test like this still trustworthy after it is completed? Any suggestions how to resolve this problem? Even if reliability is not an issue, I would say these errors are using excessive computation time since it has to rollback to the previous checkpoint.

I have attached partial log for an LL test for M87515717, from 6.43% to 26.09%.

Much thanks!

Attachment 19958

I would genuinely advise you to conduct a FFT benchmark and thread benchmark as 5120K doesn't seem to be the fastest FFT in my case (I'm using Pascal/volta so I am not sure about Turing optimization and how it deals with FFT, with 5184K FFT being near the speed of 4608K and much faster than 5120K, it is also able to tolerate higher exponents than 5120K so this is highly advised as you can speed up your work and increase efficiency). Then you would just go to the CUDALucas.ini file to change the FFT at the very bottom as well as the thread as shown in the benchmark list. Refer to the instruction in the ini file for how to input the values.
xx005fs is offline   Reply With Quote
Old 2019-02-27, 15:36   #6
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Pittsburgh, PA

3·7·11 Posts
Default

Quote:
Originally Posted by xx005fs View Post
I would genuinely advise you to conduct a FFT benchmark and thread benchmark as 5120K doesn't seem to be the fastest FFT in my case (I'm using Pascal/volta so I am not sure about Turing optimization and how it deals with FFT, with 5184K FFT being near the speed of 4608K and much faster than 5120K, it is also able to tolerate higher exponents than 5120K so this is highly advised as you can speed up your work and increase efficiency). Then you would just go to the CUDALucas.ini file to change the FFT at the very bottom as well as the thread as shown in the benchmark list. Refer to the instruction in the ini file for how to input the values.
Alright, understood, gonna read readme and run the benchmark.

You're right I noticed that surprisingly after increasing the FFT size the time to complete each iteration decreased lol
dcheuk is offline   Reply With Quote
Old 2019-02-27, 16:01   #7
tServo
 
tServo's Avatar
 
"Marv"
May 2009
near the Tannhäuser Gate

11·47 Posts
Default

Quote:
Originally Posted by dcheuk View Post
Alright, understood, gonna read readme and run the benchmark.

You're right I noticed that surprisingly after increasing the FFT size the time to complete each iteration decreased lol
I suggest you read kriesel's newer, revised readme:
https://www.mersenneforum.org/showpo...84&postcount=6
tServo is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Skylake FMA3 round off error tha Hardware 17 2016-02-07 04:50
Round off error Androx72 Software 2 2013-02-28 00:00
HDT55TWFK6DGR voltage and round off error RickC Hardware 2 2011-02-19 04:07
Error: Round Off??? edorajh Software 27 2007-11-10 06:26
Another Round Off Error Issue PhilF Software 12 2005-07-02 19:03

All times are UTC. The time now is 21:39.

Thu Sep 24 21:39:30 UTC 2020 up 14 days, 18:50, 0 users, load averages: 1.73, 1.81, 1.85

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.