View Single Post
Old 2018-08-12, 15:04   #570
kriesel's Avatar
Mar 2017
US midwest

19·311 Posts
Default RX480 timings in V3.3 and V3.5 OpenOwL

See the attachment. With rare exceptions, the default lengths seem fastest in V3.5. V3.5 is faster than each corresponding V3.3 test, by slight and varying amounts versus fft length. All tests performed on Win 7 X64, same system, same patch state, same driver version, same gpu. Initial iterations are faster than later ones. Most tests were 160k iterations or more in length and ignore the first 800 to 10,000 iterations' speed, and average the rest.

In a nutshell, use the default fft lengths, except for 10M and 18M, use -fft +2 on the command line or in a batch file.
It's possible some further gains could be found by testing carry choices.

A couple of V3.5 test exponents produced errors, indicating the maximum exponent guidance may be set a bit too high.
Attached Files
File Type: pdf openowl v33 and v35 timings.pdf (20.1 KB, 114 views)
kriesel is offline   Reply With Quote