mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   kriesel (https://www.mersenneforum.org/forumdisplay.php?f=154)
-   -   gpuOwL-specific reference material (https://www.mersenneforum.org/showthread.php?t=23391)

kriesel 2018-05-29 02:56

gpuOwL-specific reference material
 
This thread is intended to hold only reference material specifically for GpuOwL.
(Suggestions are welcome. Discussion posts in this thread are not encouraged. Please use the reference material discussion thread [URL]http://www.mersenneforum.org/showthread.php?t=23383[/URL].)


To get started in gpuowl, on linux, see Ernst Mayer's directions: [URL]https://mersenneforum.org/showthread.php?t=25601[/URL]
For Windows, see below.
In either case, note that the computation types, hardware supported, fft size limits, file formats, etc have varied greatly and rapidly over the course of the hundreds of versions. Choose version according to what you want to run and what each offers.

On a completed install of Windows (may as well have done Windows updates to current also):
Enable or install and configure any [B]remote desktop software[/B] you plan to use; Windows Remote Desktop, TightVNC, etc.
[B]Create a working folder[/B]. (I create one for each instance for each gpu.) Do create it as a subdirectory of the user's default folder. Permissions will be ok there.[B] DO NOT[/B] place it in system directories, Program Files, etc. Permission problems are common if attempting to run there.

[B]Get and install the gpuowl software[/B]
Either[INDENT] download a current build from the end of the Gpuowl Windows build thread [URL]https://www.mersenneforum.org/showthread.php?t=25624[/URL] or for an earlier version [URL]https://www.mersenneforum.org/showpost.php?p=488539&postcount=4[/URL], or from the download mirror [URL]https://download.mersenne.ca/[/URL] which has many Windows versions and a few Linux versions. Unzip it into a working folder under the user's home directory (NOT in Program Files or other restricted areas as some have attempted, and run into permissions problems).
[/INDENT]Or[INDENT] follow the Windows build instructions at [URL]https://www.mersenneforum.org/showpost.php?p=532454&postcount=21[/URL] to create a build environment (once) and follow the compile and link section there as needed.
[/INDENT]Next, decide whether you will run it manually or use primenet.py.

I recommend running one instance of gpuowl manually at first, to learn what normal operation looks like, so that if/when issues with operation appear, you're familiar with the program. (Add complexity later, after learning the basics.) It will also give you a greater appreciation for the automation built into primenet.py and other programs such as prime95 and mprime.

If using gpuowl's primenet.py or certain other tools provided for gpuowl, you'll need a Python 3 installation. For Python, follow the instructions of a good one, such as [URL]https://docs.python.org/3/using/windows.html#windows-full[/URL] (I've been exploring compiling primenet.py into a standalone executable, but haven't quite worked out how to get one small enough to post it on the forum as an attachment yet.)

Note, not all gpuowl versions include all the features described below. Some are rather recent additions.

[B]Create a config.txt[/B]
Suggested contents:
-user your-primenet-uid -cpu systemname-gpumodel-number-winstance -device n -maxAlloc gpuram-delta
For example, since my primenet-uid is kriesel, for system asr2, second Radeon VII gpu, instance 2, the gpuram is 16GB total but if I have 2 instances running P-1 on the same gpu and their stage 2s might coincide, I might use maxAlloc= gpuram 8000 - delta 500 =7500 for each instance.
Then the config.txt line for the second gpu, second instance would be
-user kriesel -cpu asr2-radeonvii-2-w2 -device 1 -maxAlloc 7500
(Device numbers start at 0 in gpuowl and some other GIMPS gpu applications.)
Other -options are given in help output. -yield or -power 9 may be useful. Whoever runs the cert will appreciate the reduced effort required for the cert and the overall efficiency. (And occasionally that may be you!)
Format of command line options and config.txt are the same.
However, config.txt must be one line and followed by a return.

[B]Batch file or shell script[/B]
I find it useful to create a short batch file also, and desktop shortcut to it. The batch file can be as simple as g6.bat:[CODE]title %cd%
gpuowl-win[/CODE]Adding distinctive colors for text and background can be useful. Background color associated with gpu, and text color associated with instance number for example.

The shortcut command should not be a direct invocation of the batch file. Use a cmd /k prefix so the window lingers in case of problems, to give you long enough to see and read any error message.

Having the help.txt and use-flags.txt files in the working directory for ready reference is also convenient.

[B]Create a worktodo.txt[/B]
Go to [URL]https://www.mersenne.org/manual_assignment/[/URL] to get a single PRP assignment or PRP DC assignment. The excellent Gerbicz error check GEC) on PRP work will determine whether the gpu is producing reliable interim results. Verify the gpu and system combination is reliable in PRP/GEC, before attempting any P-1 factoring or LL DC, which have less error detection.
Open worktodo.txt for editing.
Paste the assignment into worktodo.txt and follow it with a return.
Save the modified file.

[B]Try running gpuowl[/B] via the batch file at a Windows command prompt: [B]g6[/B]
If it works, it should look something like the following, allowing for differences in parameters entered in config.txt, gpuowl version, exponent, work type, etc. If not, fix and retry.

A PRP start: lines should contain "OK", "EE" instead means errors, trouble, perhaps clocks set too high or unreliable system ram or a failed fan, or the fft length is specified too short for the exponent and computation type.
[CODE]2020-05-29 15:03:44 config: -device 1 -user kriesel -cpu roa/rx550 -use NO_ASM -maxAlloc 1500
2020-05-29 15:03:44 device 1, unique id ''
2020-05-29 15:03:44 roa/rx550 94955299 FFT: 5M 1K:10:256 (18.11 bpw)
2020-05-29 15:03:44 roa/rx550 Expected maximum carry32: 48210000
2020-05-29 15:03:45 roa/rx550 OpenCL args "-DEXP=94955299u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DWEIGHT_STEP=0xe.cfec567b14fd8p-3 -DIWEIGHT_STEP=0x8.a43aff8beae48p-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DPM1=0 -DAMDGPU=1 -DNO_ASM=1 -cl-fast-relaxed-math -cl-std=CL2.0 "
2020-05-29 15:03:52 roa/rx550 OpenCL compilation in 6.31 s
2020-05-29 15:03:58 roa/rx550 94955299 OK 0 loaded: blockSize 400, 0000000000000003
2020-05-29 15:04:16 roa/rx550 94955299 OK 800 0.00%; 14232 us/it; ETA 15d 15:24; 69f923b24568ac18 (check 5.88s)
2020-05-29 15:51:37 roa/rx550 94955299 OK 200000 0.21%; 14233 us/it; ETA 15d 14:38; 986d9b55f22ac736 (check 5.88s)[/CODE] A P-1 run start:
[CODE]2020-08-15 11:30:03 gpuowl v6.11-340-g41d435f
2020-08-15 11:30:04 config: -device 0 -user kriesel -cpu condorella/rx480 -yield -maxAlloc 7500 -proof 8
2020-08-15 11:30:04 device 0, unique id ''
2020-08-15 11:30:04 condorella/rx480 183000023 FFT: 10M 1K:10:512 (17.45 bpw)
2020-08-15 11:30:04 condorella/rx480 Expected maximum carry32: 43400000
2020-08-15 11:30:07 condorella/rx480 OpenCL args "-DEXP=183000023u -DWIDTH=1024u -DSMALL_HEIGHT=512u -DMIDDLE=10u -DPM1=1 -DAMDGPU=1 -DCARRYM64=1 -DWEIGHT_STEP_
MINUS_1=0xe.c72a0862a91p-5 -DIWEIGHT_STEP_MINUS_1=-0xa.1bff0fe0af57p-5 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-08-15 11:30:07 condorella/rx480 ASM compilation failed, retrying compilation using NO_ASM
2020-08-15 11:30:15 condorella/rx480 OpenCL compilation in 8.00 s
2020-08-15 11:30:15 condorella/rx480 183000023 P1 B1=700000, B2=26000000; 1009635 bits; starting at 0
2020-08-15 11:31:24 condorella/rx480 183000023 P1 10000 0.99%; 6840 us/it; ETA 0d 01:54; 58a6331a302132cb
2020-08-15 11:32:32 condorella/rx480 183000023 P1 20000 1.98%; 6877 us/it; ETA 0d 01:53; 38e0ea00e8d3dfcd
2020-08-15 11:33:41 condorella/rx480 183000023 P1 30000 2.97%; 6897 us/it; ETA 0d 01:53; 52d0881a817b7a2d
2020-08-15 11:34:50 condorella/rx480 183000023 P1 40000 3.96%; 6904 us/it; ETA 0d 01:52; 5cc6128ee3cf27dd
2020-08-15 11:35:16 condorella/rx480 saved
2020-08-15 11:36:00 condorella/rx480 183000023 P1 50000 4.95%; 6928 us/it; ETA 0d 01:51; 4980d669e97a532f[/CODE][B]Manually report the result [/B][B][B] first [/B](BEFORE uploading the proof file)[/B]
Open [URL]https://www.mersenne.org/manual_result/[/URL] in a web browser. Verify you're logged in (see upper right of the web page.)
Open the results.txt file in an editor. Copy the result. Paste it into the results field of the web page. Click on "Submit".
Optionally, enter a note in the results.txt file that the results line has been reported. I usually place "reported mm/dd/yy" filling in the date of report, on a separate line after the last reported line. This helps avoid duplicate reports, and scans easily.

[B]Upload the proof file (AFTER reporting the result record)[/B]
There's only a proof file for PRP runs begun with proof-capable versions of gpuowl. It's found in subfolder proof, and has a name composed of the exponent and proof power. For example, exponent 1234567 power 8 would be 1234567-8.proof in folder workingdirectory\proof. There are several possible upload methods listed at [URL]https://www.mersenneforum.org/showpost.php?p=553120&postcount=26[/URL] for which some have the steps described there.

[B]Using multiple instances[/B]
There are two ways that running multiple instances on the same gpu at the same time may increase throughput.
1. Aggregate throughput may be higher running two instances on the same gpu at the same time. Any wait time that occurs for one instance on the gpu while the cpu performs the GEC, or reads from or writes to files or the console, or moves data over the PCIe bus between gpu ram and system ram, may be usable by the other instance.
2. Emptying the worktodo file or halting on an error condition by one instance does not completely stop progress; any other running instance then can use resources the stopped instance is no longer using. A single instance if halted does not leave the gpu idle for hours or days if at least one other instance is still running on it.
Multiple instances are not guaranteed to improve sustained throughput. Throughput seems to be better if the two instances are running similar code and parameters; two 5M fft PRPs for example (not a 5M and an 8M, or P-1 and LLDC).

The conceptually simplest way to run multiple instances is to create a separate working folder with its own set of files including gpuowl executable, same as the first instance. A more compact way is to use the executable in a common folder by all instances of all gpus, IF updating version on all instances simultaneously is ok.

[B]Backups[/B]
Now might be a good time to refresh your Windows backup, system restore point, etc.
Also verify that your gpuowl folders, containing assignments, results, interim save files, configuration, etc. will be backed up.

[B]Using gpuowl's primenet.py[/B]
There's a separate post detailing primenet.py setup and use at [URL]https://www.mersenneforum.org/showpost.php?p=553072&postcount=25[/URL]. It's probably best to rename the results file containing already-reported results before starting to use primenet.py for reporting.

[B]Pool[/B]
I haven't tried it yet, but particularly with multiple instances, multiple gpus of the same type, or both, it appears it could make manual work assignment and result reporting easier. The help output says:
[CODE]-pool <dir> : specify a directory with the shared (pooled) worktodo.txt and results.txt
Multiple GpuOwl instances, each in its own directory, can share a pool of assignments[/CODE]So presumably putting the -pool option in each config.txt does it.


Table of contents
[LIST=1][*] This post[*] Run time versus exponent or fft length for the RX550 or RX480 or Radeon VII of gpuOwL from V1.9 up. Currently up to v7.2-69 on RX480 or Radeon VII [URL]http://www.mersenneforum.org/showpost.php?p=488535&postcount=2[/URL][*] gpuOwL bug and wish list [URL]http://www.mersenneforum.org/showpost.php?p=488537&postcount=3[/URL][*] Getting started with gpuOwL [URL]http://www.mersenneforum.org/showpost.php?p=488539&postcount=4[/URL][*]gpuOwL requirements [URL]http://www.mersenneforum.org/showpost.php?p=489076&postcount=5[/URL][*]Features and requests [URL]http://www.mersenneforum.org/showpost.php?p=489077&postcount=6[/URL][*]Feature / version announcements [URL]http://www.mersenneforum.org/showpost.php?p=489083&postcount=7[/URL][*]Determining upper exponent limit for a transform type and fft length [URL]https://www.mersenneforum.org/showpost.php?p=498231&postcount=8[/URL][*]FFT lengths [URL]https://www.mersenneforum.org/showpost.php?p=499636&postcount=9[/URL][*]PRP-3 run time scaling in V5.0-9c13870 (no P-1) [URL]https://www.mersenneforum.org/showpost.php?p=502776&postcount=10[/URL][*]Gpuowl .owl file header style versus gpuowl version samples [URL]https://www.mersenneforum.org/showpost.php?p=508397&postcount=11[/URL][*]Gpuowl PRP3 continuation compatibility [URL]https://www.mersenneforum.org/showpost.php?p=508637&postcount=12[/URL][*]Validation and verification runs [URL]https://www.mersenneforum.org/showpost.php?p=509291&postcount=13[/URL][*]gpuowl-win V6.5-c48d46f run times on AMD and NVIDIA vs. CUDALucas [URL]https://www.mersenneforum.org/showpost.php?p=517837&postcount=14[/URL][*]Gpuowl residue type etc versus version [URL]https://www.mersenneforum.org/showpost.php?p=519603&postcount=15[/URL][*]Gpuowl v6.5-84-g30c0508 -h help output [URL]https://www.mersenneforum.org/showpost.php?p=523599&postcount=16[/URL][*]Gpuowl P-1 run time scaling on AMD and NVIDIA [URL]https://www.mersenneforum.org/showpost.php?p=525955&postcount=17[/URL][*]Gerbicz error check detection rate [URL]https://www.mersenneforum.org/showpost.php?p=529794&postcount=18[/URL][*]Increased throughput with simultaneous runs [URL]https://www.mersenneforum.org/showpost.php?p=530106&postcount=19[/URL][*]What's a good P-1 factoring strategy? Best? [URL]https://www.mersenneforum.org/showpost.php?p=531129&postcount=20[/URL][*]Compiling Gpuowl [URL]https://www.mersenneforum.org/showpost.php?p=532454&postcount=21[/URL][*]Save file size versus exponent or fft length [URL]https://www.mersenneforum.org/showpost.php?p=546697&postcount=22[/URL][*]Setting up in Linux [URL]https://www.mersenneforum.org/showpost.php?p=546945&postcount=23[/URL][*]Gpuowl gpu ram use scaling with exponent [URL]https://www.mersenneforum.org/showpost.php?p=552023&postcount=24[/URL][*]Using gpuowl's primenet.py on Windows [URL]https://www.mersenneforum.org/showpost.php?p=553072&postcount=25[/URL][*]Methods of uploading proof files [URL]https://www.mersenneforum.org/showpost.php?p=553120&postcount=26[/URL][*]User base OS mix indications [URL]https://www.mersenneforum.org/showpost.php?p=565443&postcount=27[/URL][*]Performance variables for gpuowl [URL]https://www.mersenneforum.org/showpost.php?p=569226&postcount=28[/URL][*]P-1 speed, 103M, V6.11-380 vs. v7.2-53 [URL]https://www.mersenneforum.org/showpost.php?p=574629&postcount=29[/URL][*]Gpuowl automatic P-1 bounds selection vs. version and exponent [URL]https://www.mersenneforum.org/showpost.php?p=582966&postcount=30[/URL][*]etc tbd[/LIST]
Top of reference tree: [URL="https://www.mersenneforum.org/showpost.php?p=521922&postcount=1"]https://www.mersenneforum.org/showpo...22&postcount=1[/URL]

kriesel 2018-05-29 03:10

GpuOwL run time vs exponent or fft length or version
 
5 Attachment(s)
[B]RX550[/B] data

gpuOwl [B]v2[/B] 5000k fft RX550 gpu, MSI 18.2.3 driver Feb 26 2018:in a quick test (~40,000 iterations each) was:
short carry 17.3 ms/iter,
medium 17.6,
long 17.4,
compared to [B]V1.9[/B] gpuOwL on the same gpu, same pcie physical connection, April 2017 MSI driver,
10.9 ms/iter for -fft DP -legacy -size 4M;
18.9 ms/iter -fft M61 -size 4M;
21.4 ms/iter -fft DP -legacy -size 8M.

The driver change coincided with an increase by about 5% of iteration time, on the same gpu, in [B]V1.9[/B] gpuOwL. [URL]http://www.mersenneforum.org/showpost.php?p=484694&postcount=370[/URL]
See the first attachment below for V1.9 on an RX550.

See also the 4-program speed comparison in the general reference thread. [URL]http://www.mersenneforum.org/showpost.php?p=488476&postcount=8[/URL]

For an [B]RX480[/B] my data indicates 3.4-3.6 times faster than RX550,
on the same exponents and gpuOwL versions, at [URL]http://www.mersenneforum.org/showpost.php?p=488173&postcount=386[/URL] and subsequently

An [B]Intel IGP HD620[/B] could run [B]V0.5[/B] or [B]v1.9[/B] but it was not worth doing. On mine the hit on prime95 throughput was larger than the gpuOwL throughput as a result. More detail on the [B]V0.5[/B] try (LL)[B]:[/B] [URL]http://www.mersenneforum.org/showpost.php?p=463570&postcount=176[/URL] (I discontinued running gpuOwl on the IGP. The tradeoff with mfakto there was much better.)
Detail on the [B]V1.9[/B] try (PRP): [URL]http://www.mersenneforum.org/showpost.php?p=478313&postcount=285[/URL]
A listing of [B]V3.5[/B] [B]OpenOwL[/B] command line options and fft lengths can be found at [URL]http://www.mersenneforum.org/showpost.php?p=493684&postcount=565[/URL]

Detail on benchmarking [B]V3.3[/B] and [B]V3.5 OpenOwL[/B] fft lengths on [B]RX480[/B] can be found at [URL]http://www.mersenneforum.org/showpost.php?p=493717&postcount=570[/URL]

Second attachment below tabulates ms/iteration timings for various versions, [B]V3.x - V3.9, V4.6, and V5.0[/B], and fft lengths, on an RX480, and includes some graphs and ratios.

Third attachment compares V6.2, 5.0, 3.8, 2.0, and 1.9. Each are fastest for some fft length / exponent ranges, except v2.0. The trend line fit for asymptotic scaling of the fastest version versus fft length or exponent is iteration time p[SUP]1.078[/SUP], so run time p[SUP]2.078[/SUP], for exponents 100M<p<~2520M (6M to 144M fft length).

Updated timings for RX480 and [B]Radeon VII [/B]under Windows 7 and 10 respectively, up to Gpuowl v7.2-69 are included in the fourth and fifth attachments. These are works in progress currently. (Lots of data points, so reading glasses and zoom.)


Top of reference tree: [URL]https://www.mersenneforum.org/showpost.php?p=521922&postcount=1[/URL]

kriesel 2018-05-29 04:34

gpuOwL bug and wish list
 
2 Attachment(s)
Here is the latest posted version of the list I am maintaining for gpuOwL. As always, this is in appreciation of the authors' past contributions. Users may want to browse this for workarounds included in some of the descriptions, or for an awareness of some known pitfalls. Please respond with any comments, additions or suggestions you may have, preferably by PM to kriesel.


(last updated 2018-12-31)


Top of reference tree: [url]https://www.mersenneforum.org/showpost.php?p=521922&postcount=1[/url]

kriesel 2018-05-29 04:44

Getting started with gpuOwL
 
This is an old post, but kept in place for its documentation of what can be done with the very old builds, and the long list of (mostly Windows, plus rarely Linux) builds available.


See the Available Software guide portion for gpuOwL, for where to get code, a brief summary of capability, and a discussion thread for it. Or scroll to the bottom of this post. Note this was originally written for very early versions, and that has been left in place here for those occasions when an old version is the best tool for a particular task.

It's pretty simple to get started with gpuOwL. Get the version that supports the fft length corresponding to the exponents you want to run, and build it for your operating system, or find a suitable executable someone else has already built. Kracker was kind enough to post build directions for Windows (including setting up a free open source build environment) at [URL]http://www.mersenneforum.org/showpost.php?p=483209&postcount=356[/URL] and a Windows build or two in the past in that same thread.

Install the OpenCL drivers on your system and confirm function with a separate OpenCL query utility.

Make sure you have gpuowl.cl in the working directory. For V1.9, depending on the transform type used, you may want nttshared.h in there too, such as if using -fft M61. (See [URL]http://www.mersenneforum.org/showpost.php?p=471318&postcount=224[/URL])

No ini file. Very little setup.

Manually check out some exponents for PRP test or PRP double check (unless you're using an old version that does LL, get that type of assignment instead) and put those records in a file called worktodo.txt, just as mersenne.org's manual checkout gives them.

You may want to use a small shell script or batch file depending on which OS you're using.

Syntax and options change with gpuowl version.
[URL]https://www.mersenneforum.org/showpost.php?p=482877&postcount=353[/URL]
V0.6 syntax example: [CODE]gpuowl -logstep 5000 -savestep 2000000 -checkstep 250000 -uid kriesel/condorella-rx550[/CODE]For V1.9, which has multiple power-of-two fft lengths, I use a simple batch file as follows (allowing switching options with a couple keystrokes and cutting way down on typos):
[CODE]:set opts=-fft M61 -size 4M
set opts=-legacy
set dev=2

gpuowl -user kriesel -cpu condorella-rx550 -device %dev% -verbosity 2 %opts%[/CODE]Observed memory requirements for 8m fft, 150M exponent, V1.9 gpuOwL, ~475-490MB on gpu, and about 100MB on cpu side (380MB peak). FFT sizes, transforms, and RX550 speeds see [URL]http://www.mersenneforum.org/showpost.php?p=479272&postcount=313[/URL]

For V2 it's also simple, and somewhat differs:
[CODE]gpuowl -device 0 -user kriesel -cpu condorella-rx480 -carry long[/CODE]In my case I needed to update the MSI display adapter driver from April 2017 to Feb 2018 version to get V2.0 gpuOwL to run on an RX550 on Windows 7.
[URL]http://www.mersenneforum.org/showpost.php?p=484694&postcount=370[/URL]

In V2 there's a -step option; see [URL]http://www.mersenneforum.org/showpost.php?p=482877&postcount=353[/URL]

V3.x is different yet. See for example [URL]http://www.mersenneforum.org/showpost.php?p=493684&postcount=565[/URL]
As is V4.x. As is V5.

[B]Code (for Windows unless otherwise indicated)[/B]
For gpuOwL Windows code, and source see [URL]http://www.mersenneforum.org/showthread.php?t=22204[/URL]
An early guide for compiling 0.x on windows with msys64+mingw64 [URL]http://www.mersenneforum.org/showpost.php?p=457343&postcount=26[/URL]
Windows in current versions includes the ability to handle .zip files but does not include support for some other compressed archive forms. IZArc is available for free download. It supports many formats, popular with/for Windows or Linux. [URL]https://www.izarc.org/[/URL]
May 2017 v0.1 version Windows build (kracker) .zip [URL]http://www.mersenneforum.org/showpost.php?p=458596&postcount=112[/URL]
May 2017 V0.3 Windows binary (kracker) .zip [URL]http://www.mersenneforum.org/showpost.php?p=459950&postcount=168[/URL]
Jun 2017 V0.5 Windows binary (kracker) .zip [URL]http://www.mersenneforum.org/showpost.php?p=461034&postcount=170[/URL]
(LL discontinued, PRP with Gerbicz block error check beginning V0.7)

Sep 2017 V1.0 binaries for Windows (kracker) .zip [URL]http://www.mersenneforum.org/showpost.php?p=467125&postcount=190[/URL]
Nov 2017 V1.9 binaries for Windows (kracker) .zip [URL]http://www.mersenneforum.org/showpost.php?p=471663&postcount=226[/URL]
Jan 2018 V1.9 binaries updated for Windows (kracker) .zip [URL]http://www.mersenneforum.org/showpost.php?p=478125&postcount=272[/URL]
Aug 2018 V2.0 binary for Windows 64 bit .exe [URL]http://www.mersenneforum.org/showpost.php?p=493623&postcount=556[/URL]
Aug 2018 V3.3 binary for Windows 64 bit .7z [URL]http://www.mersenneforum.org/showpost.php?p=493635&postcount=558[/URL]
Aug 2018 V3.5 binary for Windows 64 bit .7z [URL]http://www.mersenneforum.org/showpost.php?p=493646&postcount=560[/URL]
Aug 2018 V3.6 binary for Windows 64-bit .7z [URL]http://www.mersenneforum.org/showpost.php?p=493835&postcount=581[/URL]
Aug 2018 V3.8 binary for Windows 64-bit (this and all the above are for OpenCl) .7z [URL="http://www.mersenneforum.org/showpost.php?p=494138&postcount=614"]http://www.mersenneforum.org/showpost.php?p=494169&postcount=615[/URL]
Aug 2018 V3.9 binary for Windows 64 bit .7z [URL]http://www.mersenneforum.org/showpost.php?p=494759&postcount=666[/URL]

Nov 2018 V4.3 binary for Windows 64 bit .7z [URL]https://www.mersenneforum.org/showpost.php?p=499346&postcount=832[/URL]
Nov 2018 V4.6 binary for Windows 64 bit .7z [URL]https://www.mersenneforum.org/showpost.php?p=499301&postcount=828[/URL]
Oct 2018 V4.7 binary for Windows 64 bit .7z [URL]https://www.mersenneforum.org/showpost.php?p=499187&postcount=792[/URL] (not recommended, fails for me)

Nov 2018 V5.0 binary for Windows 64 bit .7z [URL]https://www.mersenneforum.org/showpost.php?p=499313&postcount=831[/URL]
and with some fixes and new shorter fft lengths, .7z [URL]https://www.mersenneforum.org/showpost.php?p=499569&postcount=867[/URL]
v5.0-9c13870 .7z [URL]https://www.mersenneforum.org/showpost.php?p=499592&postcount=869[/URL]

Feb 2019 V6.0 binary for Windows 64 bit .7z [URL]https://www.mersenneforum.org/showpost.php?p=507684&postcount=967[/URL]
V6.1 do not use the posted binary for V6.1 or for an early commit of V6.2. There was a bug that caused primes to be indicated composite in both.
Feb 2019 V6.2 binary for Windows 64 bit .zip [URL]https://www.mersenneforum.org/showpost.php?p=507838&postcount=983[/URL]
Apr 2019 V6.4 binary for Windows 64 bit .zip [URL]https://www.mersenneforum.org/showpost.php?p=513293&postcount=1057[/URL]
May 2019 V6.5 binary for Windows 64 bit ([B]AMD or NVIDIA![/B]) .7z [URL]https://www.mersenneforum.org/showpost.php?p=516704&postcount=1171[/URL]
July 2019 V6.5-84-30c0508 for Windows 64 bit residue type 1 .7z [URL]https://www.mersenneforum.org/showpost.php?p=521225&postcount=1274[/URL]
(V6.6)
V6.7-4-g278407a Windows build .7z [URL]https://www.mersenneforum.org/showpost.php?p=525357&postcount=1343[/URL]
(V6.8)
version uncertain, Woltman's test version .zip file of source suitable for Linux building [URL]https://mersenneforum.org/showpost.php?p=525793&postcount=1364[/URL]
(V6.9)
V6.10-9-g54cba1d Windows build .zip [URL]https://mersenneforum.org/showpost.php?p=526137&postcount=1385[/URL]
V6.11-9-ga9e3189 Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=526331&postcount=1403[/URL]
Woltman's dropbox Windows build .exe [URL]https://mersenneforum.org/showpost.php?p=532379&postcount=1510[/URL]
Another Woltman dropbox version .exe [URL]https://mersenneforum.org/showpost.php?p=532475&postcount=1539[/URL]
V6.11-83-ge270393 Windows build .7z [URL]https://www.mersenneforum.org/showpost.php?p=532695&postcount=1584[/URL]
v6.11-88 build for Windows .7z [URL]https://mersenneforum.org/showpost.php?p=533046&postcount=1629[/URL]
gpuowl v6.11-99-gdd8527b Windows build .7z [URL]https://www.mersenneforum.org/showpost.php?p=533340&postcount=1652[/URL]
v6.11-104-g91ef9a8 .zip [URL]https://mersenneforum.org/showpost.php?p=533641&postcount=1664[/URL]
v6.11-112-gf1b00d1 Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=534222&postcount=1682[/URL]
January 2020 V6.11-116-g5ca090d P-1 PRP assignment split rewrite Windows build .7z [URL]https://www.mersenneforum.org/showpost.php?p=534721&postcount=1740[/URL]
v6.11-132-gfd01ee5 Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=535254&postcount=1787[/URL]
January 2020 V6.11-134-g1e0ce1d Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=535465&postcount=1796[/URL]
February 2020 V6.11-142-gf54af2e Windows build .zip [URL]https://mersenneforum.org/showpost.php?p=537171&postcount=1829[/URL]
v6.11-145-g6146b6d Windows build .zip [URL]https://mersenneforum.org/showpost.php?p=537445&postcount=1840[/URL]
v6.11-147-g3b8b00e Windows build .zip [URL]https://mersenneforum.org/showpost.php?p=537776&postcount=1866[/URL]
v6.11-148-gfc93773 Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=538067&postcount=1877[/URL]
March 2020 v6.11-163-gec98bfe Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=538858&postcount=1903[/URL]
v6.11-198-g628f3cd Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=539991&postcount=1959[/URL]
v6.11-219-ge70ec99 ffts up to 192M Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=540983&postcount=1984[/URL]
v6.11-?-af403e2 (by kracker) the return of LL? Windows build .zip [URL]https://mersenneforum.org/showpost.php?p=542090&postcount=2047[/URL]
v6.11-255-g81fa7c3 max fft 96M Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=542296&postcount=2063[/URL]
v6.11-257-g39fc002 Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=542366&postcount=2073[/URL]
v6.11-259-g83434d8 Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=542830&postcount=2089[/URL]
April 2020 v6.11-264-g5c977d4-dirty Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=542984&postcount=2095[/URL]
v6.11-268-g0d07d21 Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=543181&postcount=2106[/URL]
v6.11-270-gf1fd1f7 Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=543592&postcount=2124[/URL]
v6.11-272-g07718b9 Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=544373&postcount=2139[/URL]
May 2020 v6.11-278-ga39cc1a Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=544891&postcount=2161[/URL]
v6.11-285-gf25ecbd Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=545896&postcount=2179[/URL]
v6.11-288-g20c4213 Jacobi check returns! .7z [URL]https://mersenneforum.org/showpost.php?p=545993&postcount=2202[/URL]
v6.11-292-gecab9ae Windows build .7z [URL]https://mersenneforum.org/showpost.php?p=546211&postcount=2220[/URL]
June 2020 v6.11-295-gaecf041 (the last I could build until ~ -316) .7z [URL]https://mersenneforum.org/showpost.php?p=547253&postcount=2274[/URL]
v6.11-318-g3109989 Windows build, max fft 120M, includes [B]PRP proof[/B] capability .7z [URL]https://mersenneforum.org/showpost.php?p=547899&postcount=1[/URL]
gpuowl for Windows 7 or up 64-bit v6.11-325-g7c09e38 .7z [URL]https://mersenneforum.org/showpost.php?p=548729&postcount=2[/URL]
gpuowl for Windows v611-327-g43cdf1c by Dylan14 .7z [URL]https://mersenneforum.org/showpost.php?p=549325&postcount=3[/URL]
gpuowl commit e5a8f2c for Google Colaboratory Linux environment built by Fan Ming .zip [URL]https://mersenneforum.org/showpost.php?p=549346&postcount=958[/URL]
gpuowl for Windows v6.11-330-ge5a8f2c .7z [URL]https://www.mersenneforum.org/showpost.php?p=549430&postcount=4[/URL]
July 2020 Gpuowl-win v6.11-335-gff60b08 .7z [URL]https://mersenneforum.org/showpost.php?p=549831&postcount=5[/URL]
Gpuowl-win v6.11-340-g41d435f .7z [URL]https://www.mersenneforum.org/showpost.php?p=549987&postcount=6[/URL]
Gpuowl-win v6.11-357-g1f41292 build .7z [URL]https://mersenneforum.org/showpost.php?p=551237&postcount=7[/URL]
Gpuowl-win v6.11-364-g36f4e2a .7z [URL]https://mersenneforum.org/showpost.php?p=551594&postcount=8[/URL]
August 2020 Gpuowl v6.11-366-gf887d6e for Linux Google Colab .7z [URL]https://www.mersenneforum.org/showpost.php?p=555882&postcount=1020[/URL]
(Note, August development focused more on primenet.py and less on the gpuowl executable.)
September 2020 Gpuowl for Linux v6.11-380-g79ea0cc .7z [URL]https://mersenneforum.org/showpost.php?p=556311&postcount=40[/URL]
Gpuowl for Windows v6.11-380-g79ea0cc .7z [URL]https://mersenneforum.org/showpost.php?p=556392&postcount=9[/URL]

October 2020 Gpuowl for Windows v7.0-18-g69c2b85 .7z (LL and standalone P-1 removed, joint P-1/PRP introduced) [URL]https://www.mersenneforum.org/showpost.php?p=559227&postcount=10[/URL]
Gpuowl-win v7.0-26-g8e6a1d1 .7z [URL]https://www.mersenneforum.org/showpost.php?p=559301&postcount=11[/URL]
gpuowl-win v7.0-35-gf06bc5b .7z [URL]https://www.mersenneforum.org/showpost.php?p=559577&postcount=12[/URL]
gpuowl-win v7.0-40-gb62d4fd .7z [URL]https://www.mersenneforum.org/showpost.php?p=559998&postcount=13[/URL]
gpuowl-win v7.0-47-ga8664fe .7z [URL]https://www.mersenneforum.org/showpost.php?p=560162&postcount=14[/URL]
gpuowl-win v7.0-66-gebe49cc .7z [URL]https://www.mersenneforum.org/showpost.php?p=560761&postcount=15[/URL]

Note, do not use the self-verify option with v7.1, or the resulting proof files will be bad.
gpuowl-win v7.1-1-g0f73d04 .7z [URL]https://www.mersenneforum.org/showpost.php?p=560832&postcount=16[/URL]
(Ethan EO multiple vendors' OpenCL flavors) gpuowl-win v7.1-7 .7z [URL]https://www.mersenneforum.org/showpost.php?p=561759&postcount=2558[/URL]
GpuOwl-win v7.1-11-g97cfbd2 2xSP fft experimentation .7z [URL]https://www.mersenneforum.org/showpost.php?p=561349&postcount=17[/URL]

November 2020 GpuOwl-win v7.2-2-ga135d8d .7z [URL]https://www.mersenneforum.org/showpost.php?p=561835&postcount=18[/URL] or [URL="https://www.mersenneforum.org/showpost.php?p=561950&postcount=20"].zip[/URL]
gpuowl-win v7.2-13-g266aed4 .7z [URL]https://www.mersenneforum.org/showpost.php?p=562207&postcount=23[/URL]
gpuowl-win v7.2-21-g28dbf88 .zip [URL]https://www.mersenneforum.org/showpost.php?p=564301&postcount=24[/URL]
Febrary 2021 gpuowl-win v7.2-39-ga87a679 .zip [URL]https://mersenneforum.org/showpost.php?p=571683&postcount=25[/URL]
gpuowl-win v7.2-53-ge27846f [URL]https://mersenneforum.org/showpost.php?p=572345&postcount=26[/URL]
gpuowl-win v7.2-63-ge47361b [url]https://mersenneforum.org/showpost.php?p=572675&postcount=27[/url]
March 2021 gpuowl-win v7.2-69-g23c14a1 [url]https://mersenneforum.org/showpost.php?p=572889&postcount=28[/url]
gpuowl 7.2-70 for Linux [URL]https://mersenneforum.org/showpost.php?p=575273&postcount=3[/URL]


For the current version source (and previous too) [URL]https://github.com/preda/gpuowl[/URL]
A separate forum thread was created for Windows gpuowl build posting. It is [URL="https://www.mersenneforum.org/showthread.php?t=25624"]here[/URL]


Top of reference tree: [URL]https://www.mersenneforum.org/showpost.php?p=521922&postcount=1[/URL]

kriesel 2018-06-03 20:29

gpuOwL requirements
 
My current understanding of the requirements, from the Windows point of view

OpenCL installed, at least version 1.2 if not 2.0. (Some of the more recent versions require OpenCL 2.0)
One or more units of OpenCL compatible hardware, with corresponding driver(s) supporting OpenCL of the required level, such as certain AMD GPUs, Intel IGPs, or CPUs. NVIDIA gpus with compute capability somewhere above 3.0 began to be supported at gpuowl v6.5. Currently GTX10xx and newer NVIDIA model gpus work, and somewhat older too.

gpuOwL below v6.5 does not currently run on NVIDIA gpus, on Linux or Windows, and to my knowledge Preda's releases before that did not. [URL]http://www.mersenneforum.org/showpost.php?p=478186&postcount=277[/URL] Someone reported porting a long ago version for his own use. [URL]http://www.mersenneforum.org/showpost.php?p=458485&postcount=107[/URL]

Per instance, gpuOwL v1.9 on 8M fft length running exponents ~150M, exhibited in Task Manager, ~115MB private working set, 145MB working set, 382MB peak working set on Windows 7 64-bit. Meanwhile GPU occupancy was ~475-490MB each.

Discrete (add-in card) GPUs give better performance because of their dedicated memory. Integrated graphics processors use memory and TDP budget shared with the CPU core(s) and will affect performance of CPU applications. IGPs may lack DP support or otherwise lack compatibility with the AMD-oriented gpuOwL DP code, and so require running V1.9 -fft M61, which is slower.

In case of difficulty, it's recommended to verify the successful installation of OpenCL and compatible drivers with a utility, such as clinfo, oclDeviceQuery.exe, or the advanced tab of GPU-Z (a Windows gpu status graphical utility).

Memory requirements are modest. I'm seeing only about 290MB occupied during 4M fft length -DP transform on an RX550. That may scale to roughly 1.3GB for a future 16M fft implementation, 2.7GB? for 32M, which would not fit on that 2GB card. (It would probably also run way too slowly for that card to be practical, at roughly estimated 2-3 years per exponent.) It illustrates that for primality testing, 4GB is probably enough for a long time.

[B]gpuOwL v2[/B] on Windows 7 sp1 with current updates failed with an MSI-sourced driver dated April 2017 on an MSI RX550.
It worked with the MSI driver 18.2.3 dated Feb 26 2018.
There's a March 23 2018 driver available from AMD, v18.3.4, or probably more recent by now, that I have not tried on v2.0. [URL]https://support.amd.com/en-us/download[/URL]


Top of reference tree: [URL]https://www.mersenneforum.org/showpost.php?p=521922&postcount=1[/URL]

kriesel 2018-06-03 20:38

gpuOwL features and requests
 
[B](caution, some of this is outdated. gpuowl is up to v6.5 as of 2019-05-09.)[/B]

[B]Nice features[/B]
The Gerbicz check, of course, detecting errors and allowing timely rollback to ensure an accurate result.

Full time logging in addition to console display.

The -step argument in gpuOwl
It seems to me after a quick look at the source, the ability for the user to override the automatic program behavior, with a specified constant step count between Gerbicz checks. I interpret this to mean -step requires a step count parameter in the range 1000 to 500000 that are 1, 2, or 5 times a power of ten; 1000 or 2000 or 5000 or 10000 ... 500000. Or perhaps larger also.
It may be both output and Gerbicz check interval.
Two use cases I've run into are:
1) Hardware and software are very stable, exponent is far from an fft length limit, overhead of starting at small step sizes is not necessary, run a large step size from the start. (This case might benefit from adaptive step size after starting large if an error occurs during the run. Also user settable number of consecutive retries if an error occurs)
2) Repeatable error has occurred, such as the exponent is slightly too large for the fft length, I'd like to determine as finely as possible, at what iteration it occurs, with a rerun from last known good save file, using minimum step size until encountering the error again. (This case might benefit from a user set limit of retries (0 - ~9) before giving up on the exponent and starting or resuming the next worktodo entry.) Source fragments supporting the opinion are at [URL]http://www.mersenneforum.org/showpost.php?p=482877&postcount=353[/URL]

The ability to switch transform midstream
Per Preda the author of gpuOwL, the save file is in compacted bits format (independent of the transform). see [URL]http://www.mersenneforum.org/showpost.php?p=479270&postcount=312[/URL]

[B]Feature requests[/B]
Save frequency option
Is there a command line option to control the frequency of saving a disaster-mitigator interim file, which seems to be produced at 10^7 iterations intervals in V1.9? I would like to try running at 5M iteration intervals for safety files. I don't see any in the V1.10 source. I suppose I could run some little batch file.

Gpu operation priority lower for gpuOwL computation or periodic yielding by gpuOwL
When running gpuOwL on the same card running the display, and using the local display rather than remote access, the screen seemed sluggish; I'm not aware of any option in gpuOwL equivalent to the -polite option in CUDALucas, which gives display operations a turn now and then (with frequency user settable).

A port to NVIDIA!

More fft lengths where useful, integrated into one application; 4M and 8M DP and M61 fft; 5000K DP, and anything new.


Top of reference tree: [url]https://www.mersenneforum.org/showpost.php?p=521922&postcount=1[/url]

kriesel 2018-06-03 22:08

Feature / version announcements
 
2 Attachment(s)
Gpuowl began as LL on AMD: initial github commit d5c48dd 2017-04-11
Introducing gpuOwL [URL]http://www.mersenneforum.org/showpost.php?p=457032&postcount=1[/URL] 2017-04-19
V0.2 [URL]http://www.mersenneforum.org/showpost.php?p=459484&postcount=135[/URL] 2017-5-21
V0.3 (offsets) [URL]http://www.mersenneforum.org/showpost.php?p=459777&postcount=147[/URL] 2017-05-26
V0.4 [URL]http://www.mersenneforum.org/showpost.php?p=460586&postcount=169[/URL] 2017-06-05
V0.5 [URL]http://www.mersenneforum.org/showpost.php?p=461098&postcount=171[/URL] 2017-06-12
V0.6 Addition of Jacobi check to LL flavor of gpuOwL, zero offset in my test [URL]http://www.mersenneforum.org/showpost.php?p=465145&postcount=46[/URL] 2017-08-08
Nonzero offset dropped and -supersafe option added [URL]http://www.mersenneforum.org/showpost.php?p=465255&postcount=61[/URL] 2017-08-10

switch from LL to PRP occurs. See also [URL]https://www.mersenneforum.org/showpost.php?p=519603&postcount=15[/URL] for residue type versus version

V0.7 commit ccb7ed2 2017-08-27
V1.0 [URL]http://www.mersenneforum.org/showpost.php?p=466649&postcount=186[/URL] 2017-08-30 PRP residue type 4
V1.1 [URL]http://www.mersenneforum.org/showpost.php?p=467130&postcount=191[/URL]
V1.2-1.4 ?
V1.5 [URL]http://www.mersenneforum.org/showpost.php?p=468932&postcount=223[/URL] 2017-9-30 PRP residue type 1
V1.7 f5198fc 2017-10-26
V1.8 [URL]http://www.mersenneforum.org/showpost.php?p=471318&postcount=224[/URL] 2017-11-08
V1.8 help [URL]http://www.mersenneforum.org/showpost.php?p=471320&postcount=225[/URL] 2017-11-08
V1.9 ?
V1.10 commit 83001d4 2018-01-27 (seen on github [URL]https://github.com/preda/gpuowl/blob/NTT/README.md[/URL])

V2.0 [URL]http://www.mersenneforum.org/showpost.php?p=479585&postcount=320[/URL] 2018-02-07
perf tune and -time option [URL]http://www.mersenneforum.org/showpost.php?p=479644&postcount=331[/URL]
V2.1-2.3 ?

V3.0 ?
V3.1 commit 5495ecf 2018-07-07
V3.2 ?
V3.3 fft lengths 4, 5, 8, 10, 16, 20M [URL]http://www.mersenneforum.org/showpost.php?p=491680&postcount=468[/URL] 2018-07-13
V3.4 ?
V3.5 "Moar fft" A lot more lengths, from 0.5M to 144M (up to ~2.5x10[SUP]9[/SUP] exponent) [URL]http://www.mersenneforum.org/showpost.php?p=491835&postcount=505[/URL] 2018-07-15
V3.6 2018-08-11 commit f7c3865 see [URL]http://www.mersenneforum.org/showpost.php?p=493835&postcount=581[/URL]
V3.7 TF integrated, TF works on OpenCL Linux ROCm 1.8.2 only [URL]http://www.mersenneforum.org/showpost.php?p=494005&postcount=586[/URL] 2018-08-16
V3.8 commit a7ef0e5 2018-08-17
V3.8 fixes [URL]http://www.mersenneforum.org/showpost.php?p=494107&postcount=612[/URL] 2018-08-17
V3.9 commit 4c4e034 2018-08-21

V4.0 commit fe7cd08 2018-09-10
V4.1 commit d77c6f0 2018-09-18
V4.3 PRP & P-1 combined [URL]https://www.mersenneforum.org/showpost.php?p=496446&postcount=694[/URL] 2018-09-20 PRP residue type 4
V4.6 commit bb691cb 2018-10-20
V4.7 commit 12c6b75 2018-10-23 [URL]https://www.mersenneforum.org/showpost.php?p=498605&postcount=765[/URL] and see also post 766

V5.0 commit 1339429 2018-10-24 PRP & two stages of P-1 [URL]https://www.mersenneforum.org/showpost.php?p=499198&postcount=796[/URL]
and see also [URL]https://www.mersenneforum.org/showpost.php?p=499202&postcount=798[/URL] 2018-10-31

V6.0 PRP, a primenet.py script added for getting and queuing work and reporting results, and P-1 has been removed. [URL]https://www.mersenneforum.org/showpost.php?p=504857&postcount=912[/URL] 2019-01-03
[URL]https://www.mersenneforum.org/showpost.php?p=504858&postcount=913[/URL]
V6.1, commit c02a6ce, support for standalone P-1 has been added [URL]https://www.mersenneforum.org/showpost.php?p=506748&postcount=945[/URL]
[URL]https://www.mersenneforum.org/showpost.php?p=506749&postcount=946[/URL]
v6.2, commit 5b26497 2019-01-27, fft lengths up to 160M, some speedups [URL]https://www.mersenneforum.org/showpost.php?p=507015&postcount=956[/URL]
v6.4 commit f6d3153 2019-04-09, added command line options -prp -pm1 [URL]https://www.mersenneforum.org/showpost.php?p=513288&postcount=1056[/URL]
v6.5 added command line option -dir for working directory; max fft length 192M [URL]https://www.mersenneforum.org/showpost.php?p=513506&postcount=1062[/URL]
V6.5-30c0508 switched back from prp residue type 4 to type 1 [URL]https://www.mersenneforum.org/showpost.php?p=521194&postcount=1273[/URL] 2019-07-10
V6.7-4, P-1 on NVIDIA [URL]https://www.mersenneforum.org/showpost.php?p=525357&postcount=1343[/URL] 2019-09-05
v6.8 per-exponent savefile folders [URL]https://www.mersenneforum.org/showpost.php?p=525336&postcount=1335[/URL] 2019-09-06
v6.9 [URL]https://www.mersenneforum.org/showpost.php?p=525696&postcount=1361[/URL]
v6.10-9-g54cba1d P-1 savefiles added [URL]https://www.mersenneforum.org/showpost.php?p=526134&postcount=1384[/URL]
v6.11-9-g9ae3189 NVIDIA CPU yield [URL]https://www.mersenneforum.org/showpost.php?p=526331&postcount=1403[/URL]
v6.11-83-ge270393 increased performance with various -use options [URL]https://www.mersenneforum.org/showpost.php?p=532695&postcount=1584[/URL]

V7.0-18 drops LLDC, merges P-1 into PRP [URL]https://www.mersenneforum.org/showthread.php?t=26007[/URL] 2020-10-07
V7.1 proof V2 [url]https://www.mersenneforum.org/showpost.php?p=560786&postcount=110[/url] 2020-10-22

Top of reference tree: [URL]https://www.mersenneforum.org/showpost.php?p=521922&postcount=1[/URL]

kriesel 2018-10-18 14:30

Determining upper exponent limit for a transform type and fft length
 
1 Attachment(s)
In gpuOwL V1.9, a lengthy experiment was conducted on how to efficiently determine the upper exponent limit for PRP with Gerbicz check for a given fft length and transform type. This was conducted on the M61 transform type and 4M fft length, for which the program author did not know the limit. (It was not possible to do it on 8M fft length, since maximum exponent is higher for the M61 transform than the corresponding DP transform, and the program's maximum exponent was capped at the 8M length DP transform approximate limit. It would also have taken more than 4 times as long.) The calculations were performed on a relatively slow RX550, which contributed significantly to the experiment's calendar duration.

By approaching the limit from above, generating error failures quickly in relatively few iterations, convergence to an approximate limit is achieved much faster than approaching the limit from below, with fully run to completion exponents. This might seem like a lot of iterations would be wasted for runs that generate errrors. However, gpuOwL in v1.9 and later had the useful property of storing interim results in a form that could be continued by a different program version and transform type. So a run that produces errors at a few million or even tens of millions of iterations with one transform type and fft length can be continued to completion by a different program version, transform type, and fft length. Many of the exponents that generated errors as M61 4M, have been run to completion with newer faster DP fft lengths as PRP tests or PRP DC tests on an RX480, as will be a few more still remaining.

Tabulating exponents tried, the success or failure, number of iterations completed, and fits through the failure data and success data separately, produce a good picture of a limit estimate. In this way, it can be determined to fairly close accuracy where the limit of completion lies, while doing work useful to the GIMPS project progression in first-time or double check effort. Tabulating along the way, with spreadsheet-generated regression fits, was used to somewhat guide the selection of next trial exponent. When practical, avoiding overlap with existing assignments was also considered in trial exponent selection.

In the example attachment, about 1.06 exponents' equivalent of failed run iterations, plus 5 completed exponents, were used, to determine a limit value around 83869400, within a span of about 190 out of ~84 million, or 2.24ppm. Approaching strictly from above, stopping when two exponents are completed, and using less closely spaced test exponents, one could reduce the work to less than the equivalent of 3 completed exponent runs.

At a cost of at most one full run per trial, the experiment could be extended to give about one bit more precision in the limit per additional trial. The practical utility of adding more bits precision to the limit is low, since there are only 5 currently unfactored candidates between 83869319 and 83869507, all of which have a LL or PRP result reported currently, and there are considerably faster fft length/transform combinations available now for performing primality or pseudoprimality tests in that exponent range. [URL="https://www.mersenne.org/report_exponent/?exp_lo=83869223&exp_hi=83869507&full=1"]https://www.mersenne.org/report_exponent/?exp_lo=83869319&exp_hi=83869507&full=1[/URL]

It's worthwhile to note that the limit value determined is not a guarantee that an exponent below that value would be certain to run to completion without error. It's merely a limit below which no error was seen, in the hundreds of millions of iterations required for the 4 exponents that completed. The error occurrences are not predictable, within a span of exponent of 15,000, and seem to behave statistically.

The limit of the M61 4M transform appears to occur at about 19.996 bits/word, or approximately 20 - 1/256, somewhere between 83,869,319 and 83869507.


Top of reference tree: [url]https://www.mersenneforum.org/showpost.php?p=521922&postcount=1[/url]

kriesel 2018-11-05 13:32

gpuowl V6.11, 6.7, 6.5, 6.2, 6.0, V5.0-9c13870 fft lengths, and earlier
 
For gpuowl fft lengths, K=1024, M=1024[SUP]2[/SUP]. Prior to theV5.0-9c13870 commit, the M=3 in the V5.0 list were not available.
Prior to V5.0-df2bdf2, <0.5M were not available.
Prior to ~V3.5, the M=5 and M=9 were not available.
V3.3 supported 4, 5, 8, 10, 16, 20M.
V2.0 supported 5000K only.
V1.9 supported 2, 4, and 8M.
V1.0 and earlier were 4M only. (I think the earliest PRP was at V0.7, prior to that it was LL)

V4.3:[CODE]openowl -list fft
2021-03-30 19:51:07 gpuowl 4.3-537c681
2021-03-30 19:51:07 FFT maxExp W H M
2021-03-30 19:51:07 0.5M 10.3M 512 512 1
2021-03-30 19:51:07 1.0M 20.3M 1024 512 1
2021-03-30 19:51:07 1.0M 20.3M 512 1024 1
2021-03-30 19:51:07 2.0M 39.8M 1024 1024 1
2021-03-30 19:51:07 2.0M 39.8M 512 2048 1
2021-03-30 19:51:07 2.0M 39.8M 2048 512 1
2021-03-30 19:51:07 2.5M 49.4M 512 512 5
2021-03-30 19:51:07 4.0M 78.0M 1024 2048 1
2021-03-30 19:51:07 4.0M 78.0M 2048 1024 1
2021-03-30 19:51:07 4.0M 78.0M 4096 512 1
2021-03-30 19:51:07 4.5M 87.5M 512 512 9
2021-03-30 19:51:07 5.0M 96.9M 1024 512 5
2021-03-30 19:51:07 5.0M 96.9M 512 1024 5
2021-03-30 19:51:07 8.0M 153.0M 2048 2048 1
2021-03-30 19:51:07 8.0M 153.0M 4096 1024 1
2021-03-30 19:51:07 9.0M 171.6M 1024 512 9
2021-03-30 19:51:07 9.0M 171.6M 512 1024 9
2021-03-30 19:51:07 10.0M 190.0M 1024 1024 5
2021-03-30 19:51:07 10.0M 190.0M 512 2048 5
2021-03-30 19:51:07 10.0M 190.0M 2048 512 5
2021-03-30 19:51:07 16.0M 300.0M 4096 2048 1
2021-03-30 19:51:07 18.0M 336.3M 1024 1024 9
2021-03-30 19:51:07 18.0M 336.3M 512 2048 9
2021-03-30 19:51:07 18.0M 336.3M 2048 512 9
2021-03-30 19:51:07 20.0M 372.5M 1024 2048 5
2021-03-30 19:51:07 20.0M 372.5M 2048 1024 5
2021-03-30 19:51:07 20.0M 372.5M 4096 512 5
2021-03-30 19:51:07 36.0M 659.0M 1024 2048 9
2021-03-30 19:51:07 36.0M 659.0M 2048 1024 9
2021-03-30 19:51:07 36.0M 659.0M 4096 512 9
2021-03-30 19:51:07 40.0M 730.0M 2048 2048 5
2021-03-30 19:51:07 40.0M 730.0M 4096 1024 5
2021-03-30 19:51:07 72.0M 1290.9M 2048 2048 9
2021-03-30 19:51:07 72.0M 1290.9M 4096 1024 9
2021-03-30 19:51:07 80.0M 1429.8M 4096 2048 5
2021-03-30 19:51:07 144.0M 2527.5M 4096 2048 9[/CODE] V5.0:[CODE]gpuowl 5.0-9c13870
-list fft
FFT maxExp W H M
0.1M 2.6M 256 256 1
0.2M 5.2M 256 512 1
0.2M 5.2M 512 256 1
0.4M 7.7M 256 256 3
0.5M 10.2M 1024 256 1
0.5M 10.2M 256 1024 1
0.5M 10.2M 512 512 1
0.6M 12.7M 256 256 5
0.8M 15.1M 256 512 3
0.8M 15.1M 512 256 3
1.0M 20.0M 1024 512 1
1.0M 20.0M 256 2048 1
1.0M 20.0M 512 1024 1
1.0M 20.0M 2048 256 1
1.1M 22.5M 256 256 9
1.2M 24.9M 256 512 5
1.2M 24.9M 512 256 5
1.5M 29.7M 1024 256 3
1.5M 29.7M 256 1024 3
1.5M 29.7M 512 512 3
2.0M 39.3M 1024 1024 1
2.0M 39.3M 512 2048 1
2.0M 39.3M 2048 512 1
2.0M 39.3M 4096 256 1
2.2M 44.1M 256 512 9
2.2M 44.1M 512 256 9
2.5M 48.9M 1024 256 5
2.5M 48.9M 256 1024 5
2.5M 48.9M 512 512 5
3.0M 58.4M 1024 512 3
3.0M 58.4M 256 2048 3
3.0M 58.4M 512 1024 3
3.0M 58.4M 2048 256 3
4.0M 77.3M 1024 2048 1
4.0M 77.3M 2048 1024 1
4.0M 77.3M 4096 512 1
4.5M 86.7M 1024 256 9
4.5M 86.7M 256 1024 9
4.5M 86.7M 512 512 9
5.0M 96.1M 1024 512 5
5.0M 96.1M 256 2048 5
5.0M 96.1M 512 1024 5
5.0M 96.1M 2048 256 5
6.0M 114.7M 1024 1024 3
6.0M 114.7M 512 2048 3
6.0M 114.7M 2048 512 3
6.0M 114.7M 4096 256 3
8.0M 151.8M 2048 2048 1
8.0M 151.8M 4096 1024 1
9.0M 170.3M 1024 512 9
9.0M 170.3M 256 2048 9
9.0M 170.3M 512 1024 9
9.0M 170.3M 2048 256 9
10.0M 188.7M 1024 1024 5
10.0M 188.7M 512 2048 5
10.0M 188.7M 2048 512 5
10.0M 188.7M 4096 256 5
12.0M 225.3M 1024 2048 3
12.0M 225.3M 2048 1024 3
12.0M 225.3M 4096 512 3
16.0M 298.1M 4096 2048 1
18.0M 334.3M 1024 1024 9
18.0M 334.3M 512 2048 9
18.0M 334.3M 2048 512 9
18.0M 334.3M 4096 256 9
20.0M 370.4M 1024 2048 5
20.0M 370.4M 2048 1024 5
20.0M 370.4M 4096 512 5
24.0M 442.3M 2048 2048 3
24.0M 442.3M 4096 1024 3
36.0M 656.2M 1024 2048 9
36.0M 656.2M 2048 1024 9
36.0M 656.2M 4096 512 9
40.0M 727.0M 2048 2048 5
40.0M 727.0M 4096 1024 5
48.0M 868.1M 4096 2048 3
72.0M 1287.5M 2048 2048 9
72.0M 1287.5M 4096 1024 9
80.0M 1426.4M 4096 2048 5
144.0M 2525.2M 4096 2048 9[/CODE]V6.0:[CODE]C:\msys64\home\ken\gpuowl-compile\v6.0-b7bb1c3>openowl -list fft
2019-02-04 23:05:21 gpuowl 6.0-b7bb1c3
2019-02-04 23:05:21 -list fft
2019-02-04 23:05:21 FFT 8K [ 0.01M - 0.18M] 64-64
2019-02-04 23:05:21 FFT 24K [ 0.04M - 0.51M] 64-64-3
2019-02-04 23:05:21 FFT 32K [ 0.05M - 0.68M] 64-256 256-64
2019-02-04 23:05:21 FFT 40K [ 0.06M - 0.85M] 64-64-5
2019-02-04 23:05:21 FFT 64K [ 0.10M - 1.34M] 64-512 512-64
2019-02-04 23:05:21 FFT 72K [ 0.11M - 1.50M] 64-64-9
2019-02-04 23:05:21 FFT 96K [ 0.15M - 1.99M] 64-256-3 256-64-3
2019-02-04 23:05:21 FFT 128K [ 0.20M - 2.63M] 1K-64 64-1K 256-256
2019-02-04 23:05:21 FFT 160K [ 0.25M - 3.27M] 64-256-5 256-64-5
2019-02-04 23:05:21 FFT 192K [ 0.29M - 3.91M] 64-512-3 512-64-3
2019-02-04 23:05:21 FFT 256K [ 0.39M - 5.18M] 64-2K 256-512 512-256 2K-64
2019-02-04 23:05:21 FFT 288K [ 0.44M - 5.81M] 64-256-9 256-64-9
2019-02-04 23:05:21 FFT 320K [ 0.49M - 6.44M] 64-512-5 512-64-5
2019-02-04 23:05:21 FFT 384K [ 0.59M - 7.69M] 1K-64-3 64-1K-3 256-256-3
2019-02-04 23:05:21 FFT 512K [ 0.79M - 10.18M] 1K-256 256-1K 512-512 4K-64
2019-02-04 23:05:21 FFT 576K [ 0.88M - 11.42M] 64-512-9 512-64-9
2019-02-04 23:05:21 FFT 640K [ 0.98M - 12.66M] 1K-64-5 64-1K-5 256-256-5
2019-02-04 23:05:21 FFT 768K [ 1.18M - 15.12M] 64-2K-3 256-512-3 512-256-3 2K-64-3
2019-02-04 23:05:21 FFT 1M [ 1.57M - 20.02M] 1K-512 256-2K 512-1K 2K-256
2019-02-04 23:05:21 FFT 1152K [ 1.77M - 22.45M] 1K-64-9 64-1K-9 256-256-9
2019-02-04 23:05:21 FFT 1280K [ 1.97M - 24.88M] 64-2K-5 256-512-5 512-256-5 2K-64-5
2019-02-04 23:05:21 FFT 1536K [ 2.36M - 29.72M] 1K-256-3 256-1K-3 512-512-3 4K-64-3
2019-02-04 23:05:21 FFT 2M [ 3.15M - 39.34M] 1K-1K 512-2K 2K-512 4K-256
2019-02-04 23:05:21 FFT 2304K [ 3.54M - 44.13M] 64-2K-9 256-512-9 512-256-9 2K-64-9
2019-02-04 23:05:21 FFT 2560K [ 3.93M - 48.90M] 1K-256-5 256-1K-5 512-512-5 4K-64-5
2019-02-04 23:05:21 FFT 3M [ 4.72M - 58.41M] 1K-512-3 256-2K-3 512-1K-3 2K-256-3
2019-02-04 23:05:21 FFT 4M [ 6.29M - 77.30M] 1K-2K 2K-1K 4K-512
2019-02-04 23:05:21 FFT 4608K [ 7.08M - 86.70M] 1K-256-9 256-1K-9 512-512-9 4K-64-9
2019-02-04 23:05:21 FFT 5M [ 7.86M - 96.07M] 1K-512-5 256-2K-5 512-1K-5 2K-256-5
2019-02-04 23:05:21 FFT 6M [ 9.44M - 114.74M] 1K-1K-3 512-2K-3 2K-512-3 4K-256-3
2019-02-04 23:05:21 FFT 8M [ 12.58M - 151.83M] 2K-2K 4K-1K
2019-02-04 23:05:21 FFT 9M [ 14.16M - 170.28M] 1K-512-9 256-2K-9 512-1K-9 2K-256-9
2019-02-04 23:05:21 FFT 10M [ 15.73M - 188.68M] 1K-1K-5 512-2K-5 2K-512-5 4K-256-5
2019-02-04 23:05:21 FFT 12M [ 18.87M - 225.32M] 1K-2K-3 2K-1K-3 4K-512-3
2019-02-04 23:05:21 FFT 16M [ 25.17M - 298.13M] 4K-2K
2019-02-04 23:05:21 FFT 18M [ 28.31M - 334.34M] 1K-1K-9 512-2K-9 2K-512-9 4K-256-9
2019-02-04 23:05:21 FFT 20M [ 31.46M - 370.44M] 1K-2K-5 2K-1K-5 4K-512-5
2019-02-04 23:05:21 FFT 24M [ 37.75M - 442.34M] 2K-2K-3 4K-1K-3
2019-02-04 23:05:21 FFT 36M [ 56.62M - 656.22M] 1K-2K-9 2K-1K-9 4K-512-9
2019-02-04 23:05:21 FFT 40M [ 62.91M - 727.03M] 2K-2K-5 4K-1K-5
2019-02-04 23:05:21 FFT 48M [ 75.50M - 868.07M] 4K-2K-3
2019-02-04 23:05:21 FFT 72M [113.25M - 1287.53M] 2K-2K-9 4K-1K-9
2019-02-04 23:05:21 FFT 80M [125.83M - 1426.38M] 4K-2K-5
2019-02-04 23:05:21 FFT 144M [226.49M - 2525.23M] 4K-2K-9[/CODE]V6.2-4a213af:[CODE]FFT Configurations:
FFT 8K [ 0.01M - 0.18M] 64-64
FFT 32K [ 0.05M - 0.68M] 64-256 256-64
FFT 48K [ 0.07M - 1.01M] 64-64-6
FFT 64K [ 0.10M - 1.34M] 64-512 512-64
FFT 72K [ 0.11M - 1.50M] 64-64-9
FFT 80K [ 0.12M - 1.66M] 64-64-10
FFT 128K [ 0.20M - 2.63M] 1K-64 64-1K 256-256
FFT 192K [ 0.29M - 3.91M] 64-256-6 256-64-6
FFT 256K [ 0.39M - 5.18M] 64-2K 256-512 512-256 2K-64
FFT 288K [ 0.44M - 5.81M] 64-256-9 256-64-9
FFT 320K [ 0.49M - 6.44M] 64-256-10 256-64-10
FFT 384K [ 0.59M - 7.69M] 64-512-6 512-64-6
FFT 512K [ 0.79M - 10.18M] 1K-256 256-1K 512-512 4K-64
FFT 576K [ 0.88M - 11.42M] 64-512-9 512-64-9
FFT 640K [ 0.98M - 12.66M] 64-512-10 512-64-10
FFT 768K [ 1.18M - 15.12M] 1K-64-6 64-1K-6 256-256-6
FFT 1M [ 1.57M - 20.02M] 1K-512 256-2K 512-1K 2K-256
FFT 1152K [ 1.77M - 22.45M] 1K-64-9 64-1K-9 256-256-9
FFT 1280K [ 1.97M - 24.88M] 1K-64-10 64-1K-10 256-256-10
FFT 1536K [ 2.36M - 29.72M] 64-2K-6 256-512-6 512-256-6 2K-64-6
FFT 2M [ 3.15M - 39.34M] 1K-1K 512-2K 2K-512 4K-256
FFT 2304K [ 3.54M - 44.13M] 64-2K-9 256-512-9 512-256-9 2K-64-9
FFT 2560K [ 3.93M - 48.90M] 64-2K-10 256-512-10 512-256-10 2K-64-10
FFT 3M [ 4.72M - 58.41M] 1K-256-6 256-1K-6 512-512-6 4K-64-6
FFT 4M [ 6.29M - 77.30M] 1K-2K 2K-1K 4K-512
FFT 4608K [ 7.08M - 86.70M] 1K-256-9 256-1K-9 512-512-9 4K-64-9
FFT 5M [ 7.86M - 96.07M] 1K-256-10 256-1K-10 512-512-10 4K-64-10
FFT 6M [ 9.44M - 114.74M] 1K-512-6 256-2K-6 512-1K-6 2K-256-6
FFT 8M [ 12.58M - 151.83M] 2K-2K 4K-1K
FFT 9M [ 14.16M - 170.28M] 1K-512-9 256-2K-9 512-1K-9 2K-256-9
FFT 10M [ 15.73M - 188.68M] 1K-512-10 256-2K-10 512-1K-10 2K-256-10
FFT 12M [ 18.87M - 225.32M] 1K-1K-6 512-2K-6 2K-512-6 4K-256-6
FFT 16M [ 25.17M - 298.13M] 4K-2K
FFT 18M [ 28.31M - 334.34M] 1K-1K-9 512-2K-9 2K-512-9 4K-256-9
FFT 20M [ 31.46M - 370.44M] 1K-1K-10 512-2K-10 2K-512-10 4K-256-10
FFT 24M [ 37.75M - 442.34M] 1K-2K-6 2K-1K-6 4K-512-6
FFT 36M [ 56.62M - 656.22M] 1K-2K-9 2K-1K-9 4K-512-9
FFT 40M [ 62.91M - 727.03M] 1K-2K-10 2K-1K-10 4K-512-10
FFT 48M [ 75.50M - 868.07M] 2K-2K-6 4K-1K-6
FFT 72M [113.25M - 1287.53M] 2K-2K-9 4K-1K-9
FFT 80M [125.83M - 1426.38M] 2K-2K-10 4K-1K-10
FFT 96M [150.99M - 1702.92M] 4K-2K-6
FFT 144M [226.49M - 2525.23M] 4K-2K-9
FFT 160M [251.66M - 2797.39M] 4K-2K-10[/CODE]As far as I know, up to and including some commit of V6.5 the fft list is the same as for V6.2.
For v6.5-c48d46f (but note, don't use combinations with height 64 and a middle step; [URL="https://www.mersenneforum.org/showpost.php?p=517774&postcount=1204):"]https://www.mersenneforum.org/showpost.php?p=517774&postcount=1204[/URL] assuming the fft list is again W H Middle, that's the ones below in bold):[CODE]
FFT Configurations:
FFT 8K [ 0.01M - 0.18M] 64-64
FFT 32K [ 0.05M - 0.68M] 64-256 256-64
FFT 48K [ 0.07M - 1.01M] [B]64-64-6[/B]
FFT 64K [ 0.10M - 1.34M] 64-512 512-64
FFT 72K [ 0.11M - 1.50M] [B]64-64-9[/B]
FFT 80K [ 0.12M - 1.66M] [B]64-64-10[/B]
FFT 128K [ 0.20M - 2.63M] 1K-64 64-1K 256-256
FFT 192K [ 0.29M - 3.91M] 64-256-6 [B]256-64-6[/B]
FFT 256K [ 0.39M - 5.18M] 64-2K 256-512 512-256 2K-64
FFT 288K [ 0.44M - 5.81M] 64-256-9 [B]256-64-9[/B]
FFT 320K [ 0.49M - 6.44M] 64-256-10 [B]256-64-10[/B]
FFT 384K [ 0.59M - 7.69M] 64-512-6 [B]512-64-6[/B]
FFT 512K [ 0.79M - 10.18M] 1K-256 256-1K 512-512 4K-64
FFT 576K [ 0.88M - 11.42M] 64-512-9 [B]512-64-9[/B]
FFT 640K [ 0.98M - 12.66M] 64-512-10 [B]512-64-10[/B]
FFT 768K [ 1.18M - 15.12M] [B]1K-64-6[/B] 64-1K-6 256-256-6
FFT 1M [ 1.57M - 20.02M] 1K-512 256-2K 512-1K 2K-256
FFT 1152K [ 1.77M - 22.45M] [B]1K-64-9[/B] 64-1K-9 256-256-9
FFT 1280K [ 1.97M - 24.88M] [B]1K-64-10[/B] 64-1K-10 256-256-10
FFT 1536K [ 2.36M - 29.72M] 64-2K-6 256-512-6 512-256-6 [B]2K-64-6 [/B]
FFT 2M [ 3.15M - 39.34M] 1K-1K 512-2K 2K-512 4K-256
FFT 2304K [ 3.54M - 44.13M] 64-2K-9 256-512-9 512-256-9 [B]2K-64-9[/B]
FFT 2560K [ 3.93M - 48.90M] 64-2K-10 256-512-10 512-256-10 [B]2K-64-10[/B]
FFT 3M [ 4.72M - 58.41M] 1K-256-6 256-1K-6 512-512-6 [B]4K-64-6[/B]
FFT 4M [ 6.29M - 77.30M] 1K-2K 2K-1K 4K-512
FFT 4608K [ 7.08M - 86.70M] 1K-256-9 256-1K-9 512-512-9 [B]4K-64-9[/B]
FFT 5M [ 7.86M - 96.07M] 1K-256-10 256-1K-10 512-512-10 [B]4K-64-10[/B]
FFT 6M [ 9.44M - 114.74M] 1K-512-6 256-2K-6 512-1K-6 2K-256-6
FFT 8M [ 12.58M - 151.83M] 2K-2K 4K-1K
FFT 9M [ 14.16M - 170.28M] 1K-512-9 256-2K-9 512-1K-9 2K-256-9
FFT 10M [ 15.73M - 188.68M] 1K-512-10 256-2K-10 512-1K-10 2K-256-10
FFT 12M [ 18.87M - 225.32M] 1K-1K-6 512-2K-6 2K-512-6 4K-256-6
FFT 16M [ 25.17M - 298.13M] 4K-2K
FFT 18M [ 28.31M - 334.34M] 1K-1K-9 512-2K-9 2K-512-9 4K-256-9
FFT 20M [ 31.46M - 370.44M] 1K-1K-10 512-2K-10 2K-512-10 4K-256-10
FFT 24M [ 37.75M - 442.34M] 1K-2K-6 2K-1K-6 4K-512-6
FFT 36M [ 56.62M - 656.22M] 1K-2K-9 2K-1K-9 4K-512-9
FFT 40M [ 62.91M - 727.03M] 1K-2K-10 2K-1K-10 4K-512-10
FFT 48M [ 75.50M - 868.07M] 2K-2K-6 4K-1K-6
FFT 72M [113.25M - 1287.53M] 2K-2K-9 4K-1K-9
FFT 80M [125.83M - 1426.38M] 2K-2K-10 4K-1K-10
FFT 96M [150.99M - 1702.92M] 4K-2K-6
FFT 144M [226.49M - 2525.23M] 4K-2K-9
FFT 160M [251.66M - 2797.39M] 4K-2K-10[/CODE]At some point, between v6.5-c48d46f 5/7/19 and v6.5-76-g1ca08e2 5/27/19 there was a change to up to 192M fft sizes allowing approximately gigadigit tests. V6.5-84-g30c0508 is the last build I have indicating FFT 192M [301.99M - [B]3339.40M[/B]] 4K-2K-12.(upper limit 1.005gigadigits). I think that's due to experience with error rates, limits specific to fft lengths were revised downward slightly.

Following are for V6.7-4.[CODE]FFT Configurations:
FFT 8K [ 0.01M - 0.17M] 64-64
FFT 32K [ 0.05M - 0.68M] 64-256 256-64
FFT 64K [ 0.10M - 1.33M] 64-512 512-64
FFT 128K [ 0.20M - 2.62M] 1K-64 64-1K 256-256
FFT 192K [ 0.29M - 3.89M] 64-256-6
FFT 224K [ 0.34M - 4.52M] 64-256-7
FFT 256K [ 0.39M - 5.15M] 64-2K 256-512 512-256 2K-64
FFT 288K [ 0.44M - 5.77M] 64-256-9
FFT 320K [ 0.49M - 6.40M] 64-256-10
FFT 352K [ 0.54M - 7.02M] 64-256-11
FFT 384K [ 0.59M - 7.64M] 64-256-12 64-512-6
FFT 448K [ 0.69M - 8.88M] 64-512-7
FFT 512K [ 0.79M - 10.12M] 1K-256 256-1K 512-512 4K-64
FFT 576K [ 0.88M - 11.35M] 64-512-9
FFT 640K [ 0.98M - 12.58M] 64-512-10
FFT 704K [ 1.08M - 13.81M] 64-512-11
FFT 768K [ 1.18M - 15.03M] 64-512-12 64-1K-6 256-256-6
FFT 896K [ 1.38M - 17.47M] 64-1K-7 256-256-7
FFT 1M [ 1.57M - 19.89M] 1K-512 256-2K 512-1K 2K-256
FFT 1152K [ 1.77M - 22.32M] 64-1K-9 256-256-9
FFT 1280K [ 1.97M - 24.73M] 64-1K-10 256-256-10
FFT 1408K [ 2.16M - 27.14M] 64-1K-11 256-256-11
FFT 1536K [ 2.36M - 29.54M] 64-1K-12 64-2K-6 256-256-12 256-512-6 512-256-6
FFT 1792K [ 2.75M - 34.33M] 64-2K-7 256-512-7 512-256-7
FFT 2M [ 3.15M - 39.10M] 1K-1K 512-2K 2K-512 4K-256
FFT 2304K [ 3.54M - 43.85M] 64-2K-9 256-512-9 512-256-9
FFT 2560K [ 3.93M - 48.59M] 64-2K-10 256-512-10 512-256-10
FFT 2816K [ 4.33M - 53.32M] 64-2K-11 256-512-11 512-256-11
FFT 3M [ 4.72M - 58.04M] 1K-256-6 64-2K-12 256-512-12 256-1K-6 512-256-12 512-512-6
FFT 3584K [ 5.51M - 67.44M] 1K-256-7 256-1K-7 512-512-7
FFT 4M [ 6.29M - 76.81M] 1K-2K 2K-1K 4K-512
FFT 4608K [ 7.08M - 86.15M] 1K-256-9 256-1K-9 512-512-9
FFT 5M [ 7.86M - 95.46M] 1K-256-10 256-1K-10 512-512-10
FFT 5632K [ 8.65M - 104.74M] 1K-256-11 256-1K-11 512-512-11
FFT 6M [ 9.44M - 114.00M] 1K-256-12 1K-512-6 256-1K-12 256-2K-6 512-512-12 512-1K-6 2K-256-6
FFT 7M [ 11.01M - 132.46M] 1K-512-7 256-2K-7 512-1K-7 2K-256-7
FFT 8M [ 12.58M - 150.85M] 2K-2K 4K-1K
FFT 9M [ 14.16M - 169.18M] 1K-512-9 256-2K-9 512-1K-9 2K-256-9
FFT 10M [ 15.73M - 187.45M] 1K-512-10 256-2K-10 512-1K-10 2K-256-10
FFT 11M [ 17.30M - 205.67M] 1K-512-11 256-2K-11 512-1K-11 2K-256-11
FFT 12M [ 18.87M - 223.85M] 1K-512-12 1K-1K-6 256-2K-12 512-1K-12 512-2K-6 2K-256-12 2K-512-6 4K-256-6
FFT 14M [ 22.02M - 260.08M] 1K-1K-7 512-2K-7 2K-512-7 4K-256-7
FFT 16M [ 25.17M - 296.17M] 4K-2K
FFT 18M [ 28.31M - 332.13M] 1K-1K-9 512-2K-9 2K-512-9 4K-256-9
FFT 20M [ 31.46M - 367.98M] 1K-1K-10 512-2K-10 2K-512-10 4K-256-10
FFT 22M [ 34.60M - 403.74M] 1K-1K-11 512-2K-11 2K-512-11 4K-256-11
FFT 24M [ 37.75M - 439.40M] 1K-1K-12 1K-2K-6 512-2K-12 2K-512-12 2K-1K-6 4K-256-12 4K-512-6
FFT 28M [ 44.04M - 510.47M] 1K-2K-7 2K-1K-7 4K-512-7
FFT 36M [ 56.62M - 651.81M] 1K-2K-9 2K-1K-9 4K-512-9
FFT 40M [ 62.91M - 722.13M] 1K-2K-10 2K-1K-10 4K-512-10
FFT 44M [ 69.21M - 792.25M] 1K-2K-11 2K-1K-11 4K-512-11
FFT 48M [ 75.50M - 862.18M] 1K-2K-12 2K-1K-12 2K-2K-6 4K-512-12 4K-1K-6
FFT 56M [ 88.08M - 1001.57M] 2K-2K-7 4K-1K-7
FFT 72M [113.25M - 1278.70M] 2K-2K-9 4K-1K-9
FFT 80M [125.83M - 1416.57M] 2K-2K-10 4K-1K-10
FFT 88M [138.41M - 1554.04M] 2K-2K-11 4K-1K-11
FFT 96M [150.99M - 1691.15M] 2K-2K-12 4K-1K-12 4K-2K-6
FFT 112M [176.16M - 1964.39M] 4K-2K-7
FFT 144M [226.49M - 2507.57M] 4K-2K-9
FFT 160M [251.66M - 2777.78M] 4K-2K-10
FFT 176M [276.82M - 3047.18M] 4K-2K-11
FFT 192M [301.99M - 3315.86M] 4K-2K-12[/CODE]As of V6.11-219, the list was:[CODE]FFT Configurations:
FFT 128K [ 0.20M - 2.62M] 256-256
FFT 256K [ 0.39M - 5.15M] 256-512 512-256
FFT 512K [ 0.79M - 10.12M] 1K-256 256-256-4 256-1K 512-512
FFT 768K [ 1.18M - 15.03M] 256-256-6
FFT 896K [ 1.38M - 17.47M] 256-256-7
FFT 1M [ 1.57M - 19.89M] 1K-512 256-256-8 256-512-4 256-2K 512-256-4 512-1K 2K-256
FFT 1152K [ 1.77M - 22.32M] 256-256-9
FFT 1280K [ 1.97M - 24.73M] 256-256-10
FFT 1408K [ 2.16M - 27.14M] 256-256-11
FFT 1536K [ 2.36M - 29.54M] 256-256-12 256-512-6 512-256-6
FFT 1792K [ 2.75M - 34.33M] 256-512-7 512-256-7
FFT 2M [ 3.15M - 39.10M] 1K-256-4 1K-1K 256-512-8 256-1K-4 512-256-8 512-512-4 512-2K 2K-512 4K-256
FFT 2304K [ 3.54M - 43.85M] 256-512-9 512-256-9
FFT 2560K [ 3.93M - 48.59M] 256-512-10 512-256-10
FFT 2816K [ 4.33M - 53.32M] 256-512-11 512-256-11
FFT 3M [ 4.72M - 58.04M] 1K-256-6 256-512-12 256-1K-6 512-256-12 512-512-6
FFT 3584K [ 5.51M - 67.44M] 1K-256-7 256-1K-7 512-512-7
FFT 4M [ 6.29M - 76.81M] 1K-256-8 1K-512-4 1K-2K 256-1K-8 256-2K-4 512-512-8 512-1K-4 2K-256-4 2K-1K 4K-512
FFT 4608K [ 7.08M - 86.15M] 1K-256-9 256-1K-9 512-512-9
FFT 5M [ 7.86M - 95.46M] 1K-256-10 256-1K-10 512-512-10
FFT 5632K [ 8.65M - 104.74M] 1K-256-11 256-1K-11 512-512-11
FFT 6M [ 9.44M - 114.00M] 1K-256-12 1K-512-6 256-1K-12 256-2K-6 512-512-12 512-1K-6 2K-256-6
FFT 7M [ 11.01M - 132.46M] 1K-512-7 256-2K-7 512-1K-7 2K-256-7
FFT 8M [ 12.58M - 150.85M] 1K-512-8 1K-1K-4 256-2K-8 512-1K-8 512-2K-4 2K-256-8 2K-512-4 2K-2K 4K-256-4 4K-1K
FFT 9M [ 14.16M - 169.18M] 1K-512-9 256-2K-9 512-1K-9 2K-256-9
FFT 10M [ 15.73M - 187.45M] 1K-512-10 256-2K-10 512-1K-10 2K-256-10
FFT 11M [ 17.30M - 205.67M] 1K-512-11 256-2K-11 512-1K-11 2K-256-11
FFT 12M [ 18.87M - 223.85M] 1K-512-12 1K-1K-6 256-2K-12 512-1K-12 512-2K-6 2K-256-12 2K-512-6 4K-256-6
FFT 14M [ 22.02M - 260.08M] 1K-1K-7 512-2K-7 2K-512-7 4K-256-7
FFT 16M [ 25.17M - 296.17M] 1K-1K-8 1K-2K-4 512-2K-8 2K-512-8 2K-1K-4 4K-256-8 4K-512-4 4K-2K
FFT 18M [ 28.31M - 332.13M] 1K-1K-9 512-2K-9 2K-512-9 4K-256-9
FFT 20M [ 31.46M - 367.98M] 1K-1K-10 512-2K-10 2K-512-10 4K-256-10
FFT 22M [ 34.60M - 403.74M] 1K-1K-11 512-2K-11 2K-512-11 4K-256-11
FFT 24M [ 37.75M - 439.40M] 1K-1K-12 1K-2K-6 512-2K-12 2K-512-12 2K-1K-6 4K-256-12 4K-512-6
FFT 28M [ 44.04M - 510.47M] 1K-2K-7 2K-1K-7 4K-512-7
FFT 32M [ 50.33M - 581.27M] 1K-2K-8 2K-1K-8 2K-2K-4 4K-512-8 4K-1K-4
FFT 36M [ 56.62M - 651.81M] 1K-2K-9 2K-1K-9 4K-512-9
FFT 40M [ 62.91M - 722.13M] 1K-2K-10 2K-1K-10 4K-512-10
FFT 44M [ 69.21M - 792.25M] 1K-2K-11 2K-1K-11 4K-512-11
FFT 48M [ 75.50M - 862.18M] 1K-2K-12 2K-1K-12 2K-2K-6 4K-512-12 4K-1K-6
FFT 56M [ 88.08M - 1001.57M] 2K-2K-7 4K-1K-7
FFT 64M [100.66M - 1140.39M] 2K-2K-8 4K-1K-8 4K-2K-4
FFT 72M [113.25M - 1278.70M] 2K-2K-9 4K-1K-9
FFT 80M [125.83M - 1416.57M] 2K-2K-10 4K-1K-10
FFT 88M [138.41M - 1554.04M] 2K-2K-11 4K-1K-11
FFT 96M [150.99M - 1691.15M] 2K-2K-12 4K-1K-12 4K-2K-6
FFT 112M [176.16M - 1964.39M] 4K-2K-7
FFT 128M [201.33M - 2236.48M] 4K-2K-8
FFT 144M [226.49M - 2507.57M] 4K-2K-9
FFT 160M [251.66M - 2777.78M] 4K-2K-10
FFT 176M [276.82M - 3047.18M] 4K-2K-11
FFT 192M [301.99M - 3315.86M] 4K-2K-12[/CODE]Subsequently the maximum fft was pared back by v6.11-255-g81fa7c3 to 96M. By v6.11-318-g3109989 the max fft became 120M allowing ~2[SUP]31[/SUP] max exponent. As of v6.11-330-ge5a8f2c, the fft lengths are:[CODE]FFT Configurations (specify with -fft <width>:<middle>:<height> from the set below):
FFT 128K [ 0.20M - 2.63M] 256:1:256
FFT 256K [ 0.39M - 5.18M] 256:1:512 512:1:256
FFT 384K [ 0.59M - 7.72M] 256:3:256
FFT 512K [ 0.79M - 10.25M] 256:4:256
FFT 640K [ 0.98M - 12.72M] 256:5:256
FFT 768K [ 1.18M - 15.22M] 256:6:256 256:3:512 512:3:256
FFT 896K [ 1.38M - 17.68M] 256:7:256
FFT 1M [ 1.57M - 20.20M] 256:8:256 256:4:512 512:4:256
FFT 1152K [ 1.77M - 22.62M] 256:9:256
FFT 1.25M [ 1.97M - 25.07M] 256:10:256 256:5:512 512:5:256
FFT 1408K [ 2.16M - 27.52M] 256:11:256
FFT 1.50M [ 2.36M - 30.00M] 1K:3:256 256:12:256 256:6:512 256:3:1K 512:6:256 512:3:512
FFT 1664K [ 2.56M - 32.44M] 256:13:256
FFT 1.75M [ 2.75M - 34.85M] 256:14:256 256:7:512 512:7:256
FFT 1920K [ 2.95M - 37.23M] 256:15:256
FFT 2M [ 3.15M - 39.82M] 1K:4:256 256:8:512 256:4:1K 512:8:256 512:4:512
FFT 2.25M [ 3.54M - 44.57M] 256:9:512 512:9:256
FFT 2.50M [ 3.93M - 49.41M] 1K:5:256 256:10:512 256:5:1K 512:10:256 512:5:512
FFT 2.75M [ 4.33M - 54.24M] 256:11:512 512:11:256
FFT 3M [ 4.72M - 59.13M] 1K:6:256 1K:3:512 256:12:512 256:6:1K 512:12:256 512:6:512 512:3:1K
FFT 3.25M [ 5.11M - 63.93M] 256:13:512 512:13:256
FFT 3.50M [ 5.51M - 68.67M] 1K:7:256 256:14:512 256:7:1K 512:14:256 512:7:512
FFT 3.75M [ 5.90M - 73.37M] 256:15:512 512:15:256
FFT 4M [ 6.29M - 78.46M] 1K:8:256 1K:4:512 256:8:1K 512:8:512 512:4:1K
FFT 4.50M [ 7.08M - 87.83M] 1K:9:256 256:9:1K 512:9:512
FFT 5M [ 7.86M - 97.36M] 1K:10:256 1K:5:512 256:10:1K 512:10:512 512:5:1K
FFT 5.50M [ 8.65M - 106.88M] 1K:11:256 256:11:1K 512:11:512
FFT 6M [ 9.44M - 116.51M] 1K:12:256 1K:6:512 1K:3:1K 256:12:1K 512:12:512 512:6:1K 4K:3:256
FFT 6.50M [ 10.22M - 125.95M] 1K:13:256 256:13:1K 512:13:512
FFT 7M [ 11.01M - 135.29M] 1K:14:256 1K:7:512 256:14:1K 512:14:512 512:7:1K
FFT 7.50M [ 11.80M - 144.55M] 1K:15:256 256:15:1K 512:15:512
FFT 8M [ 12.58M - 154.59M] 1K:8:512 1K:4:1K 512:8:1K 4K:4:256
FFT 9M [ 14.16M - 173.03M] 1K:9:512 512:9:1K
FFT 10M [ 15.73M - 191.79M] 1K:10:512 1K:5:1K 512:10:1K 4K:5:256
FFT 11M [ 17.30M - 210.53M] 1K:11:512 512:11:1K
FFT 12M [ 18.87M - 229.51M] 1K:12:512 1K:6:1K 512:12:1K 4K:6:256 4K:3:512
FFT 13M [ 20.45M - 248.10M] 1K:13:512 512:13:1K
FFT 14M [ 22.02M - 266.49M] 1K:14:512 1K:7:1K 512:14:1K 4K:7:256
FFT 15M [ 23.59M - 284.71M] 1K:15:512 512:15:1K
FFT 16M [ 25.17M - 304.49M] 1K:8:1K 4K:8:256 4K:4:512
FFT 18M [ 28.31M - 340.79M] 1K:9:1K 4K:9:256
FFT 20M [ 31.46M - 377.72M] 1K:10:1K 4K:10:256 4K:5:512
FFT 22M [ 34.60M - 414.63M] 1K:11:1K 4K:11:256
FFT 24M [ 37.75M - 451.99M] 1K:12:1K 4K:12:256 4K:6:512 4K:3:1K
FFT 26M [ 40.89M - 488.59M] 1K:13:1K 4K:13:256
FFT 28M [ 44.04M - 524.79M] 1K:14:1K 4K:14:256 4K:7:512
FFT 30M [ 47.19M - 560.64M] 1K:15:1K 4K:15:256
FFT 32M [ 50.33M - 599.62M] 4K:8:512 4K:4:1K
FFT 36M [ 56.62M - 671.04M] 4K:9:512
FFT 40M [ 62.91M - 743.74M] 4K:10:512 4K:5:1K
FFT 44M [ 69.21M - 816.39M] 4K:11:512
FFT 48M [ 75.50M - 889.11M] 4K:12:512 4K:6:1K
FFT 52M [ 81.79M - 961.97M] 4K:13:512
FFT 56M [ 88.08M - 1033.20M] 4K:14:512 4K:7:1K
FFT 60M [ 94.37M - 1103.74M] 4K:15:512
FFT 64M [100.66M - 1177.31M] 4K:8:1K
FFT 72M [113.25M - 1321.02M] 4K:9:1K
FFT 80M [125.83M - 1464.31M] 4K:10:1K
FFT 88M [138.41M - 1607.03M] 4K:11:1K
FFT 96M [150.99M - 1751.79M] 4K:12:1K
FFT 104M [163.58M - 1893.52M] 4K:13:1K
FFT 112M [176.16M - 2035.14M] 4K:14:1K
FFT 120M [188.74M - 2172.36M] 4K:15:1K
[/CODE] Top of reference tree: [URL]https://www.mersenneforum.org/showpost.php?p=521922&postcount=1[/URL]

kriesel 2018-12-14 19:46

PRP3 run time scaling in V5.0-9c13870 (no P-1)
 
1 Attachment(s)
Gpuowl PRP3 has been run on all known Mersenne prime exponents feasible on its currently available fft lengths, mostly in ascending order. This provides run time scaling, reliability check on the hardware, and a check for any occurrence of false negatives or error detections, from the same run set. The test is being run on an RX480 under Windows 7 x64, along with a running instance of prime95 and mfakto running on an RX550 in the same system.

For the exponents below 216091, the minimum available fft length, 128K, is too large, giving bits/word below 1.5, and in most cases immediate fatal errors. p=132049 runs briefly, at 1.01 bits/word, but detects Gerbicz check errors repeatably in the initial 800 iteration block and exits after 3 rounds of that.
For exponents 216091 to 1398269, the run time is highly linear since they all are run at fft length 128K; p[SUP]0.99[/SUP].
For exponents above 1398269, since the fft length is chosen approximately proportional to the exponent, it seems reasonable to expect the scaling to approximate a power law above 2, since fft multiplication time is, per Knuth and other sources, proportional to n ln n ln ln n. Then a full PRP3 test would take n-1 iterations, or approximately n[SUP]2[/SUP] ln n ln ln n for large n.

In the attachment for CUDALucas run time scaling at [URL]https://www.mersenneforum.org/showpost.php?p=488523&postcount=2[/URL] there is scaling to p[SUP]1.85[/SUP] for 10[SUP]6[/SUP]<p<10[SUP]7[/SUP], and to p[SUP]2.095[/SUP] for 10[SUP]7[/SUP]<p<10[SUP]8[/SUP].
Run time scaling for prime95 for 86243<=p<=2976221 was p[SUP]2.094[/SUP].[URL]https://www.mersenneforum.org/showpost.php?p=502778&postcount=2[/URL]

The scaling for gpuowl appears to be lower than expected and lower than seen for other applications. For 1398269<p<10[SUP]7[/SUP], runtime scales as p[SUP]1.518[/SUP]; for 10[SUP]7[/SUP]<p<10[SUP]8[/SUP] it is p[SUP]1.72 to 1.88[/SUP], which implies an fft multiplication time scaling proportional to lower than linear, similar to a lower exponent range in CUDALucas. Perhaps gpuowl does not reach asymptotic scaling until higher exponents. From 100M exponent to 100Mdigit, the gpuowl scaling was p[SUP]2.04[/SUP], consistent with that. Low n runs appear to be affected by setup overhead in CUDALucas and clLucas also, reducing the power seen in scaling fits. For gpuOwL, the OpenCl compilation each time contributes 2 to 3 seconds overhead. Frequent console or log output may also be contributing.

Finally, and importantly, no false negatives and no detected errors were observed.


Top of reference tree: [URL]https://www.mersenneforum.org/showpost.php?p=521922&postcount=1[/URL]

kriesel 2019-02-13 01:55

gpuowl .owl file header style versus gpuowl version samples
 
more <n>.owl or head <n>.owl

Z:\sources\mersennes\gpuowl\ken\v10test>more c77500079.ll
(none)[CODE]Pí² d gÜ?
?g¦ s¢?[/CODE]gpuowl V1.9 line 1 of 38000009.owl
[CODE]OWL 3 38000009 103000 0 500[/CODE]C:\msys64\home\ken\gpuowl-compile\v1.10>head 77230663.owl
[CODE]OWL 3 77230663 1414500 0 500[/CODE]C:\msys64\home\ken\gpuowl-compile\v2.0>more 89000167.owl
[CODE]OWL 3 89000167 41500 0 500[/CODE]-------------------didn't see a file format 4------------
C:\msys64\home\ken\gpuowl-compile\v3.3>more 89000167.owl
[CODE]OWL 5
Comment: gpuOwL v3.3-bc4a29f; 2018-11-04 03:07:53 UTC
Type: PRP
Exponent: 89000167
Iteration: 22000
PRP-block-size: 400
Residue-64: 0xb90013de9a857278
Errors: 0
End-of-header:[/CODE]gpuowl 3.8 .owl file header
[CODE]OWL 5
Comment: gpuOwL v3.8-91c52fa; 2019-02-12 23:14:59 UTC
Type: PRP
Exponent: 299000059
Iteration: 93760000
PRP-block-size: 400
Residue-64: 0x95d3c1aae6883a8b
Errors: 0
End-of-header:[/CODE]C:\msys64\home\ken\gpuowl-compile\v3.9>type 89000167.owl | more
[CODE]OWL 5
Comment: gpuOwL v3.9-da61ebd; 2018-11-04 03:43:17 UTC
Type: PRP
Exponent: 89000167
Iteration: 26400
PRP-block-size: 400
Residue-64: 0x0a03f10ca11565dc
Errors: 0
End-of-header:[/CODE]-----------------------didn't see a file format 6------

C:\msys64\home\ken\gpuowl-compile\v4.3>head 89000167.owl
[CODE]OWL PRP 7 89000167 144000 0 400 624cac006596e5bb
[/CODE]C:\msys64\home\ken\gpuowl-compile\v4.6>more 89000167.owl
[CODE]OWL PRP 7 89000167 22000 0 400 b90013de9a857278[/CODE]C:\msys64\home\ken\gpuowl-compile\v4.7>more 89000167.owl
[CODE]OWL PRP 7 89000167 0 0 400 0000000000000003[/CODE]C:\msys64\home\ken\gpuowl-compile\v5.0>more 89000167.owl
[CODE]OWL PRP 8 89000167 44000 0 400 57049b5adf2df847 1 0
[/CODE]C:\msys64\home\ken\gpuowl-compile\v5.0-9c13870>more 81885841.owl
[CODE]OWL PRP 8 81885841 81760000 860000 400 35d1c3b4bd099ce1 1 0
[/CODE]PRP-1 (PRP&P-1 combined): "OWL PRP" file version 8, exponent iteration B1 blocksize res64 ? ?
PRP-only has B1=0


C:\msys64\home\ken\gpuowl-compile\v6.2-e2ffe65>more 86243.owl
[CODE]OWL PRP 9 86243 800 400 47fcdf05631f4989
[/CODE]Caveats:
Does not include the old 0.1-0.6 LL file formats.
Does not include header info for any version above v6.2.
Didn't find a way to compile the TF-capable versions on Windows, so no such files to look into.
Haven't attempted any P-1-only.


Top of reference tree: [url]https://www.mersenneforum.org/showpost.php?p=521922&postcount=1[/url]


All times are UTC. The time now is 11:06.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.