![]() |
|
|
#12 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
31·173 Posts |
The ability to continue a gpuowl PRP3 primality test in various versions, versus version used to save the interim file, is tabulated along with file version number as given in the file header. All PRP tests for the attachment were made on an RX480 on Win7, many with exponent 77230663.
Early versions, before v0.7, do only LL. There is no continuation compatibility between any LL-only version and any PRP version of gpuowl to my knowledge. Not all LL-only versions are compatible; v0.5 does random offset but v0.6 requires zero offset. V7.1 is not compatible with v7.0 save files; finish ongoing work in v7.0 (or presumably earlier) before switching to v7.1 (or presumably higher). https://www.mersenneforum.org/showpo...&postcount=110 There appear to be some boundaries across which work in progress can not be migrated. Top of this reference thread: https://www.mersenneforum.org/showthread.php?t=23391 Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2021-03-23 at 14:01 Reason: updated attachment, clarified LL to LL-only versions. |
|
|
|
|
#13 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
31·173 Posts |
One way to test a new version of prime finding software is to rerun tests on known primes.
https://www.mersenneforum.org/showpo...0&postcount=44 lists several such validation runs made for various versions of gpuowl. See also the attachment at https://www.mersenneforum.org/showpo...83&postcount=8 for verifications/validations of all GIMPS-found Mersenne primes with gpuowl V5.0-9c13870. Top of this reference thread: https://www.mersenneforum.org/showthread.php?t=23391 Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2020-02-20 at 20:39 |
|
|
|
|
#14 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
14F316 Posts |
Exponents were selected for 4608K fft length current first-test wavefront, and 18432K 100Mdigit timing runs on the available fft options in gpuowl and CUDALucas. Timings should be representative since for a given fft length, they were run as much as possible on the identical gpu and in fastest possible succession, to undergo the same system and environmental influences (temperature in system and ambient temp). (Can't run CUDALucas on AMD, so no comparison data for RX550 or RX480.) Timings tabulated are averages of multiple 20,000-iteration console outputs, after they appeared to have stabilized. These are of consecutive iterations, not the same iteration interval for every timing obtained; the save file was moved from gpu to gpu to obtain cumulative progress toward completion of the test exponents. Due to the large number of iterations timed, and observed stability of timing versus iteration number, there is little timing variation effect from iteration number, believed to be <0.1% based on an overnight run.
Observed speed advantage of gpuowl over CUDALucas on the same gpu unit ranged from 0.2% to 7.5%; average about 2.7%. No cases were observed of CUDALucas being faster on the same fft length and gpu model, in this limited set of exponent cases, except for a very slight 0.3% CUDALucas advantage on a laptop gtx1050Ti. Note, that some previous versions of gpuowl have shown slightly better timings than these on the AMD gpus testable in those versions. It's possible those were due to environmental differences. Note, gpuowl has since undergone considerable optimization and is now much faster, with v6.11-380 typically fastest in my incomplete speed sampling. V 7.2-53 also does well. CUDALucas is no longer being maintained and has not kept pace with Gpuowl on the same hardware. Top of this thread: https://www.mersenneforum.org/showthread.php?t=23391 Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2021-03-17 at 14:45 Reason: added gpuowl optimization comment |
|
|
|
|
#15 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
31·173 Posts |
There are several PRP residue types. (See https://www.mersenneforum.org/showpo...32&postcount=8) For the res64 to match between two primality tests of the same exponent, the residue types must match, unless perhaps the corresponding mersenne number is prime. (The residues would still not match, if the mersenne number was prime, if one primality test was PRP3 and the other PRP-1)
The PRP residue type(s) supported vary by gpuowl version. If attempting to double check an existing primality test made by prime95 or other software, a correct gpuowl version must be used to match the residue type of the first test. Early versions (up through v0.6) implemented the Lucas-Lehmer test instead. A table of residue type(s) versus gpuowl version and some additional info follows in the pdf attachment. In a nutshell: If you want a versatile selection of fft lengths and PRP type 4 residues, choose from v4.3 to v6.5-f34ad18. If you want PRP type 1 residues, choose from v6.5-84-30c0508 and more recent, or possibly if they're faster and have suitable fft lengths, v1.1-v3.9. Available fft lengths versus gpuowl versions are listed at https://www.mersenneforum.org/showpo...36&postcount=9 and some of these also indicate maximum exponent per fft length. Top of this reference thread: https://www.mersenneforum.org/showthread.php?t=23391 Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2020-04-02 at 18:44 Reason: updated/expanded table, added LL early versions; added v6.5 type 1 commit |
|
|
|
|
#16 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
31×173 Posts |
For a different version of gpuowl, run gpuowl -h or gpuowl-win -h or see the post at the relevant link reachable from http://www.mersenneforum.org/showpos...39&postcount=4, which may include help output.
Built for windows, tried on RX480, -h works, -? doesn't without a worktodo.txt existing. Code:
>gpuowl-win -h 2019-07-10 10:29:22 gpuowl v6.5-84-g30c0508 Command line options: -dir <folder> : specify work directory (containing worktodo.txt, results.txt, config.txt, gpuowl.log) -user <name> : specify the user name. -cpu <name> : specify the hardware name. -time : display kernel profiling information. -fft <size> : specify FFT size, such as: 5000K, 4M, +2, -1. -block <value> : PRP GEC block size. Default 1000. Smaller block is slower but detects errors sooner. -log <step> : log every <step> iterations, default 20000. Multiple of 10000. -carry long|short : force carry type. Short carry may be faster, but requires high bits/word. -B1 : P-1 B1 bound, default 500000 -B2 : P-1 B2 bound, default B1 * 30 -rB2 : ratio of B2 to B1. Default 30, used only if B2 is not explicitly set -prp <exponent> : run a single PRP test and exit, ignoring worktodo.txt -pm1 <exponent> : run a single P-1 test and exit, ignoring worktodo.txt -results <file> : name of results file, default 'results.txt' -iters <N> : run next PRP test for <N> iterations and exit. Multiple of 10000. -use NEW_FFT8,OLD_FFT5,NEW_FFT10: comma separated list of defines, see the #if tests in gpuowl.cl (used for perf tuning). -device <N> : select a specific device: 0 : Ellesmere-36x1266-@28:0.0 Radeon (TM) RX 480 Graphics 1 : gfx804-8x1203-@3:0.0 Radeon 550 Series FFT Configurations: FFT 8K [ 0.01M - 0.18M] 64-64 FFT 32K [ 0.05M - 0.68M] 64-256 256-64 FFT 64K [ 0.10M - 1.34M] 64-512 512-64 FFT 128K [ 0.20M - 2.63M] 1K-64 64-1K 256-256 FFT 192K [ 0.29M - 3.91M] 64-256-6 FFT 224K [ 0.34M - 4.54M] 64-256-7 FFT 256K [ 0.39M - 5.18M] 64-2K 256-512 512-256 2K-64 FFT 288K [ 0.44M - 5.81M] 64-256-9 FFT 320K [ 0.49M - 6.44M] 64-256-10 FFT 352K [ 0.54M - 7.06M] 64-256-11 FFT 384K [ 0.59M - 7.69M] 64-256-12 64-512-6 FFT 448K [ 0.69M - 8.94M] 64-512-7 FFT 512K [ 0.79M - 10.18M] 1K-256 256-1K 512-512 4K-64 FFT 576K [ 0.88M - 11.42M] 64-512-9 FFT 640K [ 0.98M - 12.66M] 64-512-10 FFT 704K [ 1.08M - 13.89M] 64-512-11 FFT 768K [ 1.18M - 15.12M] 64-512-12 64-1K-6 256-256-6 FFT 896K [ 1.38M - 17.57M] 64-1K-7 256-256-7 FFT 1M [ 1.57M - 20.02M] 1K-512 256-2K 512-1K 2K-256 FFT 1152K [ 1.77M - 22.45M] 64-1K-9 256-256-9 FFT 1280K [ 1.97M - 24.88M] 64-1K-10 256-256-10 FFT 1408K [ 2.16M - 27.31M] 64-1K-11 256-256-11 FFT 1536K [ 2.36M - 29.72M] 64-1K-12 64-2K-6 256-256-12 256-512-6 512-256-6 FFT 1792K [ 2.75M - 34.54M] 64-2K-7 256-512-7 512-256-7 FFT 2M [ 3.15M - 39.34M] 1K-1K 512-2K 2K-512 4K-256 FFT 2304K [ 3.54M - 44.13M] 64-2K-9 256-512-9 512-256-9 FFT 2560K [ 3.93M - 48.90M] 64-2K-10 256-512-10 512-256-10 FFT 2816K [ 4.33M - 53.66M] 64-2K-11 256-512-11 512-256-11 FFT 3M [ 4.72M - 58.41M] 1K-256-6 64-2K-12 256-512-12 256-1K-6 512-256-12 512-512-6 FFT 3584K [ 5.51M - 67.87M] 1K-256-7 256-1K-7 512-512-7 FFT 4M [ 6.29M - 77.30M] 1K-2K 2K-1K 4K-512 FFT 4608K [ 7.08M - 86.70M] 1K-256-9 256-1K-9 512-512-9 FFT 5M [ 7.86M - 96.07M] 1K-256-10 256-1K-10 512-512-10 FFT 5632K [ 8.65M - 105.41M] 1K-256-11 256-1K-11 512-512-11 FFT 6M [ 9.44M - 114.74M] 1K-256-12 1K-512-6 256-1K-12 256-2K-6 512-512-12 512-1K-6 2K-256-6 FFT 7M [ 11.01M - 133.32M] 1K-512-7 256-2K-7 512-1K-7 2K-256-7 FFT 8M [ 12.58M - 151.83M] 2K-2K 4K-1K FFT 9M [ 14.16M - 170.28M] 1K-512-9 256-2K-9 512-1K-9 2K-256-9 FFT 10M [ 15.73M - 188.68M] 1K-512-10 256-2K-10 512-1K-10 2K-256-10 FFT 11M [ 17.30M - 207.02M] 1K-512-11 256-2K-11 512-1K-11 2K-256-11 FFT 12M [ 18.87M - 225.32M] 1K-512-12 1K-1K-6 256-2K-12 512-1K-12 512-2K-6 2K-256-12 2K-512-6 4K-256-6 FFT 14M [ 22.02M - 261.80M] 1K-1K-7 512-2K-7 2K-512-7 4K-256-7 FFT 16M [ 25.17M - 298.13M] 4K-2K FFT 18M [ 28.31M - 334.34M] 1K-1K-9 512-2K-9 2K-512-9 4K-256-9 FFT 20M [ 31.46M - 370.44M] 1K-1K-10 512-2K-10 2K-512-10 4K-256-10 FFT 22M [ 34.60M - 406.43M] 1K-1K-11 512-2K-11 2K-512-11 4K-256-11 FFT 24M [ 37.75M - 442.34M] 1K-1K-12 1K-2K-6 512-2K-12 2K-512-12 2K-1K-6 4K-256-12 4K-512-6 FFT 28M [ 44.04M - 513.91M] 1K-2K-7 2K-1K-7 4K-512-7 FFT 36M [ 56.62M - 656.22M] 1K-2K-9 2K-1K-9 4K-512-9 FFT 40M [ 62.91M - 727.03M] 1K-2K-10 2K-1K-10 4K-512-10 FFT 44M [ 69.21M - 797.64M] 1K-2K-11 2K-1K-11 4K-512-11 FFT 48M [ 75.50M - 868.07M] 1K-2K-12 2K-1K-12 2K-2K-6 4K-512-12 4K-1K-6 FFT 56M [ 88.08M - 1008.44M] 2K-2K-7 4K-1K-7 FFT 72M [113.25M - 1287.53M] 2K-2K-9 4K-1K-9 FFT 80M [125.83M - 1426.38M] 2K-2K-10 4K-1K-10 FFT 88M [138.41M - 1564.83M] 2K-2K-11 4K-1K-11 FFT 96M [150.99M - 1702.92M] 2K-2K-12 4K-1K-12 4K-2K-6 FFT 112M [176.16M - 1978.12M] 4K-2K-7 FFT 144M [226.49M - 2525.23M] 4K-2K-9 FFT 160M [251.66M - 2797.39M] 4K-2K-10 FFT 176M [276.82M - 3068.76M] 4K-2K-11 FFT 192M [301.99M - 3339.40M] 4K-2K-12 2019-07-10 10:29:30 Exiting because "help" 2019-07-10 10:29:30 Bye >gpuowl-win -? 2019-07-10 10:29:43 gpuowl v6.5-84-g30c0508 2019-07-10 10:29:43 Note: no config.txt file found 2019-07-10 10:29:43 config: -? 2019-07-10 10:29:43 Can't open 'worktodo.txt' (mode 'rb') 2019-07-10 10:29:43 Bye Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2020-02-20 at 20:38 |
|
|
|
|
#17 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
31×173 Posts |
Various versions v6.6 and up have been run on a variety of exponents, up to or approaching the limits that are gpu-specific and probably somewhat Gpuowl version specific also. These runs were mostly on Windows 7 Pro X64 systems and with as little as 12 GB system ram. Tesla P100 and K80 sets were done on Google Colaboratory Linux VMs with a Gpuowl build Fan Ming had posted.
Top of this reference thread: https://www.mersenneforum.org/showthread.php?t=23391 Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2021-01-13 at 19:40 Reason: Radeon VII data updated with GHzD/day column, graphs vs. fft length & exponent, note |
|
|
|
|
#18 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
31×173 Posts |
In a post at the link following, Robert Gerbicz indicates a rate of ~12 ppm missed on errors in a PRP3 computation. That was at a very small p=17. If that rate held constant over the range of mersenne.org 2 to 109, and the error occurrence rate continued at 2% per test, it would correspond to about 12 undetected wrong residues in the approximately 50 million prime exponents (12ppm * 50 million exponents x 2% error occurrence). The rate of undetected error with GEC drops rapidly with p, so is nonzero but very nearly zero in the range of p of interest to GIMPS. (x < 2/(2p-1)) It's very good, far better than the Jacobi symbol check's 50% rate for LL, and either is much better than nothing, but it's not perfect. https://www.mersenneforum.org/showpo...1&postcount=88
And at https://www.mersenneforum.org/showpo...&postcount=104, R. Gerbicz indicates a probability of multiple errors in a block, of ~1/Mp Also note that errors in code or hardware function outside the Gerbicz check's protection add to the overall error rate, as prime95 found. (The possibilities of human error or deceit add additional possible error in reported residues to Primenet, and these probabilities are hard to quantify and control, other than by independent checking.) GIMPS policy appears to be that for both LL and PRP tests, double checking of such tests will remain required, except for PRP tests with proof and verification (Cert) performed. Top of this reference thread: https://www.mersenneforum.org/showthread.php?t=23391 Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2020-10-04 at 16:37 Reason: qualified DC, for PRP proof/cert exception |
|
|
|
|
#19 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
31×173 Posts |
See https://www.mersenneforum.org/showpo...8&postcount=99 for results of some experimentation with simultaneous PRP runs or PRP & P-1 tandem runs.
In a nutshell; 107.5 to 108% of the throughput of a single run, 106% power consumption. The tradeoff is a little more throughput, a little more power efficiency, but nearly double latency. Note, this performance boost may be specific to the Linux ROCm drivers, and much reduced or absent for amdgpupro or Windows. Test in your environment. Not tested on Radeon VII and Gpuowl, but likely practical, is phasing of two P-1 runs, so that in instance A, stage 1 is occurring, with its low memory requirements, while instance B is running stage 2, with its high memory requirements. At modest exponents it may be possible to run two stage 2's in parallel (simultaneously), as it is possible in CUDAPm1, on gpus with less ram than the Radeon VII's 16GB. In the months after that, considerable effort has been spent on alternate code paths in gpuowl to optimize speed, so the performance of a single instance has been enhanced and dual-instance throughput advantage has been reduced. Also memlock was added to help make the tandem P-1 more practical/automated. In v7, stage 1 now uses more gpu ram than previously. Top of this reference thread: https://www.mersenneforum.org/showthread.php?t=23391 Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2021-03-17 at 14:47 |
|
|
|
|
#20 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
536310 Posts |
Multiple P-1 applications offer the user the option of controlling the P-1 factoring bounds, or require their input, on the command line or in worktodo records or both. What inputs should a user use, depends on the goal. For the purpose of this post, the goal is to maximize time saved on the average, over many exponents, in searching for a new Mersenne prime. Finding factors is nice, but it is not the goal.
Several strategies have been discussed on some forum threads. Some advocate testing with 1-test-saved bounds. Others advocate quickly testing many exponents with very small bounds, claiming it will save time by eliminating some exponents quickly, before running a lengthier factoring with larger bounds. There are multiple bounds given on the mersenne.ca page for a single exponent. What saves the most time in the long run has been unclear, and to my knowledge, not well tested and documented. I chose two exponents to repeatedly P-1 factor with different bounds. The exponents were chosen to represent near-future P-1 GIMPS wavefront factoring, and 100Mdigit exponents. Several sets of bounds were used to determine approximately where the number of exponents factored per day is maximized, under the condition that B2 is twice B1. This fits well with gpuowl's minimum bounds having that relationship. Additionally, the tabulated gpu72 bounds, PrimeNet bounds, and 1-test-saved and 2-tests saved bounds were run. The odds of finding a factor were computed for each exponent and bounds combination. The probable time expenditure in P-1 factoring was computed from the actual logged run times and the odds of finding factors in each run. All cases were premised on trial factoring to gpu72 limits first. Total number of runs was 12 for the 102M exponent, and 14 for the 333M exponent. This took an RX480 almost ten days to complete, using the then-current commit of gpuowl. The actual run times and odds of finding a factor for the various cases were combined for 7 different scenarios and 3 GIMPS cases as applicable. The scenarios are:
A: No further LL is done, and a single PRP returning composite is regarded as definitive (eg with proof generation & certification), so future work would consist entirely of single primality tests per exponent. This is a hypothetical or future case. There is still a lot of LL first test, or PRP without proof, being performed. A single LL or PRP (without proof generation) does not address the issue of honest error or false reports. B: All future primality tests will be double checked, whether LL or PRP, so future work would consist entirely of double primality tests per exponent. This is I think the closest to the 2020 situation for GIMPS. (Only about 1/4 of primality test results were PRP with proof in September 2020.) C: An equal probability of single or double primality tests going forward, the midpoint case between A and B. This would correspond to PRP being single tested and LL double tested and each occurring at about the 2020 rate. There are a few scenario and case combinations that don't make sense. This leaves 17 combinations for each exponent. The optimal time savings for the GIMPS cases were computed to be: A: All single primality tests: use the 1-test-saved bounds B: The 2020 situation, 2 primality tests: use the PrimeNet bounds C: Equal probability of single or double primality tests: use the 1-test-saved bounds The max-factors/day first scenario was observed to be less effective in all cases, both exponents. Possibly it would do somewhat better if run with a different B2/B1 ratio. The other extreme is the ratio of bounds for the 1-test-saved, 2-test-saved, gpu72, or PrimeNet scenarios, which is typically about 20 to 30. I concluded that the proper P-1 bounds to use now in Gpuowl are the larger mersenne.ca (formerly PrimeNet, now GPU72 row) bounds, immediately (only). Without any prior P-1; no 1-test-saved first, no max factors/day low bounds first. For example, https://www.mersenne.ca/exponent/104399917 Note, that these test runs were made before the latest performance advances in Gpuowl. The actual bounds, odds, and calculations are documented in the attached pdf. After these tests were done, the PrimeNet and gpu72 bounds on mersenne.ca were revised. Formerly the PrimeNet bounds were higher for a given exponent. After the revision the gpu72 line bounds are higher. After PRP with proof adoption is substantially complete, downward adjustment of bounds is likely. See also M344587487's post re efficiently and effectively performing P-1: https://mersenneforum.org/showpost.p...2&postcount=10 Top of this reference thread: https://www.mersenneforum.org/showthread.php?t=23391 Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2021-06-17 at 18:43 Reason: added proof generation consideration, updated for dates |
|
|
|
|
#21 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
31·173 Posts |
First ask yourself, is this compile really necessary? See https://www.mersenneforum.org/showpo...83&postcount=7 or https://download.mersenne.ca/gpuowl which also have older versions precompiled for Windows. These usually have the executable, readme file, and help output bundled into a zip file. And there is occasionally a Linux version available, such as in the Google Colab thread.
Requirements for recent github commits of gpuowl:
I've taken the conservative approach of using git clone and saving every build in a separate folder, which gets named for the version and commit, so I can run any version at any time. If you only want the latest, substitute git pull. There's a handy basic intro to git at https://www.mersenneforum.org/showpo...postcount=1076 The following procedures are for relatively recent commits, since Preda et al incorporated multiple build targets into the Gpuowl makefile addressing both Linux builds and msys2/mingw Windows builds around v6.7 as I recall. There are two steps; preparing an adequate build environment, and performing the build. On a local system
See also https://www.mersenneforum.org/showpo...6&postcount=40, https://www.mersenneforum.org/showth...t=msys2&page=4I generally create a folder for each version and commit, eg gpuowl-v6.11-9-g9ae3189. After an executable is produced, I can drag the executable and the readme up to that folder and have a relatively empty working folder for test, while all the source and .o files sit in the .\gpuowl subfolder below. I follow a fresh build with gpuowl-win -help and save that output. A nice feature is it shows which OpenCL GPUs it detected. On cloud computing, there are at least three approaches
If there's a failure to build, use git bisect to determine at which commit the issue began, and include that info in any issue report provided to Preda. https://git-scm.com/docs/git-bisect Thanks to kracker, Preda, tServo, and others who helped get me started or fix the occasional broken build environment. Edit 2021 March 09: None of the above should be mistaken for advocacy of blind adherence to "latest rev is always best". When it is necessary or useful to build or revert to an older commit, that is not available precompiled for the OS and version/commit desired from existing locations, for successful build, stability, features no longer available in the latest commit(s), to avoid speed regression, for comparison testing, or other reasons, a review of https://stackoverflow.com/questions/...ote-repository may be useful in determining how to modify the above build processes to build the desired commit of the desired branch. Top of this reference thread: https://www.mersenneforum.org/showthread.php?t=23391 Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2021-03-09 at 23:05 Reason: added "Edit 2021 March 09 paragraph" re building other than latest commit |
|
|
|
|
#22 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
31×173 Posts |
I think file sizes will be very nearly the same for PRP3, LL, and stage 1 P-1, since they are all going to be storing residues mod Mp.
Attached is observed PRP3 file size except as noted, plus some linear fits and extrapolations. Top of this reference thread: https://www.mersenneforum.org/showthread.php?t=23391 Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1 Last fiddled with by kriesel on 2020-05-28 at 22:08 |
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Reference material discussion thread | kriesel | kriesel | 78 | 2021-07-12 13:51 |
| Mersenne Prime mostly-GPU Computing reference material | kriesel | kriesel | 31 | 2020-07-09 14:04 |
| CUDALucas-specific reference material | kriesel | kriesel | 9 | 2020-05-28 23:32 |
| Mfaktc-specific reference material | kriesel | kriesel | 8 | 2020-04-17 03:50 |
| CUDAPm1-specific reference material | kriesel | kriesel | 12 | 2019-08-12 15:51 |