mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfakto: an OpenCL program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=15646)

Viliam Furik 2021-06-30 19:58

[QUOTE=birtwistlecaleb;582335]They both instantly close when they find that, so I can't see that. Am I supposed to manually add the file?[/QUOTE]

Yes. Add the file and fill it with work. If it's empty but present, the program will still quit on start.

James Heinrich 2021-06-30 20:01

[QUOTE=birtwistlecaleb;582335]They both instantly close when they find that[/QUOTE]Especially when debugging, it's very useful to open a command prompt first at the location of the file you're trying to run and then run it from the command prompt rather than just double-clicking the executable, that way you can see what output it is giving.

kriesel 2021-06-30 20:24

[QUOTE=birtwistlecaleb;582324]They both do not have a worktodo.txt, and it seems like [B]they are broken [/B]because of that.[/QUOTE]Something's malfunctioning, but it's clearly happening elsewhere than the software. Check between the furniture (chair or whatever) and the keyboard; do whatever maintenance needed there.

Use the readme that comes included with the software. Read it. Understand it. Follow its plain directions.

Mfaktc:[CODE]################################
# 3.1 Running mfaktc (Windows) #
################################

Similar to Linux (read above!).
[B]Open a command window[/B] and run 'mfaktc.exe -h'.


####################################################################
# 4 How to get work and report results from/to the Primenet server #
####################################################################

Getting work:
Step 1) go to http://www.mersenne.org/ and login with your username and
password
Step 2) on the menu on the left click "Manual Testing" and then
"Assignments"
Step 3) choose the number of assignments by choosing
"Number of CPUs (cores) you need assignments for (maximum 12)"
and "Number of assignments you want for each core"
Step 4) Change "Preferred work type" to "Trial factoring"
Step 5) click the button "Get Assignments"
Step 6) [B]copy&paste the "Factor=..." lines directly into the worktodo.txt
[/B][B] in your mfaktc directory[/B][/CODE]
Mfakto is similar:
[CODE]Open a terminal window and run 'mfakto -h' for possible parameters. You may
also want to check mfakto.ini for additional settings. mfakto typically fetches
work from worktodo.txt as specified in the INI file. See section 3 on how to
obtain assignments and report results.

A typical worktodo.txt file looks like this:
-- begin example --
Factor=[assignment ID],66362159,64,68
Factor=[assignment ID],3321932899,76,77
-- end example --[/CODE]

[CODE]
########################################
# 3 Getting work and reporting results #
########################################

You must have a PrimeNet account to participate. Simply visit the GIMPS website
at https://mersenne.org to create one. Once you've signed up, you can get
assignments in several ways.

From the GIMPS website:
Step 1) log in to the GIMPS website with your username and password
Step 2) on the menu bar, select Manual Testing > Assignments
Step 3) open the link to the manual GPU assignment request form
Step 4) enter the number of assignments or GHz-days you want
Step 5) click "Get Assignments"[/CODE]...[CODE] Once you have your assignments, copy the "Factor=..." lines directly into
your worktodo.txt file. Start mfakto, sit back and let it do its job.
Running mfakto is also a great way to stress test your GPU. ;-)[/CODE]


Use the [URL="https://www.mersenneforum.org/showthread.php?t=23394"]mfakto[/URL] or mfaktc [URL="https://www.mersenneforum.org/showpost.php?p=488518&postcount=1"]reference info[/URL]. "Create a worktodo file and put some assignments in there. Start with few, in case your gpu or igp does not work out. Get the type you plan to run the most. Get them from [URL]https://www.mersenne.org/manual_gpu_assignment/[/URL]"
"Create a Windows batch file or Linux shell script with a short name.
Set the device number there.
Consider redirecting console output to a file or employing a good tee program."

Stop wasting other people's time, and apply some of your own intelligence and time.

Uncwilly 2021-06-30 21:10

And [URL="https://www.mersenneforum.org/showthread.php?t=22029"]MISFIT[/URL] can handle all of the work filling a worktodo and submitting results.
[url]https://www.mersenneforum.org/forumdisplay.php?f=103[/url]

birtwistlecaleb 2021-06-30 21:23

[QUOTE=Viliam Furik;582338]Yes. Add the file and fill it with work. If it's empty but present, the program will still quit on start.[/QUOTE]
Thanks! I got it working now.:bow wave:
This is what I got if you find something else that happened.
[CODE]got assignment: exp=219633173 bit_min=73 bit_max=74 (8.71 GHz-days)
Starting trial factoring M219633173 from 2^73 to 2^74 (8.71Ghz-days)
Using GPU kernel "[Just to be safe, censored.]"[/CODE]
There were no extra lines for a couple minutes and none now.
Edit: No lines for around 30 minutes. Does it only make a line when you finish a tf assignment/1 bitlevel?

DrobinsonPE 2021-08-24 04:44

Ryzen 5 5600G
 
[CODE]MFAKTO, AMD Ryzen 5 5600G, ASROCK B450-HDV R4.0, DDR4-3600 RAM
----------------------------------------------------------------
Selftest statistics
number of tests 34026
successful tests 34026

selftest PASSED!


C:\Users\User\mfakto>mfakto
mfakto 0.15pre7-MGW (64bit build)


Runtime options
Inifile mfakto.ini
Verbosity 1
SieveOnGPU yes
MoreClasses yes
GPUSievePrimes 81157
GPUSieveProcessSize 24 Kib
GPUSieveSize 96 Mib
FlushInterval 0
WorkFile worktodo.txt
ResultsFile results.txt
Checkpoints enabled
CheckpointDelay 300 s
Stages enabled
StopAfterFactor class
PrintMode compact
V5UserID none
ComputerID none
TimeStampInResults yes
VectorSize 2
GPUType AUTO
SmallExp no
UseBinfile mfakto_Kernels.elf
Compiletime options

Select device - Get device info:
WARNING: Unknown GPU name, assuming GCN. Please post the device name "gfx90c (Advanced Micro Devices, Inc.)" to http://www.mersenneforum.org/showthread.php?t=15646 to have it added to mfakto. Set GPUType in mfakto.ini to select a GPU type yourself to avoid this warning.

OpenCL device info
name gfx90c (Advanced Micro Devices, Inc.)
device (driver) version OpenCL 2.0 AMD-APP (3276.6) (3276.6 (PAL,HSAIL))
maximum threads per block 1024
maximum threads per grid 1073741824
number of multiprocessors 7 (448 compute elements)
clock rate 1900 MHz

Automatic parameters
threads per grid 0
optimizing kernels for GCN

Loading binary kernel file mfakto_Kernels.elf
Compiling kernels.
GPUSievePrimes (adjusted) 81206
GPUsieve minimum exponent 1037054
Started a simple selftest ...
Selftest statistics
number of tests 30
successful tests 30

selftest PASSED!

got assignment: exp=103399837 bit_min=75 bit_max=76 (74.00 GHz-days)
Starting trial factoring M103399837 from 2^75 to 2^76 (74.00 GHz-days)
Using GPU kernel "cl_barrett15_82_gs_2"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Aug 23 20:51 | 24 0.6% | 40.947 10h51m | 162.66 81206 0.00%
mfakto will exit once the current class is finished.
press ^C again to exit immediately
Aug 23 20:52 | 27 0.7% | 40.947 10h50m | 162.66 81206 0.00%[/CODE]

kracker 2021-08-28 22:57

Not sure if numbers for RDNA2 have been posted here... haven't kept up.

6700 XT <- probably thermal throttling. owner/tester said parts of the card hit 99C
[code]
Resulting speed for M78000071:
bit_min - bit_max GHz-days/day kernelname
60 - 69 1573.192 cl_barrett15_69_gs
69 - 70 1491.949 cl_barrett15_71_gs
70 - 73 1311.655 cl_barrett15_73_gs
73 - 74 1283.831 cl_barrett15_74_gs
74 - 76 1254.737 cl_barrett32_76_gs
76 - 77 1251.052 cl_barrett32_77_gs
77 - 81 1180.985 cl_barrett15_82_gs
81 - 87 1105.831 cl_barrett32_87_gs
87 - 88 1100.490 cl_barrett32_88_gs
88 - 92 957.069 cl_barrett32_92_gs
[/code]

RX 6600 XT
[code]
Resulting speed for M78000071:
bit_min - bit_max GHz-days/day kernelname
60 - 69 1251.068 cl_barrett15_69_gs
69 - 70 1185.451 cl_barrett15_71_gs
70 - 73 1040.123 cl_barrett15_73_gs
73 - 74 1018.516 cl_barrett15_74_gs
74 - 76 1015.305 cl_barrett32_76_gs
76 - 77 1012.099 cl_barrett32_77_gs
77 - 81 939.008 cl_barrett15_82_gs
81 - 87 893.350 cl_barrett32_87_gs
87 - 88 891.557 cl_barrett32_88_gs
88 - 92 776.045 cl_barrett32_92_gs
[/code]

James Heinrich 2021-08-28 23:27

[QUOTE=kracker;586766]Not sure if numbers for RDNA2 have been posted here...[/QUOTE]I have not seen a single mfakto benchmark for any RX 6xxx yet. If someone who has one (of any kind) would like to submit a benchmark I would be most grateful:
[url]https://www.mersenne.ca/mfaktc.php#benchmark[/url]

kriesel 2021-09-13 17:01

Has anyone gotten an Iris Xe IGP working with mfakto on Windows?
 
Not me despite [URL="https://www.mersenneforum.org/showpost.php?p=587816&postcount=9"]several tries[/URL]. Ideas?

Ethan (EO) 2022-01-21 03:39

1770 GHz-d/day from a Radeon VII with mfakto 0.14:

[CODE]got assignment: exp=5340017 bit_min=69 bit_max=70 (22.39 GHz-days)
Starting trial factoring M5340017 from 2^69 to 2^70 (22.39GHz-days)
k_min = 55270967334660 - k_max = 110541934671501
Using GPU kernel "cl_barrett32_77_gs_4"


Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Jan 20 18:13 | 2163 46.8% | 1.139 9m43s | 1769.20 30005 0.00%[/CODE]

[CODE]



PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
718 root 30 10 719240 292672 5736 S 100.3 1.9 29:39.14 mprime
22047 root 20 0 32.425g 134184 86248 S 1.0 0.9 0:02.11 mfakto-x64
[/CODE]

All default parameters except:

[CODE]
GPUType = APU
GPUSievePrimes = 30000
GPUSieveSize = 9
FlushInterval = 0
[/CODE]

FlushInterval = 0 and GPUSieveSize = 10 had major impacts on speed (~80% combined increase)

CLinfo:
[CODE]
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 2.1 AMD-APP (3180.7)
Platform Name: AMD Accelerated Parallel Processing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices


Platform Name: AMD Accelerated Parallel Processing
Number of devices: 1
Device Type: CL_DEVICE_TYPE_GPU
Vendor ID: 1002h
Board name: AMD Radeon VII
Device Topology: PCI[ B#7, D#0, F#0 ]
Max compute units: 60
Max work items dimensions: 3
Max work items[0]: 1024
Max work items[1]: 1024
Max work items[2]: 1024
Max work group size: 256
Preferred vector width char: 4
Preferred vector width short: 2
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Native vector width char: 4
Native vector width short: 2
Native vector width int: 1
Native vector width long: 1
Native vector width float: 1
Native vector width double: 1
Max clock frequency: 1840Mhz
Address bits: 64
Max memory allocation: 14360458035
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 64
Max image 2D width: 16384
Max image 2D height: 16384
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 1024
Alignment (bits) of base address: 2048
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: No
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 17163091968
Constant buffer size: 14360458035
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 65536
Max pipe arguments: 16
Max pipe active reservations: 16
Max pipe packet size: 1475556147
Max global variable size: 12924412160
Max global variable preferred total size: 17163091968
Max read/write image args: 64
Max on device events: 1024
Queue on device max size: 8388608
Max on device queues: 1
Queue on device preferred size: 262144
SVM capabilities:
Coarse grain buffer: Yes
Fine grain buffer: Yes
Fine grain system: No
Atomics: No
Preferred platform atomic alignment: 0
Preferred global atomic alignment: 0
Preferred local atomic alignment: 0
Kernel Preferred work group size multiple: 64
Error correction support: 0
Unified memory for Host and Device: 0
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue on Host properties:
Out-of-Order: No
Profiling : Yes
Queue on Device properties:
Out-of-Order: Yes
Profiling : Yes
Platform ID: 0x7fae26339f30
Name: gfx906
Vendor: Advanced Micro Devices, Inc.
Device OpenCL C version: OpenCL C 2.0
Driver version: 3180.7 (PAL,HSAIL)
Profile: FULL_PROFILE
Version: OpenCL 2.0 AMD-APP (3180.7)
Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int3
2_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_devi
ce_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_imag
e cl_khr_mipmap_image_writes cl_amd_copy_buffer_p2p
[/CODE]

amdinfo:
[CODE]
Thu Jan 20 18:21:28 PST 2022

=== GPU 0, 07:00.0 Radeon VII 16368 MB ===
Bios: 113-D3600200-106, UUID: T2CY67L113160501
Core: 1806 MHz 981mV, Mem: 1100 MHz, REF: 7500
PerfCtrl: manual, Load: 100%, MemLoad: 0%, Power: 176.0 W, Cap: 275 W
Core: 68°C, HotSpot: 93°C, Mem: 67°C, Fan: 26%, RPM: 1072
Core state: 8, clocks: 700 808 1146 1394 1574 1717 1785 1810 1840*
Mem state: 2, clocks: 350 800 1100*
SOC state: 7, clocks: 309 523 566 618 680 755 850 971*
DCEF state: 0, clocks: 357* 453 566 680 755 850 971 1133
F clocks: 550 610 690 760 870 960 1080 1225
PCIE Link speed: GEN2 (5.0GT/s), PCIE Link width: x4
Memory total: 16368.00 MB, used: 89.24 MB, free: 16278.76 MB, type: Hynix HBM2
VDDGfx: 1000mV, VDDCR_SOC: 931mV, VDDCI_MEM: 850mV, VDDIO_MEM: 1218mV, VDDCR_HBM: 1218mV
[/CODE]

--perftest:
[CODE]

Runtime options
Inifile mfakto.ini
Verbosity 1
SieveOnGPU yes
MoreClasses yes
GPUSievePrimes 30000
GPUSieveProcessSize 24Ki bits
GPUSieveSize 9Mi bits
FlushInterval 0
WorkFile worktodo.txt
ResultsFile results.txt
Checkpoints enabled
CheckpointDelay 300s
Stages enabled
StopAfterFactor class
PrintMode compact
V5UserID none
ComputerID none
TimeStampInResults yes
VectorSize 4
GPUType APU
SmallExp no
UseBinfile mfakto_Kernels.elf
Select device - Get device info - Compiling kernels.


Perftest

Generate list of the first 1000000 primes: 1807.24 ms

Generate list of the first 1075766 primes for GPU sieving: 266.84 ms

1. CPU-Sieve-Init (once per class, 960 times per test, avg. for 3 iterations)
Init_class(sieveprimes= 5000): 0.83 ms
Init_class(sieveprimes= 20000): 3.50 ms
Init_class(sieveprimes= 80000): 15.30 ms
Init_class(sieveprimes= 200000): 40.62 ms
Init_class(sieveprimes= 500000): 107.34 ms
Init_class(sieveprimes=1000000): 224.55 ms

2. CPU-Sieve (output rate M/s)
Sieve size is fixed at compile time, cannot test with variable sizes. Just running 3 fixed tests.

SievePrimes: 254 396 611 945 1460 2257 3487 5389 8328 12871 19890 30738 47503 73411 113449 175323 270944 418716 647083 1000000
SieveSizeLimit
36 kiB 480.4 432.8 390.9 354.5 320.1 291.8 266.6 242.9 220.2 198.9 175.7 158.6 132.9 106.5 85.5 66.8 55.8 44.7 34.6 25.7
36 kiB 480.4 434.2 391.4 351.9 320.9 292.0 266.1 243.0 220.4 198.6 176.0 158.7 133.0 106.4 86.0 69.6 55.7 43.4 33.3 25.5
36 kiB 480.5 433.4 391.8 353.8 320.3 292.1 264.3 242.5 220.7 198.7 175.7 158.3 133.0 106.4 86.2 69.7 55.9 44.6 34.5 25.3

Best SieveSizeLimit for
SievePrimes: 254 396 611 945 1460 2257 3487 5389 8328 12871 19890 30738 47503 73411 113449 175323 270944 418716 647083 1000000
at kiB: 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36
max M/s: 480.5 434.2 391.8 354.5 320.9 292.1 266.6 243.0 220.7 198.9 176.0 158.7 133.0 106.5 86.2 69.7 55.9 44.7 34.6 25.7
Survivors: 36.41% 34.07% 32.06% 30.27% 28.69% 27.28% 26.00% 24.83% 23.78% 22.82% 21.93% 21.11% 20.37% 19.67% 19.01% 18.40% 17.83% 17.30% 16.78% 16.32%
removal rate 839.3 840.4 830.3 816.5 797.5 778.7 758.9 735.5 707.2 672.7 626.7 593.1 520.2 434.8 367.4 309.1 257.7 213.6 171.5 132.0


3. Memory copy to GPU (blocks of 8388608 bytes)

Standard copy, standard queue:
240 MB in 0.1 ms (3355443.2 MB/s) (real)

Standard copy, profiled queue:
240 MB in 0.1 ms (4125544.9 MB/s) (real)
240 MB in 0.0 ms ( inf MB/s) (profiled data)
8 MB in 0.0 ms ( inf MB/s) (profiled data, peak)

Standard copy, two queues:
240 MB in 0.2 ms (1553445.9 MB/s) (real)

Reinitializing with gpu_sieving enabled.
Select device - Get device info - Compiling kernels.

4. GPU sieve, 3 iterations each

gpusieve_init: 16.524000 ms (CPU work)
gpusieve_init_exponent: 1.352000 ms (CalcModularInverses)
gpusieve_init_class: 0.345333 ms (CalcBitToClear)
gpusieve: 11.457333 ms (SegSieve)
tf: 11.666667 ms = 11324.620800 M/s (raw rate, cl_barrett15_69_gs)

GPU sieve raw rate (input rate M/s)
SievePrimes: 54 396 611 945 1460 2257 3487 5389 8328 12871 19890 30738 47503 73411 113449 175323 270944 418716 647083 1075766
GPUSieveSize
4 MBit 67144.7 55897.9 46651.4 47567.2 47719.0 44864.2 44350.6 33783.1 31516.4 28207.1 30356.8 21151.7 16961.0 13197.5 7731.4 3728.7 1846.4 797.4 409.8 183.5
5 MBit 80432.8 58118.3 56611.8 52614.5 57999.1 54424.4 47021.3 49629.2 45024.7 41184.3 35061.6 28766.1 20894.9 16010.0 7669.4 4100.9 2163.0 984.1 511.4 228.5
6 MBit 94965.4 76790.8 66693.9 69571.6 67049.3 63493.1 61039.8 56956.1 52881.7 46405.5 41147.5 32349.9 24912.5 15195.0 8105.8 4596.3 2482.9 1166.9 607.3 274.5
7 MBit 98523.9 76711.0 78596.4 69931.2 75540.6 71109.0 60317.2 61720.8 52180.3 52202.8 46387.4 38451.9 25545.4 13105.0 8407.8 4872.6 2713.3 1309.7 709.0 318.8
8 MBit 87351.0 75656.7 74923.1 66174.6 70852.2 68212.4 64932.1 57345.4 54967.2 48345.1 40175.3 38532.2 26071.8 14798.5 9123.0 5256.2 2973.4 1499.4 799.4 362.2
9 MBit 94576.8 78769.9 69504.6 73223.2 68832.2 62923.9 66840.1 57760.5 58839.4 52060.7 47742.9 31783.0 22332.1 13762.6 8640.8 5569.4 3192.6 1670.4 893.1 409.5
10 MBit 79487.8 72676.1 71520.9 70812.2 63945.7 59375.8 57689.5 51111.7 49184.0 41028.0 36663.5 25067.8 19034.7 11769.2 8343.4 5551.4 3344.9 1813.0 990.5 452.5
12 MBit 47313.1 42530.0 39160.7 39331.2 37001.8 36336.5 37946.6 34166.5 30534.9 27125.4 24789.0 20286.5 16061.6 12119.5 8604.0 5930.3 3745.8 2101.3 1176.1 537.8
16 MBit 22587.5 21808.5 21509.8 21477.9 21030.7 20776.5 19966.1 19902.4 19072.8 17849.3 16723.7 14571.1 12884.5 10701.1 8758.7 6492.2 4458.7 2578.7 1412.5 649.9
20 MBit 15453.9 15185.0 15132.2 15221.4 15066.6 14882.8 14787.3 14682.8 14482.5 13661.7 13035.0 12573.4 11687.9 10430.4 9021.7 6990.8 4982.8 3089.2 1744.9 807.9
24 MBit 12942.5 12791.0 12866.7 12805.2 12851.6 12730.2 12594.8 12655.6 12564.2 12325.6 12147.8 11651.2 11087.9 10117.6 9182.0 7465.4 5375.6 3427.5 1894.0 855.3
36 MBit 11300.3 11422.1 11510.5 11520.6 11526.4 11503.4 11558.1 11513.8 11577.9 11554.3 11500.8 11152.0 10877.9 10332.0 9545.8 8228.9 6584.6 4368.2 2543.6 1223.9
48 MBit 11570.8 11544.3 11555.6 11530.2 11582.2 11596.1 11618.3 11600.9 11642.0 11562.4 11576.5 11385.3 11221.2 10702.4 10117.7 8969.0 7350.4 5160.2 3032.2 1505.7
96 MBit 11680.3 11685.6 11701.2 11464.8 11719.0 11770.2 11788.6 11806.0 11853.3 11784.6 11820.7 11733.9 11665.5 11400.5 11138.8 10382.2 9262.4 7109.1 4370.9 2269.4
101 MBit 11708.2 11692.3 11713.3 11682.9 11745.7 11755.0 11780.7 11730.7 11770.4 11825.7 11861.9 11757.6 11718.3 11432.5 11189.0 10466.5 9412.2 7270.0 4564.3 2382.8
102 MBit 11678.8 11687.7 11711.7 11705.0 11763.0 11769.1 11789.8 11784.5 11828.4 11809.3 11857.6 11683.7 11520.7 11409.1 11110.6 10543.9 9416.0 7304.5 4596.0 2406.8
103 MBit 11706.2 11687.2 11703.6 11679.3 11744.7 11768.8 11775.1 11816.2 11823.8 11820.9 11860.6 11716.9 11739.0 11450.1 11170.4 10546.0 9420.6 7284.3 4435.2 2422.8
104 MBit 11714.7 11703.0 11718.9 11709.3 11747.5 11777.2 11738.6 11821.8 11855.0 11839.9 11856.1 11751.1 11746.2 11430.0 11191.3 10524.9 9413.5 7247.2 4945.0 2689.0
105 MBit 11489.0 11682.7 11718.2 11706.5 11750.7 11754.8 11781.6 11790.3 11850.9 11811.9 11827.1 11745.4 11719.1 11417.7 11171.5 10490.0 9425.0 7249.4 4989.1 2712.0
106 MBit 11708.1 11692.9 11661.2 11520.2 11743.3 11766.2 11786.9 11805.0 11832.2 11795.0 11831.3 11734.9 11719.8 11414.8 11158.9 10547.9 9429.5 7206.8 4800.5 2595.6
120 MBit 11746.6 11727.7 11711.1 11723.0 11718.8 11664.1 11821.2 11839.7 11870.8 11816.1 11877.3 11794.6 11767.6 11433.1 11319.4 10693.6 9725.4 7620.0 5295.4 2887.6
121 MBit 11752.6 11740.9 11719.6 11724.0 11777.3 11539.4 11787.7 11840.9 11874.5 11857.7 11866.9 11779.4 11793.1 11513.9 11317.9 10738.1 9695.5 7585.6 5118.6 2781.0
123 MBit 11728.8 11728.1 11723.2 11713.4 11755.0 11741.9 11599.1 11843.5 11884.1 11846.1 11886.1 11769.9 11786.1 11535.4 11336.3 10748.9 9791.2 7664.4 5145.4 2821.8
124 MBit 11730.7 11714.8 11723.3 11716.5 11749.9 11738.0 11551.4 11820.3 11874.5 11843.5 11882.5 11774.2 11798.5 11477.8 11331.5 10763.4 9794.0 7646.6 5189.6 2829.4
125 MBit 11740.0 11713.1 11726.4 11717.8 11672.1 11742.7 11807.8 11847.5 11870.3 11841.3 11870.7 11781.5 11807.0 11490.3 11310.9 10786.8 9820.4 7656.5 5224.7 2861.0
126 MBit 11725.1 11709.6 11717.2 11731.4 11705.2 11688.9 11819.9 11851.9 11878.1 11835.8 11879.2 11787.2 11807.5 11542.7 11299.6 10796.9 9820.8 7731.2 5254.1 2874.5
127 MBit 11725.7 11725.4 11711.1 11705.3 11602.0 11797.7 11817.2 11845.8 11857.8 11847.2 11879.4 11795.2 11793.5 11531.3 11354.6 10811.1 9795.0 7701.9 5240.1 2883.2
128 MBit 11739.6 11725.8 11733.2 11648.0 11703.5 11797.5 11832.7 11858.5 11885.6 11843.9 11911.3 11792.9 11800.4 11565.2 11393.2 10789.1 9871.8 7716.1 5325.8 2925.0

Best GPUSieveSize for
SievePrimes: 54 310 1078 1078 1846 2614 3382 5686 8502 13622 19766 31030 47414 74038 113206 175670 270902 419382 647734 1075766
at MiB: 7 9 7 9 7 7 9 7 9 7 9 8 8 5 128 127 128 126 128 128
max M/s: 98523.9 78769.9 78596.4 73223.2 75540.6 71109.0 66840.1 61720.8 58839.4 52202.8 47742.9 38532.2 26071.8 16010.0 11393.2 10811.1 9871.8 7731.2 5325.8 2925.0
Survivors: 48.30% 35.57% 30.05% 30.05% 28.19% 27.10% 26.36% 24.98% 24.00% 22.95% 22.19% 21.33% 20.58% 19.85% 19.21% 18.58% 18.00% 17.46% 16.96% 16.42%
removal rate
average: 50941.7 50750.5 54978.7 51219.1 54243.3 51836.2 49222.0 46304.9 44719.3 40223.8 37150.6 30312.8 20705.0 12831.3 9204.8 8802.2 8094.6 6381.0 4422.4 2444.7
incremental: n/a 49988.1 1970775.4 -15.2 -44339.6 13211.7 8290.9 11134.5 12340.0 4863.2 4252.2 1707.2 601.9 303.2 255.5 1323.5 657.6 192.2 85.7 35.3

5. mfakto_cl_71 kernel
soon
6. barrett_79 kernel
soon
7. barrett_92 kernel
soon
[/CODE]

Ethan (EO) 2022-01-21 06:39

[QUOTE=Ethan (EO);598463]1770 GHz-d/day from a Radeon VII with mfakto 0.14:
[/QUOTE]

Okay, took the time to build 0.15pre8 and it's much faster:

[CODE]
got assignment: exp=5340017 bit_min=69 bit_max=70 (22.39 GHz-days)
Starting trial factoring M5340017 from 2^69 to 2^70 (22.39 GHz-days)
Using GPU kernel "cl_barrett32_76_gs_2"

Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Jan 20 22:21 | 3780 82.0% | 0.795 2m18s | 2534.74 99894 0.00%[/CODE]

None of the mfakto.ini changes I made for 0.14 were needed - this is w/ default values for mfakto.ini in 0.15pre8.

A bit of exponent cherry picking and some overclocking, and we can hit 3200GHz-d/day:

[CODE]
got assignment: exp=1140863 bit_min=69 bit_max=70 (104.80 GHz-days)
Starting trial factoring M1140863 from 2^69 to 2^70 (104.80 GHz-days)
Using GPU kernel "cl_barrett32_76_gs_2"

Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Jan 20 22:46 | 61 1.5% | 2.937 46m18s | 3211.48 81206 0.00%
[/CODE]

perftest snippets [Default Clocks]:
[CODE]
Resulting speed for M2000093:
bit_min - bit_max GHz-days/day kernelname
60 - 64 2424.561 cl_barrett15_69_gs
64 - 76 2779.709 cl_barrett32_76_gs
76 - 77 2575.103 cl_barrett32_77_gs
77 - 87 2454.118 cl_barrett32_87_gs
87 - 88 2271.072 cl_barrett32_88_gs
88 - 92 2152.565 cl_barrett32_92_gs

Resulting speed for M39000037:
bit_min - bit_max GHz-days/day kernelname
60 - 64 1918.657 cl_barrett15_69_gs
64 - 76 2226.954 cl_barrett32_76_gs
76 - 77 2054.627 cl_barrett32_77_gs
77 - 87 1952.592 cl_barrett32_87_gs
87 - 88 1799.562 cl_barrett32_88_gs
88 - 92 1695.062 cl_barrett32_92_gs

Resulting speed for M66362159:
bit_min - bit_max GHz-days/day kernelname
60 - 64 1919.665 cl_barrett15_69_gs
64 - 76 2227.070 cl_barrett32_76_gs
76 - 77 2018.324 cl_barrett32_77_gs
77 - 87 1952.919 cl_barrett32_87_gs
87 - 88 1800.070 cl_barrett32_88_gs
88 - 92 1695.632 cl_barrett32_92_gs

Resulting speed for M74000077:
bit_min - bit_max GHz-days/day kernelname
60 - 64 1858.822 cl_barrett15_69_gs
64 - 76 2150.142 cl_barrett32_76_gs
76 - 77 1977.225 cl_barrett32_77_gs
77 - 87 1882.914 cl_barrett32_87_gs
87 - 88 1731.277 cl_barrett32_88_gs
88 - 92 1632.379 cl_barrett32_92_gs

Resulting speed for M78000071:
bit_min - bit_max GHz-days/day kernelname
60 - 64 1845.433 cl_barrett15_69_gs
64 - 76 2143.972 cl_barrett32_76_gs
76 - 77 1977.783 cl_barrett32_77_gs
77 - 87 1878.918 cl_barrett32_87_gs
87 - 88 1732.445 cl_barrett32_88_gs
88 - 92 1629.175 cl_barrett32_92_gs

Resulting speed for M332900047:
bit_min - bit_max GHz-days/day kernelname
60 - 64 1706.641 cl_barrett15_69_gs
64 - 76 1990.833 cl_barrett32_76_gs
76 - 77 1835.323 cl_barrett32_77_gs
77 - 87 1740.026 cl_barrett32_87_gs
87 - 88 1603.641 cl_barrett32_88_gs
88 - 92 1505.604 cl_barrett32_92_gs

Resulting speed for M999900079:
bit_min - bit_max GHz-days/day kernelname
60 - 64 1662.079 cl_barrett15_69_gs
64 - 76 1928.562 cl_barrett32_76_gs
76 - 77 1771.161 cl_barrett32_77_gs
77 - 87 1684.428 cl_barrett32_87_gs
87 - 88 1546.434 cl_barrett32_88_gs
88 - 92 1454.844 cl_barrett32_92_gs

Resulting speed for M2001862367:
bit_min - bit_max GHz-days/day kernelname
60 - 64 1582.528 cl_barrett15_69_gs
64 - 76 1855.355 cl_barrett32_76_gs
76 - 77 1711.931 cl_barrett32_77_gs
77 - 87 1619.642 cl_barrett32_87_gs
87 - 88 1493.767 cl_barrett32_88_gs
88 - 92 1398.778 cl_barrett32_92_gs

Resulting speed for M4201971233:
bit_min - bit_max GHz-days/day kernelname
60 - 64 1554.085 cl_barrett15_69_gs
64 - 76 1805.688 cl_barrett32_76_gs
76 - 77 1656.103 cl_barrett32_77_gs
77 - 87 1574.842 cl_barrett32_87_gs
87 - 88 1444.049 cl_barrett32_88_gs
88 - 92 1357.261 cl_barrett32_92_gs
[/CODE]


All times are UTC. The time now is 23:33.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.