mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuOwL: an OpenCL program for Mersenne primality testing (https://www.mersenneforum.org/showthread.php?t=22204)

kriesel 2019-12-16 09:25

gpuowl 6.11-88 build for Windows
 
1 Attachment(s)
Lots of warnings again:[CODE]$ make gpuowl-win.exe
cat head.txt gpuowl.cl tail.txt > gpuowl-wrap.cpp
echo \"`git describe --long --dirty --always`\" > version.new
diff -q -N version.new version.inc >/dev/null || mv version.new version.inc
echo Version: `cat version.inc`
Version: "v6.11-88-gb9f0be7"
g++ -MT Pm1Plan.o -MMD -MP -MF .d/Pm1Plan.Td -Wall -O2 -std=c++17 -c -o Pm1Plan.o Pm1Plan.cpp
g++ -MT GmpUtil.o -MMD -MP -MF .d/GmpUtil.Td -Wall -O2 -std=c++17 -c -o GmpUtil.o GmpUtil.cpp
g++ -MT Worktodo.o -MMD -MP -MF .d/Worktodo.Td -Wall -O2 -std=c++17 -c -o Worktodo.o Worktodo.cpp
In file included from Worktodo.cpp:6:
File.h: In static member function 'static File File::open(const std::filesystem::__cxx11::path&, const char*, bool)':
File.h:31:11: warning: format '%s' expects argument of type 'char*', but argument 2 has type 'const value_type*' {aka 'const wchar_t*'} [-Wformat=]
log("Can't open '%s' (mode '%s')\n", name.c_str(), mode);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
g++ -MT common.o -MMD -MP -MF .d/common.Td -Wall -O2 -std=c++17 -c -o common.o common.cpp
In file included from common.cpp:4:
File.h: In static member function 'static File File::open(const std::filesystem::__cxx11::path&, const char*, bool)':
File.h:31:11: warning: format '%s' expects argument of type 'char*', but argument 2 has type 'const value_type*' {aka 'const wchar_t*'} [-Wformat=]
log("Can't open '%s' (mode '%s')\n", name.c_str(), mode);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
g++ -MT main.o -MMD -MP -MF .d/main.Td -Wall -O2 -std=c++17 -c -o main.o main.cpp
In file included from main.cpp:8:
File.h: In static member function 'static File File::open(const std::filesystem::__cxx11::path&, const char*, bool)':
File.h:31:11: warning: format '%s' expects argument of type 'char*', but argument 2 has type 'const value_type*' {aka 'const wchar_t*'} [-Wformat=]
log("Can't open '%s' (mode '%s')\n", name.c_str(), mode);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
g++ -MT Gpu.o -MMD -MP -MF .d/Gpu.Td -Wall -O2 -std=c++17 -c -o Gpu.o Gpu.cpp
In file included from ProofSet.h:6,
from Gpu.cpp:4:
File.h: In static member function 'static File File::open(const std::filesystem::__cxx11::path&, const char*, bool)':
File.h:31:11: warning: format '%s' expects argument of type 'char*', but argument 2 has type 'const value_type*' {aka 'const wchar_t*'} [-Wformat=]
log("Can't open '%s' (mode '%s')\n", name.c_str(), mode);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
g++ -MT clwrap.o -MMD -MP -MF .d/clwrap.Td -Wall -O2 -std=c++17 -c -o clwrap.o clwrap.cpp
In file included from clwrap.cpp:4:
File.h: In static member function 'static File File::open(const std::filesystem::__cxx11::path&, const char*, bool)':
File.h:31:11: warning: format '%s' expects argument of type 'char*', but argument 2 has type 'const value_type*' {aka 'const wchar_t*'} [-Wformat=]
log("Can't open '%s' (mode '%s')\n", name.c_str(), mode);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
g++ -MT Task.o -MMD -MP -MF .d/Task.Td -Wall -O2 -std=c++17 -c -o Task.o Task.cpp
In file included from Task.cpp:7:
File.h: In static member function 'static File File::open(const std::filesystem::__cxx11::path&, const char*, bool)':
File.h:31:11: warning: format '%s' expects argument of type 'char*', but argument 2 has type 'const value_type*' {aka 'const wchar_t*'} [-Wformat=]
log("Can't open '%s' (mode '%s')\n", name.c_str(), mode);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
g++ -MT checkpoint.o -MMD -MP -MF .d/checkpoint.Td -Wall -O2 -std=c++17 -c -o checkpoint.o checkpoint.cpp
In file included from checkpoint.h:5,
from checkpoint.cpp:3:
File.h: In static member function 'static File File::open(const std::filesystem::__cxx11::path&, const char*, bool)':
File.h:31:11: warning: format '%s' expects argument of type 'char*', but argument 2 has type 'const value_type*' {aka 'const wchar_t*'} [-Wformat=]
log("Can't open '%s' (mode '%s')\n", name.c_str(), mode);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
g++ -MT timeutil.o -MMD -MP -MF .d/timeutil.Td -Wall -O2 -std=c++17 -c -o timeutil.o timeutil.cpp
g++ -MT Args.o -MMD -MP -MF .d/Args.Td -Wall -O2 -std=c++17 -c -o Args.o Args.cpp
In file included from Args.cpp:4:
File.h: In static member function 'static File File::open(const std::filesystem::__cxx11::path&, const char*, bool)':
File.h:31:11: warning: format '%s' expects argument of type 'char*', but argument 2 has type 'const value_type*' {aka 'const wchar_t*'} [-Wformat=]
log("Can't open '%s' (mode '%s')\n", name.c_str(), mode);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
g++ -MT state.o -MMD -MP -MF .d/state.Td -Wall -O2 -std=c++17 -c -o state.o state.cpp
g++ -MT Signal.o -MMD -MP -MF .d/Signal.Td -Wall -O2 -std=c++17 -c -o Signal.o Signal.cpp
g++ -MT FFTConfig.o -MMD -MP -MF .d/FFTConfig.Td -Wall -O2 -std=c++17 -c -o FFTConfig.o FFTConfig.cpp
g++ -MT AllocTrac.o -MMD -MP -MF .d/AllocTrac.Td -Wall -O2 -std=c++17 -c -o AllocTrac.o AllocTrac.cpp
g++ -MT gpuowl-wrap.o -MMD -MP -MF .d/gpuowl-wrap.Td -Wall -O2 -std=c++17 -c -o gpuowl-wrap.o gpuowl-wrap.cpp
g++ -o gpuowl-win.exe Pm1Plan.o GmpUtil.o Worktodo.o common.o main.o Gpu.o clwrap.o Task.o checkpoint.o timeutil.o Args.o state.o Signal.o FFTConfig.o AllocTrac.o gpuowl-wrap.o -lstdc++fs -lOpenCL -lgmp -pthread -L/opt/rocm/opencl/lib/x86_64 -L/opt/amdgpu-pro/lib/x86_64-linux-gnu -L/c/Windows/System32 -L. -static
strip gpuowl-win.exe
[/CODE]

preda 2019-12-16 12:27

[QUOTE=paulunderwood;533030]How do I under-volt with rocm-smi?[/QUOTE]

I use a small bash script, along the lines of:

rocm=/home/preda/ROC-smi/rocm-smi

pp() {
echo $*

cd /sys/class/drm/card$1/device
echo "m 1 $2" > pp_od_clk_voltage
echo "vc 1 1304 $3" > pp_od_clk_voltage
echo "vc 2 1801 $4" > pp_od_clk_voltage
echo c > pp_od_clk_voltage
$rocm -d$1 --setsclk $5
}

pp 2 1180 785 1050 3

there is: gpu-id (2), mem-frequency (1180), voltage at midpoint (765) and voltage at end (1050), and desired setsclk (3).

Do a "cat pp_od_clk_voltage" before changing it.

paulunderwood 2019-12-16 12:54

[QUOTE=preda;533050]I use a small bash script, along the lines of:

rocm=/home/preda/ROC-smi/rocm-smi

pp() {
echo $*

cd /sys/class/drm/card$1/device
echo "m 1 $2" > pp_od_clk_voltage
echo "vc 1 1304 $3" > pp_od_clk_voltage
echo "vc 2 1801 $4" > pp_od_clk_voltage
echo c > pp_od_clk_voltage
$rocm -d$1 --setsclk $5
}

pp 2 1180 785 1050 3

there is: gpu-id (2), mem-frequency (1180), voltage at midpoint (765) and voltage at end (1050), and desired setsclk (3).

Do a "cat pp_od_clk_voltage" before changing it.[/QUOTE]

I don't have pp_od_clk_voltage. This is what I have:

[CODE]/sys/class/drm/card1/device# ls
aer_dev_correctable driver_override mem_info_gtt_total pp_dpm_dcefclk resource
aer_dev_fatal drm mem_info_gtt_used pp_dpm_fclk resource0
aer_dev_nonfatal enable mem_info_vis_vram_total pp_dpm_mclk resource0_wc
ari_enabled fw_version mem_info_vis_vram_used pp_dpm_pcie resource2
boot_vga gpu_busy_percent mem_info_vram_total pp_dpm_sclk resource2_wc
broken_parity_status hwmon mem_info_vram_used pp_dpm_socclk resource4
class i2c-10 modalias pp_features resource5
config i2c-4 msi_bus pp_force_state revision
consistent_dma_mask_bits i2c-6 msi_irqs pp_mclk_od rom
current_link_speed i2c-8 numa_node pp_num_states subsystem
current_link_width irq pcie_bw pp_power_profile_mode subsystem_device
d3cold_allowed local_cpulist pcie_replay_count pp_sclk_od subsystem_vendor
device local_cpus power pp_table uevent
df_cntr_avail max_link_speed power_dpm_force_performance_level remove unique_id
dma_mask_bits max_link_width power_dpm_state rescan vbios_version
driver mem_busy_percent pp_cur_state reset vendor
[/CODE]

Is it okay to cat the file pp_od_clk_voltage?

preda 2019-12-16 19:40

[QUOTE=paulunderwood;533052]I don't have pp_od_clk_voltage. This is what I have:
[/QUOTE]

Add amdgpu.ppfeaturemask=0xffffffff to your /etc/default/grub.conf , to enable power-play (and reboot). After that you should have the file pp_od_clk_voltage

GRUB_CMDLINE_LINUX_DEFAULT="amdgpu.ppfeaturemask=0xffffffff"

dcheuk 2019-12-16 22:36

[QUOTE=ATH;532657]The folder:

C:\Users\[your user name]\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup

is for programs that should start only when that specific user logs in, while the folder:

C:\ProgramData\Microsoft\Windows\Start Menu\Programs\StartUp\

is for programs that should start for all users logging in[/QUOTE]

[QUOTE=kriesel;532661]So, the second is also not at system startup.[/QUOTE]

Sorry it's been a busy week.

This is exactly what I did on 2 computers running Win 10, both misfit and mfaktc programs started before logging in (due to a Windows update). My apologies for the bad information - that is what I remembered doing and it worked so I thought it would be helpful. :sad:

paulunderwood 2019-12-17 03:35

[QUOTE=preda;533072]Add amdgpu.ppfeaturemask=0xffffffff to your /etc/default/grub.conf , to enable power-play (and reboot). After that you should have the file pp_od_clk_voltage

GRUB_CMDLINE_LINUX_DEFAULT="amdgpu.ppfeaturemask=0xffffffff"[/QUOTE]

after an update-grub... The undervolts in your file were too tight for my card and caused (my first) error. Here is what I have now:

[CODE]sh pp.sh
1 1160 820 1050 5[/CODE]

With fan at 150 sensors show:

[CODE]amdgpu-pci-0300
Adapter: PCI adapter
vddgfx: +0.96 V
fan1: 2937 RPM (min = 0 RPM, max = 3850 RPM)
edge: +68.0°C (crit = +100.0°C, hyst = -273.1°C)
(emerg = +105.0°C)
junction: +92.0°C (crit = +110.0°C, hyst = -273.1°C)
(emerg = +115.0°C)
mem: +74.0°C (crit = +94.0°C, hyst = -273.1°C)
(emerg = +99.0°C)
power1: 253.00 W (cap = 250.00 W)
[/CODE]

I am getting 752 us/it for FFT 5632K :smile:

EDIT: After 3 hours I got another error. I have relaxed the undervolt to:

[CODE]sh pp.sh
1 1160 830 1050 5
[/CODE]

Now 755us,

kriesel 2019-12-17 17:31

Unusual output, showing factor truncation at 3 differing lengths
 
I recently started running widely separated exponents in P-1 to determine run time scaling on Radeon VII in gpuowl-win v6.11-83-ge270393 and feasible exponent limit. On a >500M test exponent I got unusual results, producing a mammoth alleged factor, or at least outputting some of what it claims to be one.

It appears the console, log, and results output truncate mammoth alleged factors at 3 different lengths. The longest output length appears to be 2029. digits, per Windows File Manager.

There are no warnings or errors output when the truncations occur.
The truncation includes omission from the results record, of the JSON punctuation after a factor: [B]"]}[/B]

I request the output fields be lengthened where possible, and checks for truncation be included if possible.
Also, in P-1 stage one I was observing (min 0 0) at every output. If that is an error condition it could be trapped for. It appears to be ok, just odd looking.

I have reduced clock rates, and after running successfully a smallish test exponent with known factor, I am rerunning from the beginning, the large test exponent whose factorization is unknown. I have saved the first run's save files in a separate folder.

A few theories of what may have happened:
1) An error in the hardware due to clock rates that are too high for the impeccable accuracy required by the inherent relative lack of P-1 computation error checks (most likely)
2) A software issue
3) A mammoth factor found that exceeds the allowed lengths of gpuowl's output formats, perhaps a composite of several factors (least likely)
4) Something else I haven't thought of
5) Some combination

I'll post an update after either the retest, or the availability of a new commit with longer output limits that could be run on the old saved files.

mrh 2019-12-17 17:44

Same happened to me last night, happened on [M]133331333[/M]. I wanted one that was quick if I deleted the state so I tried [M]95531[/M], same issue. Same with my nvidia card. Baffled, I pulled the latest from github and now it works fine. I was also increasing the output buffer, had to increase it to over 40MB, since it was trying to print 2^133331333-1, which has a lot of digits lol

[URL="https://github.com/preda/gpuowl/issues/87"]github issue[/URL]

[QUOTE=kriesel;533104]I recently started running widely separated exponents in P-1 to determine run time scaling on Radeon VII in gpuowl-win v6.11-83-ge270393 and feasible exponent limit. On a >500M test exponent I got unusual results, producing a mammoth alleged factor, or at least outputting some of what it claims to be one.

It appears the console, log, and results output truncate mammoth alleged factors at 3 different lengths. The longest output length appears to be 2029. digits, per Windows File Manager.

There are no warnings or errors output when the truncations occur.
The truncation includes omission from the results record, of the JSON punctuation after a factor: [B]"]}[/B]

I request the output fields be lengthened where possible, and checks for truncation be included if possible.
Also, in P-1 stage one I was observing (min 0 0) at every output. If that is an error condition it could be trapped for. It appears to be ok, just odd looking.

I have reduced clock rates, and after running successfully a smallish test exponent with known factor, I am rerunning from the beginning, the large test exponent whose factorization is unknown. I have saved the first run's save files in a separate folder.

A few theories of what may have happened:
1) An error in the hardware due to clock rates that are too high for the impeccable accuracy required by the inherent relative lack of P-1 computation error checks (most likely)
2) A software issue
3) A mammoth factor found that exceeds the allowed lengths of gpuowl's output formats, perhaps a composite of several factors (least likely)
4) Something else I haven't thought of
5) Some combination

I'll post an update after the retest or the availability of a new commit with longer output limits.[/QUOTE]

kriesel 2019-12-17 20:02

[QUOTE=mrh;533107]Same happened to me last night, happened on [M]133331333[/M]. I wanted one that was quick if I deleted the state so I tried [M]95531[/M], same issue. Same with my nvidia card. Baffled, I pulled the latest from github and now it works fine. I was also increasing the output buffer, had to increase it to over 40MB, since it was trying to print 2^133331333-1, which has a lot of digits lol

[URL="https://github.com/preda/gpuowl/issues/87"]github issue[/URL][/QUOTE]
Thanks, I'm giving gpuowl-v6.11-90-g2f94ace a chance at the saved old files.
Note though, that v11.83 was able to perform a 10M with known factor correctly.
Which version(s) did you see the issue on?

And Preda, please put a safety net of some sort there, for mammoth factors either legitimately or due to error. Perhaps check if it fits in the available output buffers, and if not, print a message to that effect along with the length in bits or digits or whatever. A little more info than the re-run quickly gave:
[CODE]2019-12-17 14:08:58 roa/radeonvii-f2 xxxxxxxxx P2 2880/2880: 54806 primes; setup 2.13 s, 11.341 ms/prime
2019-12-17 14:09:00 roa/radeonvii-f2 yyyyyyyyy FFT 40960K: Width 256x4, Height 256x8, Middle 10; 16.69 bits/word
terminate called after throwing an instance of 'std::domain_error'
what(): GCD invalid input[/CODE]

mrh 2019-12-17 20:57

It was v6.11-77-g1af5378

[QUOTE=kriesel;533117]Thanks, I'm giving gpuowl-v6.11-90-g2f94ace a chance at the saved old files.
Note though, that v11.83 was able to perform a 10M with known factor correctly.
Which version(s) did you see the issue on?

[/QUOTE]

mrh 2019-12-17 22:52

FWIW, I went back to v6.11-84-geda9b17 which is a lot faster than v6.11-90-g2f94ace for me:

1008 us/it vs 1524 us/it

Both using -use FMA_X2,MERGED_MIDDLE with --setsclk 3.

I didn't actually notice until I started getting text messages that my card was running hot.
With out MERGED_MIDDLE, 6.11-90 is 1068 us/it, but power draw is 12W more than 6.11-84, and temp is much higher.


All times are UTC. The time now is 23:14.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.