mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2021-09-01, 22:46   #56
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

24×23 Posts
Default

Quote:
Originally Posted by frmky View Post
cudacommon.h is missing from the git repository.
Fixed along with another issue.
SethTro is offline   Reply With Quote
Old 2021-09-02, 00:44   #57
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

17016 Posts
Default

Quote:
Originally Posted by bsquared View Post
1280: (~31 ms/curves)
2560: (~21 ms/curves)
640: (~63 ms/curves)
1792: (~36 ms/curves)

So we have a winner! -gpucurves 2560 beats all the others and anything the old build could do as well (best on the old build was 5120 @ (~25 ms/curves))

With the smaller kernel (running (2^499-1) / 20959), -gpucurves 5120 is fastest at about 6ms/curve on both new and old builds.
I added `gpu_throughput_test.sh` which runs different sized inputs and measures throughput.

On my system maximum results are achieved at

256 bits: 2x default curves (or 3584 curves), same speed at 4x default too
512 bits: 2x and 4x default curves
1024 bits: only at default curves
extra testing at 2048 bits: 1.5x and 3x outperform 2x and 4x slightly
SethTro is offline   Reply With Quote
Old 2021-09-02, 00:45   #58
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

5608 Posts
Default

Quote:
Originally Posted by SethTro View Post
I added `gpu_throughput_test.sh` which runs different sized inputs and measures throughput.

On my system maximum results are achieved at

256 bits: 2x default curves (or 3584 curves), same speed at 4x default too
512 bits: 2x and 4x default curves
1024 bits: only at default curves
extra testing at 2048 bits: 1.5x and 3x outperform 2x and 4x slightly
Maybe this relates to registers used by the kernel? max threads per block? Any insight from CUDA experts would be appreciated
SethTro is offline   Reply With Quote
Old 2021-09-02, 09:12   #59
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

24×23 Posts
Default

I halved compile time by adding cgbn_swap and avoiding inlining double_add_v2 twice.

Sadly I pushed the branch and it will probably fail to compile for everyone till https://github.com/NVlabs/CGBN/pull/17 gets pulled

---

@bsquared, you might try changing TPB_DEFAULT from 128 to 512, In some initial testing it looks like larger gpucurves don't slow down any more with ./gpu_throughput_test.sh more testing to follow tomorrow.
SethTro is offline   Reply With Quote
Old 2021-09-02, 15:39   #60
chris2be8
 
chris2be8's Avatar
 
Sep 2009

2×32×112 Posts
Default

Quote:
Originally Posted by henryzz View Post
My guess is that your gcc version may be too old. I would try the most recent version you can get your hands on. The easiest way may be to update your OS into a version that isn't end of life.
I've installed gcc-6 (the latest in the repositories) and that gets past that error, but fails a bit further on:
Code:
 gcc-6 --version
gcc-6 (SUSE Linux) 6.2.1 20160826 [gcc-6-branch revision 239773]
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

./configure --enable-gpu=30 --with-cuda=/usr/local/cuda CC=gcc-6 -with-cgbn-include=/home/chris/CGBN/include/cgbn
...
configure: Using cuda.h from /usr/local/cuda/include
checking cuda.h usability... yes
checking cuda.h presence... yes
checking for cuda.h... yes
checking that CUDA Toolkit version is at least 3.0... (9.0) yes
configure: Using CUDA dynamic library from /usr/local/cuda/lib64
checking for cuInit in -lcuda... yes
checking that CUDA Toolkit version and runtime version are the same... no
configure: error: 'cuda.h' and 'cudart' library have different versions, you have to reinstall CUDA properly, or use the --with-cuda parameter to tell configure the path to the CUDA library and header you want to use
That error message doesn't make much sense because I only have one version of CUDA installed on the system. So it's probably failing to compile a test program.

So I'll try upgrading the OS next. Then install later versions of CUDA and gcc.
chris2be8 is offline   Reply With Quote
Old 2021-09-02, 20:06   #61
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

24×23 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
I've installed gcc-6 (the latest in the repositories) and that gets past that error, but fails a bit further on:
Code:
configure: error: 'cuda.h' and 'cudart' library have different versions, you have to reinstall CUDA properly, or use the --with-cuda parameter to tell configure the path to the CUDA library and header you want to use
That error message doesn't make much sense because I only have one version of CUDA installed on the system. So it's probably failing to compile a test program.
You can find the literal program it failed to compile in config.log or the shape in acinclude.m4 (basically wrap the 2nd block in int maint() { ... })

Code:
    AC_RUN_IFELSE([AC_LANG_PROGRAM([
      [
        #include <stdio.h>
        #include <string.h>
        #include <cuda.h>
        #include <cuda_runtime.h>
      ]],[[
        int libversion;
        cudaError_t err;
        err = cudaRuntimeGetVersion (&libversion);
        if (err != cudaSuccess)
        {
          printf ("Could not get runtime version\n");
          printf ("Error msg: %s\n", cudaGetErrorString(err));
          return -1;
        }
        printf("(%d.%d/", CUDA_VERSION/1000, (CUDA_VERSION/10) % 10);
        printf("%d.%d) ", libversion/1000, (libversion/10) % 10);
        if (CUDA_VERSION == libversion)
          return 0;
        else
          return 1;
      ]])],
And you can find the command line it tried to compile this with in config.log too (my guess is something like gcc-9 -o conftest -I/usr/local/cuda/include -g -O2 -I/usr/local/cuda/include -Wl,-rpath,/usr/local/cuda/lib64 -L/usr/ local/cuda/lib64 conftest.c -lcudart -lstdc++ -lcuda -lrt )
SethTro is offline   Reply With Quote
Old 2021-09-02, 22:28   #62
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2×1,097 Posts
Default

I think this can be triggered if the version of CUDA supported by the driver doesn't match the toolkit version. But this is usually ok as long as the driver is a little newer. I think both this and the lack of cuInit() in the CUDA lib should be warnings, not errors. Both of these are ok in some circumstances.
frmky is offline   Reply With Quote
Old 2021-09-02, 23:57   #63
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

5608 Posts
Default

Happy me!

I found two 35 digit factors from a C303 today (from Factoring for a publication)

Code:
GPU: factor 404157820975138535541421971085010741 found in Step 1 with curve 1796 (-sigma 3:1850760857)
GPU: factor 404157820975138535541421971085010741 found in Step 1 with curve 2049 (-sigma 3:1850761110)
GPU: factor 404157820975138535541421971085010741 found in Step 1 with curve 2449 (-sigma 3:1850761510)
Computing 3584 Step 1 took 2294ms of CPU time / 1816867ms of GPU time
********** Factor found in step 1: 404157820975138535541421971085010741
Found prime factor of 36 digits: 404157820975138535541421971085010741
Then
Code:
Thu 2021/09/02 23:25:50 UTC Step 1 took 0ms
Thu 2021/09/02 23:25:50 UTC Step 2 took 9668ms
Thu 2021/09/02 23:25:50 UTC ********** Factor found in step 2: 51858345311243630596653971633910169
Thu 2021/09/02 23:25:50 UTC Found prime factor of 35 digits: 51858345311243630596653971633910169
Feels good that this code is being useful :)

Last fiddled with by SethTro on 2021-09-02 at 23:58
SethTro is offline   Reply With Quote
Old 2021-09-03, 07:02   #64
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2×1,097 Posts
Default

Quote:
Originally Posted by SethTro View Post
Feels good that this code is being useful :)
Nearly all of the factors that I found for Factoring for a Publication 2 used this code.
frmky is offline   Reply With Quote
Old 2021-09-04, 16:00   #65
chris2be8
 
chris2be8's Avatar
 
Sep 2009

2·32·112 Posts
Default

I'm still puzzling over it. I've upgraded the system to openSUSE Leap 15.3 and installed CUDA 11.4. But no matter what I do lspci -v still says Kernel modules: nouveau

I've tried everything I can find in the CUDA Installation Guide for Linux. And everything I can find on the web. But it still loads the nouveau kernel module, not the one shipped with CUDA. Has anyone any idea how to get it to use the Nvidia drivers?

NB. On the system with the GTX 970:
Code:
4core:~ # lspci -v -s 01:00
01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: eVga.com. Corp. Device 3978
	Flags: fast devsel, IRQ 11
	Memory at f6000000 (32-bit, non-prefetchable) [disabled] [size=16M]
	Memory at e0000000 (64-bit, prefetchable) [disabled] [size=256M]
	Memory at f0000000 (64-bit, prefetchable) [disabled] [size=32M]
	I/O ports at e000 [disabled] [size=128]
	Expansion ROM at f7000000 [disabled] [size=512K]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Legacy Endpoint, MSI 00
	Capabilities: [100] Virtual Channel
	Capabilities: [250] Latency Tolerance Reporting
	Capabilities: [258] L1 PM Substates
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900] #19
	Kernel modules: nouveau
Compare with on the system with a CC 3.0 card:
Code:
root@sirius:~# lspci -v -s 07:00
07:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 760] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Micro-Star International Co., Ltd. [MSI] GK104 [GeForce GTX 760]
	Flags: bus master, fast devsel, latency 0, IRQ 76
	Memory at f6000000 (32-bit, non-prefetchable) [size=16M]
	Memory at e8000000 (64-bit, prefetchable) [size=128M]
	Memory at f0000000 (64-bit, prefetchable) [size=32M]
	I/O ports at e000 [size=128]
	[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Endpoint, MSI 00
	Capabilities: [b4] Vendor Specific Information: Len=14 <?>
	Capabilities: [100] Virtual Channel
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900] #19
	Kernel driver in use: nvidia
	Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
Compare the last line of output in each case.

If it's because CUDA 11.4 doesn't support this card I could try removing CUDA 11.4 and installing CUDA 10.x But would that work.
chris2be8 is offline   Reply With Quote
Old 2021-09-04, 16:58   #66
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

3·1,291 Posts
Default

Following this solution (although Ubuntu)

Quote:
Boot to Ubuntu, but before you login in to Ubuntu, press Cntrl+Alt+F2

run the following command:

sudo nano /etc/modprobe.d/blacklist-nouveau.conf

add the 2 following lines, save & exit

blacklist nouveau
options nouveau modeset=0

run the following command

sudo update-initramfs -u
reboot.

run lsmod | grep nvidia

HTH

Last fiddled with by paulunderwood on 2021-09-04 at 17:08
paulunderwood is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
NTT faster than FFT? moytrage Software 50 2021-07-21 05:55
PRP on gpu is faster that on cpu indomit Information & Answers 4 2020-10-07 10:50
faster than LL? paulunderwood Miscellaneous Math 13 2016-08-02 00:05
My CPU is getting faster and faster ;-) lidocorc Software 2 2008-11-08 09:26
Faster than LL? clowns789 Miscellaneous Math 3 2004-05-27 23:39

All times are UTC. The time now is 14:42.


Mon Oct 25 14:42:24 UTC 2021 up 94 days, 9:11, 0 users, load averages: 1.14, 1.19, 1.22

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.