mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2010-01-20, 12:23   #78
BigBrother
 
Feb 2005
The Netherlands

21810 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Thank you, BigBrother.
Actually I've no CUDA-capable compiler environment installed under Windows, is MSVC the first choice for CUDA on Windows? Is it available for free for non-commercial usage?
I don't know if MSVC is the first choice for CUDA on Windows, it's the only one i've got here. I'm using a student license for MSVC.
BigBrother is offline   Reply With Quote
Old 2010-01-20, 15:55   #79
moebius
 
moebius's Avatar
 
Jul 2009
Germany

2C216 Posts
Default

Microsoft Visual Studio Express and MSVC++ Express Edition is free-of-charge, MFC, ATL, OpenPM require the Standard Edition or higher.

Last fiddled with by moebius on 2010-01-20 at 15:57
moebius is offline   Reply With Quote
Old 2010-01-20, 19:44   #80
moebius
 
moebius's Avatar
 
Jul 2009
Germany

2·353 Posts
Default

E:\mfactc>mfaktc-hack 65255629 68 69 0
mfaktc v0.02 Copyright (C) 2009, 2010 Oliver Weihe
THREADS_PER_GRID 1048576
THREADS_PER_BLOCK 256
SIEVE_SIZE_LIMIT 32kiB
SIEVE_SIZE 230945bits
SIEVE_PRIMES 50000
USE_PINNED_MEMORY enabled
USE_ASYNC_COPY enabled
VERBOSE_TIMING disabled
SELFTEST disabled
MORE_CLASSES disabled
sieve_init(): sieving factor candidates with small primes up to 611957
tf(65255629, 68, 69);
k_min = 2261474677980

no factor for M65255629 from 2^68 to 2^69
tf(): total time spent: 12785031msec

too bad that I can't upload the results to primenet
moebius is offline   Reply With Quote
Old 2010-01-21, 11:25   #81
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

100010110112 Posts
Default

Hello!

moebius: If you want to help: rerun allready factored exponents and help proving that my code finds the factor(s) aswell.

False positives are not the problem (if there are some)
But we need to be sure that my code doesn't miss the factors.

On your hardware "SIEVE_PRIMES 200000" might be benifical, to bad that you can't check easily (needs recompile).

Can you add one or two lines starting with "class" aswell so I can see the actual speed?

-----
Find attached version 0.03

- allow exponents up to 2^32 -1
(tested with some exponents around M3321xxxxxx)
- siever: improved the loop which creates the candidate list (again)
- loop unrolled
- use a lookup table to parse 8 bits at once
- added 40 known factors from ElevenSmooth "Operation Billion Digits" in
M3321xxxxxx range to the selftest
- added another timer which helps to adjust SIEVE_PRIMES (needs to be
enabled with VERBOSE_TIMING)

The bigger exponents need further testing!

You'll see the speedups only in configurations where the CPU was limiting!

-----
Uncwilly: M3312xxxxxx from 2^1 to 2^71 needs ~5m 35s on my system :)


Oliver
Attached Files
File Type: gz mfaktc-0.03.tar.gz (26.8 KB, 289 views)

Last fiddled with by TheJudger on 2010-01-21 at 11:26
TheJudger is offline   Reply With Quote
Old 2010-01-21, 11:58   #82
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

5·7·139 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hello!

moebius: If you want to help: rerun allready factored exponents and help proving that my code finds the factor(s) aswell.

False positives are not the problem (if there are some)
But we need to be sure that my code doesn't miss the factors.

On your hardware "SIEVE_PRIMES 200000" might be benifical, to bad that you can't check easily (needs recompile).

Can you add one or two lines starting with "class" aswell so I can see the actual speed?

-----
Find attached version 0.03

- allow exponents up to 2^32 -1
(tested with some exponents around M3321xxxxxx)
- siever: improved the loop which creates the candidate list (again)
- loop unrolled
- use a lookup table to parse 8 bits at once
- added 40 known factors from ElevenSmooth "Operation Billion Digits" in
M3321xxxxxx range to the selftest
- added another timer which helps to adjust SIEVE_PRIMES (needs to be
enabled with VERBOSE_TIMING)

The bigger exponents need further testing!

You'll see the speedups only in configurations where the CPU was limiting!

-----
Uncwilly: M3312xxxxxx from 2^1 to 2^71 needs ~5m 35s on my system :)


Oliver
As a owner of a Nvidia card on a Linux OS and a manager of Operation Billion Digits, I will soon ask for help testing the range 3321xxxxxx

Luigi
ET_ is offline   Reply With Quote
Old 2010-01-21, 12:09   #83
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

5×223 Posts
Default

Hi Luigi,

keep in mind that it is still limited to 71 bits factor size.

Oliver
TheJudger is offline   Reply With Quote
Old 2010-01-21, 12:29   #84
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

5·7·139 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hi Luigi,

keep in mind that it is still limited to 71 bits factor size.

Oliver
Keep in mind that I still need to install Linux and learn how to use CUDA...

We may test our factors below 71 bits in less than a day (note that we didn't use Prime95...)

I keep on waiting for some 80 bits version to put on OBD page

Luigi
ET_ is offline   Reply With Quote
Old 2010-01-21, 19:44   #85
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

47108 Posts
Default

Wow, this could be a turning point for GIMPS. I hope your code gets implemented in Prime95!
ixfd64 is offline   Reply With Quote
Old 2010-01-21, 21:54   #86
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

3·7·17·31 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Uncwilly: M3312xxxxxx from 2^1 to 2^71 needs ~5m 35s on my system :)
I may have to invest in a box for home then. I currently only have a laptop.
Uncwilly is online now   Reply With Quote
Old 2010-01-21, 22:49   #87
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

5·103 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Uncwilly: M3312xxxxxx from 2^1 to 2^71 needs ~5m 35s on my system :)
How is that comparing with CPUs? (core i7 etc...)


-- Craig
nucleon is offline   Reply With Quote
Old 2010-01-22, 00:38   #88
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

3×7×17×31 Posts
Default

Quote:
Originally Posted by nucleon View Post
How is that comparing with CPUs? (core i7 etc...)
70->71 on a single core ~8h on my laptop. 0.72 GHz Days credit.

Last fiddled with by Uncwilly on 2010-01-22 at 00:38
Uncwilly is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1724 2023-06-04 23:31
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 42 2022-12-18 05:59
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 14:22.


Fri Jul 7 14:22:29 UTC 2023 up 323 days, 11:51, 0 users, load averages: 0.90, 1.11, 1.19

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔