Forum: Software
2021-04-18, 18:12
|
Replies: 51
Views: 6,397
So, does this mean that p=M31 has all the...
So, does this mean that p=M31 has all the required roots-of-two for the IBDWT for Z/pZ NTT? so, is NTT(M31) a viable alternative to FGT?
Or, the problem with M31 is that it doesn't have the...
|
Forum: GPU Computing
2021-04-15, 17:48
|
Replies: 1
Views: 174
|
Forum: GpuOwl
2021-04-03, 17:12
|
Replies: 16
Views: 832
|
Forum: Information & Answers
2021-03-31, 11:55
|
Replies: 7
Views: 341
|
Forum: Hardware
2021-03-29, 07:43
|
Replies: 16
Views: 741
The need for the general-MUL vs. MUL-3 only...
The need for the general-MUL vs. MUL-3 only appears when changing the "L" step dinamically during a test. This is something GpuOwl does not support (and thus gets away with using MUL-3), but prime95...
|
Forum: GpuOwl
2021-03-28, 18:58
|
Replies: 82
Views: 10,400
|
Forum: Hardware
2021-03-21, 20:02
|
Replies: 16
Views: 741
|
Forum: GpuOwl
2021-03-12, 20:31
|
Replies: 16
Views: 832
|
Forum: GpuOwl
2021-03-12, 08:56
|
Replies: 82
Views: 10,400
I recently got a Radeon VII with Samsung memory...
I recently got a Radeon VII with Samsung memory (as as RMA replacement). Even without any RAM overclock, and without any undervolt, that memory consistently generates errors. This is in contrast with...
|
Forum: GpuOwl
2021-03-12, 08:52
|
Replies: 16
Views: 832
|
Forum: GpuOwl
2021-03-10, 19:43
|
Replies: 82
Views: 10,400
|
Forum: GpuOwl
2021-03-10, 17:28
|
Replies: 82
Views: 10,400
|
Forum: mersenne.ca
2021-03-02, 19:04
|
Replies: 599
Views: 67,986
|
Forum: Software
2021-02-20, 08:48
|
Replies: 39
Views: 6,428
3xSP sum()
Unfortunately the sum() I have up to now is a beast: 54 ADDs.
This seems a rather very expensive sum()..
To see some corner-cases that sum() must handle, here is one example: given "x", we'd...
|
Forum: GPU Computing
2021-02-20, 08:42
|
Replies: 10
Views: 1,160
|
Forum: Software
2021-02-12, 21:02
|
Replies: 39
Views: 6,428
Figure 10 seems to indicate:
c0,e0 =...
Figure 10 seems to indicate:
c0,e0 = twoSum(a0, b0)
d1,e11 = twoSum(a1, b1)
c1,e12 = twoSum(d1, e0)
c2 = a2 + b2 + e11 + e12
which looks pretty good (i.e. simpler than I was expecting)
|
Forum: Software
2021-02-12, 19:45
|
Replies: 39
Views: 6,428
|
Forum: Software
2021-02-12, 08:42
|
Replies: 39
Views: 6,428
SP plan
I've been thinking some more about a practical SP FFT implementation on GPUs, and here are some problems/ideas:
1. FFT twiddles, i.e. the trigonometric constants (sin+cos) used in the FFT.
...
|
Forum: GpuOwl
2021-02-06, 20:04
|
Replies: 48
Views: 7,012
|
Forum: GpuOwl
2021-02-04, 06:16
|
Replies: 48
Views: 7,012
|
Forum: GpuOwl
2021-02-03, 21:40
|
Replies: 48
Views: 7,012
GpuOwl updated P-1 calculator
Hi, recently I revisited the P-1 calculator that's included with GpuOwl's source code https://github.com/preda/gpuowl/blob/master/pm1/pm1.cpp
The calculator is a small stanalone C++ program; to...
|
Forum: GPU Computing
2021-01-17, 16:42
|
Replies: 21
Views: 1,718
|
Forum: Hardware
2021-01-05, 10:12
|
Replies: 128
Views: 12,211
The cache (L1/L2/L3) is used transparently for...
The cache (L1/L2/L3) is used transparently for the *global* memory operations. It is managed automatically by the cache control (probably a variant of LRU), not explicitly by the software. So yes,...
|
Forum: GpuOwl
2020-12-13, 04:31
|
Replies: 199
Views: 18,053
|
Forum: GpuOwl
2020-12-06, 22:58
|
Replies: 2,696
Views: 244,194
|