![]() |
|
|
#2146 | ||
|
Romulan Interpreter
Jun 2011
Thailand
3·3,221 Posts |
Quote:
Quote:
). You know I was really missing it, and I was very upset when I found out (hey! I am the dude who argued hard about having the same naming scheme for checkpoints in cudaPM1, remember? and I got few punches even from Batalov for that ) but I didn't go back to 2.04 because I also like the new keyboard options in 2.05 (increase, decrease things interactively - brilliant!) and the size/threads tuning mechanism, it saves me tons of manual work which I used to do with 2.04 to tune the ranges. So, don't get me wrong, 2.05 is a wonderful upgrade! kotgw guys! |
||
|
|
|
|
|
#2147 |
|
"Carl Darby"
Oct 2012
Spring Mountains, Nevada
32·5·7 Posts |
LaurV, your issue was fixed in r59. However, I did find a minor bug in threadbench which is up with r65. I'm sure Jerry will have Windows versions soon.
|
|
|
|
|
|
#2148 |
|
"Jerry"
Nov 2011
Vancouver, WA
112310 Posts |
r65 Windows binaries posted to SourceForge
|
|
|
|
|
|
#2149 |
|
Romulan Interpreter
Jun 2011
Thailand
3·3,221 Posts |
Downloaded. Thanks. Some things could be better...
- the interactive t and T look very nice, but don't work. The iteration counter refuse to go under 10000, for example. This was working in the former version. - the file names are missing the ".txt" part, so some browser will not show the residues, which they consider is the "file extension" now. - the "tune" mechanism is gone, or I am enough stupid to fail for almost an hour to convince it to do some tuning. I tried -cufftbench x y z with x=y (it only does fft test, overwriting the file, c'mon! I am going berserk to this, luckily I have a backup, in the other folder, I worked one full day to make that file with the older version!) or with y<x as specified in the ini (it crashes completely!). |
|
|
|
|
|
#2150 |
|
"Carl Darby"
Oct 2012
Spring Mountains, Nevada
32×5×7 Posts |
Most of your problems are because I need to get off my ass and write some documentation.
-threadbench 1 8192 5 1 for example benchmarks fft lengths from 1 to 8192 found in <gpu> fft.txt. I have debate back and forth with myself whether to overwrite or append when doing a new cufftbench. The checkpoint interval cannot be less than the screen report interval. I'm open to suggestions for better ways to do these things. |
|
|
|
|
|
#2151 |
|
"Jerry"
Nov 2011
Vancouver, WA
1,123 Posts |
I've started an updated README. I'll email it to you later so you can use it.
|
|
|
|
|
|
#2152 |
|
"Carl Darby"
Oct 2012
Spring Mountains, Nevada
13B16 Posts |
Isn't there a way to have all the files in the directory shown? I think its silly to call something a text file when it is not a file with text in it. Maybe use extension .cls?
|
|
|
|
|
|
#2153 | |
|
Romulan Interpreter
Jun 2011
Thailand
3×3,221 Posts |
Quote:
Agree that saving files should not be too often. In fact, what would work perfect for me it would be to have files saved every 200k, or 500k, or 1M iterations, but to be able to print on screen every 2k, 5k, 10k. That is because it give you a feeling that the program is doing something, the code is not "in the woods", but in the same time, saving files is a compromise between speed and hard disk space. Ideal we should save any iteration, so we could properly resume the DCs without wasting time and redoing the last 1M iterations every time when it crashes or it misses the partial residues (if you have them). But of course, this idea is not only stupid, but also absurd. Saving every 1M or 500k iterations (or 100k for huge expos) should be quite OK, and it should be quite OK to have this "non changeable", you put it in the ini file, or in the command line, and it stays there! If you need a new interval, you can change the ini file and restart (or restart with a different cmd line parameters). This should be perfect. So, the t/T key should not affect the number of iteration after which the checkpoint is saved. The t/T is only for the screen, and my complain was exactly that: pressing t/T has no effect on screen, the output is every 10k iterations, no mater what. In the beginning I want to evaluate an ETA for an expo, I can't wait till it does 100k iterations and I press "t" few times. Nothing happens, beside of the screen messages that the number is decreased. For the older versions, a checkpoint file was written every time when a screen line was written, and this is still ok, with the observation that is limiting: you can't go too low with the number, without filling your HDD fast with garbage checkpoints, and slowing the things down (writing on disk is slow). But it was still ok, if there is no other way, we better have it as it was. It should be wonderful to have the syntax of the thread benching switch (why are 4 numbers? what is the forth?), this I am going to experiment immediately and urgent, when I will reach home in the evening (~6 hours left, lunch break now). And don't get angry with me when I am grumpy, is not disrespect. I respect you all for the good things you do here. BTW, I went last night to the cudaLucas' home page, I saw the list of "contributors", starting with Dubslow. All of you did a wonderful job, but people memory is short, and many, reading that page, may not remember that those changes starting from Dubslow, till today, are more or less "cosmetic", and actually is msft who did cudaLucas... (Sorry if I look a grumpy, I mean I am usually grumpy, but today more than other days, as I told to few on private, I had a small car accident, nothing serious, only few scratches of my car, one guy with a truck hit me from the back-right - in Thailand “right” is the driver’s side. My car insurance expired on March 11, I didn’t have time to renew it, because I was too busy at work, and of course, the truck guy didn’t stop - which is quite common for Thailand! I have the truck number, but people say that is not very useful, squeezing money from transportation companies, usually owned by some local lords, is impossible if you don’t have insurance – in the other case is not your business anymore, the insurance company will take care of it). Last fiddled with by LaurV on 2014-03-21 at 07:14 |
|
|
|
|
|
|
#2154 |
|
"Carl Darby"
Oct 2012
Spring Mountains, Nevada
32·5·7 Posts |
I am not at all angry with you, if fact I value your input highly.
Y and y switches in interactive mode increase or decrease the screen report interval. The fourth parameter in the -threadbench option affects which fft lengths are tested and what is printed on the screen. 1 tests every fft length in the <gpu> fft.txt file, 0 tests all reasonable fft lengths (greatest prime factor <= 7), higher numbers affect screen output and some exclude 32 and 1024 as tested thread values. I can't remember the exact details at the moment. Not all changes since msft quit working on CUDALucas have been cosmetic only. In fact many of the changes most users won't even notice. 1. Random bit shift. 2. Change fft length during the test. 3. Better fft and threads optimization. 4. Less memory useage, on device, host, and disk. 5. Smaller data tranfers between device and host, and between host and disk. 6. Faster kernels (thanks again George) 7. Ability to change most basic settings without restarting. |
|
|
|
|
|
#2155 | |
|
Romulan Interpreter
Jun 2011
Thailand
25BF16 Posts |
Quote:
![]() (you know I am teasing you, right? hehe) I have added these lines to the ini file: Code:
# Y -- increase screen output interval. # y -- decrease screen output interval. I have tested different screen output formats - working properly, no bug found. Currently testing the thread optimization, up to now it looks ok. I have changed in the ini file, to say '-threadbench" instead of "-cufftbench" where it was appropriate. I could attach the ini file to replace the one on sourceforge (unchanged since years! I have added different comments to it) but I have the feeling that Jerry will go through it anyhow, so I let him the pleasure ![]() Thanks a billion! Please do not overwrite the fft file. Adding to it should be better. Or give and option. I like to collect all the info, and eventually sort them (manually) in increasing order, I have a file with all sizes for "now I am not overclocking" or "now P95 is not running, but Aliqueit is running instead" and so on. The results are different, and to get maximum performance, I have to use the right size. I know this sounds nitpicking, well, it comes with the age...
|
|
|
|
|
|
|
#2156 |
|
"Carl Darby"
Oct 2012
Spring Mountains, Nevada
32·5·7 Posts |
How about backing up the old fft.txt file? (I mean instead of overwriting it) Other routines depend on fft.txt being in increasing order.
Last fiddled with by owftheevil on 2014-03-21 at 15:24 |
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Don't DC/LL them with CudaLucas | LaurV | Data | 131 | 2017-05-02 18:41 |
| CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 | Brain | GPU Computing | 13 | 2016-02-19 15:53 |
| CUDALucas: which binary to use? | Karl M Johnson | GPU Computing | 15 | 2015-10-13 04:44 |
| settings for cudaLucas | fairsky | GPU Computing | 11 | 2013-11-03 02:08 |
| Trying to run CUDALucas on Windows 8 CP | Rodrigo | GPU Computing | 12 | 2012-03-07 23:20 |