mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfaktc: a CUDA program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=12827)

LaurV 2011-11-29 06:47

you got them a bit viceversa. Each line contains two commands, which - by default - will open their own windows. Without the switches you could end up in having 2 windozes for each command line, totally 6 for your example, and they would close and disappear when the most inner task finishes. Use the help:

[CODE]
c:\>help cmd
Starts a [B][COLOR=Red]new instance[/COLOR][/B] of the Windows XP command interpreter

CMD [/A | /U] [/Q] [/D] [/E:ON | /E:OFF] [/F:ON | /F:OFF] [/V:ON | /V:OFF]
[[/S] [/C | /K] string]

/C Carries out the command specified by string and then terminates
/K [COLOR=Red]Carries out the command specified by string[B] but remains[/B][/COLOR]
/S Modifies the treatment of string after /C or /K (see below)
/Q Turns echo off
/D Disable execution of AutoRun commands from registry (see below)
/A Causes the output of internal commands to a pipe or file to be ANSI
/U Causes the output of internal commands to a pipe or file to be
Unicode
/T:fg Sets the foreground/background colors (see COLOR /? for more info)
/E:ON Enable command extensions (see below)
/E:OFF Disable command extensions (see below)
/F:ON Enable file and directory name completion characters (see below)
/F:OFF Disable file and directory name completion characters (see below)
/V:ON Enable delayed environment variable expansion using ! as the
delimiter. For example, /V:ON would allow !var! to expand the
variable var at execution time. The var syntax expands variables
at input time, which is quite a different thing when inside of a FOR
loop.
/V:OFF Disable delayed environment expansion.

Note that multiple commands separated by the command separator '&&'
are accepted for string if surrounded by quotes. Also, for compatibility
reasons, /X is the same as /E:ON, /Y is the same as /E:OFF and /R is the
same as /C. Any other switches are ignored.

[B][COLOR=Red]If /C or /K is specified, then the remainder of the command line after
the switch is processed as a command line[/COLOR][/B], where the following logic is
used to process quote (") characters:

1. If all of the following conditions are met, then quote characters
on the command line are preserved:

- no /S switch
- exactly two quote characters
- no special characters between the two quote characters,
where special is one of: &<>()@^|
- there are one or more whitespace characters between the
the two quote characters
- the string between the two quote characters is the name
of an executable file.

2. Otherwise, old behavior is to see if the first character is
a quote character and if so, strip the leading character and
remove the last quote character on the command line, preserving
any text after the last quote character.

If /D was NOT specified on the command line, then when CMD.EXE starts, it
Press any key to continue . . .[/CODE][CODE]
c:\>help start
Starts a [B][COLOR=Red]separate[/COLOR][/B] window to run a specified program or command.

START ["title"] [/Dpath] [/I] [/MIN] [/MAX] [/SEPARATE | /SHARED]
[/LOW | /NORMAL | /HIGH | /REALTIME | /ABOVENORMAL | /BELOWNORMAL]
[/WAIT] [/B] [command/program]
[parameters]

"title" Title to display in window title bar.
path Starting directory
B Start application [B][COLOR=Red]without creating a new window[/COLOR][/B]. The
application has ^C handling ignored. Unless the application
enables ^C processing, ^Break is the only way to interrupt
the application
I The new environment will be the original environment passed
to the cmd.exe and not the current environment.
MIN Start window minimized
MAX Start window maximized
SEPARATE Start 16-bit Windows program in separate memory space
SHARED Start 16-bit Windows program in shared memory space
LOW Start application in the IDLE priority class
NORMAL Start application in the NORMAL priority class
HIGH Start application in the HIGH priority class
REALTIME Start application in the REALTIME priority class
ABOVENORMAL Start application in the ABOVENORMAL priority class
BELOWNORMAL Start application in the BELOWNORMAL priority class
WAIT Start application and wait for it to terminate
command/program
If it is an internal cmd command or a batch file then
the command processor is run with the /K switch to cmd.exe.
This means that the window will remain after the command
has been run.

If it is not an internal cmd command or batch file then
it is a program and [B][COLOR=Red]will run as either a windowed application
or a console application.
[/COLOR] [/B]
parameters These are the parameters passed to the command/program


If Command Extensions are enabled, external command invocation
through the command line or the START command changes as follows:

non-executable files may be invoked through their file association just
by typing the name of the file as a command. (e.g. WORD.DOC would
launch the application associated with the .DOC file extension).
See the ASSOC and FTYPE commands for how to create these
associations from within a command script.

When executing an application that is a 32-bit GUI application, CMD.EXE
does not wait for the application to terminate before returning to
the command prompt. This new behavior does NOT occur if executing
within a command script.
Press any key to continue . . .
[/CODE]cmd.exe is the windows command prompts. For a standard installation (that is, if you did not play with the path variables) is is visible from anywhere. Click "start/command prompt", or "start/run" and type "cmd", and type help, i.e. play with "help" command. It is quite easy to understand. My objection to this mode (I said it before) is related to the fact that you end up running two console applications for each line, but they use same window, in fact, and you can not see it. You can use either the "start" command, or the "cmd.exe". Using both is redundant. But is just 64k of memory or so, for each, so it does not matter, if you feel comfortable with it, and want to specify affinities and priorities in an "easy to understand" format.

kladner 2011-11-29 16:06

[QUOTE=Dubslow;280351]Hey kladner,
First off, I heard Chicago's already gotten snow. True? (If so, not fair!)
Now: You said the /k makes sure each prompt gets its own window, and /b pauses the prompt if mfaktc stops running?

If I'm only running one instance, I can drop the /k, right?

And is cmd.exe recognized even if I'm running this command from the mfaktc folder (or in my case shortcut in the mfaktc folder)?[/QUOTE]

I think /k causes the persistent window, and /b keeps there from being more than one window per instance (visibly, at least.)

I use that string to run three instances most of the time. But each is launched independently. That is, I don't think there's any difference between running one, or more than one.

Without the /k I think the window will close on termination.

kladner 2011-11-30 17:36

Batch file to collect results
 
[QUOTE=LaurV;279922]<SNIP> The disadvantage of my method is that the results files will be in the subfolders. This can not be (yet) customized in the ini files. But for that I made a batch file to collect all the result.txt from subfolders, so I don't need to walk on each subfolder and look for them.[/QUOTE]

I put this together to copy the contents of 3 different results.txt files to a single file. It then effectively removes the contents by copying an empty results.txt to all three sub-directories, with no prompt on the overwrite.
[CODE]e:
cd \mfaktc_32-64
copy "mfaktc-0.17_32-64\results.txt"+"mfaktc-0.17_32-64_b\results.txt"+"mfaktc-0.17_32-64_c\results.txt" "Results-kladner.txt"
copy results.txt "mfaktc-0.17_32-64" /y
copy results.txt "mfaktc-0.17_32-64_b" /y
copy results.txt "mfaktc-0.17_32-64_c" /y[/CODE]This is probably not the most elegant solution, but it does work.

TheJudger 2011-12-03 00:34

[QUOTE=TheJudger;279285]
So for the mfaktc 0.18 release[LIST][*]I want to rework the barrett92 kernel (CUDA 4.1 optimizations)[*]I want to wait for official CUDA 4.1 release[*]ask Eric which of his new code should be included[/LIST][/QUOTE]

OK, OBD users will like the CUDA 4.1 optimized barrett92 kernel. :smile:
Preliminary data from my stock GTX 470 for M3321932839 from 2[SUP]79[/SUP] to 2[SUP]80[/SUP] - [B]raw GPU speed[/B]:
[CODE] | CUDA 3.2 | CUDA 4.0 | CUDA 4.1-RC1
mfaktc 0.17 | 177.59M/s | 185.38M/s | 181.21M/s
mfaktc 0.18-pre8 | 177.91M/s | 185.65M/s | 181.48M/s
mfaktc 0.18-pre10 | 183.77M/s | 191.96M/s | 211.03M/s[/CODE]

up to mfaktc 0.18-pre8 there are minimal changes in GPU code against mfaktc 0.17 (e.g. barrett92 has now a mininum factor size of 2[SUP]79[/SUP]). 0.18-pre9 are CUDA 4.1 specific optimizations for the barrett79 kernel, 0.18-pre10 are CUDA 4.1 specific optimizations and a rework of the squaring function for the barrett92 kernel. I guess 3-4% improvement for CC 1.x GPUs, too.


[QUOTE=TheJudger;279979][QUOTE=James Heinrich;279886]:surprised
No perhaps, please(!) add in some code to limit the frequency of writing checkpoint files![/QUOTE]

OK, added to my todo-list.[/QUOTE]

Removed from my todo-list and added to changelog.


Oliver

TheJudger 2011-12-03 13:10

[QUOTE=TheJudger;280853]
...
and a rework of the squaring function for the barrett92 kernel. I guess 3-4% improvement for CC 1.x GPUs, too.[/QUOTE]
Tests on my GTX 275 showed that my initial guess was wrong, only 1% improvement for barrett92 on CC 1.x. :sad:
GTX 275, M3321932839 from 2[SUP]79[/SUP] to 2[SUP]80[/SUP] - [B]raw GPU speed[/B]:
[CODE]mfaktc 0.17 45.96M/s
mfaktc 0.18-pre10 46.40M/s
[/CODE]
No matter if using CUDA 3.2, 4.0 or 4.1-RC1.

Oliver

ET_ 2011-12-03 17:53

[QUOTE=TheJudger;280899]Tests on my GTX 275 showed that my initial guess was wrong, only 1% improvement for barrett92 on CC 1.x. :sad:
GTX 275, M3321932839 from 2[SUP]79[/SUP] to 2[SUP]80[/SUP] - [B]raw GPU speed[/B]:
[CODE]mfaktc 0.17 45.96M/s
mfaktc 0.18-pre10 46.40M/s
[/CODE]
No matter if using CUDA 3.2, 4.0 or 4.1-RC1.

Oliver[/QUOTE]

Every single bit matters... :smile:

Luigi

Dubslow 2011-12-04 23:20

May I also request that the Windows and Linux versions be able to use each others' save files?

Edit: Yeah, I still don't get what's happening here. Remember when I reported that for some reason in Windows my GTX 460 would randomly lose half its throughput? And after a restart it would get it back. Now in Linux it's only getting half the throughput rate, like in Windows, except it [u]starts[/u] at half rather than randomly dropping to a half. To be fair, I'm using fairly old drivers, but updating those is what caused me to lose my GUI in the first place a couple of months ago. Will try again though, and hopefully it won't fail this time.

Edit2: Also does anybody know how to monitor GPU load in Linux?

Edit3: I can confirm that the nVidia .run file to update drivers does not work for me. I have to run [code]sudo apt-get install --reinstall nvidia-current[/code] to fix mah gui. Note: When I tried running the .run, it reported my drivers as 285.x, whereas nvidia-settings reports driver version 270.41.06.

Dubslow 2011-12-05 07:24

And now it's running at 80% throughput, not 50%. That's the first time it's ever done that and I have no idea why, and I can't think of anything that's different.

Now it's down to 50% again?!?!? All I did was stop and restart MPrime?!?!?

Wait a minute. MPrime affinities seems to play a role in it, despite the fact that the affinities shouldn't play a role... this is so confusing. See [URL="http://mersenneforum.org/showthread.php?t=16289"]here[/URL]. I'll have to come back to this tomorrow.

TheJudger 2011-12-05 11:23

[QUOTE=Dubslow;281006]May I also request that the Windows and Linux versions be able to use each others' save files?
[/QUOTE]

Changelog for mfaktc 0.18:[CODE]
version 0.18-pre7 (2011-10-18)
...
- mfaktc no longer refuses to load a checkpoint file from a Linux version
with a Windows version of mfaktc and vice versa. Of course mfaktc still
refuses to load checkpoint files from other versions than itself
(identical version string!)[/CODE]

All you have to do is wait for mfaktc 0.18 (which depends mainly on the public release of CUDA 4.1)

Oliver

James Heinrich 2011-12-05 17:26

I've thrown together a rough chart of CUDA GPU performance comparison:
[url]http://mersenne-aries.sili.net/mfaktc.php[/url]

It is not yet properly calibrated. It currently translates GFLOPS (from Wikipedia) into GHz-days/day based on timing of a single test on my 8800GT. It does not (yet) take into account performance differences of different mfaktc cores etc. But I need some more data to fine-tune it: Please send me some timing info for a [i]single instance[/i] of mfaktc, including assignment (exponent, from/to bits), GPU model, time to complete the assignment, and GPU usage for that single instance.

kladner 2011-12-05 17:37

[QUOTE=James Heinrich;281105].....Please send me some timing info for a [I]single instance[/I] of mfaktc, including assignment (exponent, from/to bits), GPU model, time to complete the assignment, and GPU usage for that single instance.[/QUOTE]

I'm setting up such a run on a GTX 460. I'll launch it as soon as the current assignment finishes in 7 minutes.


All times are UTC. The time now is 23:15.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.