mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Sierpinski/Riesel Base 5 (https://www.mersenneforum.org/forumdisplay.php?f=54)
-   -   Sr2sieve on PPC/Linux (https://www.mersenneforum.org/showthread.php?t=6669)

BlisteringSheep 2007-03-13 05:02

[QUOTE=geoff;100658]Are the sizes in bytes or kilobytes?

Do you know which compiler symbols I should test to decide whether this code should be included? I assume __linux__ and __powerpc64__ and one other for the CPU type.[/QUOTE]

The sizes are in bytes. You can check for either [FONT="Lucida Console"]__PPC64__[/FONT] or [FONT="Lucida Console"]__powerpc64__[/FONT] along with [FONT="Lucida Console"]__linux__[/FONT]. There is no compiler symbol you can check for the specific CPU. The way that [FONT="Lucida Console"]lshw[/FONT] does it is to look for directories in the [FONT="Lucida Console"]/proc/device-tree/cpus/[/FONT] directory. Any directories in there will be the CPUs. For example, my dual PowerMac has [FONT="Lucida Console"]PowerPC,970@0[/FONT] and [FONT="Lucida Console"]PowerPC,970@1[/FONT] entries. Unfortunately, I won't be able to check this on other machines until at least next week. I also don't know if a uniprocessor would have [FONT="Lucida Console"]PowerPC,970@0[/FONT] or [FONT="Lucida Console"]PowerPC,970[/FONT]

Greenbank 2007-03-14 18:57

Yes, Linux man pages for sysctl() do warn you away from using sysctl() in favour of scraping the info from text files in /proc.

geoff 2007-03-15 00:46

[QUOTE=BlisteringSheep;100663]The way that [FONT="Lucida Console"]lshw[/FONT] does it is to look for directories in the [FONT="Lucida Console"]/proc/device-tree/cpus/[/FONT] directory. Any directories in there will be the CPUs.[/QUOTE]
Thanks. I'll make it walk the directory tree using ftw() in a future version. For now (sr2sieve 1.4.32) it just looks in /proc/device-tree/cpus/PowerPC,970@0/ and uses the defaults if that directory doesn't exist.

geoff 2007-03-18 00:40

sr2sieve 1.4.34 looks for the cache size files in any directories under /proc/device-tree/cpus/, so hopefully should be able to detect the cache size for any CPU under ppc64/Linux.

BlisteringSheep 2007-03-18 07:02

geoff, a couple of problems:

In the [FONT="Lucida Console"]Makefile[/FONT], with the comments on the [FONT="Lucida Console"]ARCH=[/FONT] lines, just uncommenting them doesn't work. The variable gets the value of everything up to the middle-of-line #, so [FONT="Lucida Console"]ARCH[/FONT] is "[FONT="Lucida Console"]ppc64-linux [/FONT]". To have inline comments on variable declaration lines you have to use the strip function ala [FONT="Lucida Console"]$(strip $(ARCH))[/FONT]. You can do this just once with [FONT="Lucida Console"]ARCH:=$(strip $(ARCH))[/FONT] to get out of putting it in each if test.

[FONT="Lucida Console"]cpu.c[/FONT] is referencing [FONT="Lucida Console"]CPU_DIR_NAME[/FONT] in [FONT="Lucida Console"]set_cache_sizes()[/FONT] which isn't defined anywhere. Up in [FONT="Lucida Console"]read_cache_size_file()[/FONT] it has the hardcoded string path.

I'm sure it was just an oversight, but the tarball includes the [FONT="Lucida Console"]factors.o[/FONT] object file.

geoff 2007-03-20 02:20

[QUOTE=BlisteringSheep;101213]geoff, a couple of problems:

In the [FONT="Lucida Console"]Makefile[/FONT], with the comments on the [FONT="Lucida Console"]ARCH=[/FONT] lines, just uncommenting them doesn't work. The variable gets the value of everything up to the middle-of-line #, so [FONT="Lucida Console"]ARCH[/FONT] is "[FONT="Lucida Console"]ppc64-linux [/FONT]". To have inline comments on variable declaration lines you have to use the strip function ala [FONT="Lucida Console"]$(strip $(ARCH))[/FONT]. You can do this just once with [FONT="Lucida Console"]ARCH:=$(strip $(ARCH))[/FONT] to get out of putting it in each if test.

[FONT="Lucida Console"]cpu.c[/FONT] is referencing [FONT="Lucida Console"]CPU_DIR_NAME[/FONT] in [FONT="Lucida Console"]set_cache_sizes()[/FONT] which isn't defined anywhere. Up in [FONT="Lucida Console"]read_cache_size_file()[/FONT] it has the hardcoded string path.

I'm sure it was just an oversight, but the tarball includes the [FONT="Lucida Console"]factors.o[/FONT] object file.[/QUOTE]

Thanks Ed, I'll fix these in version 1.4.35. The stray factors.o file might be the cause of some of the problems others have had with linking.

BlisteringSheep 2007-03-23 03:21

[QUOTE=geoff;101460]Thanks Ed, I'll fix these in version 1.4.35. The stray factors.o file might be the cause of some of the problems others have had with linking.[/QUOTE]

Geoff, I got a chance to compile 1.4.35 tonight & it's still failing with the undefined CPU_DIR_NAME on line 30 of cpu.c

BlisteringSheep 2007-03-23 04:13

[QUOTE=BlisteringSheep;101820]Geoff, I got a chance to compile 1.4.35 tonight & it's still failing with the undefined CPU_DIR_NAME on line 30 of cpu.c[/QUOTE]

I changed CPU_DIR_NAME to "/proc/device-tree/cpus" & it is running fine. It detected the cache sizes correctly.

geoff 2007-03-24 01:51

[QUOTE=BlisteringSheep;101824]I changed CPU_DIR_NAME to "/proc/device-tree/cpus" & it is running fine. It detected the cache sizes correctly.[/QUOTE]

Thanks. I didn't read your earlier message properly, sorry. I have added the definition of CPU_DIR_NAME in the 1.4.36 source.

geoff 2007-05-28 02:04

1 Attachment(s)
The changes in versions 1.5.x shouldn't affect the PPC version, they are really aimed at getting around a problem with using the floating point and integer instructions together which the PPC64 code doesn't do.

However a side effect is that the x86 and x86-64 versions now do the critical loops in two passes instead of one, and it occurred to me that this could be of some benefit to the PPC even without any changes to the assembler routines.

If the attached file is appended to asm-ppc64.h in version 1.5.5, it will cause the critical loops to be done in two passes on the PPC64.

BlisteringSheep 2007-06-15 05:21

geoff,
I just wanted to let you know that this did cause a good speedup, about 10% in my testing. That is enough to make the 1.5.x series faster than the later 1.4.x versions (without this addition 1.5.x runs slightly slower than 1.4.39 or 1.4.42).
:alex:


All times are UTC. The time now is 05:56.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.