![]() |
[QUOTE=geoff;100658]Are the sizes in bytes or kilobytes?
Do you know which compiler symbols I should test to decide whether this code should be included? I assume __linux__ and __powerpc64__ and one other for the CPU type.[/QUOTE] The sizes are in bytes. You can check for either [FONT="Lucida Console"]__PPC64__[/FONT] or [FONT="Lucida Console"]__powerpc64__[/FONT] along with [FONT="Lucida Console"]__linux__[/FONT]. There is no compiler symbol you can check for the specific CPU. The way that [FONT="Lucida Console"]lshw[/FONT] does it is to look for directories in the [FONT="Lucida Console"]/proc/device-tree/cpus/[/FONT] directory. Any directories in there will be the CPUs. For example, my dual PowerMac has [FONT="Lucida Console"]PowerPC,970@0[/FONT] and [FONT="Lucida Console"]PowerPC,970@1[/FONT] entries. Unfortunately, I won't be able to check this on other machines until at least next week. I also don't know if a uniprocessor would have [FONT="Lucida Console"]PowerPC,970@0[/FONT] or [FONT="Lucida Console"]PowerPC,970[/FONT] |
Yes, Linux man pages for sysctl() do warn you away from using sysctl() in favour of scraping the info from text files in /proc.
|
[QUOTE=BlisteringSheep;100663]The way that [FONT="Lucida Console"]lshw[/FONT] does it is to look for directories in the [FONT="Lucida Console"]/proc/device-tree/cpus/[/FONT] directory. Any directories in there will be the CPUs.[/QUOTE]
Thanks. I'll make it walk the directory tree using ftw() in a future version. For now (sr2sieve 1.4.32) it just looks in /proc/device-tree/cpus/PowerPC,970@0/ and uses the defaults if that directory doesn't exist. |
sr2sieve 1.4.34 looks for the cache size files in any directories under /proc/device-tree/cpus/, so hopefully should be able to detect the cache size for any CPU under ppc64/Linux.
|
geoff, a couple of problems:
In the [FONT="Lucida Console"]Makefile[/FONT], with the comments on the [FONT="Lucida Console"]ARCH=[/FONT] lines, just uncommenting them doesn't work. The variable gets the value of everything up to the middle-of-line #, so [FONT="Lucida Console"]ARCH[/FONT] is "[FONT="Lucida Console"]ppc64-linux [/FONT]". To have inline comments on variable declaration lines you have to use the strip function ala [FONT="Lucida Console"]$(strip $(ARCH))[/FONT]. You can do this just once with [FONT="Lucida Console"]ARCH:=$(strip $(ARCH))[/FONT] to get out of putting it in each if test. [FONT="Lucida Console"]cpu.c[/FONT] is referencing [FONT="Lucida Console"]CPU_DIR_NAME[/FONT] in [FONT="Lucida Console"]set_cache_sizes()[/FONT] which isn't defined anywhere. Up in [FONT="Lucida Console"]read_cache_size_file()[/FONT] it has the hardcoded string path. I'm sure it was just an oversight, but the tarball includes the [FONT="Lucida Console"]factors.o[/FONT] object file. |
[QUOTE=BlisteringSheep;101213]geoff, a couple of problems:
In the [FONT="Lucida Console"]Makefile[/FONT], with the comments on the [FONT="Lucida Console"]ARCH=[/FONT] lines, just uncommenting them doesn't work. The variable gets the value of everything up to the middle-of-line #, so [FONT="Lucida Console"]ARCH[/FONT] is "[FONT="Lucida Console"]ppc64-linux [/FONT]". To have inline comments on variable declaration lines you have to use the strip function ala [FONT="Lucida Console"]$(strip $(ARCH))[/FONT]. You can do this just once with [FONT="Lucida Console"]ARCH:=$(strip $(ARCH))[/FONT] to get out of putting it in each if test. [FONT="Lucida Console"]cpu.c[/FONT] is referencing [FONT="Lucida Console"]CPU_DIR_NAME[/FONT] in [FONT="Lucida Console"]set_cache_sizes()[/FONT] which isn't defined anywhere. Up in [FONT="Lucida Console"]read_cache_size_file()[/FONT] it has the hardcoded string path. I'm sure it was just an oversight, but the tarball includes the [FONT="Lucida Console"]factors.o[/FONT] object file.[/QUOTE] Thanks Ed, I'll fix these in version 1.4.35. The stray factors.o file might be the cause of some of the problems others have had with linking. |
[QUOTE=geoff;101460]Thanks Ed, I'll fix these in version 1.4.35. The stray factors.o file might be the cause of some of the problems others have had with linking.[/QUOTE]
Geoff, I got a chance to compile 1.4.35 tonight & it's still failing with the undefined CPU_DIR_NAME on line 30 of cpu.c |
[QUOTE=BlisteringSheep;101820]Geoff, I got a chance to compile 1.4.35 tonight & it's still failing with the undefined CPU_DIR_NAME on line 30 of cpu.c[/QUOTE]
I changed CPU_DIR_NAME to "/proc/device-tree/cpus" & it is running fine. It detected the cache sizes correctly. |
[QUOTE=BlisteringSheep;101824]I changed CPU_DIR_NAME to "/proc/device-tree/cpus" & it is running fine. It detected the cache sizes correctly.[/QUOTE]
Thanks. I didn't read your earlier message properly, sorry. I have added the definition of CPU_DIR_NAME in the 1.4.36 source. |
1 Attachment(s)
The changes in versions 1.5.x shouldn't affect the PPC version, they are really aimed at getting around a problem with using the floating point and integer instructions together which the PPC64 code doesn't do.
However a side effect is that the x86 and x86-64 versions now do the critical loops in two passes instead of one, and it occurred to me that this could be of some benefit to the PPC even without any changes to the assembler routines. If the attached file is appended to asm-ppc64.h in version 1.5.5, it will cause the critical loops to be done in two passes on the PPC64. |
geoff,
I just wanted to let you know that this did cause a good speedup, about 10% in my testing. That is enough to make the 1.5.x series faster than the later 1.4.x versions (without this addition 1.5.x runs slightly slower than 1.4.39 or 1.4.42). :alex: |
| All times are UTC. The time now is 05:56. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.