mersenneforum.org  

Go Back   mersenneforum.org > Fun Stuff > Lounge

Reply
 
Thread Tools
Old 2003-12-18, 21:39   #221
gbvalor
 
gbvalor's Avatar
 
Aug 2002

3×37 Posts
Default

Quote:
After the hollidays I will be configuring my dual Opteron 240 system as my main work station on Windows 2000. I am willing to help in testing.

Is there a site to download these alpha/beta AMD64 clients?
Glucas is designed mainly for Unix/Linux/Mac Oses. Some people have compiled and tried it on windows with cgywin, I didn't

Prime95 is the proper client for this OS, but I don't know how is the state of development/optimization of it.

Guillermo
gbvalor is offline   Reply With Quote
Old 2003-12-18, 21:45   #222
gbvalor
 
gbvalor's Avatar
 
Aug 2002

3×37 Posts
Default

Quote:
Linux opteron 2.6.0 #2 Thu Dec 18 01:22:39 EST 2003 x86_64 unknown unknown GNU/Linux
I don't know whether is because the compiler or the kernel, but my timings have dropped about a 4% with this new kernel.

Anyone have also detected the same on mprime?

Guillermo.
gbvalor is offline   Reply With Quote
Old 2003-12-19, 05:18   #223
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

2×23×179 Posts
Default

Quote:
Originally posted by gbvalor
I don't know whether is because the compiler or the kernel, but my timings have dropped about a 4% with this new kernel.

Anyone have also detected the same on mprime?
Yes, I saw a slight speed loss as well... Maybe I compiled the kernel wrong? I used the default menuconfig settings, minus SMP...
Xyzzy is offline   Reply With Quote
Old 2003-12-19, 06:18   #224
aaronl
 
aaronl's Avatar
 
Aug 2003

24×3 Posts
Default

Quote:
Originally posted by Xyzzy
Yes, I saw a slight speed loss as well... Maybe I compiled the kernel wrong? I used the default menuconfig settings, minus SMP...
Did you enable kernel preemption (CONFIG_PREEMPT)? In my slightly outdated kernel source it looks like it's enabled by default on x86 but not x86-64.

Linux 2.6 runs the scheduler at a much higher frequency by default. This adds some overhead and could be part of the cause. I'd be personally interested in what changing HZ in include/asm-x86_64/param.h from 1000 to 100 does to the timings.

I'm tempted to raise this issue in the Linux forum so that people who run crunching farms can adjust their kernel settings optimally. I'm confident that lower HZ would improve mprime timings slightly (HZ=25 would probably be even better on dedicated GIMPS boxen). I also think that turning off kernel preemption would be a win, but this is harder to predict. Someone who has a spare machine to test with (not me) should run some benchmarks so we could verify this. It would also be interesting to see how 2.6 compares to 2.4 after factors like HZ and preemption are taken into account.
aaronl is offline   Reply With Quote
Old 2003-12-20, 21:07   #225
PrimeCruncher
 
PrimeCruncher's Avatar
 
Sep 2003
Borg HQ, Delta Quadrant

2×33×13 Posts
Default

From this CNet news report:

Quote:
The new kernel also monitors for new events more frequently--1,000 times per second instead of 100--a fact that slows down the system about 1 percent, Morton said in an October presentation about the kernel.

In addition, 2.6 requires somewhat more memory to run and shows worse performance when it has to use hard drives as extra memory under heavy loads, Morton said.
Don't know much about Linux but that sounds like the culprit right there to me...
PrimeCruncher is offline   Reply With Quote
Old 2003-12-21, 00:24   #226
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

2×23×179 Posts
Default

A long time ago I changeed the time slice on my old Sun from 10ms to 1ms and it made a huge difference too... I'm not sure what the setting here is...

Here is a blurb from menuconfig...
Quote:
This option reduces the latency of the kernel when reacting to real-time or interactive events by allowing a low priority process to be preempted even if it is in kernel mode executing a system call. This allows applications to run more reliably even when the system is under load. On contrary it may also break your drivers and add priority inheritance problems to your system. Don't select it if you rely on a stable system or have slightly obscure hardware. It's also not very well tested on x86-64 currently. You have been warned. Say Y here if you are feeling brave and building a kernel for a desktop, embedded or real-time system. Say N if you are unsure.
It is set right now to "N"...
Xyzzy is offline   Reply With Quote
Old 2003-12-21, 00:27   #227
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

2·23·179 Posts
Default

Here is a listing of param.h
Code:
#ifndef _ASMx86_64_PARAM_H
#define _ASMx86_64_PARAM_H

#ifdef __KERNEL__
# define HZ            1000            /* Internal kernel timer frequency */
# define USER_HZ       100          /* .. some user interfaces are in "ticks */
#define CLOCKS_PER_SEC        (USER_HZ)       /* like times() */
#endif

#ifndef HZ
#define HZ 100
#endif

#define EXEC_PAGESIZE   4096

#ifndef NGROUPS
#define NGROUPS         32
#endif

#ifndef NOGROUP
#define NOGROUP         (-1)
#endif

#define MAXHOSTNAMELEN  64      /* max length of hostname */

#endif
Xyzzy is offline   Reply With Quote
Old 2003-12-21, 00:40   #228
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

2×23×179 Posts
Default

Our current dmesg...
Code:
Bootdata ok (command line is root=/dev/hda2 vga=773  acpi=off splash=silent splash=silent)
Linux version 2.6.0 (root@opteron) (gcc version 3.3 (SuSE Linux)) #2 Thu Dec 18 01:22:39 EST 2003
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000001fff0000 (usable)
 BIOS-e820: 000000001fff0000 - 000000001ffff000 (ACPI data)
 BIOS-e820: 000000001ffff000 - 0000000020000000 (ACPI NVS)
 BIOS-e820: 00000000ff7c0000 - 0000000100000000 (reserved)
ACPI: have wakeup address 0x10000001000
found SMP MP-table at 000ff780
hm, page 000ff000 reserved twice.
hm, page 00100000 reserved twice.
hm, page 000f9000 reserved twice.
hm, page 000fa000 reserved twice.
On node 0 totalpages: 131056
  DMA zone: 4096 pages, LIFO batch:1
  Normal zone: 126960 pages, LIFO batch:16
  HighMem zone: 0 pages, LIFO batch:1
Intel MultiProcessor Specification v1.1
    Virtual Wire compatibility mode.
OEM ID: TYAN     <6>Product ID: S2880        <6>APIC at: 0xFEE00000
Processor #0 15:5 APIC version 16
I/O APIC #1 Version 17 at 0xFEC00000.
Processors: 1
Checking aperture...
CPU 0: aperture @ 4000000 size 32 MB
Aperture from northbridge cpu 0 too small (32 MB)
No AGP bridge found
Your BIOS doesn't leave a aperture memory hole
Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Mapping aperture over 65536 KB of RAM @ 4000000
Building zonelist for node : 0
Kernel command line: root=/dev/hda2 vga=773  acpi=off splash=silent splash=silent console=tty0
Initializing CPU#0
PID hash table entries: 16 (order 4: 256 bytes)
time.c: Using 1.193182 MHz PIT timer.
time.c: Detected 1396.036 MHz processor.
Console: colour VGA+ 80x25
Memory: 443828k/524224k available (2242k kernel code, 79656k reserved, 906k data, 160k init)
Calibrating delay loop... 2744.32 BogoMIPS
Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
Mount-cache hash table entries: 256 (order: 0, 4096 bytes)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU: AMD Opteron(tm) Processor 140 stepping 01
POSIX conformance testing by UNIFIX
watchdog: setting K7_PERFCTR0 to ffeab2e0
testing NMI watchdog ... OK.
ENABLING IO-APIC IRQs
Using IO-APIC 1
...changing IO-APIC physical APIC ID to 1 ... ok.
init IO_APIC IRQs
 IO-APIC (apicid-pin) 1-0, 1-9, 1-10, 1-11, 1-17, 1-20, 1-21, 1-22, 1-23 not connected.
..TIMER: vector=0x31 pin1=2 pin2=0
number of MP IRQ sources: 18.
number of IO-APIC #1 registers: 24.
testing the IO APIC.......................

IO APIC #1......
.... register #00: 01000000
.......    : physical APIC id: 01
.... register #01: 00170011
.......     : max redirection entries: 0017
.......     : PRQ implemented: 0
.......     : IO APIC version: 0011
.... register #02: 01000000
.......     : arbitration: 01
.... IRQ redirection table:
 NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
 00 000 00  1    0    0   0   0    0    0    00
 01 001 01  0    0    0   0   0    1    1    39
 02 001 01  0    0    0   0   0    1    1    31
 03 001 01  0    0    0   0   0    1    1    41
 04 001 01  0    0    0   0   0    1    1    49
 05 001 01  0    0    0   0   0    1    1    51
 06 001 01  0    0    0   0   0    1    1    59
 07 001 01  0    0    0   0   0    1    1    61
 08 001 01  0    0    0   0   0    1    1    69
 09 000 00  1    0    0   0   0    0    0    00
 0a 000 00  1    0    0   0   0    0    0    00
 0b 000 00  1    0    0   0   0    0    0    00
 0c 001 01  0    0    0   0   0    1    1    71
 0d 001 01  0    0    0   0   0    1    1    79
 0e 001 01  0    0    0   0   0    1    1    81
 0f 001 01  0    0    0   0   0    1    1    89
 10 001 01  1    1    0   1   0    1    1    91
 11 000 00  1    0    0   0   0    0    0    00
 12 001 01  1    1    0   1   0    1    1    99
 13 001 01  1    1    0   1   0    1    1    A1
 14 000 00  1    0    0   0   0    0    0    00
 15 000 00  1    0    0   0   0    0    0    00
 16 000 00  1    0    0   0   0    0    0    00
 17 000 00  1    0    0   0   0    0    0    00
IRQ to pin mappings:
IRQ0 -> 0:2
IRQ1 -> 0:1
IRQ3 -> 0:3
IRQ4 -> 0:4
IRQ5 -> 0:5
IRQ6 -> 0:6
IRQ7 -> 0:7
IRQ8 -> 0:8
IRQ12 -> 0:12
IRQ13 -> 0:13
IRQ14 -> 0:14
IRQ15 -> 0:15
IRQ16 -> 0:16
IRQ18 -> 0:18
IRQ19 -> 0:19
.................................... done.
Using local APIC timer interrupts.
Detected 12.464 MHz APIC timer.
NET: Registered protocol family 16
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20031002
ACPI: Interpreter disabled.
SCSI subsystem initialized
ACPI: ACPI tables contain no PCI IRQ routing entries
PCI: Invalid ACPI-PCI IRQ routing table
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
PCI: Using IRQ router default [1022/746b] at 0000:00:07.3
PCI->APIC IRQ transform: (B0,I7,P3) -> 19
PCI->APIC IRQ transform: (B1,I0,P3) -> 19
PCI->APIC IRQ transform: (B1,I0,P3) -> 19
PCI->APIC IRQ transform: (B1,I11,P0) -> 18
PCI->APIC IRQ transform: (B1,I13,P0) -> 19
PCI->APIC IRQ transform: (B1,I14,P0) -> 16
PCI-DMA: Disabling AGP.
PCI-DMA: aperture base @ 4000000 size 65536 KB
PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
IA32 emulation $Id: sys_ia32.c,v 1.32 2002/03/24 13:02:28 ak Exp $
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
  utmisc-0745 [03] ut_acquire_mutex      : Thread 1 could not acquire Mutex [ACPI_MTX_Memory] AE_BAD_PARAMETER
pty: 256 Unix98 ptys configured
Real Time Clock Driver v1.12
hw_random: AMD768 system management I/O registers at 0x5000.
hw_random hardware driver 1.0.0 loaded
Linux agpgart interface v0.100 (c) Dave Jones
Hangcheck: starting hangcheck timer 0.5.0 (tick is 180 seconds, margin is 60 seconds).
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
Using anticipatory io scheduler
FDC 0 is a post-1991 82077
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
loop: loaded (max 8 devices)
tg3.c:v2.3 (November 5, 2003)
0000:01:0d.0: Force SAC with mask ffffffffffffffff
eth0: Tigon3 [partno(BCM95705C3) rev 3003 PHY(5705)] (PCI:33MHz:32-bit) 10/100/1000BaseT Ethernet 00:e0:81:60:06:46
0000:01:0e.0: Force SAC with mask ffffffffffffffff
eth1: Tigon3 [partno(BCM95705C3) rev 3003 PHY(5705)] (PCI:33MHz:32-bit) 10/100/1000BaseT Ethernet 00:e0:81:60:06:47
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
AMD8111: IDE controller at PCI slot 0000:00:07.1
AMD8111: chipset revision 3
AMD8111: not 100% native mode: will probe irqs later
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
AMD8111: 0000:00:07.1 (rev 03) UDMA133 controller
    ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:pio, hdd:pio
hda: WDC WD400JB-00ENA0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: max request size: 128KiB
hda: 78165360 sectors (40020 MB) w/8192KiB Cache, CHS=65535/16/63, UDMA(100)
 hda: hda1 hda2 hda3
Fusion MPT base driver 2.05.00.03
Copyright (c) 1999-2002 LSI Logic Corporation
mptbase: 0 MPT adapters found, 0 installed.
Fusion MPT SCSI Host driver 2.05.00.03
mice: PS/2 mouse device common for all mice
serio: i8042 AUX port at 0x60,0x64 irq 12
input: AT Translated Set 2 keyboard on isa0060/serio0
serio: i8042 KBD port at 0x60,0x64 irq 1
Intel 810 + AC97 Audio, version 0.24, 01:21:42 Dec 18 2003
oprofile: using NMI interrupt.
NET: Registered protocol family 2
IP: routing cache hash table of 4096 buckets, 32Kbytes
TCP: Hash tables configured (established 32768 bind 32768)
NET: Registered protocol family 1
NET: Registered protocol family 17
found reiserfs format "3.6" with standard journal
Reiserfs journal params: device hda2, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
reiserfs: checking transaction log (hda2) for (hda2)
Using r5 hash to sort names
VFS: Mounted root (reiserfs filesystem) readonly.
Freeing unused kernel memory: 160k freed
Adding 1012084k swap on /dev/hda3.  Priority:42 extents:1
tg3: eth0: Link is up at 100 Mbps, full duplex.
tg3: eth0: Flow control is off for TX and off for RX.
Xyzzy is offline   Reply With Quote
Old 2003-12-21, 00:58   #229
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

2×23×179 Posts
Default

An experiment:

I disabled scsi, apm, acpi, sound, profiling support, kernel debugging, force IOMMU to on, K8 Machine check debugging mode and IOMMU support...

Timings before:
Code:
AMD Opteron(tm) Processor 140
CPU speed: 1396.12 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 64 KB
L2 cache size: 1024 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 512
Prime95 version 23.5, RdtscTiming=1
Best time for 384K FFT length: 32.888 ms.
Best time for 448K FFT length: 39.325 ms.
Best time for 512K FFT length: 44.658 ms.
Best time for 640K FFT length: 55.494 ms.
Best time for 768K FFT length: 67.630 ms.
Best time for 896K FFT length: 81.642 ms.
Best time for 1024K FFT length: 92.040 ms.
Best time for 1280K FFT length: 123.933 ms.
Best time for 1536K FFT length: 151.438 ms.
Best time for 1792K FFT length: 181.381 ms.
Best time for 2048K FFT length: 204.414 ms.
Best time for 2560K FFT length: 262.163 ms.
Best time for 3072K FFT length: 325.316 ms.
Best time for 3584K FFT length: 393.170 ms.
Best time for 4096K FFT length: 441.877 ms.
Timings after:
Code:
AMD Opteron(tm) Processor 140
CPU speed: 1396.48 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 64 KB
L2 cache size: 1024 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 512
Prime95 version 23.5, RdtscTiming=1
Best time for 384K FFT length: 32.883 ms.
Best time for 448K FFT length: 39.403 ms.
Best time for 512K FFT length: 44.433 ms.
Best time for 640K FFT length: 55.312 ms.
Best time for 768K FFT length: 67.480 ms.
Best time for 896K FFT length: 81.606 ms.
Best time for 1024K FFT length: 92.092 ms.
Best time for 1280K FFT length: 123.841 ms.
Best time for 1536K FFT length: 151.233 ms.
Best time for 1792K FFT length: 181.395 ms.
Best time for 2048K FFT length: 204.373 ms.
Best time for 2560K FFT length: 262.083 ms.
Best time for 3072K FFT length: 325.414 ms.
Best time for 3584K FFT length: 393.316 ms.
Best time for 4096K FFT length: 441.792 ms.
It looks like all of these options have little or no effect...
Xyzzy is offline   Reply With Quote
Old 2003-12-21, 02:10   #230
aaronl
 
aaronl's Avatar
 
Aug 2003

24·3 Posts
Default

Quote:
Originally posted by Xyzzy
An experiment:

I disabled scsi, apm, acpi, sound, profiling support, kernel debugging, force IOMMU to on, K8 Machine check debugging mode and IOMMU support...
Again, I suggest changing HZ ("Internal kernel timer frequency") from 1000 to 100 and seeing if you regain any of the performance.
aaronl is offline   Reply With Quote
Old 2003-12-21, 02:16   #231
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

2·23·179 Posts
Default

I'm working on it...

Is the value 1/x? For example, 1s / 1000 = 1ms timeslice therefore 1s / 100 = 10ms timeslice?
Xyzzy is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Opteron is Hyperthreaded ? bgbeuning Information & Answers 3 2016-01-10 08:26
Opteron web server... Xyzzy Lounge 14 2003-11-05 23:07
Opteron Bottleneck?? Prime95 Hardware 31 2003-09-17 06:54
AMD Opteron naclosagc Software 27 2003-08-10 19:14
What will an AMD Opteron be classified as ? dsouza123 Software 4 2003-08-02 14:29

All times are UTC. The time now is 13:56.


Mon Aug 2 13:56:05 UTC 2021 up 10 days, 8:25, 0 users, load averages: 4.01, 2.73, 2.29

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.