mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2003-04-25, 11:10   #1
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default interesting tools and compilers (for P4, Athlon, Opteron)

Hello,

since I run into interesting software pieces and sites, I'll give an overview, what could be useful for prime95 development:

an low overhead profiler for linux, using performance counters of Intel/AMD, can go down to instruction level:
http://oprofile.sourceforge.net/

AMD CodeAnalyst 2.1 beta (like VTune, but free, has Pipeline simulator for various AMD CPUs, performance counter and timer based profiling)
http://www.amd.com/us-en/Processors/...2_3604,00.html
(I used 1.2 before for optimizing my SSE/MMX game of life code)

Portland Group compiler 5.0 beta with support for Opterons 16 SSE2 and int registers, which is surely better optimizing for the schedulers and latencies than Intel C++. You can download a free version - with Opteron binaries - for Win and Linux here:
http://www.pgroup.com/AMD64.htm

GCC 3.3 has better x86-64 support. A complete package (MingW versions) can be found here:
http://www.thisiscool.com/gcc33_mingw.htm

Surely the compilers won't be useful for the handcrafted assembler, but one could look what code they produce for key parts of the algorithm.

More interesting would be a pipeline analysis of the hotspots. I don't know if such a thing is possible for Intel based systems now, but simulating for Athlon/Opteron could give some clues.

EDIT: CodeAnalyst 2.1 doesn't have the nice graphical pipeline analyzer which was present in version 1.2. Now it's a command line tool. Its still very useful (because of SSE2 and x86-64 support). To get a clue, what it looks like in 1.2 (you can get version 1.2 and some info by searching google for "AMD Codeanalyst" and opening the cached page) here's a screenshot. The colored boxes have meanings like "dispatch" (yellow), "execute" (green), "retiring" (grey), some stall (red border) and so on. Moving mouse pointer over a box gives detailed info about what's happening with this instruction during that particular cycle.
http://www.informatik.uni-rostock.de...odeAnalyst.jpg
Dresdenboy is offline   Reply With Quote
Old 2003-04-28, 15:13   #2
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

5518 Posts
Default

screenshot and more info added (I write this because there is no notification about my edit)
Dresdenboy is offline   Reply With Quote
Old 2003-04-28, 20:45   #3
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

22×3×647 Posts
Default

Dresdenboy, your ID makes me think you work for AMD! In which case I'm almost ashamed to ask this question: "Does CodeAnalyst work on Intel CPUs - especially the pipeline analyzer?" I know it won't tell me anything about Intel's pipelines but it would be cool if it would help an Intel owner optimize his code for an AMD CPU.
Prime95 is offline   Reply With Quote
Old 2003-04-28, 22:01   #4
gbvalor
 
gbvalor's Avatar
 
Aug 2002

11011112 Posts
Default Re: interesting tools and compilers (for P4, Athlon, Opteron

Hello,

Nice review!. I'm still waiting for something like CodeAnalyst ... but for linux :-(

Quote:
Originally Posted by Dresdenboy
GCC 3.3 has better x86-64 support. A complete package (MingW versions) can be found here:
http://www.thisiscool.com/gcc33_mingw.htm
I've had bad experiences with GCC 3.3 and SSE2 support. It is indeed bugy a lot managing SSE2 code. Actually I have to work with unstable CVS GCC 3.4 snapshots. It works OK for Glucas with no problems for pentium4 and Opteron support.

Guillermo.
gbvalor is offline   Reply With Quote
Old 2003-04-29, 06:25   #5
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

Quote:
Originally Posted by Prime95
Dresdenboy, your ID makes me think you work for AMD! In which case I'm almost ashamed to ask this question: "Does CodeAnalyst work on Intel CPUs - especially the pipeline analyzer?" I know it won't tell me anything about Intel's pipelines but it would be cool if it would help an Intel owner optimize his code for an AMD CPU.
That's even not farfetched ;) But not me works there. It's one of my friends who's a fab worker in Dresden. I'm just someone who likes to optimize code since 6502, 68k, 386 times. But since this work is often already being done for Intel CPUs these days someone needs to do (or help to do) it for AMD.

CodeAnalyst works on all x86 CPUs - except that event based profiling isn't possible. CA checks for availability for certain events. Here at work (not AMD) there are some PIII and P4 systems. On PIII the CodeAnalyst 1.2 pipeline analyzer works perfectly. You can select a CPU (K6-2, Athlon, Duron, Athlon XP/MP) and select the multiplier for simulation.

That was a nice way to study SSE behaviour (and to find out, that MMX is faster for AND, NOR etc.). I saw that the scheduler often was just full of ops because there were no integer ops in about 30 SSE instructions and sometimes it had to wait for Load/Store unit because of reusing of a stored value.

In CodeAnalyst 2.1b also Athlon 64/Opteron code can be simulated. Maybe I'll write a gui some day if it's not planned at AMD.

Opteron also has better options for doing SSE2 than Athlon XP has for SSE because many of the important ops are now directpath or double (mOP) decoded and don't fill up the issue ports like before which would allow us to start some 64bit mul or so.

DDB
Dresdenboy is offline   Reply With Quote
Old 2003-04-29, 06:48   #6
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

36110 Posts
Default

Some interesting fact about the Intel compiler:
http://www.aceshardware.com/forum?read=95033881 quotes the Inquirer that people found out, that SSE2 code compiled on Xeon runs much faster on Opteron than when it's compiled on Opteron itself (using the same options and target).

A comment (http://www.aceshardware.com/forum?read=95033983):
Quote:
If you compile on Opteron platform, the compiler detects the chip and produces poor performing code, so you have to compile on a P4.

There is an IFDEF that basically says "If AMD chip detected, disable optimizations".

The opportunity to insert that function is why Intel has been spending a ton of money buying out any compiler producers it can.
Well, for me at home (Athlon XP) the created code is really fast. But I'll make a test and will also compile it on P4 for trying at home.

DDB
Dresdenboy is offline   Reply With Quote
Old 2003-04-29, 18:30   #7
Xyzzy
 
Xyzzy's Avatar
 
Aug 2002

22×11×191 Posts
Default Re: interesting tools and compilers (for P4, Athlon, Opteron

Quote:
Originally Posted by gbvalor
Nice review!. I'm still waiting for something like CodeAnalyst ... but for linux :-(
Is VTune the same thing?

Quote:
To all Software Developers:
INTEL INTRODUCES UPDATED VERSION OF VTUNE(TM) PERFORMANCE ANALYZER FOR LINUX

INTEL(R) VTUNE(TM) PERFORMANCE ANALYZER V1.1 FOR LINUX
We are excited to inform you that the VTune(TM) Performance Analyzer 1.1 for Linux* is available now for purchase, by download and on CD-ROM directly from Intel and also from Intel Software Development Product resellers worldwide.

FEATURES
The VTune analyzer for Linux* provides a fully native-Linux solution that allows you to reach higher levels of software performance on the latest 32-bit Intel processors, including the new Pentium(r) M processor component of Intel Centrino(TM) mobile technology, Intel(r) Xeon(TM) and Intel(r) Pentium(r) 4 processors.

This new product provides a command-line capability that allows you to collect, analyze, and display performance data for your 32-bit Linux* applications, kernels and drivers. This product version highlights include:

· Powerful, flexible native-Linux command line interface
· Low intrusion system-wide profiling capability
· Local event based sampling and call graph support
· Support for multiple Red Hat* and SuSE* Linux distributions

VTune(TM) Performance Analyzer 1.1 for Linux
http://intel.m0.net/m/s.asp?HB8872253498X2397262X183625X

EVALUATION AND PURCHASE
Please visit us at http://www.intel.com/software/products to learn more about evaluation and purchase of the VTune Performance Analyzer 1.1 for Linux*.

SUPPORT
Every purchase of an Intel Software Development Product includes one year of support services, which provides access to Intel Premier Support and all product updates during that time. Premier Support includes online access to technical and application notes and documentation.

INTEL SOFTWARE COLLEGE
Also check out the Intel Software College course selections for application developers. Intel Software College offers high-quality training worldwide on Intel processors, platforms, tools and technologies.
http://intel.m0.net/m/s.asp?HB8872253498X2397263X183625X

Regards,
Intel VTune(TM) Performance Analyzer Product Team
Xyzzy is offline   Reply With Quote
Old 2003-04-29, 20:33   #8
gbvalor
 
gbvalor's Avatar
 
Aug 2002

1578 Posts
Default Re: interesting tools and compilers (for P4, Athlon, Opteron

Hello,

Quote:
Originally Posted by Xyzzy
Quote:
Originally Posted by gbvalor
Nice review!. I'm still waiting for something like CodeAnalyst ... but for linux :-(
Is VTune the same thing?
Is almost the same thing. The free evaluation is only valid for 7 days. When one (as me no too much skilled) has learnt the use of Vtune the license has expired :-( . OTOH, I can't spend 699$ in a Vtune License. This is why I'm anxious to see something free for non commercial pruposes (as the Intel compilers).

AMD people!, Linux has been the OS with better and faster support for your new processors. It's time to offer to the developers performance tuners and compilers for your hardware. ;) isn't it?

Guillermo.
gbvalor is offline   Reply With Quote
Old 2003-04-30, 09:21   #9
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

5518 Posts
Default Re: interesting tools and compilers (for P4, Athlon, Opteron

Quote:
Originally Posted by gbvalor
Hello,
Nice review!. I'm still waiting for something like CodeAnalyst ... but for linux :-(
Have a look at the also mentioned Oprofile for Linux, which has counter based profiling down to instruction level both for Intel/AMD.

If there is some pipeline analysis necessary one could put some hotspot functions into a small cygwin/mingw app using the same compiler (GCC/ICC) and analyse it under windows.

There is even no program database or debug info needed for CodeAnalyst 2.1 but it's good to know the start adress for a trace. I used CodeAnalysts graphical interface to find a suitable start adress when going down to disassembly level - that's possible without sourcecode/program database - but then you don't see associated sourcecode lines or function names.

While I profiled my game of life here, I also profiled Prime95, Win2k kernel and other running code where you also can have a look at in disassembled view.

Pipeline analysis of this game of life also shows, that it runs ~40% faster on Opteron compared clock per clock to XP.

Regards,
DDB
Dresdenboy is offline   Reply With Quote
Old 2003-04-30, 14:07   #10
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

192 Posts
Default

A short info about CodeAnalyst 2.1b:

As I found out the command line tools for pipeline analyzing produce a binary file, which could be used for creating a graphical visualisation like in CodeAnalyst 1.2. Currently it's only used to produce simulation reports.

DDB
Dresdenboy is offline   Reply With Quote
Old 2003-05-05, 16:24   #11
gbvalor
 
gbvalor's Avatar
 
Aug 2002

3×37 Posts
Default Re: interesting tools and compilers (for P4, Athlon, Opteron

Quote:
Originally Posted by Dresdenboy
Have a look at the also mentioned Oprofile for Linux, which has counter based profiling down to instruction level both for Intel/AMD.
Oprofile looks very interesting but I need to compile a new kernel. So I must do things carefully.

Thanks, Dresdenboy. Your suggestions have been useful to me.

If you like, you can get and test Glucas code at:

http://sourceforge.net/projects/glucas

There is still no release for SSE2 code, this is currently partially implemented and you can download it from CVS repository.

Guillermo.
gbvalor is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Comparison of NFS tools CRGreathouse Factoring 3 2018-02-05 14:55
Murphy's Law and other tools Uncwilly Lounge 5 2014-07-07 22:36
AMD Athlon 64 vs AMD Opteron for ecm thomasn Factoring 6 2004-11-08 13:25
Creative ways to achieve Athlon 64 / Opteron optimization GP2 Hardware 11 2004-01-21 03:01
Intel Compilers? db597 Software 1 2003-01-17 16:45

All times are UTC. The time now is 05:02.


Mon Jan 17 05:02:38 UTC 2022 up 177 days, 23:31, 0 users, load averages: 1.56, 1.13, 1.01

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔