mersenneforum.org  

Go Back   mersenneforum.org > Extra Stuff > Programming

Reply
 
Thread Tools
Old 2012-09-15, 17:35   #1
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

173116 Posts
Default Program optimization

I while back I came across http://critticall.com/
This program optimizes code by making random changes, checking whether they work and their speed.
Would this sort of approach work with optimizing assembler? Currently a compiler outputs a particular version for each type of cpu but there are potentially more optimizations that can be done for a particular system. Experimentation would surely give better results.
henryzz is offline   Reply With Quote
Old 2012-09-15, 18:50   #2
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by henryzz View Post
I while back I came across http://critticall.com/
This program optimizes code by making random changes, checking whether they work and their speed.
Would this sort of approach work with optimizing assembler? Currently a compiler outputs a particular version for each type of cpu but there are potentially more optimizations that can be done for a particular system. Experimentation would surely give better results.
http://critticall.com/mersenne.html
Dubslow is offline   Reply With Quote
Old 2012-09-15, 18:54   #3
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

259116 Posts
Default

It would have been funny, if it wasn't so sad to watch...
Batalov is offline   Reply With Quote
Old 2012-09-15, 18:59   #4
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

961710 Posts
Default

Here's how Critticall would solve the following problem.

One of the lights in the bathroom just went out.
Solve it.
<...a few hours later...>
All walls are drilled in random places, the toilet is taken off and broken to pieces, all the fuses are blown by shorts, the water pipes were disconnected (but not shut)... you get the idea. The light is still off.
_______________

(Genetic algorithms were around for more than a decade. I do agree that there are some problems for them. I've exaggerated a little bit with this example.)

Last fiddled with by Batalov on 2012-09-15 at 19:05
Batalov is offline   Reply With Quote
Old 2012-09-15, 20:01   #5
cheesehead
 
cheesehead's Avatar
 
"Richard B. Woods"
Aug 2002
Wisconsin USA

22·3·641 Posts
Default

The basic problem is that the random genetic evolution algorithm needs 4 billion years to work properly.
cheesehead is offline   Reply With Quote
Old 2012-09-16, 03:05   #6
CRGreathouse
 
CRGreathouse's Avatar
 
Aug 2006

3·1,993 Posts
Default

Quote:
Originally Posted by Batalov View Post
Here's how Critticall would solve the following problem.

One of the lights in the bathroom just went out.
Solve it.
<...a few hours later...>
All walls are drilled in random places, the toilet is taken off and broken to pieces, all the fuses are blown by shorts, the water pipes were disconnected (but not shut)... you get the idea. The light is still off.
_______________

(Genetic algorithms were around for more than a decade. I do agree that there are some problems for them. I've exaggerated a little bit with this example.)
That's not the real problem here, though. The problem is the fitness function: it's very hard to test if a program correctly tests primality. So instead it pokes a hole in the wall and waits a billion years seeing if that made the lights go on.
CRGreathouse is offline   Reply With Quote
Old 2012-09-22, 07:25   #7
Brian Gladman
 
Brian Gladman's Avatar
 
May 2008
Worcester, United Kingdom

21B16 Posts
Default

Quote:
Originally Posted by henryzz View Post
I while back I came across http://critticall.com/
This program optimizes code by making random changes, checking whether they work and their speed.
Would this sort of approach work with optimizing assembler? Currently a compiler outputs a particular version for each type of cpu but there are potentially more optimizations that can be done for a particular system. Experimentation would surely give better results.
The x64 assembler code in MPIR is produced using a method that is not random but is based on a related idea (developed by another MPIR contributor).

The code is first designed and it is then analysed and annotated with information that specifies how far each individual instruction can be moved relative to the others around it without changing the final answer (using knowledge of the target micro-architecture). The code is then run in a simulator that tests the different instruction orders to find the order that gives the highest speed.
Brian Gladman is offline   Reply With Quote
Old 2012-09-22, 07:41   #8
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

2·23·137 Posts
Default

The problem I foresee here is:
Quote:
Originally Posted by Brian Gladman View Post
The code is then run in a simulator that tests the different instruction orders to find the order that gives the highest speed.
... under what conditions?

The modern CPUs are very complex things. There are both internal and external states and conditions that occur under real application usage that a simulator cannot hope to duplicate, or even model closely.

Also with the sheer number of movable instruction in any non-trivial program I suspect that finding the optimal order is a rather hard problem. Is there a travelling salesman in the house?
retina is offline   Reply With Quote
Old 2012-09-22, 08:16   #9
Brian Gladman
 
Brian Gladman's Avatar
 
May 2008
Worcester, United Kingdom

72×11 Posts
Default

Quote:
Originally Posted by retina View Post
The problem I foresee here is:... under what conditions?

The modern CPUs are very complex things. There are both internal and external states and conditions that occur under real application usage that a simulator cannot hope to duplicate, or even model closely.

Also with the sheer number of movable instruction in any non-trivial program I suspect that finding the optimal order is a rather hard problem. Is there a travelling salesman in the house?
Most of the time a relatively short piece of assembler code is running, it is possible to predict with reasonable accuracy how the micro-architecture will behave. Moreover, the low level code that implements many of the primitives on which multiple precision arithmetic depends is actually quite simple so the complexity is well contained.
Brian Gladman is offline   Reply With Quote
Old 2012-09-22, 08:59   #10
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

2·23·137 Posts
Default

Quote:
Originally Posted by Batalov View Post
Here's how Critticall would solve the following problem.

One of the lights in the bathroom just went out.
Solve it.
<...a few hours later...>
All walls are drilled in random places, the toilet is taken off and broken to pieces, all the fuses are blown by shorts, the water pipes were disconnected (but not shut)... you get the idea. The light is still off.
The main difference from your example is that computer code can be easily restored to the original condition if it becomes obvious that things are going haywire.
Quote:
Originally Posted by cheesehead View Post
The basic problem is that the random genetic evolution algorithm needs 4 billion years to work properly.
Are you sure 4Gy is enough, we've yet to see the outcome "work properly". But we had better hurry up since we've only got about another 800My-1Gy left to get things right.
Quote:
Originally Posted by Brian Gladman View Post
Most of the time a relatively short piece of assembler code is running, it is possible to predict with reasonable accuracy how the micro-architecture will behave. Moreover, the low level code that implements many of the primitives on which multiple precision arithmetic depends is actually quite simple so the complexity is well contained.
There are many different micro-architectures around; they all respond differently to the input code. Plus not just the micro-architecture is at play here: there is also the various myriad of buffers, caches, cores, SMT, temperature sensors, pipelines, etc. that tend to mess up any predictions depending upon what was executed immediately in the past or might be executed shortly in the future.
retina is offline   Reply With Quote
Old 2012-09-22, 09:41   #11
Brian Gladman
 
Brian Gladman's Avatar
 
May 2008
Worcester, United Kingdom

72·11 Posts
Default

Quote:
Originally Posted by retina View Post
There are many different micro-architectures around; they all respond differently to the input code. Plus not just the micro-architecture is at play here: there is also the various myriad of buffers, caches, cores, SMT, temperature sensors, pipelines, etc. that tend to mess up any predictions depending upon what was executed immediately in the past or might be executed shortly in the future.
Much of the MPIR assembler code is optimised on an architecture specific basis with the various quirks taken into account. The impact of cache effects can be reasonably accounted for and we have included techniques to account for the run-in and run-out effects you mention (many of which are weird and not well documented but can be seen in the results and hence built into the simulator).

Nobody pretends that any of this is perfect - it doesn't have to be. The overall speed improvements are both measurable and significant.
Brian Gladman is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
gcc optimization/intrinsics R.D. Silverman Factoring 12 2015-09-15 08:51
Possible optimization for GIMPS alpertron Math 3 2012-08-13 16:08
NFS Optimization R.D. Silverman Factoring 91 2010-01-24 20:48
Program Optimization for dual processor machines Altrus Software 4 2005-09-26 15:19
ASM Optimization Cyclamen Persicum Hardware 4 2004-05-26 07:51

All times are UTC. The time now is 21:30.


Thu Dec 2 21:30:36 UTC 2021 up 132 days, 15:59, 0 users, load averages: 0.97, 1.10, 1.19

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.