mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2003-03-30, 08:59   #1
Xyzzy
 
Xyzzy's Avatar
 
Aug 2002

32·929 Posts
Default Possible optimizaton...

I heard about this today... Although it is designed to benefit multi-threaded processes, it is said to work on single-threaded processes too... Preliminary benchmarks show it might be worth using... It is free for individual use...

http://www.winheap.com/

If you try it please post your benchmarks here...
Xyzzy is offline   Reply With Quote
Old 2003-03-30, 16:06   #2
cperciva
 
Oct 2002

43 Posts
Default

It's snake oil. Sure, there are cases where it will be 10 times faster... IF all you do is allocate and release chunks of memory. In other cases, it will be considerably slower than the standard memory allocator.

This is a well known fact about memory allocation: Different allocators can perform dramatically better than each other depending upon the artificial benchmark, but if you're not making large numbers of malloc calls to begin with, it doesn't matter which allocator you use.
cperciva is offline   Reply With Quote
Old 2003-03-30, 21:49   #3
IanB
 
Mar 2003

416 Posts
Default

Quote:
Originally Posted by cperciva
It's snake oil. Sure, there are cases where it will be 10 times faster... IF all you do is allocate and release chunks of memory. In other cases, it will be considerably slower than the standard memory allocator.

This is a well known fact about memory allocation: Different allocators can perform dramatically better than each other depending upon the artificial benchmark, but if you're not making large numbers of malloc calls to begin with, it doesn't matter which allocator you use.
Right, let's address each of your points.

1. "Snake oil". Absolute nonsense. So you're an experienced enterprise C++ programmer are you ? Used to dealing with multiprocessor machines and significantly multithreaded applications ? No, thought not. There are any number of products that serve the same market, such as hoard (www.hoard.org) and SmartHeap (www.microquill.com). You think these are all snake oil ? Don't be ridiculous.
2. "In other cases it will be considerably slower...". Again, total nonsense. Why should it be "considerably slower" ? The "standard" allocator does nothing more than take out a process wide synchronisation lock, then call the HeapXXXX functions of the Win32 API - which is exactly what Microsoft say is "broken" about SP1. Most compiler vendors' heap managers work the same way. WinHeap and similar products perform any number of tricks to (a) reduce kernel time and (b) remove synchronisation points thus dramtically improving performance and scalability. Try reading the source code of some runtime libraries before you comment.
3. "This is a well known fact about memory allocation: Different allocators can perform dramatically better than each other depending upon the artificial benchmark, but if you're not making large numbers of malloc calls to begin with, it doesn't matter which allocator you use."
Yes, of course if you are not peforming many heap functions you will not see much of an improvement. However, most server side applications, particularly those accepting client connections, are very heap intensive. They typically create object instances to deal with the client request, plus memory for retrieving information from databases or for generating HTML/XML/PDF files etc. Such applications benefit greatly from a scalable heap manager - without one those applications are typically reduced to effectively being single threaded.

WinHeap is a new player on the block. Established products, such as SmartHeap and Hoard, claim a considerable number of customers (just look at the client lists on their websites). You think these people parted with good money (in the case of SmartHeap) for nothing ? Actually, what they do is employ the product, benchmark it *with their own software*, then see the difference it makes *in real world appilcations*. WinHeap has a version available for free download to allow *you* to do the exact same thing - download it and try it with your own application(s).

But hey, don't take my word for it. Look at what Microsoft say at:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vcsample98/html/vcsmpmpheapsamplemultithreadedheapmanager.asp

"MpHeap Sample: Multithreaded Heap Manager -
Many multithreaded applications that use the standard memory allocation routines pay a significant performance penalty when running on a multiprocessor machine. This is due to the serialization used by the default heap. On a multiprocessor machine, more than one thread may try to allocate memory simultaneously. One thread will block on the critical section guarding the heap. The other thread must then signal the critical section when it is finished to release the waiting thread. This adds significant overhead."

So, before stating that things are "well known facts", try understanding what the "well known facts" actually are.
IanB is offline   Reply With Quote
Old 2003-03-31, 01:11   #4
trif
 
trif's Avatar
 
Aug 2002

20210 Posts
Default

IanB, may I ask what the logic is in not including Win ME/98/95 support in the Standard (free for personal/nonprofit) version? I'd love to be able to at least test it on my home machines, but they are all running Win98SE.
trif is offline   Reply With Quote
Old 2003-03-31, 01:52   #5
IanB
 
Mar 2003

22 Posts
Default

Quote:
Originally Posted by trif
IanB, may I ask what the logic is in not including Win ME/98/95 support in the Standard (free for personal/nonprofit) version? I'd love to be able to at least test it on my home machines, but they are all running Win98SE.
We've actually knocked back support for Win98/ME for all versions (Standard, Developer, Professional), at least until we finish the current beta programme. WinHeap will *load* on 98/ME, but it does nothing. The reason for this is that many of the key functions that we rely upon internally for improved performance are not available on Win98/ME (InterlockedExchangeComparePointer, EnterCriticalSection etc). Although we've put in place workarounds, we found the final speed improvements to be negligible. And we still had some niggly compatibility issues outstanding even with the workarounds in place.

Our target market is really multithreaded/multiprocessor environments. Since the 98/ME familly does not support multiple processors there's little benefit in our spending the (probably lots) of time needed to get any real performance benefit out of WinHeap on those platforms. Ultimately, we need to ensure that our target platforms (NT, 2000, XP) are 100% going before looking at 98/ME again.

One question I did not address, that was really implicit in the original poster's message, is whether or not prime could benefit from WinHeap. The answer is, unfortunately, likely to be no. As the generally misinformed second poster stated, unless your application is performing heap operations regularly, there's little to be gained by using a different heap manager. My guess (although I have not tried it) is that the prime software is computationally expensive, but unlikely to perform very much in the way malloc(), free() and other heap management functions.

On the other hand, if you're a user of big Excel spreadsheets or Word documents, there's some evidence to suggest that WinHeap could help. We're looking into this at the moment.

Cheers

Ian
IanB is offline   Reply With Quote
Old 2003-03-31, 04:44   #6
trif
 
trif's Avatar
 
Aug 2002

20210 Posts
Default

I would not have expected much if any improvement in Prime95 time since it does not do a lot of memory allocation. But people are seeing improvement nonetheless. I am wondering if it is due to lower overhead from the OS itself if Windows is using the single threaded heap manager and WinHeap takes its place.

Thank you for addressing the Win98 situation. I had read in the FAQ that Me/98/95 support was in the Pro and Developer versions, and I couldn't see the logic in not putting it in the Standard version too.
trif is offline   Reply With Quote
Old 2003-03-31, 04:54   #7
IanB
 
Mar 2003

416 Posts
Default

Quote:
Originally Posted by trif
I would not have expected much if any improvement in Prime95 time since it does not do a lot of memory allocation. But people are seeing improvement nonetheless. I am wondering if it is due to lower overhead from the OS itself if Windows is using the single threaded heap manager and WinHeap takes its place.

Thank you for addressing the Win98 situation. I had read in the FAQ that Me/98/95 support was in the Pro and Developer versions, and I couldn't see the logic in not putting it in the Standard version too.
I had a read through the prime source code. I *did* find a bunch of heap management stuff, but it was not clear how often it would be called. I suspect it would be rarely, but I'll try to find some time to run it with our profiler to see. If people are seeing an improvement, well that's great :-) We'd be grateful if they would register that they are using WinHeap by visiting http://www.winheap.com/myaccount/register.php. Hey I wish it would make a difference to my seti times :)

I too noticed the FAQ still says that Win98/ME are not support by WinHeap Standard - must get that fixed up.

Cheers

Ian
IanB is offline   Reply With Quote
Old 2003-03-31, 06:28   #8
cperciva
 
Oct 2002

43 Posts
Default

Quote:
Originally Posted by IanB
1. "Snake oil". Absolute nonsense. So you're an experienced enterprise C++ programmer are you ? Used to dealing with multiprocessor machines and significantly multithreaded applications ?
No, I'm not an enterprise C++ programmer; I'm a DPhil student at Oxford University, doing my thesis work in parallel computing. I consider the claim that "WinHeap can make multithreaded programs run ten times faster" to qualify it as snake oil given that even your completely artificial benchmarks fail to reach that.

Quote:
2. "In other cases it will be considerably slower...". Again, total nonsense. Why should it be "considerably slower" ? The "standard" allocator does nothing more than take out a process wide synchronisation lock, then call the HeapXXXX functions of the Win32 API - which is exactly what Microsoft say is "broken" about SP1. Most compiler vendors' heap managers work the same way. WinHeap and similar products perform any number of tricks to (a) reduce kernel time and (b) remove synchronisation points thus dramtically improving performance and scalability. Try reading the source code of some runtime libraries before you comment.
How well do you handle memory fragmentation? How much overhead do you introduce by avoiding synchronization locks in cases where there is no contention?

Quote:
...well known facts...
I meant "well known facts within the academic community", of course.
cperciva is offline   Reply With Quote
Old 2003-03-31, 07:15   #9
IanB
 
Mar 2003

416 Posts
Default

Quote:
Originally Posted by cperciva
No, I'm not an enterprise C++ programmer; I'm a DPhil student at Oxford University, doing my thesis work in parallel computing.
Ahhh excellent, a uni student Then I would have thought you would attempt to provide some evidence of your assertions rather than just masquerade as somebody knowledgable about a subject, which quite clearly you are not.

Quote:
Originally Posted by cperciva
I consider the claim that "WinHeap can make multithreaded programs run ten times faster" to qualify it as snake oil given that even your completely artificial benchmarks fail to reach that.
Riiiight. So you failed to notice where the benchmarks showing iteration 5 performing at about 13x the speed of the Microsoft heap manager. Of course the benchmarks are artificial; the whole point is to demonstrate what the software is capable of. The same type of benchmark (algorithm wise) is also used by Hoard, which has considerable academic expertise behind it. Exactly how would you expect to benchmark a product like this in a way that is reproducible by a developer ? Get real. The benchmark has to be simple, easily understood and reproducible - and our benchmark is.

In fact, those benchmarks are old. Currently, under a dual processor Windows 2000 server, WinHeap is on *average* 17 times faster than the Microsoft heap manager for that benchmark. That number improves as you add either more processors or more threads.

We make *absolutely no* assertions that WinHeap will speed up all software. That would be ridiculous. That's why you get to download it and try it yourself with your own software. No other commercally available heap manager offers that option. Moreover, we have a number of clients who are using it successfully in our beta test programme, and they most certainly are getting better performance and better throughput for their applications.

Quote:
How well do you handle memory fragmentation? How much overhead do you introduce by avoiding synchronization locks in cases where there is no contention?
Two sensible questions, well done. Completely off the point of course, but sensible questions nonethelesss.

Fragmentation handling depends upon the size of the memory block. WinHeap provides three different allocation/deallocation strategies depending upon block size, <256 bytes, <16384 bytes and everything else. These strategies are hardly capable of being described in a few lines here, but we've found that our overall memory usage is usually slightly under what would be allocated by the Visual C++ heap manager for the same program running over the same time period. Long term running of various programs hasn't shown any significant overhead due to fragmentation generally, although WinHeap will generally allocate another block of memory (from its internal store) rather than wait for another thread (which often involves a kernel context switch), so it is possible for WinHeap to require more memory than the Microsoft heap manager. However, in a production envronment memory is cheap and processors are not. Nobody has told us that they are concerned with WinHeap's memory usage, nor would I expect them to.

Contention is dealt to by always attempting to use user mode synchronisation primitives, plus a small-ish spin count. Judicial use of InterlockedExchangeComparePointer, which uses bus level locking offered by the x86 architecture, plus EnterCriticalSection, gives us great flexibility. Kernel mode locks are required sometimes, it's unavoidable. Of course there are other algorithmic "tricks" that we have employed to try to reduce contention, but I'm not going to document them here.

Quote:
I meant "well known facts within the academic community", of course.
Oh yes, of course that's what you meant, how could I possibly not have known that ? As smart as you probably are, I suggest you do more research about a subject before posting in a public forum like this. You completely fail to justify your comment that "other heap managers will be slower than the standard allocator". Indeed such a comment is quite unjustifiable - it's just plain wrong. And to back that up, on a single processor machine running a single thread, the Thrash benchmark runs about 10% faster with WinHeap that with the Microsoft heap manager. Don't take my word for it though, download it and try it yourself. If you want try other benchmarks of your own, go ahead. I'm keen to see the results. For your assertion to be correct implies that somehow Microsoft has implemented the perfect single threaded heap manager. Do you really believe that ?

You could always simply try the software yourself, or the Microsoft sample I posted the link to, or Hoard, or whatever. Or would you simply prefer to point fingers and say "that's crap" ?

WinHeap, SmartHeap, Hoard, MtHeap etc etc are all solving *real* problems that exist in *real* development projects in *real* companies. We believe WinHeap is one of the fastest and thecertainly the easiest to use. It's not a magic cure for a poorly written multithreaded program, but where heap management is causing contention - as I have already demonstrated it often does in server applications - WinHeap will make a significant difference, thus reducing implementation costs and development time for applications.

Ian
IanB is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 17:05.


Sun Dec 5 17:05:46 UTC 2021 up 135 days, 11:34, 1 user, load averages: 1.50, 1.51, 1.53

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.