![]() |
Cheap fast disc
I'm trying to write some code which needs to do lots of passes which read a ~60G file and write another one (what do you mean, not everybody dreams of a size 2^33 mod-{2^61-1} NTT ? ) It's streaming rather than seeking (read 4GB, write 1GB in each of four places, repeat ...), so basically I want fast transfer rates. I've also run out of space on my current discs.
I'm tempted to get two of the smallest-capacity models of a modern range of hard drives (eg something like [url]http://www.scan.co.uk/Products/320GB-Hitachi-HDT721032SLA360-Deskstar-7K1000B-SATA-3Gb-s-7200-rpm-16MB-Cache-0-ms[/url]) and use them under Linux software RAID0. Is this sensible? IIRC modern drives read from one platter at a time, so I should expect this one-platter model to have the same 90MB/s read-and-write rates as the terabyte model measured at [url]http://www.tomshardware.com/reviews/hitachi-western-digital-terabyte,2017-6.html[/url], and two of them together ought to be able to read or write ten gigabytes a minute. It's an X58 motherboard so I assume it can handle two independent fast SATA channels - the SATA controllers are at the end of a PCIe x4 link which is 2GB/sec so not a bottleneck. SSD would be four times the price for a 120G drive, which is not quite enough capacity. 150G VelociRaptor is twice the price, and sort of just about big enough, though might be a bit faster. |
[QUOTE=fivemack;166325]It's streaming rather than seeking (read 4GB, write 1GB in each of four places, repeat ...), so basically I want fast transfer rates. I've also run out of space on my current discs.[/QUOTE]Best thing to do would be to spread the data over 5 disks. No RAID : read on disk 1 write on 2, 3, 4 and 5.... One busy file per disk[QUOTE=fivemack;166325]SSD would be four times the price for a 120G drive, which is not quite enough capacity. 150G VelociRaptor is twice the price, and sort of just about big enough, though might be a bit faster.[/QUOTE]Be sure to read the AnandTech article about SSD's before going that way : [url=www.anandtech.com/storage/showdoc.aspx?i=3531]The SSD Anthology: Understanding SSDs and New Drives from OCZ[/url]. There are a limit on the number of writes on SSD disks and other limitations but it may be interesting for you depending on what the disk usage will be exactly.
Jacob |
The latest generation of hard disk drives fits 500GB on each platter. So I'd try for a pair of 500GB one-platter drives with 7200rpm, the higher data density should give better linear read performance than a 320GB drive at the same rpm. I have a Samsung HD502HI now which is 500GB one-platter with only 5400rpm (very quiet drive though!), but afaik Samsung and Seagate offer 500GB one-platter at 7200rpm as well.
Alex |
[QUOTE=S485122;166326]Best thing to do would be to spread the data over 5 disks. No RAID : read on disk 1 write on 2, 3, 4 and 5.... One busy file per disk[/QUOTE]
I don't think that can be right; it seems very constrained by the read rate from a single disc. Four discs, two partitions per disc, one set of partitions bonded as RAID0 for reading and the others left alone for writing, ought to get full performance. But I don't want the cost and the electricity use of four discs; I'm not really prepared to spend 30% more to get 10% more speed by using 500G/platter rather than 333G/platter models. |
[QUOTE=fivemack;166367]I don't think that can be right; it seems very constrained by the read rate from a single disc. Four discs, two partitions per disc, one set of partitions bonded as RAID0 for reading and the others left alone for writing, ought to get full performance. But I don't want the cost and the electricity use of four discs; I'm not really prepared to spend 30% more to get 10% more speed by using 500G/platter rather than 333G/platter models.[/QUOTE]
If you're only willing to spring for two drives then you can't get more bandwidth without buying more drives. You could just read from one drive and write to another and you probably won't improve much with other partitioning schemes. |
[QUOTE=lfm;166454]If you're only willing to spring for two drives then you can't get more bandwidth without buying more drives. You could just read from one drive and write to another and you probably won't improve much with other partitioning schemes.[/QUOTE]
If he is reading and then writing but not doing both at the same time, it should be faster to have those two drives in RAID-0. You basically double your read and write speeds because it reads/writes half the file to one disc and half to the other simultaneously. So if he is reading in a 4GB file, then doing something and writing four 1 GB files using RAID-0 would half the read time and half the write time compared to just using 1 drive for reading and 1 drive for writing. |
If you often read/write such big files I think the most important is the disk cache.
Nowadays you will get 64GB RAM for about 1200-1600 EUR (plus a motherboard which can handle that). You will get 50-100 times more throughput if your threads are not reading/writing all at the same time and if you can produce that data as fast at all. |
Yes, if I'm willing to spend £400 on a Tyan S2937 board, £500 on two CPUs for it and a further £1600 on sixteen 4GB memory modules, I could put 64GB on a single machine, and it would be extremely fast.
But spending a thirtieth of that on disc drives is getting me something which is, I am reasonably sure, adequate for my purposes. |
Besides the hardware aspect of the question, take a look at ext4. I've been using it for 2 weeks and it looks pretty good. Interresting benchmark here : [url]http://www.phoronix.com/scan.php?page=article&item=ext4_benchmarks&num=1[/url]
|
[quote=fivemack;167171]Yes, if I'm willing to spend £400 on a Tyan S2937 board, £500 on two CPUs for it and a further £1600 on sixteen 4GB memory modules, I could put 64GB on a single machine, and it would be extremely fast.
But spending a thirtieth of that on disc drives is getting me something which is, I am reasonably sure, adequate for my purposes.[/quote] How about $400 US for this ramdisk - [URL]http://www.acard.com.tw/english/fb01-product.jsp?idno_no=270&prod_no=ANS-9010&type1_title=%20Solid%20State%20Drive&type1_idno=13[/URL] and then the expense of 8x 8gb ddr2 dimms? I can provide feedback on its performance circa late May - I am ordering one at the end of this month [those dreaded paycheck vs bills vs child's needs tradeoffs]. It's sustained data transfer rate "should be" approximately 4x that of 10,000 rpm SATA drives. C++ya, xkey [who apparently had to re-register after umpteen years of silently reading] |
First question is whether the reads/writes will be concurrent or interleaved. If they are concurrent, you'll want separate drives so you can do both at the same time.
Are you going to have >4GB RAM (leaving room for the OS)? Maybe you should read 2GB at a time? Or, possible write out 2 GB while also reading in the next 2 GB? Check to make sure there are Intel RAID drivers available for *nix as well. I think there are some serious issues regarding Intel MatrixRAID and *nix. |
| All times are UTC. The time now is 07:24. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.