Let's Set Stuff Up
In my previous post I outlined a procedure for transferring disk images over the network using a file image. Along the way I found some interesting things happened when I tried to save time by compressing the file, both in-flight and pre-compressed.
I thought it might be best to try some testing of transfers, just to see what conclusions could be drawn from the results.
First things first. Let's create a few test files. First, a 1 gigabyte file of near-nothings. This would be like a drive image of a new hard disk that is mostly unallocated sectors, therefore highly compressible. On my OS X system, I run:
time dd if=/dev/zero of=./1_gig_zeroed_blocks_fw.img bs=1g count=1
...and let it go. Time will time the file creation. Dd will read from the zero (nothing?) device, outputting a one-gig (bs = block size) file with one count to my external FireWire disk drive. Note that I'm using OS X; with Linux, dd's options are a little different, such as bs=1G instead of the lowercase g.
...looks good. Let's create one on my machine's internal SSD drive:
time dd if=/dev/zero of=./1_gig_zeroed_blocks_ssd.img bs=1g count=1
Let's create a file of random things. This would be a hard disk image full of software and other non-zero content on the filesystem.
On my FireWire drive:
time dd if=/dev/random of=./1_gig_random_blocks_fw.img bs=1g count=1
On the SSD drive:
time dd if=/dev/random of=./1_gig_random_blocks_ssd.img bs=1g count=1
So far so good.
File | FireWire, zeroed | FireWire, random |
---|---|---|
Time to create (or compress) | 0m16s | 2m11s |
Size (ls -al) | 1073741824 | 1073741824 |
Size (ls -alh) | 1.0G | 1.0G |
MD5 | cd573cfaace07e7949bc0c46028904ff | 9e58b0cbae41c2ad7a91cb6b8f2cd6a0 |
File | SSD, zeroed | SSD, random |
---|---|---|
Time to create (or compress) | 0m4s | 1m24s |
Size (ls -al) | 1073741824 | 1073741824 |
Size (ls -alh) | 1.0G | 1.0G |
MD5 | cd573cfaace07e7949bc0c46028904ff | 0aa62e9b1922cb8a34623afff6648981 |
There's something to note here; the use of "random" created really random files. The md5 hashes show that the two random files are indeed different. But the files, separately created, consisting of output from /dev/zero have the same hash. That's because zero is giving the same data. Nothing. These are the two extremes between unused stuff on a drive and totally filling it with a mishmash of information. Your drive contents will be in between the two extremes, of course, and as it fills and has data "deleted" and overwritten it will gradually move towards the mishmash side.
This also illustrated how fast the SSD drive performance is compared to the external FireWire drive. Although I can't say I'm too surprised at this.
Second, we create some compressed files. What I expect is the zeroed blocks files should be significantly smaller than the random blocks version. On the FireWire drive:
time bzip2 -c 1_gig_random_blocks_fw.img > 1_gig_random_blocks_fw.img.bz2
time gzip -c 1_gig_random_blocks_fw.img > 1_gig_random_blocks_fw.img.gz
File | FireWire, random, bzip2 | FireWire, random, gzip |
---|---|---|
Time to create (or compress) | 3m46s | 0m39s |
Size (ls -al) | 1078480063 | 1074069404 |
Size (ls -alh) | 1.0G | 1.0G |
MD5 | 1584ac641ae4989ef6439000e9a591b9 | 7e41488b6f4ffbb1b87f4d49dbc7ea02 |
Holy cow!
Compression works, at the highest abstraction, by looking for data that it can find in common and substituting a shorthand to represent certain patterns. Without forcing any "ultra" compression levels, the files barely shrunk. In fact, they grew, with the overhead of the compression metadata tacked on. The random data just couldn't be compressed, much like trying to compress compressed data.
time bzip2 -c 1_gig_zeroed_blocks_fw.img > 1_gig_zeroed_blocks_fw.img.bz2
time gzip -c 1_gig_zeroed_blocks_fw.img > 1_gig_zeroed_blocks_fw.img.gz
File | FireWire, zeroed, bzip2 | FireWire, zeroed, gzip |
---|---|---|
Time to create (or compress) | 0m33s | 0m6s |
Size (ls -al) | 785 | 1043683 |
Size (ls -alh) | 785B | 1.0M |
MD5 | 9192d766e556ac3c470bff28a0af7b04 | eaee7c163450c9f739eb16599f1633ea |
And another wow. Remember it was looking for shorthand to substitute for patterns? It would appear that "nothing" compresses quite a bit.
(If you want some fun, you can really freak someone out emailing them a 1 megabyte file with instructions to unzip it...no, don't do that. It's not nice.)
Again, these are extreme examples. You can't tell what the real-world performance is for the two compressors from these files.
On the SSD drive:
time bzip2 -c 1_gig_random_blocks_ssd.img > 1_gig_random_blocks_ssd.img.bz2
time gzip -c 1_gig_random_blocks_ssd.img > 1_gig_random_blocks_ssd.img.gz
File | SSD, random, bzip2 | SSD, random, gzip |
---|---|---|
Time to create (or compress) | 3m41s | 0m39s |
Size (ls -al) | 1078489250 | 1074069405 |
Size (ls -alh) | 1.0G | 1.0G |
MD5 | ccfca3a4aafd0bf32b1a0643097af1b4 | e9d84661c44c00828610b4908ec33dd1 |
Well, this is kind of interesting...the times to compress the files are close to what it took to compress the random files on the FireWire drive. That would mean that the compressors, not the drives, were the bottleneck.
But how could the compression take less time than the creation of the files initially? I'd guess it's filesystem caching; it hadn't been flushed to the drive yet, but as far as the applications were concerned, it was done. Don't turn off the computer without flushing buffers first or you'll get a whoopsie.
time bzip2 -c 1_gig_zeroed_blocks_ssd.img > 1_gig_zeroed_blocks_ssd.img.bz2
time gzip -c 1_gig_zeroed_blocks_ssd.img > 1_gig_zeroed_blocks_ssd.img.gz
File | SSD, zeroed, bzip2 | SSD, zeroed, gzip |
---|---|---|
Time to create (or compress) | 0m12s | 0m6s |
Size (ls -al) | 785 | 1043684 |
Size (ls -alh) | 785B | 1.0M |
MD5 | 9192d766e556ac3c470bff28a0af7b04 | dd03953bfcd54f3c5a8457978079ec36 |
Still extremely small for the zeroed files. Also notice that the MD5's for the bzip2 files match between the zeroed bzip2 on the SSD and the FireWire drives, but the gzip files do not match. That's kind of interesting...something in the metadata must be different. Perhaps it's storing path data? (Pure speculation)
Does It Matter If I Cat Instead Of DD?
Let's test a 1-gig file transfer and see what time it takes.
On the Linux machine, I tell it to listen for the incoming file. Straight copy with dd, and my working directory is a mounted internal 500GB drive.
nc -l 19000 | dd of=./test.img
On my sending machine, I send the file from my internal SSD drive:
time cat 1_gig_random_blocks_ssd.img | nc <target machine ip> 19000
The result: 1m37s.
MD5Sum: 0aa62e9b1922cb8a34623afff6648981 test.img
I had to use time from the Mac (sending) system because timing the open, "listening" side included time from waiting for me to actually start sending data. Timing from the sending system actually timed the start of sending information to closing the connection.
Now I'll try again, using cat on the receiving side. On the target, after deleting the image:
nc -l 19000 | cat > ./test.img
On the sending machine, same as before:
time cat 1_gig_random_blocks_ssd.img | nc <target machine ip> 19000
The result: 0m13s
MD5Sum: 0aa62e9b1922cb8a34623afff6648981 test.img
Conclusion: Yes, the implementations behind dd vs. cat are different and have a dramatic effect on the speed with which the file is transferred.
Does It Matter If The File Is Mostly "Empty" With Cat?
Let's try transferring the zeroed file versus the random file. Several runs will give a good idea of how much variation in times there are.
On the target machine:
nc -l 19000 | cat > ./test.img
While on the source machine:
time cat 1_gig_random_blocks_ssd.img | nc <target machine ip> 19000
The result: 0m11s, 0m12s, 0m12s
MD5Sum: 0aa62e9b1922cb8a34623afff6648981 test.img
Now, I transfer the mostly empty file. On the target:
nc -l 19000 | cat > ./test.img
On the source:
time cat 1_gig_zeroed_blocks_ssd.img | nc <target machine ip> 19000
The result: 0m13s, 0m12s, 0m11s
MD5Sum: cd573cfaace07e7949bc0c46028904ff test.img
Conclusion: Not a significant difference. A big file is a big file. Period.
What If I Repeat It With DD? Is a "Zeroed" File Faster?
Let's find out. On the target system:
nc -l 19000 | dd of=./test.img
On the source:
time cat 1_gig_random_blocks_ssd.img | nc <target machine ip> 19000
The result: 1m38s, 1m36s, 1m37s
MD5Sum: 0aa62e9b1922cb8a34623afff6648981 test.img
Now with the zeroed file. On the target:
nc -l 19000 | dd of=./test.img
And the source:
time cat 1_gig_zeroed_blocks_ssd.img | nc <target machine ip> 19000
The result: 1m37s, 1m37s, 1m37s
MD5Sum: cd573cfaace07e7949bc0c46028904ff test.img
Conclusion: No, the zeroed file doesn't make any difference with dd compared to writing the random-bits file.
Does Transferring, Using Cat, From My External FireWire Drive Affect The Transfer Time?
Maybe? This would test if the FireWire drive is a bigger bottleneck than the network.
When I was doing the previously-blogged drive image test, I at the time had a 500 gig file, and the only drive I had space to hold that kind of image was the external FireWire drive. I wondered if the internal SSD drive would have significantly sped up the process, and hoped that my wasted time wasn't that bad, if indeed the network was a bigger buffer filler than the transfer rate of the drive.
So to test this I once again set up the target to listen for the transfer:
nc -l 19000 | cat > test.img
And on my source, cd change to my FireWire drive with the set of FireWire-hosted test images and send them over the network:
time cat 1_gig_random_blocks_fw.img | nc <target machine ip> 19000
The result: 0m43s, 0m12s, 0m11s
MD5Sum: 9e58b0cbae41c2ad7a91cb6b8f2cd6a0 test.img
...What? It fell from 43 seconds down to 12 seconds? The only explanation I have is file caching...
Let's first compare to a transfer from the SSD; I use the same command on the target:
nc -l 19000 | cat > test.img
Then from the SSD drive on the source:
time cat 1_gig_random_blocks_ssd.img | nc <target machine ip> 19000
Result: 0m12s, 0m11s, 0m12s
MD5Sum: 0aa62e9b1922cb8a34623afff6648981 test.img
Now let's try that FireWire again...
Target:
nc -l 19000 | cat > test.img
Source:
time cat 1_gig_random_blocks_fw.img | nc <target machine ip> 19000
Results: 0m22s, 0m12s, 0m12s
And if I try a 1 gig file from the FireWire drive that wasn't transferred before?
On the source:
time cat 1_gig_zeroed_blocks_fw.img | nc <target machine ip> 19000
Results: 0m43s, 0m12s, 0m12s
The same giant time drop after the first iteration. I wonder if running
sudo purge
...between tests would affect it? The purge command is supposed to flush disk cache.
Let's see; repeating the above, with the random block img file as before, but running purge between runs...
The results: 0m22s, 0m44s, 0m43s
Yup, it's a cache. The first run has a kind of partial-flush going on from the change from the previous tests (I didn't purge the cache before the first run.)
Conclusion: Yes, there is a difference, but when dealing with a 1 gig file the memory caching on OS X will offset the difference after the first run, as long as you're repeating the process. The cache will level the field. But first run, the SSD spanked the FireWire transfer quite soundly, and if you're dealing with files larger than can be cached there will probably be a loss of subsequent performance.
Transferring the Compressed Files
When imaging systems, I would think that storing an image of the drive as a compressed file on the source system would help speed up the process compared to transferring an image that was 1:1 in size. But does it really? If it does have an effect, how much does the compression affect the transfer time?
Here I created a couple of tiny files relative to what their decompressed size. If I do the transfer, will the tiny file dump across the network and close, leaving the remote system to decompress and write the file to disk? Or is there some kind of mechanism that prevents the network connection from closing, making the savings in imaging time reliant on how fast or efficient the decompression algorithm is processed?
Let's find out. On the target system:
nc -l 19000 | bzip2 -d | cat > test.img
On the source:
time cat 1_gig_zeroed_blocks_ssd.img.bz2 | nc <target machine ip> 19000
The bzip2 version of the zeroed file is 785 bytes, and expanded to a gig, making the ratio of compressed to decompressed sizes absolutely huge. Dumping the file over the network should literally take only a few moments. I'm fairly sure the decompression and writing of the file would take longer than sending the 785 bytes over the wire. So how long did the above commands take? (And I ran purge between attempts...)
Results: 0.007s, 0.007s, 0.007s
MD5Sum of the decompressed file: cd573cfaace07e7949bc0c46028904ff test.img
The MD5 of the decompressed file shows that it did transform into the appropriate original file. And the speed with which it completed, on the source side, implies that the transfer of the compressed file does indeed dump it and close the network connection once the data is transferred and leaving the target system to decompress and write the file out from a buffer. What is the time difference between the two actions (the sender sending the file and the target getting the file and decompressing/writing the file to disk?)
There are a few ways to get a rough idea, but I'm going to utilize iTerm's ability to mirror commands to multiple sessions so that I can set up the two commands on the two systems, both set with "time", and send the enter-key keystroke at the same time to both systems (iTerm calls it "broadcast input"). The result?
Source: 0.032s
Target: 9.6s
Definite discrepancy.
What if I do the same thing with the gzipped version? On the target:
time nc -l 19000 | gzip -d | cat > test.img
On the source:
time cat 1_gig_zeroed_blocks_ssd.img.gz | nc <target machine ip> 19000
And the results:
Target: 7.9s
Source: 7.4s
How long does it take to transfer a random 1 megabyte file, similar in size to the gzipped file? First I create a file to transfer:
dd if=/dev/random of=./1_meg_random_blocks_ssd.img bs=1m count=1
Just for comparison:
1043684 Mar 25 16:55 1_gig_zeroed_blocks_ssd.img.gz
1048576 Apr 6 17:11 1_meg_random_blocks_ssd.img
Then transfer; on the target:
time nc -l 19000 | cat > test.img
And send from the source:
time cat 1_meg_random_blocks_ssd.img | nc <target machine ip> 19000
And the results:
Target: 0.019s
Source: 0.018s
Strange. I don't know how to explain it, at least not without digging in with wireshark to see if there's some kind of 2-way communication, or if there's something with buffers and the algorithm gzip is using by default so it is only taking X amount of data to decompress then blocking until it's ready for the next chunk, while bzip2 is taking the whole amount all at once (although at less than 500 bytes, I would think even if it were doing something similar to gzip, that file size would fit into one chunk to work with.)
Conclusion: It appears that there's some influence from buffers, but the file-sending part will close the socket once it can dump the file to the target side. Depending on the decompressor implementation, sending a compressed file can help.
My Takeaways: After running this series of rough benchmarks, several of my preconceived notions were apparently inaccurate. Apparently you shouldn't simply count on suppositions that should make sense to actually make sense.
There are several questions brought up by the results, and I don't have definitive answers to all of them. Ironically I have speculation that makes sense...but that's what I was testing in the first place.
Let's test a 1-gig file transfer and see what time it takes.
On the Linux machine, I tell it to listen for the incoming file. Straight copy with dd, and my working directory is a mounted internal 500GB drive.
nc -l 19000 | dd of=./test.img
On my sending machine, I send the file from my internal SSD drive:
time cat 1_gig_random_blocks_ssd.img | nc <target machine ip> 19000
The result: 1m37s.
MD5Sum: 0aa62e9b1922cb8a34623afff6648981 test.img
I had to use time from the Mac (sending) system because timing the open, "listening" side included time from waiting for me to actually start sending data. Timing from the sending system actually timed the start of sending information to closing the connection.
Now I'll try again, using cat on the receiving side. On the target, after deleting the image:
nc -l 19000 | cat > ./test.img
On the sending machine, same as before:
time cat 1_gig_random_blocks_ssd.img | nc <target machine ip> 19000
The result: 0m13s
MD5Sum: 0aa62e9b1922cb8a34623afff6648981 test.img
Conclusion: Yes, the implementations behind dd vs. cat are different and have a dramatic effect on the speed with which the file is transferred.
Does It Matter If The File Is Mostly "Empty" With Cat?
Let's try transferring the zeroed file versus the random file. Several runs will give a good idea of how much variation in times there are.
On the target machine:
nc -l 19000 | cat > ./test.img
While on the source machine:
time cat 1_gig_random_blocks_ssd.img | nc <target machine ip> 19000
The result: 0m11s, 0m12s, 0m12s
MD5Sum: 0aa62e9b1922cb8a34623afff6648981 test.img
Now, I transfer the mostly empty file. On the target:
nc -l 19000 | cat > ./test.img
On the source:
time cat 1_gig_zeroed_blocks_ssd.img | nc <target machine ip> 19000
The result: 0m13s, 0m12s, 0m11s
MD5Sum: cd573cfaace07e7949bc0c46028904ff test.img
Conclusion: Not a significant difference. A big file is a big file. Period.
What If I Repeat It With DD? Is a "Zeroed" File Faster?
Let's find out. On the target system:
nc -l 19000 | dd of=./test.img
On the source:
time cat 1_gig_random_blocks_ssd.img | nc <target machine ip> 19000
The result: 1m38s, 1m36s, 1m37s
MD5Sum: 0aa62e9b1922cb8a34623afff6648981 test.img
Now with the zeroed file. On the target:
nc -l 19000 | dd of=./test.img
And the source:
time cat 1_gig_zeroed_blocks_ssd.img | nc <target machine ip> 19000
The result: 1m37s, 1m37s, 1m37s
MD5Sum: cd573cfaace07e7949bc0c46028904ff test.img
Conclusion: No, the zeroed file doesn't make any difference with dd compared to writing the random-bits file.
Does Transferring, Using Cat, From My External FireWire Drive Affect The Transfer Time?
Maybe? This would test if the FireWire drive is a bigger bottleneck than the network.
When I was doing the previously-blogged drive image test, I at the time had a 500 gig file, and the only drive I had space to hold that kind of image was the external FireWire drive. I wondered if the internal SSD drive would have significantly sped up the process, and hoped that my wasted time wasn't that bad, if indeed the network was a bigger buffer filler than the transfer rate of the drive.
So to test this I once again set up the target to listen for the transfer:
nc -l 19000 | cat > test.img
And on my source, cd change to my FireWire drive with the set of FireWire-hosted test images and send them over the network:
time cat 1_gig_random_blocks_fw.img | nc <target machine ip> 19000
The result: 0m43s, 0m12s, 0m11s
MD5Sum: 9e58b0cbae41c2ad7a91cb6b8f2cd6a0 test.img
...What? It fell from 43 seconds down to 12 seconds? The only explanation I have is file caching...
Let's first compare to a transfer from the SSD; I use the same command on the target:
nc -l 19000 | cat > test.img
Then from the SSD drive on the source:
time cat 1_gig_random_blocks_ssd.img | nc <target machine ip> 19000
Result: 0m12s, 0m11s, 0m12s
MD5Sum: 0aa62e9b1922cb8a34623afff6648981 test.img
Now let's try that FireWire again...
Target:
nc -l 19000 | cat > test.img
Source:
time cat 1_gig_random_blocks_fw.img | nc <target machine ip> 19000
Results: 0m22s, 0m12s, 0m12s
And if I try a 1 gig file from the FireWire drive that wasn't transferred before?
On the source:
time cat 1_gig_zeroed_blocks_fw.img | nc <target machine ip> 19000
Results: 0m43s, 0m12s, 0m12s
The same giant time drop after the first iteration. I wonder if running
sudo purge
...between tests would affect it? The purge command is supposed to flush disk cache.
Let's see; repeating the above, with the random block img file as before, but running purge between runs...
The results: 0m22s, 0m44s, 0m43s
Yup, it's a cache. The first run has a kind of partial-flush going on from the change from the previous tests (I didn't purge the cache before the first run.)
Conclusion: Yes, there is a difference, but when dealing with a 1 gig file the memory caching on OS X will offset the difference after the first run, as long as you're repeating the process. The cache will level the field. But first run, the SSD spanked the FireWire transfer quite soundly, and if you're dealing with files larger than can be cached there will probably be a loss of subsequent performance.
Transferring the Compressed Files
When imaging systems, I would think that storing an image of the drive as a compressed file on the source system would help speed up the process compared to transferring an image that was 1:1 in size. But does it really? If it does have an effect, how much does the compression affect the transfer time?
Here I created a couple of tiny files relative to what their decompressed size. If I do the transfer, will the tiny file dump across the network and close, leaving the remote system to decompress and write the file to disk? Or is there some kind of mechanism that prevents the network connection from closing, making the savings in imaging time reliant on how fast or efficient the decompression algorithm is processed?
Let's find out. On the target system:
nc -l 19000 | bzip2 -d | cat > test.img
On the source:
time cat 1_gig_zeroed_blocks_ssd.img.bz2 | nc <target machine ip> 19000
The bzip2 version of the zeroed file is 785 bytes, and expanded to a gig, making the ratio of compressed to decompressed sizes absolutely huge. Dumping the file over the network should literally take only a few moments. I'm fairly sure the decompression and writing of the file would take longer than sending the 785 bytes over the wire. So how long did the above commands take? (And I ran purge between attempts...)
Results: 0.007s, 0.007s, 0.007s
MD5Sum of the decompressed file: cd573cfaace07e7949bc0c46028904ff test.img
The MD5 of the decompressed file shows that it did transform into the appropriate original file. And the speed with which it completed, on the source side, implies that the transfer of the compressed file does indeed dump it and close the network connection once the data is transferred and leaving the target system to decompress and write the file out from a buffer. What is the time difference between the two actions (the sender sending the file and the target getting the file and decompressing/writing the file to disk?)
There are a few ways to get a rough idea, but I'm going to utilize iTerm's ability to mirror commands to multiple sessions so that I can set up the two commands on the two systems, both set with "time", and send the enter-key keystroke at the same time to both systems (iTerm calls it "broadcast input"). The result?
Source: 0.032s
Target: 9.6s
Definite discrepancy.
What if I do the same thing with the gzipped version? On the target:
time nc -l 19000 | gzip -d | cat > test.img
On the source:
time cat 1_gig_zeroed_blocks_ssd.img.gz | nc <target machine ip> 19000
And the results:
Target: 7.9s
Source: 7.4s
How long does it take to transfer a random 1 megabyte file, similar in size to the gzipped file? First I create a file to transfer:
dd if=/dev/random of=./1_meg_random_blocks_ssd.img bs=1m count=1
Just for comparison:
1043684 Mar 25 16:55 1_gig_zeroed_blocks_ssd.img.gz
1048576 Apr 6 17:11 1_meg_random_blocks_ssd.img
Then transfer; on the target:
time nc -l 19000 | cat > test.img
And send from the source:
time cat 1_meg_random_blocks_ssd.img | nc <target machine ip> 19000
And the results:
Target: 0.019s
Source: 0.018s
Strange. I don't know how to explain it, at least not without digging in with wireshark to see if there's some kind of 2-way communication, or if there's something with buffers and the algorithm gzip is using by default so it is only taking X amount of data to decompress then blocking until it's ready for the next chunk, while bzip2 is taking the whole amount all at once (although at less than 500 bytes, I would think even if it were doing something similar to gzip, that file size would fit into one chunk to work with.)
Conclusion: It appears that there's some influence from buffers, but the file-sending part will close the socket once it can dump the file to the target side. Depending on the decompressor implementation, sending a compressed file can help.
My Takeaways: After running this series of rough benchmarks, several of my preconceived notions were apparently inaccurate. Apparently you shouldn't simply count on suppositions that should make sense to actually make sense.
There are several questions brought up by the results, and I don't have definitive answers to all of them. Ironically I have speculation that makes sense...but that's what I was testing in the first place.