[Cyberduck-trac] [Cyberduck] #10278: Optimize Checksum Calculation
Cyberduck
trac at cyberduck.io
Thu Mar 15 09:14:57 UTC 2018
#10278: Optimize Checksum Calculation
----------------------------+-------------------------
Reporter: allklier | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: core | Version: 6.4.1
Severity: normal | Keywords:
Architecture: | Platform: macOS 10.12
----------------------------+-------------------------
Two suggestions to optimize checksum calculation while uploading to S3.
I frequently upload very large files (75-100GB) to S3 and the checksum
calculation adds a significant delay in a time sensitive workflow. I was
just uploading a 75GB file, and the checksum calculation took 10min before
the actual upload started. Actual upload time is 32min, so that adds a 33%
time penalty in uploading, which is significant and very unfortunate.
- Compute the checksum during the upload, rather than a separate pre-calc
pass. Yes, that reduces redundancy of the checksum because it becomes a
single read, but errors are more likely during upload than local disk
read.
- The algorithm for reading the file for checksum calculation seems slow.
My primary storage (RAID5) supports read bandwidth in excess of 400MB/s,
yet during the calculation of the checksum the read speed never exceeds
120MB/s, so checksum calculation is limited by code not I/O bandwidth.
--
Ticket URL: <https://trac.cyberduck.io/ticket/10278>
Cyberduck <https://cyberduck.io>
Libre FTP, SFTP, WebDAV, S3 & OpenStack Swift browser for Mac and Windows
More information about the Cyberduck-trac
mailing list