[Cyberduck-trac] [Cyberduck] #11833: Extremely slow S3 downloads of large files

Sat Oct 2 16:29:57 UTC 2021

#11833: Extremely slow S3 downloads of large files
------------------------+-----------------------
 Reporter:  allklier    |         Owner:
     Type:  defect      |        Status:  new
 Priority:  normal      |     Milestone:
Component:  core        |       Version:  7.10.2
 Severity:  normal      |    Resolution:
 Keywords:              |  Architecture:  Intel
 Platform:  Windows 10  |
------------------------+-----------------------
Description changed by allklier:

Old description:

> For the second time in a few weeks I've face massive delays in download
> large files from an S3 bucket.
>
> In the latest case it's a single 220GB video file. In other case it was a
> number of larger video clips totaling 150GB.
>
> The actual download proceeds at network speed (Gigabit Internet), however
> after the last segment has been fetched Cyberduck sits for hours at 100%
> while it's assembling the segments into a single file. Despite having a
> very fast RAID (800MB/s transfer rates).
>
> It seems it shouldn't take that long if the code were to simply read each
> segment and concatenate them. I'm assuming there must be some
> inefficiency such as writing the entire file with each segment, or
> similar as the only reason why this exponentially slows down with file
> size.
>
> In this last download it created 104 2G segments. The last segment (104)
> completed download at 10:37AM. It's now 12:19PM and the new combined file
> still is only 109GB (about 50% of the total).
>
> That practically makes Cyberduck unusable for files like this.

New description:

 For the second time in a few weeks I've faced massive delays in download
 large files from an S3 bucket.

 In the latest case it's a single 220GB video file. In another case it was
 a number of larger video clips totaling 150GB.

 The actual download proceeds at network speed (Gigabit Internet), however
 after the last segment has been fetched Cyberduck sits for hours at 100%
 while it's assembling the segments into a single file. Despite having a
 very fast RAID (800MB/s transfer rates).

 It seems it shouldn't take that long if the code were to simply read each
 segment and concatenate them through file I/O. I'm assuming there must be
 some inefficiency as the only reason why this exponentially slows down
 with file size. Maybe the target file gets written over and over again? Or
 the copy buffer size is very small which penalizes spinning disk with lots
 of seeks as it's copying from/to the same disk (segments are located in
 temp subfolder).

 In this last download it created 104 2GB segments. The last segment (104)
 completed download at 10:37AM. It's now 12:19PM and the new combined file
 still is only 109GB (about 50% of the total).

 That practically makes Cyberduck unusable for files like this.

--

--
Ticket URL: <https://trac.cyberduck.io/ticket/11833#comment:1>
Cyberduck <https://cyberduck.io>
Libre FTP, SFTP, WebDAV, S3 & OpenStack Swift browser for Mac and Windows