S3 sync: s3 -> local redownloads unchanged files
Problem
We store a pile of files in S3 and it's handy to have a local copy of our S3 buckets for development and backup. Upon first glance `aws s3 sync` looks like it'll work. I ran sync on our entire bucket and it completed successfully; it downloaded a whole bucket to local disk. The second time I ran the command it was redownloaded some files that haven't changed (on S3 or locally) alongside the new ones. [code block] These files were just downloaded with the first `sync`. The local modified time & size match S3's values. While I never rule out the possibility of user error I don't see an obvious cause. The first S3->Local sync completed normally, I run it again and it redownloads _some_ files every time that haven't changed. Not all, just some. And it's the same files redownloaded every time. My cli version is `aws-cli/1.2.13 Python/2.7.6 Darwin/10.8.0` This may or may not be related to issue #599, but I won't personally make that call.
Error Output
error I don't see an obvious cause. The first S3->Local sync completed normally, I run it again and it redownloads _some_ files every time that haven't changed. Not all, just some. And it's the same files
Unverified for your environment
Select your OS to check compatibility.
1 Fix
Solution: S3 sync: s3 -> local redownloads unchanged files
Output: [code block] These files exist locally: the first `sync` downloaded them: [code block] Additional `sync`s certainly grab new files that _don't_ exist locally as expected, but it's also reporting the same files don't exist on each run and then tries to redownload them. The files in my log example are large movie files but the issue doesn't appear to be isolated to one file size or type
Trust Score
2 verifications
- 1
These files exist locally: the first `sync` downloaded them:
These files exist locally: the first `sync` downloaded them:
textThese files exist locally: the first `sync` downloaded them: ``` sh-3.2# stat /Volumes/RAID/home/backup/s3backup/serc/files/NAGTWorkshops/rtop/introductory_meteorology.mov 234881029 53464380 -rw-r--r-- 1 root com.apple.local.ard_admin 0 644979152 "Nov 22 12:54:03 2013" "Feb 13 12:08:57 2014" "Feb 13 12:08:57 2014" "Nov 22 12:54:03 2013" 4096 1259728 0 /Volumes/RAID/home/backup/s3backup/serc/files/NAGTWorkshops/rtop/introductory_meteorology.mov - 2
Additional `sync`s certainly grab new files that _don't_ exist locally as expect
Additional `sync`s certainly grab new files that _don't_ exist locally as expected, but it's also reporting the same files don't exist on each run and then tries to redownload them.
- 3
The files in my log example are large movie files but the issue doesn't appear t
The files in my log example are large movie files but the issue doesn't appear to be isolated to one file size or type. There are plenty of small thumbnail images and other file types, too.
Validation
Resolved in aws/aws-cli GitHub issue #648. Community reactions: 1 upvotes.
Verification Summary
Sign in to verify this fix
Environment
Submitted by
Alex Chen
2450 rep