--- layout: post title: "Amazon S3, Ruby and Rails slides" --- The slides from the talk are "here":http://spatten_presentations.s3.amazonaws.com/s3-on-rails.pdf. (Yes, they're hosted on S3). There are two points in the presentation where I switched to a different window. At the 'S3SH DEMO' slide, I did some live coding showing how you can work with S3 using s3sh. It basically followed the script shown in 's3sh demo script' below, so read that part when you see the 'S3SH DEMO' slide. At the 'Example: S3Syncer' slide, I switched over to textmate and showed the code for a simple script to synchronize a single directory to S3. I then demoed the script to show it working. So, when you see the 'Example; S3Syncer' slide, read the s3syncer code and s3syncer demo sections below. h2. s3sh demo script Start up s3sh
$> s3sh
Create a bucket. Show that you can create a bucket multiple times if you own it, but trying to create a bucket that somebody else owns raises an error.
>> Bucket.create('spatten_s3demo')
=> true
>> Bucket.create('spatten_s3demo')
=> true
>> Bucket.create('test')
AWS::S3::BucketAlreadyExists: The requested bucket name is not available. The bucket namespace is shared by all users of the system. Please select a different name and try again.
        from /usr/local/lib/ruby/gems/1.8/gems/aws-s3-0.4.0/bin/../lib/aws/s3/error.rb:38:in `raise'
        from /usr/local/lib/ruby/gems/1.8/gems/aws-s3-0.4.0/bin/../lib/aws/s3/base.rb:72:in `request'
        from /usr/local/lib/ruby/gems/1.8/gems/aws-s3-0.4.0/bin/../lib/aws/s3/base.rb:83:in `put'
        from /usr/local/lib/ruby/gems/1.8/gems/aws-s3-0.4.0/bin/../lib/aws/s3/bucket.rb:79:in `create'
        from (irb):3
You can save a bucket in a variable using @Bucket.find@
>> b = Bucket.find('spatten_s3demo')
=> #nil, "name"=>"spatten_s3demo", "marker"=>nil, "max_keys"=>1000, "is_truncated"=>false, "xmlns"=>"http://s3.amazonaws.com/doc/2006-03-01/"}, @object_cache=[]>
Create a text object
>> S3Object.store('test.txt', 'This is a test', 'spatten_s3demo')
=> #
>> b.objects
=> [#]
>> pp b.objects[0].about
{"last-modified"=>"Wed, 05 Dec 2007 19:56:49 GMT",
 "x-amz-id-2"=>
  "JACm9T+m9CgZhmj4q6q00OSGHgSyBVAbQ1cgRWGydYZLTKdhLc/IUZ+K7b/1snOc",
 "content-type"=>"text/plain",
 "etag"=>"\"ce114e4501d2f4e2dcea3e17b546f339\"",
 "date"=>"Wed, 05 Dec 2007 19:57:03 GMT",
 "x-amz-request-id"=>"CA170D2AA5DEB0C9",
 "server"=>"AmazonS3",
 "content-length"=>"14"}
=> nil
>> b.objects[0].key
=> "test.txt"
>> b.objects[0].value
=> "This is a test"
Create a binary object and show it in a browser
>> S3Object.store('vampire.jpg', File.open('vampire.jpg'), 'spatten_s3demo')
=> #
Show the photo in browser This doesn't work, as the file is only readable by me. Make it public readable and do it again.
>> S3Object.store('vampire.jpg', File.open('vampire.jpg'), 'spatten_s3demo', 
     :access => :public_read)
=> #
Show it in a browser again. It works this time. Look at bucket.objects. We have to reload the bucket to show the new object.
>> b.objects
=> [#]
>> b.objects(:reload)
=> [#, #]
Hash access to bucket objects
>> b['vampire.jpg']
=> #
>> vamp = b['vampire.jpg']
=> #
A look at metadata
>> vamp.content_type
=> "image/jpeg"
>> vamp.size
=> 10817
>> vamp.metadata
=> {}
>> vamp.metadata['subject'] = 'Claire'
=> "Claire"
>> vamp.metadata['photographer'] = 'Nadine Inkster'
=> "Nadine Inkster"
>> vamp.store
=> true
Storing the picture data in a variable
>> picdata = vamp.value
=> "\377\330\377\340\000\020JFIF\000\001\002\000.......
Downloading a picture by streaming it to an IO object.
>> File.open('vampire_downloaded.jpg', 'w') {|file| file.write(vamp.value)}
=> 10817
>> exit
s3demo $>ls
flowers.jpg             vampire.jpg
test.txt                vampire_downloaded.jpg
s3demo $>open vampire_downloaded.jpg 
s3demo $>
h2. S3Syncer Code Please note that this code is really only useful as an example of how to synchronize with S3. It won't recurse directories and it dies a horrible death if there are any symlinked files in a directory. If you are looking for something to synchronize directories, check out "s3sync.rb":http://s3sync.net/wiki. {% highlight ruby %} #!/usr/bin/env ruby require 'digest/md5' require 'aws/s3' include AWS::S3 class S3Syncer attr_reader :local_files, :files_to_upload def initialize(directory, bucket_name) @directory = directory @bucket_name = bucket_name end def S3Syncer.sync(directory, bucket) syncer = S3Syncer.new(directory, bucket) syncer.get_local_files syncer.connect_to_s3 syncer.get_bucket syncer.select_files_to_upload syncer.sync end # This does not recurse directories. def get_local_files @local_files = Dir.entries(@directory) end def connect_to_s3 Base.establish_connection!( :access_key_id => ENV['AMAZON_ACCESS_KEY_ID'], :secret_access_key => ENV['AMAZON_SECRET_ACCESS_KEY'] ) raise "\nERROR: Connection not made or bad access key " + "or bad secret access key. Exiting" unless AWS::S3::Base.connected? end def get_bucket Bucket.create(@bucket_name) @bucket = Bucket.find(@bucket_name) end # Files should be uploaded if # The file doesn't exist in the bucket # OR # The MD5 hashes don't match def select_files_to_upload @files_to_upload = @local_files.select do |file| case when File.directory?(local_name(file)) false # Don't upload directories when !@bucket[file] true # Upload if file does not exist on S3 when @bucket[file].etag != Digest::MD5.hexdigest(File.read(local_name(file))) true # Upload if MD5 sums don't match else false # the MD5 matches and it exists already, so don't upload it end end end # This will choke on symlinked files def sync (puts "Directories are in sync"; return) if @files_to_upload.empty? @files_to_upload.each do |file| puts "#{file} ===> #{@bucket.name}:#{file}" S3Object.store(file, File.open(local_name(file), 'r'), @bucket_name) end end private def local_name(file) File.join(@directory, file) end end if __FILE__ == $0 S3Syncer.sync('/Users/Scott/versioned/spattendesign/presentations/s3-on-rails/s3demo', 'spatten_syncdemo') end {% endhighlight %} h2. S3Syncer demo Start with spatten_syncdemo bucket empty, and four files in the local directory. Run the script
s3demo $>ls
flowers.jpg             vampire.jpg
test.txt                vampire_downloaded.jpg
s3demo $>s3syncer
flowers.jpg ===> spatten_syncdemo:flowers.jpg
test.txt ===> spatten_syncdemo:test.txt
vampire.jpg ===> spatten_syncdemo:vampire.jpg
vampire_downloaded.jpg ===> spatten_syncdemo:vampire_downloaded.jpg
Run it again, it says there's no need to do anything
s3demo $>s3syncer
Directories are in sync
Change a file locally and sync again
s3demo $> vi test.txt
Make some changes using vi
s3demo $>s3syncer
test.txt ===> spatten_syncdemo:test.txt
Delete flower.jpg using the Firefox S3 Organizer and then sync again.
s3demo $>s3syncer
flowers.jpg ===> spatten_syncdemo:flowers.jpg
So there you go, a quick intro to the wonders of Amazon S3.