Amazon S3 - a large, far away, widely distributed gadget
Amazon S3 counts as a gadget, right? :-) I’ve been using it professionally for a while, and of course many of the services we take for granted until us-east-1 goes down use it too. Turns out that you can hook it in to a homebrew website without very much work…
The other day I traced a period of terrible performance (8s network latency getting out of the house) to a visit from Googlebot-Video/1.0 fetching an old AVI file from a post-hoc image stabilization project (now made mostly redundant by youtube’s builtin stabilization feature.) The file was about 50M, and anyone interested in the project really wants the less-compressed original, shoving it to youtube really doesn’t help… but it turns out that that’s tiny by Amazon S3 standards, and the free tier covers it just fine.
There were a surprisingly small set of steps; I’m posting them here with the actual domains involved, since they’re visible and public anyway, you just need to convert them to your own needs…
- Create yourself an AWS account. (You already shop at amazon, right? Just log in and create one…)
- “Better get a bucket.” Go to the console web page, pick S3, hit “create bucket”. Name it something obvious; in my case,
avi.thok.orgalthough something more generic likes3.thok.orgwould have been a common choice. Do this first, because the bucket namespace is global and isn’t checked against DNS registration at all, so there’s a very faint chance someone already has a bucket of that name; at this stage, if you find a collision you can just pick a different name, likes3-namespaces-can-you-speak-it.thok.org. - Install
s3cmd(justgit clonethe github version and run it from the checkout - the one in ubuntu doesn’t actually handle puts with redirects.)- Configure it:
s3cmd --configureand get the Access and Secret keys from the console under “security”; don’t bother to configure encryption or https because these are files that are already available by http, you don’t want to deal with certificates, and you’ll check the md5sums later. - Copy your files. Note that s3 doesn’t have directories per se; you just put the path with slashes in place as you go, so
s3cmd put --no-encrypt kicx1440.avi s3://avi.thok.org/me/publish/europython/day2/kicx1440.aviworks just fine, without having to do anything aboutme/publishdirectly. - Make them world-readable. By default S3 is, correctly, private;
s3cmd setacl --acl-public s3://avi.thok.org/me/publish/europython/day2/kicx1440.avimakes that single file public. At this point, there’s a long convoluted url that will fetch this file, and you could stop here and just change the html that points to it, but let’s handle this cleanly…
- Configure it:
- Edit your DNS zone and add
avi IN CNAME s3.amazonaws.com.Carlton Bale gets credit for having the first google hit that actually said this would work. Once you’ve pushed this through,curl -L -v -I http://avi.thok.org/me/publish/europython/day2/kicx1440.aviworks - note carefully, the-Igets curl to do aHEAD(-Hwas already taken?) so you get back headers, not 100m of video. You should see theLocationheader taking you over to S3, and then a convincingETag(md5sum of the file, in this particular case) andContent-Length. - Edit your apache config and add
RewriteRule ^/(me/.*\.avi)$ http://avi.thok.org/$1 [R,L]To pick this apart:RewriteRuleis the apache swiss-army-knife of URL mangling.- The first bit is a regular expression that matches the entire “path” (
^for start,$for end) and grabs everything after the leading slash (thus the slash is outside the grouping parentheses.) Within this part of the path, it has to start withme/and end with.avibut can have anything at all in between; if we wanted literally all AVI files, we’d drop theme/part, but I have some small ones elsewhere on the site that I didn’t want to bother hunting down and uploading. - The second bit is the new URL -
avi.thok.orgto point to theCNAMEwe set up above,$1is the first set of parentheses in the match (so,me/xxx.avi.) - Finally, the last bit is what to do with this big of hatchet work;
Rsays to make it a redirect (and because our result starts withhttpit automatically becomes an “external” redirect, in this case a 302, ie. “don’t try to fetch this url, just tell the client to go away and find it themselves.” You can’t get theyah from heah, but you can get there from over there… theLis for “last” and just says to stop trying and don’t do any more rewriting on this particular result.
- Don’t forget to actually
/etc/init.d/apache2 reloador however your system spells that. At this point, you cancurl -L -v -I http://www.thok.org/me/publish/europython/day2/kicx1440.avi(note that we’re actually starting with the primary domain here, where the original problem started) and follow ourHTTP/1.1 302 Foundand then amazon’sHTTP/1.1 307 Temporary Redirectand the bandwidth problem (remember the bandwidth problem? This song’s about a bandwidth problem) is now gone.
Future refinements:
- use
[R=307]and make the first hop a Temporary Redirect as well. Not sure if that’s correct, yet, but given that this all started with a search engine bot that wasn’t aware of the human-readable “slow (home)” and “fast (MIT)” alternate links, it’s worth looking into. - if
thok.orgwere more of a CMS, automatically noticing avi files and pushing them to amazon would be a good transparent trick. For a total of five files on a home website? Not actually worth the trouble, even if the logs say I have at least a month before the bot comes around again :-) - actually process the logs by object size and see if anything else should get this treatment; in practice, these files got noticed so they’re the right starting point, and this isn’t anything you’d mistake for a major site.
“Chromebook Air” and a trip to Santa Clara
Put my mobility and gadget obsessions to the test by going to Pycon 2013 in Santa Clara. Didn’t take the X220t Thinkpad, just grabbed a Samsung Chromebook a couple of days before leaving (and installed crouton on it.) Crouton gives me a chrooted Ubuntu Precise environment, into which I poured my normal desktop coding and photography workflow. I added a 32G SD card with a couple of months of photography and my latest master index, which let me process pictures as I went; the one other useful addition to my pocket pouch was a dual-end mini-micro USB cable because most of my cameras are Canon, and so I still need Mini-USB when everything else is MicroUSB. I also happen to prefer short cables for reducing pocket clutter… but made up for it by carrying a 2ft A-A cable in the bag with the camera chargers.
The other trick was to remind myself that I was, after all, going to Silicon Valley, where they already have all the gadgets, and though “I can’t Amazon-Prime things to my hotel” is a very #firstworld definition of “roughing it”, clearly making several visits to Fry’s Electronics would be just as good… and in fact, when I “accidentally” took a thousand pictures my first day in town (at Point Reyes) the next thing I picked up was a 64G SD card to upgrade my local picture backup… and a knife to open the package with :-)
I did prepare the Chromebook before leaving, with an old Apple sticker; rather than causing comment, I think it served more as camouflage, at least at a distant glance. On closer inspection, of course, the $250 Chromebook is relatively flimsy (hold it up by a corner and the touchpad isn’t clickable anymore, for example) and the screen is very much not IPS (let alone Retina) which isn’t really a problem in conference and hotel settings, but was kind of unpleasant on the flight. Interestingly, I didn’t need to plug in power the entire length of the BOS to SFO flight, nor the return, even with (Gogo) wifi turned on the whole time.
That brings up “why crouton, instead of just installing ubuntu?” - first of all, I do like the idea of a limited self-sysadmining laptop, and have made on-and-off use of the CR-48 since they shipped, though I haven’t been able to make the leap to programming on it (I did try koding, and it’s just Not Emacs) and second, I hadn’t figured out how the “12 free Gogo coupons” was implemented, and figured it was easier to just use the ChromeOS-side browser for that. Just a little thing, but crouton worked well enough that it didn’t really get in my way.
Another nice thing about the Chromebook was that I didn’t especially worry about it; if it got damaged, the SD card would probably be fine, and I could probably just hit BestBuy and grab another one :-) There are a lot of workplaces (not mine, sadly) where being able to salvage a business trip by expensing a laptop that costs less than two nights in a hotel room could be a huge win.
I even picked up a cheap HDMI cable at Fry’s - having not noticed that the conference hotel TVs were (coincidentally Samsung branded) analog panels, with VGA inputs, no digital ones. (The “vacation” part of the trip did have HDMI-capable TVs in the hotel rooms, but part of sucessfully shooting 4000 pictures in a week was getting out and shooting and using the laptop for backup and picking out highlights, and deciding to leave serious tagging and uploading until I got home when I had time to do research and identification (after all, there were a couple of inches of snow on the ground when I got home; California was beautiful, New England hasn’t actually managed spring yet.)
Other trip gadgets: I had a rental car, so I got a small, cheap, and hard-to-recommend allegedly-2.1amp 12V-to-USB adapter that couldn’t keep my Gnote at equilibrium, let alone charge it, while using it as a navigation system. Still looking for a sane answer there. More helpful was the 12V camera-battery charger with little clip adapters for each different battery type, that let me top up the camera I’d used most on a given day; while it was a nice and performant gadget, I would have been better off actually being disciplined and keeping a charged backup battery for each camera every night. (Since it’s 12V and 120V, it might be sensible to bring only that charger next time, for the slight benefit over the multiple 120V chargers I’d otherwise carry, though on closer inspection, one of those was also 12V capable already.)
Finally, the trip was to go to Pycon 2013, so I brought back more linux boxes than I went with :-) So far the only interesting thing I’ve done with the RasPi is to hook it up to a tiny keyboard and project my trip-report slides at work. The idea that you can reasonably “rummage around on your desk to find a linux box” the way you used to rummage around looking for a spare ethernet cable is highly entertaining :-)
Colorimeters, Colorful LED displays, and Kicksaver
The IO Rodeo Colorimeter kit is building a basic easy-to-assemble (no soldering!) Open Source Hardware colorimeter - a basic scientific measurement tool, which uses different frequencies of light to measure properties of a liquid sample. The design looks very student-friendly, and is a good start on understanding that instruments aren’t magic…
This 8-digit 7 segment display board is a nice module for old-school lots-of-digits output - if you were doing your own version of a DeLorean back-to-the-future dashboard, it’d be a good component to have :-) I’ll note that in practice, a $60 refurb android tablet might be near-useless as a portable appliance, but it would make a great embedded display with graphs and whatever simulated-digit output you want… but I like LEDs so I backed it anyway.
Given how terrible Kickstarter is at actually helping me find interesting gadget projects (hey Amazon! buy them and force them to use your recommendation engine! :-) I owe credit for finding the LED display project to (KickSaver)[http://www.kicksaver.net/] - not actually a search engine, but certainly an interesting browsing alternative - you give them a price threshold, they give you kickstarter projects that need that amount to kick them over to successful funding. (Got a better idea? Fork Kicksaver on github and let me know what you came up with…)