Time for a non-photo photo-related post Some of you photographers may find these ramblings interesting… for the rest of you, just look the other way and continue about your business.
A short back-story
As a photographer, who also has a day-job doing web development, there is a tendency for random photography programming related ideas to pop into my head. I already have a host of scripts I use to make my life easier; from batch renaming, photo backup and archiving, image uploading, etc. Sometimes these scripts come out of necessity for a particular job. Sometimes they’re created to speed things up once I realized I’m repeating the same task over and over. Many times they’re just out of utter curiosity whether it can actually be done.
Integrating with existing websites can be fun and challenging. I love exploring new API’s. I currently have written different integrations with Flickr, Twitter, Facebook and Google. Each web service presents its own challenges, shortcomings and other issues, including old services being deprecated, breaking functionality that had worked great in the past. Ugh. But these API’s open up amazing possibilities that would be otherwise impossible.
OK, Albums, Right?
Yes, I promised you an article about my facebook gallery implementation. Often times when I photograph an event, particularly sports, I’ll upload a gallery of images to this website, but I’ll also upload an album to facebook. People can’t get enough of tagging themselves in photos on Facebook.
First I began to explore the idea of using the Facebook photo API to automatically create a new album by transferring the photos from a gallery on this site to a Facebook album. Meaning I only have to manually upload to one location. The FB upload API does work, but during my testing I encountered a major flaw.
When using the FB website to upload photos to an album, after you upload a batch of photos, a summary of that upload is posted to your wall with the album name/link and total number of photos uploaded. But, when using the API, each photo uploaded would make a post on the wall. So uploading 100 photos would result in 100 “Picture Iowa has uploaded a photo…” posts on our wall. Very annoying and new-feed clogging for those of you that like us on FB.
Facebook provides a batch upload capability in their API, which partially solves the problem (although it should have fully solved it). In my testing of this, it would still randomly make multiple posts to the wall after smaller groups of photos were posted. I also have FB setup to auto-post status updates to Twitter, which when uploading via the FB website, posts a single summary tweet on Twitter, that I’ve uploaded photos to a album. But using the API, multiple tweets would be made, the same way multiple wall posts would be made. I don’t want 100+ wall posts and tweets when I upload a new album. This is not acceptable, so I had no choice but to scratch this idea.
I few months went by without me thinking much more about this… Until about a week ago. I though that even though I still had to manually upload photos to both here and to FB, I’d like a way to have any activity on those photos be synchronized between the two sites. This first meant I’d need a way to match each photo in the facebook album up with the photo in the gallery here. Should be a simple task, right? After all, the exact same images were uploaded to both sites.
Of course, FB strips out all metadata. Not a deal breaker, we can still compare the binary image data, right? Nope. Facebook also re-compresses the uploaded photos. So even though at first glance the photos on FB appear unchanged, there are additional JPG artifacts present in FB’s photos. So, how can have a computer match up photos that are easy for a human to see as being the same, but the 0′s and 1′s are different to the computer?
First thought. Even though FB strips the metadata, it extracts the description from the EXIF and adds it as a description to the photo. I could just use Photo Mechanic to assign a unique ID or something within the description before uploading, therefore having unique description text to match on. Problem is that the FB uploader has a bug causing it to sometimes drop the description data in 5-10% of the uploaded images. So not all uploaded images have the description, making them impossible to match everything using this method. Plus it’s an additional step I have to remember to do before uploading.
Second thought. Just match them based on the upload order. If my photo filenames begin with a sequence number, they should be uploaded to FB in that same sequence, right? Nope. They can get jumbled during the FB upload, and since the original filename is not retained either, this was also a no-go.
What to do now?
Imagemagick to the rescue! This go-to set of tools has saved my butt many a time. I typically just use the image resizing and conversion tools, but there’s also a binary called compare. This can be used to “to mathematically and visually annotate the difference between an image and its reconstruction”. Sweet! After a bit of tweaking with the “fuzz” flag, I was getting 100% correct matches when running a scan on photos from a FB album and comparing to the same gallery here.
/usr/bin/compare -metric AE -fuzz 55% '$photo[facebook]' '$photo[pictureiowa]' null: 2>&1
Now I was able to store this in a MySQL database matching up the my gallery ID with the facebook ID.
What shall we do with this new-found data link!?
Tags, Comments and Likes. And oodles of statistics, of course! The first step was to wright a script to scan the FB albums for new tags, comments and likes, adding them to our relational MySQL database. Then we need a way to keep this updated when new activity occurs. Since this is a somewhat intensive action to the FB API, since it is returning a lot of data when scanning an entire album, I didn’t want to just schedule it to run every hour or so, scanning every single photo in every album I’ve linked on FB. Fortunately, FB provides an activity RSS feed. So, the script can just poll the RSS feed every hour and parse out which albums had activity, re-scanning only those designated albums with the API.
I then wrote a simple module for Gallery 2, the gallery software this site runs. It will display the tags, likes and comments next to photos on this site that were posted on FB. Here’s an example. Pretty cool, huh? OK, well, it’s at least kinda neat.
Another thing I had planned was to now use the API to update the photo description on FB, so that I could add a direct short-link from the facebook photo to the photo on this site. Whoops, FB has not implemented update this feature in their API, either because they’re lazy, or most likely, they’ve just decided they don’t want people doing that. Which is silly, considering you can update the descriptions easily directly from FB.
Stats and Reporting
I’ve thrown together a haphazard page that shows some of the data I now have available to me, and what can be done with it: http://www.pictureiowa.com/facebook/photos.php Facebook makes it hard to access such summarized data within their site.
This page shows a lot on one page, and I’ll be refining as I have time in the future. But for now, it’ll do.
I have no time for this. Spent too much time writing all the garbage up above.