Sunday, December 6, 2009

Google’s Social Graph API

Google’s Social Graph API

Author

Louis Simoneau

I recently signed up for Jeremy Keith’s podcasting web app, Huffduffer. The sign-up form has received a lot of buzz because of its “Mad Lib”-style design, but what interested me was that after I created my account, I saw the following in the lower right-hand corner of my profile page:

Figure 1. Huffduffer’s Elsewhere profile panel
Huffduffer’s Elsewhere profile panel

Huffduffer took the one web site URL I provided it with (louissimoneau.com) and found just about every profile of mine, from every social network or web application I’ve ever used. It’s not apparent in the figure, but those links point to my profiles on those sites. How is this possible?
Digging around through a number of Jeremy’s blog posts, I found that he’s using the Google Social Graph API to gather this data. My web site points to my Twitter, Last.fm, and FriendFeed profiles, and those in turn (mainly FriendFeed) link out to all my other profiles. Google is able to follow these links and know that they refer to other instances of me through clever use of the XFN microformat (more on this a bit later).
Jeremy’s implementation, though definitely cool, is a little blunt. Some people may object to having every single one of their web identities listed on their profile, and as Google’s spider is relying on cached data, some of the information may be out of date. For example, I deleted interblag.tumblr.com years ago and it has since been picked up by another individual.
I thought it might be a cool idea to try and develop a slightly more sophisticated version of this functionality, and at the same time learn a bit about how the Social Graph API works its magic. I’ll be using Ruby on Rails for this implementation, but as the code is very simple you should be able to adapt it easily to whichever platform you’re comfortable with.

Microformats Crash Course

Before we go any further, it’s a good idea to step back and try to figure out how this functionality is possible in the first place. How does Google know that those other sites are also “me”? The answer is microformats. If you’re a real HTML nerd or have a penchant for semantic code, you’ll already know about microformats, but for the rest of us here’s a quick catch-up course.
Microformats are just that: mini formats, which happen to sit inside the larger format of HTML. HTML lacks a way, for instance, of indicating that a link is pointing to a web site of an individual you’ve met in person—that is, there’s no “met” attribute that would let you do:
<a href="http://somesite.com" met="true" />
This code, of course, is invalid. Fortunately, HTML has a few attributes that can be co-opted for this purpose. class is the most obvious, but for links and anchors we also have rel. For example:
<a href="http://somesite.com" rel="met" />
This snippet of code, unlike the previous one, is perfectly valid. This in and of itself may be of little value: at best you could use CSS to give a different style to links pointing to people you’ve met. But if web developers across the world agree on a way of representing this information, suddenly the range of possibilities explodes: a web spider can crawl links between people who’ve met each other and construct a map of these relationships.
There are currently microformat standards for everything from addresses (hCard) to friendships (XFN) to tags (rel-tag), as well as a number of others currently in the draft stage.



Google is using the microformat standards XFN (eXtensible Friend Network) and FOAF (Friend of a Friend) to crawl networks of links. These links point to people represented by URIs, and it’s the relationships between these people that Google is attempting to figure out. FOAF is a slightly more complicated format that involves creating a separate file detailing all your friendships using XML; that’s unnecessary for our implementation so we’ll just focus on XFN.
The most basic use of XFN is using rel=”me” in a link to denote that the site being linked to belongs to the same person as the site we’re on.
There is, of course, a problem here: this microformat data is only present if a person has bothered to put it there. And only a small percentage of web site owners even know what microformats are, let alone use them on their sites. Well, the good news is that several large sites and applications add XFN to links automatically. Last.fm and FriendFeed, for example, both use this standard when you add links to your other sites. So you may have a social graph without even being aware of it.
WordPress blogs also make it easy to mark up links in this way. Scroll down the page when adding a new link to your WordPress blog and you’ll see the following:

Figure 2. WordPress XFN support
WordPress XFN support


Hence, even without widespread knowledge of microformats, it turns out that there are a significant number of marked-up XFN links out in the wild for the Google Social Graph API to spider.


Getting to Know the Social Graph API

Let’s have a look at the API, shall we? Essentially, all of its functionality is packed into a single method: lookup. There are two other methods, but they’re mostly useful for testing purposes; lookup is where the real muscle is.
So, how do you call lookup? It’s simple—just point your browser to a URL like this one:
http://socialgraph.apis.google.com/lookup?q=http://friendfeed.com/rssaddict&edo=true&pretty=true
I’m using my FriendFeed page as an example here because it includes a fairly large number of contacts and profiles. The parameters passed to lookup are straightforward: q is a list of URLs to look up, comma-separated; edo tells the API to look for edges out of the selected node, which means that it will find contacts referenced by the URL specified; and pretty just tells it to pretty-print the resulting JSON to make it readable on screen. Here’s an excerpt of what’s returned by the API when we request that URL:
{
 "canonical_mapping": {
  "http://friendfeed.com/rssaddict": "http://friendfeed.com/rssaddict"
 },
 "nodes": {
  "http://friendfeed.com/rssaddict": {
   "attributes": {
    "exists": "1",
    "url": "http://friendfeed.com/rssaddict",
    "profile": "http://friendfeed.com/rssaddict",
    "atom": "http://friendfeed.com/rssaddict?format\u003datom"
   },
   "nodes_referenced": {
    "http://delicious.com/rssaddict": {
     "types": [
      "me"
     ]
    },
    "http://friendfeed.com/aburd": {
     "types": [
      "contact"
     ]
    },
    "http://friendfeed.com/acitrano": {
     "types": [
      "contact"
     ]
    },
    "http://friendfeed.com/adenpenn": {
     "types": [
      "contact"
     ]
    },
⋮
We receive a nodes object containing the nodes found based on the URL provided, and a nodes_referenced object containing nodes linked to from that node. Each referenced node has a types property, which is fairly self-explanatory: me is used for sites that are my own (like my Delicious profile) and contact is used for people I follow on FriendFeed. FriendFeed just happens to mark up friend links with contact, while last.fm uses acquaintance instead. It’s therefore best, when using the API, to distinguish only between me and anything other than me, unless you have a specific need for more detail.
There’s no need to have an API key to access this data, and because of the option to pretty-print the results, the best way to familiarize yourself with the API is to play around with the parameters in your browser. Give it a try! Once you have your head around it, we’ll move on and start building our profile autocompleter.

Building the Profile Autocompleter


The behavior we want is simple: when our users create a new account on our web site, they fill in a “Web Site URL” field with their blog or other web site. What we’d like is to use the Social Graph API to find any other sites belonging to that person, and suggest those be added to our users’ profiles.
As a starting point, I’ve set up a basic app using the Authlogic authentication gem. I’ve already created basic User and UserSession models, as well as the associated controllers. I’ve also used Ryan Bates’ Nifty Generators gem to whip up some boilerplate actions and layouts for those controllers. To get up to speed, you can follow along with Ryan’s screencast on Authlogic, which walks through the creation of a similar user and session setup to what I’ll be using here. With this done, you’ll have a basic authentication system for your application: users can register, log in, and log out.
I’ll be showing you some snippets of the application’s code, but to really grasp the idea of what’s going on and follow along, you should download the sample code archive.
Our first task will be to add web site URLs to our User model. Since each User can have more than one web site, the URLs should be stored in a separate model associated with User via has_many. So, each Profile will correspond to a single URL, and each User has_many :profiles. To start, we generate the Profile model and controller using Nifty Generators:
script/generate nifty_scaffold Profile url:string primary:boolean user_id:integer new create destroy
We’ve given the model a primary attribute, which will be true for the first URL added by the user on the registration form and false for subsequent profiles discovered via the Social Graph API. We’ll need to mostly overwrite the generated new and create actions, but at least we have an outline to work from.
To ensure that our app as simple as possible, we’ll keep it RESTful, and nest a profiles resource inside a users resource. This is specified in routes.rb with:
map.resources :users, :has_many => :profiles
This way, our app’s URLs will look like /users/2/profiles/new, /users/2/profiles , and so on. To access these paths in our code, all we need to do is call new_user_profile_path(@user) or user_profiles_path(@user), respectively.
Let’s think about what this means for a user’s interaction with our application: we’ll intercept the submission of the registration form (a POST request to /users/new) and redirect our new user to the new_user_profile_path (/users/:id/profiles/new), where they’ll be offered some profiles to add. That form will in turn POST to user_profiles_path (/users/:id/profiles) to create the profiles, and finally send the user back to their profile on our site: user_path (simply /users/:id). As always, when working in Rails, adhering to these conventions yields significant returns: all of these paths will be mapped to the appropriate actions in our controller with that single line of code in routes.rb.
We’ll need to modify our User model to indicate that it has_many profiles (I’ve also added a shortcut accessor for the primary profile):

Example 1. app/models/user.rb
class User < ActiveRecord::Base
  has_many :profiles
  has_one :primary_profile, :class_name => "Profile", :conditions => {:primary => true}
  acts_as_authentic
end


In the Profile model we’ll include the HTTParty gem, which is an extremely simple way of making HTTP requests (like the one we need to make to the Social Graph API). Here we’ll use its convenient get method to send a GET request to a specified URL:

Example 2. app/models/profile.rb
require 'httparty'

class Profile < ActiveRecord::Base
  include HTTParty
  belongs_to :user, :dependent => :destroy
  validates_associated :user

  def self.find_other_me(url)
    get('http://socialgraph.apis.google.com/lookup', :query => {:q => url, :fme => true, :sgn => false})
  end
end


Note that we’ve completed the association we put in user.rb by including belongs_to here, along with a :dependent statement and a simple validation to ensure that any new profile is associated with a user. There’s no need to worry about this if you’re not a Rails-head: the important bit is what comes next. The find_other_me method, which we declare as a class method, issues a GET request to the Social Graph API with the URL that’s passed to it and returns the result. HTTParty automatically parses the returned JSON into a Ruby object.
Now we need to modify our user registration form to include a field for the primary profile URL, and modify our users controller to take it into account. Let’s start with the form:

Example 3. app/views/users/_form.html.erb (excerpt)
<% form_for @user, @profile do |f| %>
  <%= f.error_messages %>
  <p>
    <%= f.label :username %><br />
    <%= f.text_field :username %>
  </p>
  ⋮
  <% fields_for @profile do |p|%>
  <p>
    <%= p.label :url, "URL (including http://)" %><br />
    <%= p.text_field :url %>
  </p>
  <% end %>
  <p><%= f.submit "Submit" %></p>
<% end %>


In the create action of our users controller, we’ll first create the primary profile based on the URL from the registration form. We’ll then redirect to the new action of the profiles controller, which is where we’ll suggest profiles found by the Social Graph API to our users:

Example 4. app/controllers/users_controller.rb (excerpt)
⋮
def create
  @user = User.new(params[:user])
  @user.primary_profile = Profile.new(params[:profile].merge({:primary => true}))
  if @user.save
    flash[:notice] = "Registration successful."
    redirect_to new_user_profile_path(@user)
  else
    render :action => 'new'
  end
end
⋮


Here’s the new action from the profiles controller:

Example 5. app/controllers/profiles_controller.rb (excerpt)
⋮
def new
  # Find suggested profiles
  @suggested = Profile.find_other_me(current_user.primary_profile.url)['nodes'].keys
  # if API only returns one value, it's the one we specified, skip the page
  if @suggested.length == 1
    redirect_to user_path(current_user)
  else
    render
  end
end
⋮


We call the find_other_me method which we coded earlier, and drill down into the results to obtain the URL of every node returned (the key of the hash returned by the API). If only one URL is returned by the API, it will be the one we submitted in the first place, so in that case we’ll skip this action and just redirect to the new user’s show action (which is their profile page in our app).
In the view, we just loop over the @suggested array and make a checkbox for each URL. (In the sample code I’ve added check all and uncheck all links, with some jQuery to handle their behavior in the application layout file.) The value of each checkbox is set to the URL itself, which allows us to handle them effortlessly in the controller:

Example 6. app/views/profiles/new.html.erb (excerpt)
⋮
<% form_tag user_profiles_path(current_user), :id => 'profiles' do  %>
  <table>
    <% @suggested.each_with_index do |url, index| %>
      <tr class="<%= cycle 'odd', 'even' %> ">
        <td>
          <%= check_box_tag "profiles[#{index}]", url, true %>
          <%=h url %>
        </td>
      </tr>
    <% end %>
</table>
<%= submit_tag "Continue" %>
⋮


Submitting the form calls the create method in the controller, which loops over the URLs and adds a profile to the current user for each one. This is a slight deviation from the usual REST model: generally the create action responds to a POST request and creates a single new model. In this case we’re creating several, as our POST contains an array of URLs. This tweak is worth it, however: we can keep all the Rails magic for dealing with RESTful routes, and the actions still map logically to URLs.
Once we’ve built our profiles, we save the user (which conveniently also saves all the associated profiles), and redirect to the show action of the users controller (just as we did if no other profiles were found):

Example 7. app/controllers/profiles_controller.rb (excerpt)
⋮
def create
  if params[:profiles]
    params[:profiles].each do |key,value|
      current_user.profiles.build(:url => value)
    end
  end
  if current_user.save
    flash[:notice] = "Your profile URLs have been added!"
    redirect_to user_path(current_user)
  else
    render :action => 'new'
  end
end
⋮


And that’s all! Run your database migrations, start up the development server, and navigate to the user registration form: /users/new. Fill out the form with a URL that contains some XFN goodness, and the app will recommend some extra URLs to add to your profile. If there are no XFN links on the page provided, you’ll just skip along with no knowledge of what you missed.

Summary

I like this solution primarily because it’s unobtrusive. Users who enter a URL for which the Social Graph API returns no connections will only see a regular sign-up form. On the other hand, those who indicate a URL with some rich microformat metadata will gain the added benefit of being able to flesh out their profiles with more of their online presence.
Of course, there are other uses for the Social Graph API: one possibility which springs to mind would be using it to recommend friends on your site based on relationships your users have on other applications. If you know some of your users’ Twitter usernames, for example, it’s quite simple to query the API so as to establish which of them follow each other on Twitter, and then recommend they become “friends” on your site as well. You can do the same for their WordPress blogrolls, FriendFeed friends, and so on, all with more or less the same code.