24 February 2011

Versioning and Types in REST/HTTP API Resources

There are a variety of ways to type and version the data with REST services, many of which are used successfully. Because of the living nature of APIs, changing versions and changing data types can lead to API designer headaches that saturate much of their time. I am going to discuss the ways to avoid some of these problems by configuring things properly up front.

Dealing with Types


Lets look at a common way REST calls are made:

===>
GET /customer/123 HTTP/1.1
Accept: application/xml
<===
HTTP/1.1 200 OK
Content-Type: application/xml
<customer>
  <name>Neil Armstrong</name>
</customer>
Ok, we have an API that returns a customer - looks good. The problem here is that the API does not return a customer - but rather a generic XML document. When the designer made this particular API, they didn't bother to specify the type of document being returned. Sure there is an API document somewhere that defines the customer XML being returned, and as soon as you call it you are going to see that it is a customer - what is the big deal? Well, the problem is that the API is never static. And as far as REST/HTTP is concerned, sticking a product in the response is perfectly valid (but totally wrong in concept). Wouldn't it be nice if the client could actually validate the information coming back? Wouldn't the client be more stable and predictable if we knew the format of the data that was going to come back? Wouldn't it be great if the server knew which format the client wanted and could give the client what they ask for? Sure it would, so lets change the call to make sure we are asking for what we want:

===>
GET /customer/123 HTTP/1.1
Accept: application/vnd.company.myapp.customer+xml
<===
HTTP/1.1 200 OK
Content-Type: application/vnd.company.myapp.customer+xml
<customer>
  <name>Neil Armstrong</name>
</customer>
Here I have substituted the generic xml request MIME type with a vendor specific MIME type. Awesome, now we ask for a customer XML formated customer, and we get a customer XML back from the call - which can even be validated on the client! That should solve our data format problem once and for all, right?

Dealing with Versions


BUT WAIT - there is still an issue! Lets say you decide to change the customer XML because you want to add some really cool new stuff to make that million dollar sale. How does that impact your client programs?
Here is the new flow:

===>
GET /customer/123 HTTP/1.1
Accept: application/vnd.company.myapp.customer+xml
<===
HTTP/1.1 200 OK
Content-Type: application/vnd.company.myapp.customer+xml
<customer>
  <firstName>Neil</firstName>
  <lastName>Armstrong</lastName>
  <salutation>Mr.</salutation>
</customer>
This change has just broken all the clients using that resource since they can no longer parse the changed XML properly! Further, there is no way for the client to check the returned version without calling it or asking for a specific version of the API return object - it just always gets the latest format. Sure you could say that your API needs to maintain backward compatibility - but that is not very realistic when you are properly reusing your API across your product line. To demonstrate further, lets say you have 30 applications (and maybe a handful of external companies using the API), all of which are relying on the "customer" REST resource - your choices now are:
  1. Keep it backward compatible (and lose the million dollar sale because you couldn't implement cool feature X)
  2. Change all 30 applications simultaneously to handle the new data (you likely don't have enough resource to do this and deliver on time)
  3. Make the change, breaking the apps you don't have time to upgrade, but get the sale. (of course you will fix the remaining apps in the future, right?)
Basically, you lose no matter which choice you make. So what can we do now to avoid problems with newer versions within the API?

Common Solutions to Versioning


Some common ways of handling this problem are to request a specific version in the call URI:

Method 1 (Put API version in the URI):
http://server:port/api/v2/customer/123

Method 2 (Add a request parameter in the URI):
http://server:port/api/customer/123?version=2

Both of these methods are used to request a specific version of the resource we are looking for. And for many situations, these work fine. But we are munging up the resource identification with the resources representation (breaking an essential REST tenant). We don't want the identifier that we use to find an object to be mingled with the format we are requesting (for reasons discussed in my last post)! But we are smart designers, and this is essentially the same problem we had above, so we immediately recognize that we can solve this with the same solution as above - just make the type requested more specific:

Method 3 (version the request type):
===>
GET /customer/123 HTTP/1.1
Accept: application/vnd.company.myapp.customer-v1+xml
<===
HTTP/1.1 200 OK
Content-Type: application/vnd.company.myapp.customer-v1+xml
<customer>
  <name>Neil Armstrong</name>
</customer>
Similarly, the newer clients make a different call since they are aware of the new version:

===>
GET /customer/123 HTTP/1.1
Accept: application/vnd.company.myapp.customer-v2+xml
<===
HTTP/1.1 200 OK
Content-Type: application/vnd.company.myapp.customer-v2+xml
<customer>
  <firstName<Neil>/firstName>
  <lastName<Armstrong>/lastName>
  <salutation>Mr.</salutation>
</customer>
Great! Now you have added the changes and made the sale and not broken any of the clients - you are the company hero! As an added bonus, clients and servers can work together to keep your API stable going forward. By requesting a specific type and version in the request header, the server can decide if it is capable of fulfilling the request and inform the client appropriately (i.e. return a HTTP 415 if it can't fulfill the request, or a 301 if it is changing). This is a much more stable and polite way of dealing with API changes for the client.

Conclusion


We can see that the versioning and the typing of the data are not independent concepts for REST APIs - they are in fact the same, since changing either one results in different data being returned. And since REST is built on a very well tested technology like HTTP, lets take advantage of the capabilities of that technology. It is a simply matter to type the responses being generated and it adheres to 'proper' calling conventions, simplifies the URI and solves our versioning and typing issues cleanly.