What do you want to learn?
Leverged
jhuang@tampa.cgsinc.com
Skip to main content
Pluralsight uses cookies.Learn more about your privacy
Web API Design
by Shawn Wildermuth
Designing APIs for use over the web is a key part of most projects these days. Knowing the best way to design them with longevity, testability and reliability in mind is important. This course will show you how.
Start CourseBookmarkAdd to Channel
Table of contents
Description
Transcript
Exercise files
Discussion
Learning Check
Recommended
Introducing Web API Design
Web API Design
Welcome to the Web API Design course. My name is Shawn Wildermuth of Wilder Minds. In this course we're going to talk about how to design an API that's reachable over the internet. This course includes pragmatic advice on what are Web APIs, some basic Archetypes for API Designs, how to Version Web APIs, how to Secure Web APIs, and finally we're going to talk about Hypermedia, and what people mean when they talk about it. This course isn't meant to teach you how to build APIs. This course is meant to help you design APIs. The focus of this course is to help you Design an API, and the course looks at the API from the view of the developer that is using your API. This is not specific to Microsoft's ASP.NET Web API technology, it's about APIs on the web in general. This includes REST APIs, we're going to touch a little bit on RPC, and Hypermedia or HATEOAS APIs. These APIs can be written in a variety of languages; this course is not about any specific language. Let's get started.
What are Web APIs?
Here in Module 1 of the Web API Design course, we're going to talk about the introduction of what we mean by Web API and the different technologies that are involved. We're going to start by simply talking about What are Web APIs? We will then move on to Why Design is Important to designing your APIs, and then we'll discuss the different kinds of APIs, the nature of REST, The Role of HTTP, and finally what is Hypermedia. All of these are going to be related to building or APIs. So the central theme of this course is going to be the nature of what APIs really are. We want to be able to take a look at how people are going to be consuming services or data that you're going to be exposing. This is the essential interface to your application, to your system, to your architecture. So these APIs represent a way for the consumers of your API to be able to access those services and data in the simplest way possible. Now back in the day we would design these APIs, and then we would publish them with some printed materials or a help document, or things like that, that would help them understand exactly how to use them. The APIs we're going to talk about should be much more self-describing. We're going to be using technologies like REST and Hypermedia to help us understand the nature of what we're building without being bogged down in the dogma of some of these key words that has caused arguments through the community; we're going to try to cut through the dogma and really talk about the pragmatic way to design these APIs. This design is important because the developers that are going to be consuming these APIs are going to want to be able to see how to use them in a fairly simple way. They should be not only self-describing but self-documenting as much as possible. Being able to look at the nature of what calls are available, and should be natural to take the next step and get more data or use more services in your applications, whether this be if you're exposing something like customers from an API, the orders for the customers and the line items for those orders should be a natural progression in the API; I shouldn't have to go back and look for each call in my system to look at interrelated data. In that same way, when I want to deal with the different operations on that data, it should be fairly clear whether that's using Hypermedia to describe those operations, or whether using HTTP verbs to really hint at what are the other operations that are allowed. Let's talk about the API ecosystem on the web today.
The API Ecosystem
When we say Web API, what do we really mean? In the beginning of web development, when we needed APIs we typically relied on something called a Remote Procedure Call. This was really a hearken back to an earlier time when we were building systems that needed to talk to each other by using things like COM+ and DCOM or CORBA in the Java world, in order to create systems that could talk to each other over a network connection. When we came into the web, we decided, hey we already know how to do this idea of Remote Procedure Call, let's just do it over the HTTP layer that we're using to communicate and deliver our websites as well. Remote Procedure Call is typically identified by a couple of ideas. One is that they're going to use URI Endpoints or address to get at certain pieces of functionality, services or data, as we talked about before. But unlike some of the technologies we'll look at, the verbs are typically included in these APIs. For example, we could look at an API that said Get Customers as part of the URI. So, if we had /API/GetCustomers, that is more of the type of operation that we would see in typical Remote Procedure Call systems. Back in 2000, this idea of REST sort of blossomed, and there was some discussion about REST versus things like SOAP, but REST has become a common pattern for building these systems, and REST is different from RPC, it also uses URI Endpoints, but it typically dictates that the URI should be resource-based. So, instead of Get Customer it would just be Customers and Orders and Invoices, and the other types of objects in your system, instead of including the verbs in the API name in the URI Endpoint, it does this with HTTP verbs. So, if you want to be able to get at the customer list, you would simply issue a Get HTTP command to that customer's endpoint; if you wanted to create a new one you would post to it, etc. And REST also dictates that the server be stateless, so that as we make additional calls into the system, the server isn't holding on to some state that we need to be remembered by, and this is something that happens an awful lot in RPC where we have some token or some session state that knows about us every time we call. REST-ful is indicated by trying to stateless. And then finally, the kind of data that you're pulling back or the type of result from those services typically isn't tied to a single type of format. There's a Content Negotiation, which we'll talk about in a few minutes, to help the services figure out how the client needs the data. If the client is a webpage, something like JSON or JSONP is appropriate, whereas if they're dealing with something like a rich client they might be more comfortable or easier to manage with something like XML. The server should care less about what that content is and allow a negotiation to happen to determine how to return that data. And finally, somewhat more recently, this idea of something called HATEOAS, not an acronym I'm particularly fond of, but it stands for Hypermedia As The Engine of Application State. Essentially this adds onto the idea of REST-fullness or REST interfaces, and includes in the payload links to do other operations. So, there will be links inside of the payloads of the data from these services that will indicate to the user and to the user of the API other operations that can be successful. You might have the idea of submitting a new invoice, and getting an old invoice might give you the URL on how you would submit an updated version of that invoice. This is the idea behind Hypermedia; we'll talk a little bit more about that soon.
Resource-based Architecture
Before we dive into the actual design, let's start with some of the foundations, and one of these foundations is the idea behind a Resource Based Architecture. Resources are simply put, Representations of Real World Objects or Entities. We can think about these as People, Invoices, Payments, other things in systems you're building; you're probably already doing this in your existing software development career, your created classes, or structures, or databases to store this sort of information and consume this sort of information. We're simply saying that what we're talking about is going to start with this idea of resources. In these resources, relationships are typically nested down a path of those resources. So, if we have the idea of a customer, the customer may have a relationship to its own orders, and those orders might have a relationship to their own order items, and those order items might have a further relationship to the products that are being purchased in each of those line items. You should think of these as Hierarchies or Web information, not necessarily Relational Models, because the kind of data that you're going to be dealing with and you're going to be producing in these APIs, and consuming, is going to be typically Hierarchies and Webs, not Relational Models in the sense of tables and related tables in the strict sense of relational databases. In these Architectures, these Resources are normally Represented as URIs. These URIs are Paths to those Resources, so when you want to create your APIs, you're going to want to use URIs to get at those Resources. Query Strings are often used in these URIs as well, but for non-data elements. So you don't want them to represent verbs, you know operations against those resources, and you don't want them to represent the actual data. Often they're used for different purposes, like sorting, maybe filtering, and sometimes what formats you're getting it back at. Let's see what this looks like in REST.
Introducing REST
So when I started to do the research for this course, I reached out to some other authors of courses out there to talk about what does REST mean to them, and unfortunately, this has become a fairly contentious discussion. I found that a lot of people are either on the one side of saying that REST as a philosophy isn't super useful because it's so dogmatic; the constraints it puts on a system to be blessed as being REST-based becomes overwhelming and not useful to the day-to-day developer. On the other side I heard arguments that REST is really important because it helps us dictate how we want to design those APIs, and that the specific constraints of the original discussions about what REST is and what REST isn't aren't as useful. So my goal here is to talk about REST in a very pragmatic sense; how REST can help you design good APIs, APIs that are going to stand the test of time, that won't have to change often in order to deal with new constraints, but also leverage what is good about developing APIs on the web. So what is REST? The term REST simply means Representational State Transfer. When we talk about Representational, we typically mean the resources that we want to transfer across the wire. These Representational States are typically resources, customer's orders, details, products, etc. But in order to be considered REST-ful, there are some concepts that Roy Fielding included in his original papers on this, and we want to really understand what at the end of the day is useful in REST, and take what will help us build great APIs from REST, and sort of leave the dogmatic strictness of REST sort of on the table. The concepts that come clear from REST that I think are important is the Separation of Client and Server, that the clients are going to call into the server based on URIs, and the server is going to try to meet those URI requests, whether that's returning data, whether that's adding data to the server, whether that's changing or deleting data, and that each of these Requests should be Stateless. There's no notion of who the client is so that the servers can be scaled out more seamlessly. And that where possible, as many of these requests can be Cached as possible. When we talk about Cachability, we're typically talking about caching of data results, so typically gets into a system, you aren't really going to be able to cache insert of new items or deletion of items, but being able to cache what Requests are there for getting data, and as long as the data hasn't changed, you can be pretty aggressive about your caching. And we also want to make sure that we're talking about these Uniform Interfaces, that when someone comes up to our API that we're really saying that if you were able to get through an API a customer object, and you know that there are order objects out there, that you can probably get them with the same pattern that you got them from the customer object, and so that really means that you may be walking down the URI by saying customer/1 for the first customer, /orders to get the orders for those customers; I should also be able to say orders/want to get the first order in the system as well. These URIs are going to look like each other, and that's what we mean by Uniformity or Uniform Interfaces. All of this is really good when we think about what is useful for defining what REST is. Some problems come in that the specific constraints in Roy Fielding's work to qualify your interface as being strictly REST-ful or not REST-ful, tends to add a lot of constraints to the system. In my experience, trying to make your API strictly REST-ful, or adhere to the REST principles, means you're spending a lot more time trying to follow the letter of the law instead of the spirit of the law, and at the end of the day, worrying about whether you're in this walled garden of what it is to be REST-ful or not REST-ful isn't getting your job done. So I like to take what is good about REST, bring it in pragmatically into what I'm building, but not worry so much about strict adherence to those principles in a black and white way. And a lot of this comes about that when we talked to different experts in the community or even developers in the community, there's a split about how important the idea of REST is, because REST can become very dogmatic. REST can be worried about strict adherence to a defined set of rules instead of getting our job done as developers. We can learn a lot and pattern a lot of what we should be doing from REST, but worrying about never straying from that wild garden can get us in trouble in my experience. So, I'm going to teach this course really from a pragmatic sense. I'm going to try to take what is best from REST and apply it to your API designs.
Hypermedia
So the last big piece of the puzzle is this notion of Hypermedia, but let's step back a minute and talk about Hyperlinks. The web is really drawn in by this notion of Hyperlinks, and it was a core concept in the creation of the web initially, the ability to have documents and websites that are linked to each other within themselves, to create really a web of information, to have each of these different pieces around the internet really tied to each other. Hypermedia is a little bit like this. When we talk about Hypermedia, it's really a way for the results of our API calls being as Self-Describing as possible. Hypermedia is simply a way to have links of resources that describe how to process the data or how to get at the data in special ways. These are Hyperlinked for Resources, so you can imagine that the links may include ways to get the cover for an album, it might be a way to add new items to collection; it's a way that the messages that we're getting from our APIs are going to tell us more about how to use the service itself. Is this important? Hypermedia is HATEOAS. This Hypermedia As The Engine Of Application State, is a design pattern that you're seeing in more and more APIs. Using it doesn't make your API better or worse. Depending on what you're trying to accomplish, this can be very useful or just additional overhead. If you are creating APIs that have special ways of describing what they need to do, or maybe needing to be implemented by machine systems, this can become very important. The idea of HATEOAS or Hypermedia is there so that you can create these self-documenting APIs that can be very dynamic. But you can have great APIs without HATEOAS. In fact, there are many, many APIs out there that are considered solid and dependable and well-documented, that don't have any Hypermedia. Again, don't get caught up in that dogma of your API isn't good enough unless you're using every part of that REST stack, including Hypermedia.
What kind of API to use?
With all this information about how REST works, you still have to tackle the question of should you be using REST, and should you be using it in the strict sense of what REST mavens out there expect your API to be. The Archetypes or the types of APIs that are out there do vary, and so you have to look at what you're trying to accomplish and see what makes the most sense for your specific project. REST or REST-ful APIs are easy to use and maintain, but if your API doesn't fit this Resource-based Model, using something like Remote Procedure Call style or some custom APIs, is acceptable. You may find that REST is too limiting for what you're trying to do; maybe it doesn't fit into a resource model, or maybe you really need something that is more driven by procedures, so something like Remote Procedure Call or a custom API may be the right solution. Trying to take what you need to get accomplished done and fit into the REST or REST-ful model can be counterproductive, so don't get caught up in the idea that your API has to be REST or REST-ful to be a good and valid API, but remember that REST matches many, many use cases, so make sure that you're not avoiding REST just to avoid REST. I generally say to clients that if you're building a new API, starting by designing and using a lot of the REST Symantec we've looked at here, is the way to start the approach. If you find that it becomes too limiting or you're trying too hard to make it REST-ful, then you can start breaking out of that box, but starting with REST as the natural starting point for your API design is usually what I suggest.
Summary
Now that you're through the first module, you should see that API design is really important. Designing APIs are as important to developers as designing UI layers are to users. By making your APIs easy to use, obvious, and simple, that's going to increase the adoption of your API; you're going to get more and more developers using that API, which is often the goal of any API. Looking at the requirements of your API and then deciding whether your API should follow a strict REST pattern or something more flexible is going to be a key to whether you're successful or not. There are no right and wrong answers here. You're going to have to make clear decisions about how you want to design these APIs, and know that you may make mistakes. There are better and worse suggestions here, but there are no black and white right or wrongs. And I know I've been saying it a lot in this module, because I think it's really important: Avoid the dogma of what everyone thinks is the perfect and great API and understand that as developers we need to be pragmatic about these designs. If your decisions about how to create your APIs continue to be pragmatic, and what I mean by that is that it's going to serve the final use cases and serve the developers who want to use your API, then it is probably the right decision. If you're making design decisions about whether it will be classified as a true REST-ful API, you're probably making the wrong decision. This has been Module 1 of the Web API Design course. My name is Shawn Wildermuth.
Designing The API
Introduction
Welcome to the second module of the Web API Design course. My name is Shawn Wildermuth. In this module we're going to be talking about designing the actual API itself. We're going to start by talking about how to Design for the URI, Understanding the role of Verbs, dealing with Status Codes, Associations, Designing the actual Results that are returned, ETags, Paging and Partials, and finally Non-Resource based APIs. Let's get started.
URI Design
So to begin our URI design, we're going to want to look at what the URIs actually look like, and one of the ideas I want you to get a sense of is that the URIs should contain Nouns not Verbs. The problem with verbs is that we start very innocently with something like getCustomers and saveCustomers, but very quickly this starts to balloon into a bunch of different sorts of things we want to do with the data. Now instead of having just one endpoint that we're maintaining, we're having to maintain a number of different endpoints as different parts of our URI. The solution is to use Nouns, or Resources, as REST likes to call them. The idea here is that you want to have endpoints that are described as Plural versions of whatever nouns you're going to expose. For example, this API has an endpoint called Customers; you may have Games, Invoices, so you're going to use a Noun to indicate the endpoint that is going to allow you to manipulate a type of object that you have on the server and that you want the users of your API to be able to use. Against those endpoints you're going to use identifiers to point to individual items in the collection. This does not have to be the key that you use naturally. It does not have to be the primary key that is contained in the database or some magic key, it can be one that is generated. For example, being able to get a customer by its key, the API is simply saying look at that noun, the Customers, and then give me the item that is identified by the number 123. This may be the ID or something else. An example of something else is something like Games where you have some unique identifier to find the individual game, like halo-3; or in the case of Invoices, a date for that invoice. The important idea here about that key is that it does not have to be the internal key that you're used to using. It can be something generated, but it has to point at one and only one item. It can't be generated every time someone comes to the API, because there's this notion of item potency, and what that means is that when someone retrieves data or pushes data using this key inside of your API, that it always refers to the same object. Very commonly this is going to be some primary key that you have in your storage mechanism, but it could be something that is unique, like the full title of a story in a blog, or it could be the concatenation of data as it's related; maybe customernumber-invoice number for invoices. It's up to you to determine what that key is, but it has to remain tied to that single entity.
Understanding Verbs
So, if your URI design is supposed to be nouns, where do Verbs come in? In the case of verbs, we're talking about HTTP verbs, and these verbs can be easily matched to the create, read, update, and delete that we're probably used to when dealing with things like database data. So, if we take our resource endpoint like customers, we know that if we do a GET it's going to return a list of those customers. If we POST a new customer to that endpoint, it's going to create a New Customer. If we PUT a collection of customers to this endpoint, it will do a Batch Update of those customers, and if we try to issue a DELETE against that resource, it's going to give us an error, because we cannot delete the entire list of customers. Conversely, if we issue these same verbs against the item collection itself, it's going to do something a little different. So, if we do a GET it's going to get that individual item for us. If we do a POST it's going to give us an Error, because we can't POST a brand new item to an existing item. If we do a PUT it will Update that Item, and if we do a DELETE it will Delete that Item. But what should you return from those verbs as you encounter them? In these cases we're going to look at those same verbs, but figure out what we have to return back to the client so the client knows what to do with them. So in the case of GET it should be pretty obvious; the customer's endpoint returns a List of those customers. A GET to the Item endpoint is going to return just the individual Item that the user has pointed at. If we POST, like we talked about before, to the customers with the data that represents what is a customer, it will create a New Item, assuming it's all correct. And we should return from that POST a new version of the item that was inserted; not only that the creation happened, but a formatted object that represents that New Item. And the reason we do that is sometimes part of that creation process is setting things like default properties and generating the key that they're going to need and things like that. So returning them that brand new object is very useful for them to be able to consume what is really the last version of the object as it existed on the server, which is just a moment ago when we created it. If we attempt to POST to an individual item, because we can't POST to an item like we talked about a minute ago, we should return a Status Code, and that Status Code should be an Error Status Code, probably a 400, in that the user of the API has done something incorrect. In the case of PUT, if we take a collection of updatable objects and PUT them to the customer's endpoint, it's going to attempt to update them all and then return a Status Code to say whether it succeeded or not. But if we do a PUT to the item resource, it will return the updated item because it makes sense to return an individual item that was updated, and it may have been updated with more than just the data that was sent to the server; it could include things like the last updated date or update-related keys, and so you always want to return that updated item. In the case of DELETE, because we can't delete the entire list of customers, we should return a Status Code that is an Error Status Code when a DELETE is done on the resource of customers. But if we delete an individual item, we should return a Status Code of whether you were able to delete that individual item. Let's see how this works. I'm going to pop over to Fiddler, I have a little service running locally that I'm going to do some testing on. So, we can see this. I've got an API here for games, and I've got just an identifier to represent the ID of one of the games. If I issue a GET against it, the result is going to be, let's look at the Raw first, the result is going to be a 200, so a good Status Code, and there are some headers that we'll talk a bit more about later, and then we actually get the object that we requested. If we look at the same thing without the identifier, we're still getting that 200 result, but the Raw data that we're requesting is actually a set of information, and here is the collection of the different results that we wanted to return. So this is going to return us the collection of games that we requested. In both cases we did a GET, the only difference was whether we were going after an individual game like we were here with games/2, or whether we were going over the entire collection. In that same way, we could take this and change the verb, and let's change it to a PUT so we can update an item. Now for us to update an item we're going to need to do a couple of things in our API. Our API should expect certain things, like a Content-Type. A Content-Type is going to tell it what the format of the request body is, and again we're going to send it a version of our formatted object, and in this case we're going to use JSON, because that's what it's been returning to us, so I'm going to tell it the content type is application/json, and then we can go ahead and actually paste in, in the Body of the Request, the data we want to update. So let's go ahead and update the second item, and let's just go get that item here. Go back to our Composer. If you're not that familiar with Fiddler, there's a course of mine here in the Pluralsight library that covers Fiddler that you might want to look at, but we're just going to paste that whole object that we got from the server originally, and I'm just going to change the name here to Final Fantasy 5th Edition. I'm going to be putting this request up, and I've told it the type of data that I'm going to be giving it, and when I execute this, you can see this came back as a 200, it succeeded. And if we look at the Raw result, it not only told us, hey this succeeded, but here is the new result that was on the server. See the data that's being sent back includes that 5th Edition, that change I just made to it, and every subsequent request now for that same object, let's drag this over to Composer, which will copy that request so we can re-execute it. We can see now that we've got the 5th Edition, because that is now what is being stored on the server for us. Just simply by changing the type of verb you're using, those same endpoints can be used to do different sorts of operations, and in this way, the URIs themselves remain fairly simple and easy for users of your API to get, while maintaining a lot of this different behavior for your API.
Status Codes
When we talk about returning Status Codes, we're talking about HTTP Status Codes. So, the HTTP Spec Defines certain Status Codes, and there's quite a number of them; not all of them are listed here, but a number of them that are pretty common that you're going to see, 200 OK being the most common in that a request has succeeded, and of course everyone knows about things like 404 Not Found, and 500 Internal Error, and you can see some of the other Status Codes here. Now our services can use all of these Status Codes if we want, but we find that in a well-defined API, that the number of different sorts of Status Codes that can be returned from your particular service can be simplified into maybe 8 to 10 sort of Status Codes. There are exceptions to this, and you may need more than 8 or 10, but trying to get a sense of what different Status Codes each type of call can return, simplifying that is going to make it easier for user of your API. Pragmatically, you want to use these Status Codes in returning from your API. At a minimum you should really support 200 for everything worked well, 400 for you did something wrong, you made a bad request, or 500 for the server doesn't know what it's doing and something has gone bad. Most likely you're going to also include these Status Codes: 201 for Created for when you're doing a POST, Not Modified for when you're returning a cached object, 404 for Not Found, and then 401 and 403 for when you're dealing with Authentication and Authorization. Again, you may use more Status Codes than just these, but this is a good simple set to start with.
Associations
So far we've talked about simple collections in your URIs or the list of customers, and then individual items in that collection using a key. Associations are that next level of object. Associations are about sub-objects of other objects, and we want to use the URI Navigation path to imply that there's a relationship between them. So, for example, to get all of the invoices for a particular customer, you could add another part of the path that says get the invoices for this particular customer. You could see getting the Ratings for a game, or getting the Payments for an Invoice. So, we're talking about getting information that is contextual to the object that it's behind. These Associations should return a List of those Related Objects or a single Object if that's the kind of relationship it has. So that if we look at the API for getting the Invoices of Customers 123, the shape of that result should be the same as if we just went and got a list of invoices. That way the user of your API can really deal with it in that same way; you're really only telling it that by using this path, you don't have to issue a query against Invoices or walk through to find Invoices for a customer, you're going to return only the ones that are relative to that item in the collection. There may be multiple Associations for the same object. So, while we've looked at Customers having Invoices, Customers also may have Payments, and they also may have Shipments, so you can have multiple Associations for each type of object, you just have to make sure that you're dealing with each of those in the correct way. If you have more complex needs, you should just use query string parameters to deal with them. For example, instead of having states that have Customers and those Customers have Invoices, it might be just simpler to allow you to do something like a query where you can say Customers?state=GA or Customers?state=GA and is from this individual salesperson. So, instead of trying to fit everything into simple entities or simple Associations, use Associations where it makes sense, but understand you can go to query strings in order to get very specific data as necessary. Associations are an important part of what you're going to design for your API, but don't try to rely on Associations to solve every related entity problem you have in your API.
Formatting Results
So what about the formatting of the results that are returned? How do you know what format they should be in? The best practice is really to use Content Negotiation, and what that means is to use an Accept header in the request to the server to determine what formats are supported. The idea behind the Accept header is simply to tell the server what kind of data you can accept; I can accept HTML, text, RSS, whatever it may be. So here you can see a simple request going after our endpoint of games, and looking for that second game like we saw in our example earlier. But here we're telling an Accept header to hint at what kinds of data we can expect. Now the Accept header takes a common delimited list of types of data that you can accept. Ordinarily when a user is calling your API, they're only going to list one type here, and that's the type they expect to get back. When there's more than one, the server looks at the list and finds the first one that it can match. So, in this case, if the server supports JSON, it will always return JSON. But, if for some reason the service didn't support JSON but did support XML, it would fall back to XML. You do not have to support all of the types of data that the Accept header puts in it as well; it may accept quite a few different formats, many of which you're not going to support, so you'll be able to look at the list and find when it matches the kind of data that you can format, and it's also a good idea to have a sane default; XML or JSON as the default is pretty common, and being able to in case the Accept header isn't included to fall back to a simple well known format, and to be able to fall back to a sane default, usually JSON. So the MIME types for Content Types are probably useful for you to know and understand. JSON is application/json, XML is text/xml, JSONP is application/javascript. JSONP is a JSON message wrapped in a JavaScript function. This is often used for cross domain calls, so if you want to be able to support JSONP, which is a pretty good idea, you're going to want your API to be looking for application/javascript as something that's different from application/JSON. It's important to know that when you're using JSONP, your API is going to require something like a callback query parameter, because that's going to be the name of the method that's going to be called with the JSON data when it returns. So, in this case you can see our API here is going to need a parameter, usually called callback, and then the name of some function to call, which the user of your API would specify. RSS is application, you can see it here, and ATOM, those are two other fairly common formats; I find that most of the APIs I've written in the last few years are really focused on the top there; I wanted to show you that there are different sorts of Content Types. You might even find APIs that returned non-textual data, so you might support things like img jpeg and img png. So let's see how this Content Type matters. So back in the Composer in Fiddler, let's say we want to be able to get this game like we did earlier. You may have noticed in our earlier examples that this was returning JSON by default, because the way I've defined the server is going to fall back to JSON if no accept header is included. So, if I include an Accept header for text/xml, because I support XML when I execute this, I'm going to see that the result is actually an XML result. It's the same data that I was looking at before, but it's in a different format, it is in an XML file instead of JSON formatted data. If I go back to the Composer and just simply put application/json first and execute it, we can see that we're getting pure JSON back instead. And so this is the way that you should develop your APIs to deal with Accept headers so that the consumer of your APIs can dictate what format the data comes back as. There is another approach that in some cases can be helpful, but I would really lean on Content Negotiation when you can, and this other format is being able to use URI components to do this formatting. I certainly don't consider this a best practice, but sometimes it is easier to do this when you have specific requirements for your data. These are often cases where the consumer of your API can't modify things like Accept headers, so if you are running an API that you also want to be able to get at from let's say, Excel or something like that, being able to add a URI component to do that formatting can be helpful. So an example here would be to include a query string parameter that defined what format you were going to support. I've seen some other cases where some APIs also used an extension; I'm not a big fan of this style, I'd rather have the query string, but that is one approach. And in the case of JSONP, you're going to be able to not only do the format but also include other query parameters you may need like the callback parameter.
Result Design
So in designing the actual results you're going to send back, there are simple rules of thumb I like to talk about. When your API returns single results, those single results or individual items, should be just simple objects, whether they be XML or JSON objects. So, if I'm going after Customer/123, I should expect that the format of the data coming back should be just an object that represents the data in that object. Now it may contain related or complex types in it, like we can see with the address here, but it is simply an object that represents that item. When defining these objects, I suggest that Member Names shouldn't expose who wrote the server. I see this a lot where you can see if Ruby and Rails is used there are a lot of underscores, and if NoJS is used it's camel-cased, if .NET is used sometimes it is even Pascal-cased, and I hate to do that; I like to pick one format that all my APIs are going to use regardless of what the background is. I tend to prefer, because most of the clients that I'm writing APIs for are JavaScript, to just use camelCasing. CamelCasing ends up being the one that most developers are used to and camelCasing is the way objects are going to look most natural when you're consuming them from JavaScript. If you don't like using camelCasing, if you want to choose another way of defining your Member Names, if you are using Ruby and Rails, and you want to use sort of the underscore approach, that's fine really, the only thing I would ask is to at least be consistent. When Designing Collections it's a little different than just returning the collection. We actually saw this in one of our earlier examples, and my suggestion is to wrap the collection around a simple object, that way you can send additional information in the body of the collection. So, if we simply return an object that contains both data about the result that we're returning, as well as the actual result, it can be very useful. So if we want to be able to return certain kinds of data, like in this case the number of results that were found on the server as well as the actual results themselves, we can do that, so that we have a container for information about the collection not about items in the collection.
ETags
Another important part of designing your API is to work with Entity Tags. The idea behind Entity Tags is to help the server cache better, and so when you're developing your API you're going to want to support this notion called Entity Tag Headers. Entity Tags support both Strong and Weak Caching, and they're returned as headers in the response. For example, when we make a request, the return of the response can include this ETag. This is an identifier from the server that is basically a version number for the entity that was returned. We can also have a Weak version, and the Weak versions start with W/, and this is for the server to tell you that this is a Weak Cache. ETags also support the notion of a Weak Tag. A Weak Tag starts with the W/, and the difference between the Strong and the Weak type of ETag is that the Weak Tag says the two objects are semantically the same, whereas a Strong tag indicates that they are byte- by-byte identical, and so depending on the type of data you're dealing with, you may want to deal with objects as being a weak ETag or a strong ETag. Now, what would you do with this ETag? That's really where the important part of the story comes in, because this is returned with a response, and so the user of the API is expected to be able to test for this ETag when it goes and makes a request, so the client should be sending this ETag back to see if a new version is available instead of getting a brand new object and dealing with a brand new object, even if the data is stale. This is typically done with an If-None-Match header, so if I go and request this game object that I did earlier, I would take the value from the ETag and put it in the If-None-Match header, and if it matches this, if the server says this was the ETag for this object, it will simply return a 304 or not modified status, instead of a 200, and the body of this request would be empty. This is the same notion of cached images when you're dealing with them in typical web development. This allows you to do it at the entity level itself, and you will use the If-None-Match header with the same value of the ETag that was sent to you, and if this did indeed match, the server would return a 304, which is the not modified status; it wouldn't have the body of the entity or the individual item anymore, but simply say it hasn't been modified. Therefore, if the last version you had used this ETag, go ahead and don't return the new copy, which is just a literal copy of what the client should already be dealing with. If I switch over to Fiddler, we can see that I can make a request to the server to get an object, so I'm going to go ahead and Execute this, and this returned our object as a JSON call, but in the headers of that object is this ETag. So, as a client this is going to allow me to, if I choose to, as a smart client, I'm going to try and use this as much as possible, I can use this to test whether this object has changed. So, if I go to the Raw view and let's copy that ETag value, and then in the Composer I'm going to use an If-None-Match; this essentially says if the object I'm about to request doesn't have the same ETag, go ahead and return it to me, otherwise I'm going to get a 304 error as shown here, 304 meaning not modified. And the body of this result, if we look at the Raw version, is the same ETag and no body, because it didn't need to send back a copy of the object because it knew that I already had a copy that was the same one that was on the server. This is often used for optimistic concurrency as well. So this ETag can check to see if there's a new version when it's doing some like a PUT. So, for a PUT I can use the If-Match. If the object on the server matches the ETag I have here, then go ahead and update it with this new data. If it doesn't match this, then I should probably go back to the server, get the new version without the If-None-Match, present it to the user, have them make their modifications again, and then do the PUT again. This allows me to test in the header of my PUT that I'm dealing with the same object version on the server as I had before. And if I issue a PUT with the If-Match and it fails, it's not going to return a 200 or a 404, or any of those, it's going to return a Status Code of 412, Preconditioned Failed; this means one of the preconditions in the header, in this case, If-Match, failed, so I know that the update did not really happen, because the If-Match didn't match the ETag of the object that was on the server.
Paging
In your APIs whenever you're going to deal with returning collections, these lists should always support paging. Now, you can support paging in a number of ways, but let's talk about the importance of paging. The idea behind paging is to prevent your sever from returning voluminous amounts of data that the client can't really deal with anyway. If you returned 1,000 records, the user probably isn't going to look through all 1,000 unless the client really wanted to deal with the paging. You also don't want to have to deal with the load of building up those large result sets when your server is busy, and trying to return them to a number of clients. And so it's not about just supporting paging but really requiring paging. So, you can use Query String parameters to accept the paging information, but one of the important aspects of this is making sure the first set of list that you return is only the first page of that data. It's often common to use the Object Wrapper that we talked about earlier for lists, indicating the next and previous links so that it's easy for a client to walk through the pages by just using these additional links. So here's an example of a result that's going to tell us in the body, oh this is how many objects there are for us to get, and by using simple properties we can see the next page and the previous page as URIs back to our service, so we can very easily do this sort of paging. So let's see what that looks like. I have our sample API here. In fact, I'll go ahead and issue a request just to GET the entire games collection, which happens to be more than 1,000 results. When I Execute it, what it's actually going to show me is a smaller number of results; this number of results is actually 25 by default, so I'm not overwhelming the client with the amount of data, I'm telling him how many are available in the server with this total result, but I'm just supplying the first 25 results, and then providing a link or two to the next result. So here, the next page, href, is just ?page=2. Now, I could document this in my API, but it's really useful to be able to put it in the actual package that's being returned back. So this means if I go back to Composer and I simply say page=2, what it's not going to return is the next set of 25 elements, and notice now that I'm not on the first page I can include a previous page, which is pretty common. Obviously, the first result doesn't have a previous page, so in this example API we're not even computing that, we're simply saying, hey, the next page is this, so we can see previous is page 1 and next is page 3, but 1 is the default, so in fact getting just games is going to give that first page. And this allows people to build clients that use their APIs in a much simpler fashion. When you're doing paging, even though you might have a default page size, like our example a minute ago had a page size of 25, you might also want to support different page sizes; you might want to support them getting a different amount than the default by maybe supplying a parameter. You should limit this page size to a reasonable amount so as to not incur extra server load. We saw in the API example a moment ago that we could indicate the page here, but we could also indicate the pageSize. In fact, if we go back to our example and add pageSize of let's say 10, our result here now just includes the 10 items. So, we're allowing the user to dictate how big the pages are, and therefore the number of pages is going to go up, but it allows the user to calculate that based on us returning back the total results. Now the terms of your page and page size aren't actually terribly important; different APIs use different semantics here, some use Take-And-Skip so that there isn't an implicit page size, you can just sort of do what you will. Many OData REST feeds really lean on this because this is a common strategy in things like LINK and .NET. But using the page number, page size, or result size, or whatever you want to call it, the name isn't as important as the actual functionality.
Partial Items
The last consideration for designing your data-driven API, like most of the examples we've looked at so far, is to deal with partial items. Now it's a pretty typical request to request partial items from the service. Query string parameters is a common pattern for this, and you can see a lot of example on the web that do this. The idea behind partial is to allow a user of your API to pick what fields it needs for a particular request, so that the payloads can be smaller instead of you always returning these very verbose objects that the clients themselves aren't really even using. A good example of this would be using the ?fields Query parameter where you simply list the fields that you want the result to include. This pattern of including the names here could also include the names of fields in sub-objects or associations as well; that's really up to you, but the idea would be to allow the user of your API to decide what fields are important. Now this is sort of an optional part of your design, but doing this will really allow users to consume only the parts of the data that they need, as well as reducing the footprint of your service, because you're going to be producing smaller serialized objects and the clients are going to be consuming smaller objects, therefore the roundtrip should be quicker. You can also support Updating of those Partial Items as well, and there's a special verb that's often used for this that's called PATCH. The idea behind PATCH is to be able to send in a partial object or a subset of the original object with just the fields that are updated, and check for concurrency based on the ETag that we talked about a few videos ago. So here's an example. We're using PATCH against an individual item, and we're using the If-Match header to make sure that our ETag is going to match the original requested object. And you can see here that we're sending back a small subset of the full set of fields that this service can return. A service normally has about 10 or 12 different fields, but we are only really updating a couple of them here, and so we're only going to send this partial object back; it's going to be the responsibility of your API based on a PATCH to look at this partial object and map it to the full object in order to do that updating. Using the ETag will allow you to do actual concurrency here without having to rely on field by field checking or whatever other semantics you use for doing that. It will know that the version of the object on the server is the same version as was originally requested, so that it should simplify the partial item update story.
Non-Resource APIs
So what about parts of your API that aren't really dealing with entities or domain models in the same sense that you may be used to? What if you really need to have some Functional Part of your API. Now this normally breaks the rigorousness of a REST-based API, but in a pragmatic sense, you should be able to add these elements as necessary, because we're trying at the end of the day to solve business problems, solve technical problems, provide the sorts of functionality that our users of our API are going to need. These functional parts of your API should be well documented that they are in fact functional parts of your API and not resource APIs; that you're not going to be able to necessarily do things like PUT and POST and DELETE these elements, that they're really about calling GET and doing some functional basis. It's important that you make sure that these parts of the API continue to be completely functional not resource-based. The problem is that you can very quickly get into a case where you start to build functional parts of your API that really should be resource parts of the API. You start to do things like match the idea of something like a stored procedure to a REST-based API, and you're going to very quickly fall into sort of the morass of a badly designed API. So here's an example of one, calculateTax, where you're sending in with Query parameters some definite data that will help you do the calculation. Now, instead of Query parameters you could also send in the body of a formatted data, or JSON data, or XML data to do this sort of operation, but it is doing some sort of functional piece of work here; we're not asking it to add a new invoice, we're not asking it to create a new customer, we're really doing a non-resource based part of our API. You could even see things like restartServer or beginWorldDomination, things that are functionally part of what you really need your APIs to be able to accomplish.
Summary
Let's wrap up this basic API design part of the story. Remember, you can design a great API, but you need to be careful not to surprise your users. By following some basic tenets of the way REST works, you can create APIs that should be familiar to people that have used other APIs, especially other REST-based APIs. You can certainly invent something yourself in creating APIs, and sometimes even create something very functional. But by taking some of the lessons learned in this module, I really hope you follow the patterns of other APIs that are out there. At the end of the day, part of your job as a developer is to protect the server from the user and protect the user from server, and so getting a good balance in the middle of being very useful for the user but not allowing a single user to do something bad to the server, like make really large requests, is really what you're after. By making sure you're using aggressive caching and the use of ETags, you can really allow the user to be a good citizen to your server without you having to go do the work every time someone hits your server for the same data. At the end of the day you need users to make your API a successful API, so making it easy to use and fulfilling the needs of those users are what's most important, not making conference speakers happy or not fitting into what I consider maybe a too rigorous definition of what we would normally call a REST-based interface. Well, that's been Module 2 of Web API Design. My name is Shawn Wildermuth, thank you.
Versioning
Introduction
Welcome to the third module of the Web API Design course, Versioning. My name is Shawn Wildermuth. In Module 3 we're going to talk all about Versioning your APIs. This is going to include why Versioning is important, we're going to show some examples from public APIs and how they're doing Versioning, and we're then going to talk about patterns for Versioning, including URI Path Versioning, URI Parameter Versioning, Content Type Versioning, Custom Header Versioning, and which one to choose and when. We're also going to touch on the topic of Versioning your Resources themselves. Let's get started.
Why Version your API
So the first thing is we want to talk about why Versioning is important. Once you publish an API it's set in stone, and it's set in stone because this publishing isn't a trivial move. You're telling Users and Customers that your API is out there and they can start to write code again, but as you make changes to the API that you're not going to break their code, it's an implicit contract between you and your customers and users. But requirements for your API are likely to change, and so you're going to need a way to keep the users and customers happy so that their code doesn't break, but also support new requirements or changes to your API. You need to have a way to evolve this API without breaking those existing clients. And one thing to keep your head around is that API Versioning isn't the same as your Product Versioning. Releasing a new API version every time you release a product isn't really useful; only version your API when the semantics, the signatures, and the shapes of the data you're dealing with are changing, and so you should resist the temptation to change your API, do your best not to tie the two together. And so at the end of the day you have one and only one commandment when dealing with releasing your API, and that is you will not break existing clients, so your API changes themselves aren't going to cause your clients to have to write new code, unless they want the new features, new shapes, new support that your API provides. This doesn't mean that you can't get rid of old versions of your API, but you will need to get rid of those old versions of the API with some care, with a lot of communication, so that when you eventually do stop supporting those APIs, your customers and users have plenty of notice that they're going to have to upgrade or move to a new version of the API.
Is there a right way?
So many of you may be viewing this module to find out the one and best way to version an API, and unfortunately there isn't one. When you look across the web at the different types of APIs out there, they're versioned in sometimes very different ways, and the methods that are used to version APIs can be pretty different; they have different pros, they have different cons, so you have to really find the version of your API that works best for you, and we're going to present a few options for doing that Versioning, but the important idea here is that you're going to Version your API. So there isn't one way to Version your API. We can see existing APIs out there and see some of the options that are chosen for Versioning of those APIs, but many of those public APIs have done it very specifically to meet internal requirements, so it may not be at the behest of the users of an API why Versioning happens, it may be really driving the way that the developers of the API needed to do the Versioning. There are some external requirements as well, and that is how difficult it is to use the API. You may decide to Version using one method or another method specifically about the difficulty in that. I would love to be able to give you the single option that I would recommend, but I can't. There simply is no one right way to Version your API.
Examples of Versioning
So let's look across the web at some public APIs out there that do Versioning in different ways. Let's start with Tumblr, a pretty popular API out there. The Tumbler API uses a URI path to do the Versioning. They essentially have a version embedded in the path of the URI that we can see as the v2 here, so that everything after the v2 is subject to change as the versions of the APIs change, so there is no guarantee that in v3 of the Tumblr API there's going to be a user object at all; they make small changes or they may make large changes to the API. This pattern of using the URI path is really common, you've probably run into APIs using this method. Another pattern you can see here is from Netflix, and that is using a Query parameter. So, instead of embedding it in the API, it's dictating with a Query string parameter what version of the API to go after. Another style is the Content Negotiation type, and this is where instead of using anything in the URI to indicate the version, the content type that is requested in an Accept header is used, and so this is a custom MIME type that includes the version information. We can see the 1 here indicates the version of that object that is contained in the GetHub API. And the last type of Versioning we'll talk about is a Request Header. With Azure, when you're going against their API, they're using a special Request Header called x-ms-version, that is saying this is the version of the API that this is written against, and the version in this case is just a date from when this API was released. Instead of using simple version numbers, they're using release dates to do this Versioning, so you're not tied into having specific version numbers for your entire API, for individual objects, or for individual types of resources. Let's walk through some more details and talk about the pros and cons of each of these four Versioning patterns.
Versioning in the URI Path
So using the URI Path to do your Versioning, the Version becomes Part of the Path to your API. This allows you to make big drastic changes to your API in later versions. Everything below that version number is open to change, though the amount of change you make will really be dictated by how much pain your users and customers can take in their client code. Here's an example where the v1 in the API is dictating what is available in that version of the API. This is a very common pattern; it's probably the most common of all the Versioning I've seen out there in public APIs. And in this case, you can see instead with their version 2 of their API they might decide instead of including CurrentCustomers as a customer type that they now just expose it as a different kind of resource, so that the two APIs don't have to be that similar, though, of course, what they're doing at the end of day is ultimately similar. So the pros of this pattern is that it's very simple to segregate these old APIs, and what that means is that you can really change the patterns of your APIs as time goes along, and so you may decide to support the old APIs and implement a brand new API. The problem here is that this pattern requires a lot of client changes whenever you change the version. So, even if the whole API changes and you're just adding some additional pieces, all your users and customers are going to have to go into their code and change that v1 to v2, unless they only want to support what is in the old APIs. This also increases the size of the URI surface area that you have to support, so that when you release the v2 version, you may have a whole new set of code that is supporting that version, and still having to maintain and fix bugs in the v1, and so often it's an easier decision to use this type of Versioning if you want that sort of broad reach, but at the end of the day you may decide against it because it can be a larger amount of technical debt.
Versioning with a URI Parameter
The next pattern is using a Query String Parameter. One of the interesting parts of using this is that the version can be an Optional Parameter, which means that you can make sure that your API always without the Parameter is tied to a specific version, usually the latest version. So here's an example of using a simple API. There's no version in the URI right now, but if I decide I wanted to go get the Customers of a very specific version of the API, I could then include some Query Parameter that defined what version I was going after. The pros here are that without a version the users are always going to get the latest version of the API. It's going to encourage users and customers to use the edge version of your API, even in some cases when they don't necessarily need to. There are little changes as the versions mature; this also assumes that you're not going to make great big changes as the version also changes. The problem here is that because you have the optional version included as a Query String, you can surprise developers with changes that they don't expect, and at the end of the day, you may be breaking client code because they didn't include the specific version they were going after. Now some of this can be mitigated by not making the version number optional, and by not making it optional you're making it part of the URI syntax, and it's in a lot of ways semantically the same as using the URI path we saw in the last video.
Versioning with Content Negotiation
The next type of Versioning we'll talk about is with Content Negotiation. And Content Negotiation simply means using a Custom Content Type and Accept Header in the request. Instead of using standard MIME types for the Accept types, application/JSON, text/XML, etc., you're going to use custom MIME types. Here's an example of a GET where the Accept type includes a custom content type. Here's myapp with a version, and then the kind of object I'm looking at; this is a pretty common pattern. You can include formatting information in this Accept Header as well. So you can see here putting a .JSON or a .XML in the content type could also tell the server what kind of content it wants back, which is normally what the Accept header is being used for anyway. This type of Versioning is becoming increasingly popular. It's becoming increasingly popular because the version itself is separated from the surface area of the API itself. When defining your own MIME type, there is a standard for this. The standard indicates that the "vnd." or vendor prefix can be used as a starting point and usually is. This is a reserved beginning of the MIME type, and this indicates that this is a vendor-specific content type. For example, here, we're doing the same sort of request we saw on the previous slide. The Accept header could begin with vnd, and that's more typical of what you're going to want to do in your own API content types. Let's also look at the pros and cons. The pro here is that the API and Resource Version are all in one. So, when we're looking at the version of what our API looks like, but also the resource that we're returning, we're getting a version that's really tying the two together. It takes that version out of the API surface area or the URI so that clients don't have to change except when it comes to including that Accept Header. The con here is that it adds complexity. Understanding how headers work and adding headers isn't easy on all platforms, and isn't easy for all levels of developers. This type of Versioning could also encourage more versions throughout your code, so you might have specific versions of a number of your different kinds of resources. This is good in one sense in that you can have more finer grained versioning, but it also means you're going to have to support and understand the complex nature of Versioning across your API. This can encourage your developers to create more versions for different small parts of your API, instead of understanding that making no change to your API version is often better so that clients don't have to make their changes.
Versioning with Request Headers
And finally, the last type of Versioning we'll look at is using Custom Headers inside the request. This should be a header value that is only a value to the API, so is specific to your API. You're going to use an x- type of header, that's a name that most routers or interrogators of traffic are going to ignore. So here's an example of a header that includes a name that your application is going to look for. Here's MyApp-Version, and then some text after it that's going to indicate that version. Now, it's pretty common for these sorts of custom headers to use dates of numbers, so what you include as the actual App-Version is completely up to you, it does not have to be a numbering scheme, like as developers we may be used to the with product versions, or assembly versions, or jar versions; we should get away from that and just think of something that is semantically important to what this specific call should be pointed to. The pro here is that it separates the Versioning from the API call signatures much like the Content Negotiation Versioning does, and in this case it's not tied to the resource versioning, so you're really talking about the version of the API itself, not just the version of the resource. The con here is that it adds complexity; much like the Content Negotiation, adding headers isn't easy on all platforms or for all developers.
Which to Choose?
So ultimately you're going to be asking yourself which one of these patterns should I chose? And there isn't an easy answer for you. Versioning with Content Negotiation and Custom Headers is very popular right now, it's sort of the trend of where Versioning is going, but it does add that complexity. Versioning with URI components is more common because there are more APIs out there that have chosen that pattern. Versioning with URI components tends to be easier to implement but can add technical debt to the backend of your project. Ultimately you're going to need to make a decision based on the kind of requirements you have. In many cases I would probably start with URI Component Versioning to see whether the technical debt is a hindrance to your project, and switch to something like Content Negotiation if you need something finer grained, as well as if you find out that sophistication of your users is high enough that headers aren't a big deal. An important part of your decision here is how you're going to do Versioning, but understand it's incredibly important that you version your API from the very first release, so that makes it easier for your users to move from version to version as your API matures.
Versioning Resources
So we've talked about Versioning of the API itself, but what about versions of your Resources. In most cases, unless the nature of your Resources is very strict or set in stone by other standards, your Resources Should Be Versioned as well. So the Versioning of the API calls usually isn't enough. The structures and constraints of the kinds of objects you're dealing with and returning via your API and accepting from your API tend to change, and so Versioning your Resources becomes important. If you're already using Versioning with Content Negotiation or Custom Content Types, this is pretty easy because it will know in the Accept header or in the Content Type what the version of the object that you're expecting and sending, but this does add complexity as we've talked about. Including a version number in the entity body is another option, but it does pollute the data; it adds a piece of data that is about the API and not about the nature of the data, so I don't tend to recommend this approach. If you need Resource Versioning separate from your API, you should probably be doing Content Negotiation Versioning.
Summary
So to wrap up this module, you must version your API; that's sort of the mantra I'm trying to push towards the viewers of this video. Version your API whether you like it or not; it will help with the maturation of your API as time goes on. If your API is public, it has to be versioned, period. There is no one way to do this API Versioning, so starting with something simple and moving to something more complex is a good approach, but if you feel like you're going to have a lot of version churn, choosing one of the more complex approaches like Content Negotiation or Custom Headers is probably the place to start. You're going to want to pick one that matches the maturity level of your users as well your internal team. If your internal team is not well versed in dealing with a large set of code, you may decide with one of the approaches that sort of leans on less technical debt, but if your team is not as comfortable dealing with worrying about routing based on things like Content Headers, then choosing one of the simpler approaches, like the URI Path approach, may be better, so understanding that maturity level is going to help you pick the right one. Using complex versioning isn't evil in itself, but it can increase friction with developers. So, if you decide on using a versioning scheme that is more complex to implement, you're going to have a tougher time reaching out and getting more developers to work. Ultimately, you have to be pragmatic about these decisions. Usually using just enough Versioning to start is where I start new API projects, and then allows us to make changes as the API matures. Remember that as long as you have a resilient community around your APIs, you can sunset APIs at a certain point and choose a whole new scheme. If we look at the way that GetHub when from one version of an API scheme they had several years ago to the Content Negotiation type they're using now, they knew that the API wasn't the one thing holding their customers to their product, so you have to be pragmatic about how much Versioning you're going to deal with to protect your users, as well as incurring extra effort on their part to use your API. There's a balance there that you're going to have to make, and understanding who your users really are is going to be part of that. This has been Module 3 of Versioning APIs. My name is Shawn Wildermuth.
Securing Web APIs
Introduction
Welcome to Module 4 of the Web API Design course, Securing Web APIs. My name is Shawn Wildermuth. In this module we're going to be talking about how to secure your Web APIs, which Threats are coming after your APIs, how to Protect Your API, Cross Domain Security, Who Should You actually Authenticate with, Working with API Keys, understanding User Authentication, and finally making sense of OAuth.
Understanding the Threats
Before we can look at how to secure your API, we need to really understand the nature of security as it relates to developing Web API. Who are the people that are going to come after your secrets, your work, and even the people that are going to come to just purely create disruption to your business. To begin with, do you even need to secure your API? You may be thinking I'm creating an API, I'm going to use it within my own enterprise for my own applications, who is going to care about these APIs? Ask yourself some questions and we can talk about whether you should secure them or not. Are you using any private or personalized data, data that represents individual people that could be at risk? This could be social security numbers; this could be information about your users or employees. If you are, then you should secure it. Are you sending any of this sensitive data across the wire to your applications? If you are, then you need to secure it. If you're using credentials of any kind in order to do authentication, you need to secure it. And finally, are you trying to protect people from getting to your servers but overwhelming them, maybe even to the point of stopping you from being able to serve your real customers? If so, you're going to need to secure it. So, securing a Web API typically becomes a 1st class citizen of your design. Security isn't something you can just throw on top of your existing design and hope that it will work. You have to think about security through the entire process. Who's coming after your API? Well we have users and the browser that are coming across the internet to get at these APIs, and we are going to have threats from different places here. We have the typical man in the middle attacks where we have Eavesdroppers that are looking at the traffic as it goes back and forth and seeing whether there is interesting data there. So, if you're trading any sensitive sort of data across that wire, you're going to have to protect against these eavesdroppers. In addition, you can have Hackers or even your own Personnel that are going after that personal data directly at the servers themselves; this is often behind the API some place. This includes intrusion into your systems through your firewalls or even physical security of your server locations. And finally, you have the Users and Hackers themselves, which are working on the other side of the internet, that are taking the code that you may be publishing, or maybe looking at the website that is using those APIs in order to access your servers through your API. These different kinds of threats are the ones you're going to need to make decisions about how you're going to protect against. So at the end of the day you're going to want to protect your API in almost every case. Securing your server infrastructure itself, protecting your data centers with firewalls, and protecting it against physical intrusion, is outside the scope of protecting your API. We'll assume that you're working in an organization that knows that the data center needs to be protected. When you're communicating with your API, you need to have security In-Transit, so as the clients are calling into your servers, how can you protect that data while it's traveling across the wire? And this is usually where SSL is used to protect the actual payloads of the API calls, so that they can't be modified or changed, or even inspected as it crosses over the internet. SSL does have a cost to it, but is usually worth the expense. So understanding that the overhead of actually doing the encryption and decryption on both sides, and even the handshake between the browser and your server, to do the SSL encryption, there is a cost associated with it. But, in terms of protection from people interrogating your traffic, you're going to want to do this as much as possible. And finally, and what we're going to mostly talk about in this module, is securing the API itself. And part of the security is to protect yourself from Cross Origin calls, so knowing what domains are using your API and allowing them to make those calls where appropriate. Additionally, you're going to want to have methods for dealing with Authorization and Authentication, so determining who is coming into the system and what rights to those system they have.
Cross Domain Security
So, the first piece we'll talk about is Cross Domain Security. The question you have is should you allow your API to be called from different domains. You may be creating your API directly for your public website, and then maybe this isn't something you want to deal with, you want to only allow your actual website to go after it. Because the way the browsers work is that when they make a call, an ajax call, into a Web API, if the browser itself is hosted in the same domain as the call that's being made, it just simply allows it. If it's in a different domain, if you're crossing domain, let's say going from foo.com to rd.com, the browser itself is not going to allow it, unless there are some special circumstances that will allow it to happen. Making the decision about whether to allow your API from different domains really depends on whether it's a public or a private API. If it's an API simply for use by your application or your web property, then you probably don't need to worry about it. But if it's a public API, because it's going to be called from different parts of the web, you're going to want this to be supported. Now this whole notion of Cross Domain Security only matters when it's being called from a piece of client script on someone else's web property. If someone is writing an app like an iOS, or an Android, or a Windows phone app, to get at this the API is going to work in either case; this really is about Cross Domain access from within the browser. There are Two Approaches to solve this. The first is to support a different format called JSONP as the type of data coming back. The other is to allow something called Cross-origin Resource Sharing, which is a standard out there for doing sort of a handshaking to see whether a domain is allowed to make those calls. So let's talk about each of these individually. What is this format we're talking about with JSONP? It's simply JSON with Padding, JSON being JavaScript Object Notation. JSONP is actually JavaScript. It is a small snippet of JavaScript that's returned, instead of a JSON-formatted body. It typically contains a JSON-formatted body, but it's surrounded with a small function call. The expectation is that when it comes back from the server it will be executed, and so the browsers deal with it in a different way, because we very commonly go get JavaScript that we're going to execute in the browser from different domains. If you're getting JQuery from a CDN, or using other sources of CSS or JavaScript, the browser expects to get those from a variety of different domains, so allows this call to go across to that domain, if the return type is JavaScript. When the data comes back, this JSONP package, which again is just a small piece of JavaScript, is evaluated, which ends up calling a function that contains all the data that you are looking for. So, let's see how this works. I've created a function ahead of time in my client code called updateUser, and this is going to accept some data that I want from a cross domain server. I can then issue a GET to some API, and here I'm calling an API called games, and the host is going to be some different host than I'm actually hosted on. I might be hosted at foo.com, but I'm going after some cross domain host, and notice that part of the API call is passing into the API a callback. What is the name of the function I want to call, and so this updateGames matches the function that I already have existing on my page. And the Accept header here also includes the information about what kind of data I want to come back, and this is application/javascript, it's not application/JSON, which is the way we would get normal JSON. This is actually application/JavaScript, so that the content type will be actually JavaScript, because when this GET is executed, what is returned is a small snippet of JavaScript, and it's used that callback mechanism here to say wrap the results in a call to a local function, in this case updateUser, and then inside is a JSON-formatted object that will be passed in as the data to my updateUser call; that's the core of what JSONP does. Let's look at this in a live API. If we go over to Fiddler, I can make a call here to get an object from an API. We've been doing this throughout the course here or there, and if I tell it that I want JSON as the data, when I Execute this, the result is going to be a JSON object that is returned, and in fact, if you look at the JSON result, we'll see we're getting this object from the sever, and it's formatted as JSON. But, if we go back to the Composer and change this to JavaScript; it's important to put on this the parameter of callback, this is the parameter that is usually used for APIs to define what is the name of the callback to use when I'm returning JSONP. What am I going to wrap my JSON result in? Now, you may decide to make this different, but the convention is actually to call a callback. So, I'll call it foo in this case, and Execute this. Again, I'm including the callback, and specifying that I want JavaScript not JSON. And when I execute this, the Raw body that is returned is wrapped in a function called foo. This assumes that when this is evaluated, that I actually do have a function that I called foo that will accept that data as the callback. Now when you're not calling cross domain, this little extra bit of code and bit of ceremony to using a callback may seem kind of odd and unnecessary, but it isn't unnecessary. But, if we were going to do the same thing in cross domains, call from a separate domain, the browsers would allow us to do this, whereas if we tried to do this otherwise, it would fail to because it is a cross domain call. When you're designing your data for JSONP, remember that JSONP is just JSON; it's just the same sort of results you're going to return to the clients, but they're going to be wrapped by the single function. So, the data passes just the same as with the JavaScript, it's just packaged as this JavaScript callback. The other approach is to use something called CORS. CORS allows Cross Site support from any of the browsers, but it involves a little handshaking to make it actually work. Now, the different platforms implement CORS in different ways if you want to add to it. So, we're not going to talk about how to actually write the CORS, but I want you to understand what's going on in order for this to actually work. There is some handshaking that goes on between the browser and your service before your service is allowed to make the cross domain call. Implementing this yourself is possible, but usually if you look at the platform, there are plugins in to help you implement this forward, because it is not a matter of changing the way your servers work, but actually implementing the handshaking that's going to happen before your service is executed. So let's talk about how it works so you can get your head around what the browser is actually doing. This is a little difficult to see because the browsers hide the handshaking part, or even using something like Fiddler hides the handshake, so that if it doesn't work you can sort of see what's going, but if it does work, you aren't going to see the handshake at all. So, CORS starts by making a Cross-Origin Request as it's called. I'm on food.com, and I'm calling Ebay.com to make a request. The server is asked if this Cross Domain object is allowed, and it does this by issuing a command from the browser, this isn't something you write it's something the browsers does automatically, because CORS is a standard, and what it does is it issues an OPTIONS call to the server, requesting the type of method that it was attempting to do. In this example, the original Cross-Origin Request was a POST request. This would say GET if it were a GET request, etc. And the Origin is the name of my site, the site that I'm coming from, whereas, the Host is pointing at where I'm going to. The Server Responds with what the Rules are. We're going to allow these methods and we're going to allow these methods from this Origin, and as long as the calls on the page after this adhere to these rules, it will continue to work. So, then the browser actually makes my request, and it adds onto it the Access-Control-Request-Method that matches what I'm trying to do, in this case the POST of some data to the Games API. It also includes the Origin so it knows where this is actually coming from, so that it can then still check to see that this is allowed, but this handshaking of getting the options and then receiving the rules and caching those rules, are the part that need to be implemented on the server for CORS to be allowed. Typically this handshake option is done at a pretty CORS level. You're not necessarily going to allow it just for individual API calls or methods, but you may decide to do things like allow Cross Domain only forget but not allow things like POST or PUT or DELETE.
Who Should You Authenticate?
So in many cases you're also going to want to guarantee who the caller is to your API. You need to figure out who is calling in order to figure out who I'm really authenticating as. You're really doing Server-to-Server, or you might think of it as Service-to-Service Authentication, and in this case, it's most common to design it to work with API Keys and Shared Secrets, and we'll talk a bit about how that works in a minute. There's also this thing called User Proxy Authentication. So I've written some piece of code and I want to work with some 3rd party API, but I don't want to have to collect and be responsible for storing the user information, so I want to simply have the right to go over to this 3rd party API and use that API, and that may be your API. And so in this case you're going to use something like OAuth, something that allows you to proxy the actual Authentication schemes to themselves. And finally, there is Direct User Authentication as well. And this is where you're going to simply piggyback on existing systems. So your API may use cookies or tokens that you use as part of normal Authentication with your website, so, if you're using some like ASP.NET, you may be using forms authentication here and also use that same cookie for your API authentication. This Direct User Authentication is almost always used when you're writing an API for your same property; it's not a public API, it's more of a private API that you're using to communicate for your own single page applications or your own apps. There are some important definitions for us to get our head around before we dive in here. First, what is a Credential? We talk about Credentials an awful lot, but I want to make sure that you, the viewers, have a sense of what that word really means. And a Credential is a fact that can describe an entity. Most commonly this fact is something like an identifier, or like an email address or a user name, and another fact may be something like a password. So, a set of Credentials is really a list of those facts that helps the server determine you are who you really are. Authentication is the way the server will validate a set of credentials to figure out who you actually are. Now this who you actually are is a curious one, because it's not necessarily a user of the system, it also may be a developer API Key, so it may be validating that when you signed up for an API Key that it is you, the developer, that created that relationship. So, this authentication idea of credentials is true whether you're calling server-to-server, or app-to- server, or whether you're actually authenticating with user credentials. And Authorization. Authorization is the verification that some known entity, an entity that has been validated with authentication, has rights to access a certain resource or a certain action, so that I can say that Bob is logged into the system, and Authentication has validated that it is in fact Bob on the other side of the wire. Now Bob wants to delete a customer. Does Bob have the right to delete that customer or not? And that's were Authorization comes in. Is that entity allowed to do these certain things? Can it read this, can it delete this, can it insert this, can it modify this?
Working with API Keys
So let's talk about API Keys. A very common method when developing Web APIs is to issue developers a set of credentials to identify who the developer is instead of the user, and those are normally thought of as these API Keys. There are even a number of services that APIs can register with that will do this management of the API Keys for you. So whether you implement it yourself or use some service, understanding how API Keys work is a pretty important part of it. API Keys are for non-user specific API usage. For example, if you're writing some code to go after Amazon's Web Services, or to look at the Amazon catalog, you're not representing a user that wants to look at what orders they've made, you're simply using the API to get at some data, and those APIs, instead of being truly open and public, still require a relationship with those APIs so that it knows who the person calling the API is. And this is primarily so that when someone uses the API, then it can monitor their usage. If I see someone is looking at the catalog of my products and they're just walking through and reading them all, I should be able to look at logs and see who is just dumping data out, or maybe calling it so often that it's slowing down the service for others, and identify who the developer that's causing that problem is, and then mitigate it in one of a number of ways. These API Keys are just to verify who the developer is making the call so that I can make some of those decisions. So typically, and you're going to see this from lots and lots of public APIs, and you can implement this yourselves, there's this notion of having an API Key and Signing your requests. So, to start out, the developer will go to the API's website and sign up for the API, it's going to give them some personal information, so we can figure out who the person is, and then they will be returned two pieces of information; they'll be returned a magic string that contains an API Key, and then a Shared Secret. The Shared Secret is normally used for encryption, and we'll see how that encryption works in just a moment. So using these two bits of information, when I make a call to one of these services, I'm going to need to use my API Key to make a request and to sign my request so that they can guarantee it was me in fact making the request. So the developer is going to create a request, maybe a call just to a REST-based service, and this is going to include what do I want to do, what my API Key is, and what the Timestamp is. So, the API key is being used here to say who I am, but it's not being secured yet. This API key itself is going to be transmitted across the wire so that the API itself can determine who I am. And then the developer is going to sign the request with the Shared Secret. The Shared Secret that was passed to the developer when they registered is not actually passed in as part of the request; it's going to sign that request. Now what signing the request means is to take the complete request itself and use a Shared Secret to run it through a one way encryption, to get a signature for this request when it is being signed. The developer then takes the request that it generated, plus this signature, which is this one way encryption that they have determined, and sends that whole thing to the service. The service then looks up who's making the call through the API Key, oh, there's Bob the developer, I know who Bob is, and I can also get that Shared Secret that I had given them before, because I now who Bob is. We're using the Shared Secret on both sides of the wire, but we're never transmitting it. The service then takes the request that was given and signs the request with the Shared Secret just like the developer did. And it does this so it can then look at the two signatures and make sure they are the same. So it's doing one way encryption on its side, the developer did theirs, and then it sees what the developer sent in as the signature to the request and verifies that the signatures are the same. It does this to verify that that Shared Secret is the same that the developer is using, so that it knows, oh, this is actually a developer, because the developer wouldn't just give out his Shared Secret. So the developer knows something that only he and the API knows, and I'm using the signature to verify that. It also looks at the timeout of this request and verifies that the signature is within that allotted time. When the developer created the request it included a timestamp that described the time of the request so that it could make this check to see that the signed request isn't old, so that someone couldn't steal the request that was signed and try to issue it an hour, or two hours, or two weeks later, and mimic that they are actually the developer. If it's valid, it goes ahead and executes the request and returns the data. If it's not valid it then returns an error. The API signing is a way to verify that the developer is who the developer says it is. When you're designing your APIs, and you're designing APIs that are going to be used outside the scope of individual users, simply using an API Key and a Shared Secret will allow you to validate that the developer calling into your service is actually them, so that you can have a way to register developers, and have them use the service, and be able to monitor what developer is using your service, without the need for going down and creating user authentication for each user of the system.
User Security
Identifying individual users is a little different. So if you have the notion of users, how do you verify that API is calling as them? So the developers themselves might be identifying themselves with API Keys, but you also are asking those developers to act in the role of the user they're trying to serve, and how do you verify that that user is actually them? If you're building an API for only use on your website, don't worry too much about it, because you can piggyback on the existing website security. Again, you can take the forms authentication, in the case of ASP.NET, or any sort of authentication scheme that you're using on the website and apply it to the API, because if they're logged into the system that means that they can then use the API in the same way. And if you're building clients for these 1st party APIs, those clients might be able to collect those credentials and send them in as header information when it calls into your API. If you're developing Apps against these 1st party APIs, it tends to be a little bit more painful because your Apps will need to collect user credentials and secure them. Securing them is often the harder one, so making decisions about, oh we're going to keep the user name but force the user into typing the password every time, which isn't necessarily a clean and easy way to do, or maybe storing the password in hopefully a secure way, depending on the platform you're on; it can cause additional problems with that. If you're expecting 3rd party developers to use your API, you're not going to want them to identify individual users themselves. You don't want to ask them to collect those credentials, because you don't know how good they are at protecting those credentials, and since they are a window into getting into your system, you don't want them to know user names or passwords at any point, and that's where you would use something like OAuth. OAuth will allow these 3rd party API developers to have access to your system while maintaining that only your code is actually accepting those user credentials and mapping them back to something that the developer can use as that individual user; let's see how that works.
OAuth
So, in order to protect the user, we need a way to allow the developer to act as the user in the system, but allow you to maintain control over accepting those actual user credentials. Once you accept the user credentials, you can then trust that 3rd party with some magic token that represents the developer, and the developer, whenever they call into you with this magic token, you'll know that this 3rd party developer is acting as if some real user in your system. The developer themselves won't ever receive these user credentials, and more importantly won't be responsible for storing them and securing them. So this is how it works. Let's talk about what the Developer will do, what the API will do, and ultimately what the User will do; they all have a role in how OAuth works. The developer is going to request an API Key from the API, much like we saw earlier with pure API Key authentication. And the API is going to supply an API Key and a Shared Secret, again just like it was before, because you still need to have a way for the developer and the API to know who is who. Using this API Key and Shared Secret, the developer requests a token called a Request Token. This Request Token is the magic string that the API is going to return to allow it to make this handshaking by forcing the identification of a user in their system and having the API give them permission to act as that user. The API looks at the API Key and Shared Secret being signed, and returns that token. That token is then used to redirect the user to a specific page in the API to allow them to give the credentials. So the developer redirects to the APIs authentication URI, and the API is going to display a UI for the user. The user is going to supply their credentials if they're not logged in, or once they log in they're going to confirm that the user wants to give the developer the rights to call as them. If anyone's done anything like Facebook or Twitter integration, or even a user allowed a Facebook App to be installed or allow an application to use Twitter, you as a user have done this before. It forwards you over to the Twitter.com page; it will say Bob's development check wants you to give access to your Twitter account, you say yes because you want whatever Bob is going to give you. Once the user has confirmed this authorization, the API itself redirects back to the developer. So the developer can then request an Access Token. This is a separate token that the developer is going to keep, sometimes for quite a while, in order to make requests to the API. When the developer requests this Access Token, the Access Token is going to come back with a Timeout. Here's a token to make calls into my API, and this is for how long you can use it. From that point the developer can use the API with the Access Token until that timeout occurs. They can make multiple calls as the user, as far as the API is concerned, until that timeout happens, and sometimes this Access Token is good for quite awhile; sometimes it's 20 minutes, sometimes it's two weeks, it depends on the nature of your API. If you're developing a banking system, it should be good for a couple of minutes. If you're developing something like Twitter you might want to make it a sliding expiration so that the timeout is good for quite awhile. But using that Access Token allows them to use the API, and the API, when you're designing your API, you have to look at this Access Token and be able to determine that they're calling the API as the user. When you're developing your API, you shouldn't expect that the user credentials are going to be part of the header, or the user credentials especially are not going to be part of the URI. You're not going to develop Get All Messages From User? User=Bob, right? You're going to assume that the Access Token that's going to be sent in is going to be mapped to Bob before you determine what data that resource is going to return. And so developing your API as it relates to individual users is going to be very clear and obvious when you start to work with something like OAuth, because it's going to assume that you're going to identify the user without the need for identifying the user by name using something like a query string or a path variable. So how do you design for OAuth? First of all, I want to make it clear that you should probably not implement OAuth directly. Most platforms are going to have a way to implement the OAuth for you. Understanding the flow is going to be useful, but depending on your platform, you're going to want to allow a library or service to implement the OAuth for you, because there are a lot of little moving pieces. Most of the time when I'm developing an API, the last thing I want to do is build a lot of plumbing code. Because it's complex and there are a lot of moving pieces, getting it wrong means you're likely going to have an insecure API, so rely on the benefit of more mature code, to mean the OAuth is going to be as secure as OAuth can be. You might also decide to integrate your OAuth using 3rd party identities, so you may use Facebook, Google, or Microsoft ID to determine who the logged in person is. Even if they are an individual person in your system, you may be using these 3rd party identities. Using 3rd party identities can be very helpful when you don't want to store your own identifies; you just want to be able to individually ID users. Users don't want their own IDs with your system anyway in most cases, they don't want necessarily to have to remember a username and password for your system and your system alone, unless it's a big part of your environment, like if you're building Enterprise Apps. Users will do it if there's a big payoff. So if you're providing them a service, especially if it's a free service that has a lot of benefit, they will want their own IDs; it's just not that common.
Summary
So let's wrap up some of these ideas. When you're securing your API, you should make it part of your original design. Don't hope that you can tack on security later, or that some higher level piece will just make it secure on its own. Don't try and just drop security on top of your API and hope it works well, think about securing it from the very first step to the last step of developing your API. You want to make sure that the default behavior is secure, that not going the extra step by developers will make it secure. That means never returning results that may be insecure. A common case for this, I've seen quite a lot, is if you have an existing system that already has data resources that it wants to return, let's say employees as an example. Even though you probably wouldn't ever write code through the API that used something like a social security number or a spouse's name, you may end up leaking some of that information through the API because you simply just want to return the same entity objects you're using throughout the system. But be sure that your APIs are secured by default by making sure that the data that you're returning back is pruned to keep data that may be fine inside your organization or behind the firewall, not getting out over the internet. This has been Securing Your API, Module 4 of the Web API Design Course. My name is Shawn Wildermuth of Wilder Minds.
Hypermedia
Introduction
Welcome to Module 5 of the Web API Design course. My name is Shawn Wildermuth. In this module we're going to be talking about Hypermedia, and what that means to you as the designer of a Web API. This module is going to cover the notions of Hypermedia, and this is going to include explaining what exactly we mean by Hypermedia, how this relates to REST and HATEOAS, what are Links, looking at some standard formats for HATEOAS, including what is HAL, and what is Collection+JSON? Let's get started.
REST and HATEOAS
So what exactly is Hypermedia? When the web was being envisioned, part of the magic that makes the web work so well is the idea that pages can have hyperlinks over to different parts of the web, so that the web becomes interconnected. And so if we look at something like a standard HTML page, we can see that using typical anchors allows us to link over to other parts of the web just by using an href. We can also use a property of the anchor tag called rel to describe the kind of link we're talking about. There's actually a formalized link tag in HTML that many of you probably already use. The link tag is often used to link over to the style sheet for your page, but this link syntax is actually used for a variety of reasons to link this document over with other documents, so I can say that there's an alternative version of this page, this is the language using the hreflang, that there's a version of this for Arabic, and this is the URL for that, and so we're linking this document to another version of the same document but that is for a different type of reader. We can also do the same thing for an alternate link to a print version of this page. There's also the notion of a type of link that can indicate what is the next or previous page in a cycle of pages? If you're doing something like an article where there's a page 1, and a page 2, and a page 3, you can use these links to indicate to the browser that there is the notion that this is a page and knowing how to go to the next or back to the previous page. So these are different ways that HTML allows us to link a single document or a single URL to other URLs by what is returned back, and hypermedia is meant to do this at the API level. So, essentially hypermedia are just links for an API. These links are essentially documentation to the developer so that they can know how to use your API. In many ways this will help achieve the goal of having the API being as self-describing as possible. In most cases, you can't have the API becoming completely self-describing, but this can really help inform the users of the API how to do different things with your API. These links become the State of your Application, and becomes the model for how to take data that you may be returning as the core of what your API does, but also indicates verbs so that you know how to insert a new invoice or delete an invoice, and allows you to include states that's not just the state of the data on the server, but also ways to take that data and do something with it. This is where the notion of HATEOAS comes in, or Hypermedia As The Engine of Application State. This is something that the REST thesis talks about as an important idea of creating these APIs that are interrelated. Unfortunately, this is a really awful acronym, and you're going to hear me saying it a lot in this course, mostly with my teeth clenched. A long acronym like this can make it a little more confusing than it needs to be. Essentially, you should think of Hypermedia as simply a way to link results with other results or operations in your API. So let's go back to a slide that you may have seen in the first module to really understand where this Hypermedia or HATEOAS fits in. We talked about how simple HTTP and remote procedure calls allow us to create verbs in the API, have URI endpoints, and that the REST-ful nature of APIs allows us to sort of layer on top of that, so that we can have resource-based URIs, verbs, and statelessness, and even caching in our APIs, much like up to this point you've probably learned. This last piece is allowing you to have these relationships between parts of your API using these things called links, and that's really what HATEOAS is adding to the picture. So taking what we've learned about creating these REST-ful but pragmatic APIs, and adding the ability for you to indicate information about the results you're giving back, and how to do things with that data.
What are Links?
So in designing your API, if you want to include links, what are we really talking about? Any links you include should be about helping the developer use the API, so that they don't have to go craft URLs and go look up documentation, when what they may be doing may be a next logical step. Some common scenarios for this are things like Paging, Creating New Items, Retrieving Associations, or other sorts of actions like updating an invoice, submitting a new work order, those sorts of things, are common scenarios that may be communicated as these links. Let's look at a simple example using JSON. Now these links aren't limited to only JSON, you could also use them with XML APIs. I'm going to use examples that are going to use JSON, but you can apply the same idea to XML as well. So here we start with just a simple begin and end of a JSON object, and we might have some data that you might be familiar with returning. So, we might have some data about the result, like totalResults or Success, and then a list of results that I've abbreviated here with an ellipse. In here we may also include a set of links that indicate things that can be done with the results. Links typically are going to at a minimum have two pieces; have an href, which is typically a URI to a specific operation, and then a rel, which is going to indicate what the link is for. So in this case we can see that we're including an href to the previous page of results that we wanted. So what this looks like, you could certainly document and force people to put in, but because going to the previous page or going to the next page is such a common occurrence, including it as links here can self-document your API, and make it easier for your developers to use. Instead of having to write code in let's say the JavaScript of a webpage to determine and to craft this URI, they can just take it as part of the results and go, oh, when someone wants the previous or the next page, I already have self constructed URI that is valid. You can also have it for other sorts of operations like insert, and in this case the URI here is going to be related to a POST instead of a GET. We don't have indications here for the method, but it's also fairly common for these links to include another parameter called method so that they know what the URI is, and what HTTP verb to use. Here's another example, but instead of being with a collection that is returned, this is going to be with an individual object that may be returned. And in this case I'm showing some data that is being returned, and then we have that same sort of idea of links. And here, we can indicate a self link, and this is very common where the item or the object that is returned will include a link to its own resource URI, so that if you needed to do an update, or insert, or delete, you'd have the URI that represents this object that you returned. You may also have other links that relate to associations, so in order to look at this game's rating, you may have a URI that indicates that this is the rating link, and then to use this to go ahead and get that additional data if necessary.
Standard HATEOAS Formats
I want to introduce an idea here that is related to the Versioning story we saw in Module 4, but ends up being important when we're talking about standard ways of looking at Hypermedia data, and that is something called Profile Media Types. The idea here is that profiles are simply the descriptions of what the data is that you're returning. This is an alternative to using the custom MIME type that we saw in the versioning section, and it's usually used in coordination with a MIME type. Profile Media Types are typically included at the back of an existing MIME type in an Accept header, and servers can return this type as the content type that was retrieved, so that your client code could automatically know these Profile Media Types. So as an example, if we were going to get this order, we could use an Accept header to say, what is the format we're looking for? In this case it's JSON we're looking for, but at the end of the JSON content type we're going to include the profile of the schema we want. This is often used to separate the idea of the type of format we want versus the object identifier or version that we want. And when we look at the standard formats, they use these pretty commonly to define the type of data that is returned, or more interestingly, the version of the data type that is returned. So there are a few people out there that are creating new standards for how to return Hypermedia data. There are a handful of them out there, but I've chosen to focus on just a couple. These standards are emerging, so they aren't final or set in stone; there isn't a single standard for you to go after, and in some ways by developing APIs now, you're hitching your wagon to a standard that may or may not become the prevailing standard. The two out there we're going to talk about is HAL, or Hypermedia Application Language, and Collection+JSON. These are the two that right now have the most community support, and they do things in a fairly different way and for different reasons, so we're going to explore both and talk about why you may want to choose one over the other. These standards are based on Custom Content Types, and then using the Profile Media Type to define the structure of the data that is returned or accepted. The Content Type defines the data formatting, and in the case of something like HAL or Collection+JSON, it's going to define which of those standards to use, and then the Profile Media Type is going to define the structure of that data. In this way it's going to keep the format and the versioning of the type of or the structure of the data separate. We saw in the Versioning chapter that using Content Types we could version what we were getting back, but invariably that was really mixing the two metaphors, we were mixing both the formatting and the versioning into a single sort of entity; this separates the two. Let's look at these two format types.
HAL
So HAL stands for Hypertext Application Language. This language is meant to be a lean Hypermedia type. It wants to be more brief than some of the other proposed standards out there and in fact the one that I'm leaning most heavily towards because I like the brevity of what it's trying to do; it's trying to do just enough Hypermedia to be helpful without inundating you with a lot of structure or forcing you to restructure your code with a lot of ceremony. It supports formats you're already used to like JSON and XML to include resources and links together. The Content Type is called application/hal+json when you're expecting or sending back JSON, or hal+xml for obviously XML. More typically, you're going to use a Profile Media Type to define the kind of data you're actually looking for; in this case looking for the order type from the Wilder Minds site. Here's a useful picture that I got actually from the HAL specification itself; you can see the link on the bottom there, to stateless.co. This defines pretty much what HAL is going to look like. It is resources that have links associated with them, and then embedded resources that may have the same structure, and this can continue down the chain. So, the top level object may have embedded resources, and each of those objects may themselves have links as well as other embedded resources. Let's look at an actual example. So here, much like the examples we looked at earlier, we're just looking at a simple JSON object. And the first you'll notice is that _links is the standard name they want to give to any of the sort of links that are being passed down as this result. Now one of the first things you'll see is the self link, and this is going to indicate that this is the URL that was retrieved, and the object that is returned represents this self URL. Now this href could be relative or absolute; in this example they are relative, and then it can include other links. Much like we saw in our earlier examples, this is more formalized, and one of the things you'll notice is that instead of having a separate property called rel, they're simply making an object that you can look up by name. This makes it a little easier to consume from JavaScript, which is one of the more common languages that you're going to consume this from. And it can even have what are called template links. This is part of the HAL Spec that actually points to a different specification for defining how to define URLs that have optional or templatable elements, and we'll look at what the templates look like in a minute, but being able to say that this is a templated URI is useful to tell the user of this API that here is a URI that's useful to you, and then you're going to be able to template it with your own data. You're still going to include data that is simply part of that return, whether it's things like total count or it's the error code, or time to execute, those sorts of things, you're still going to include a simple property, but they have the special _embedded property that's going to include the actual results, the embedded results to the sort of top level object that's being returned, and so we may have something like games or results that is the actual data that you actually asked for. And in this case, the game object itself just has basic data that you might be used to, in this case price, currency, and name of the game, but like the top level object, each of these individual items may also have links. Here we can see just a single self link, but you may have other links for things like deletion and insertion, or updating, and those sorts of links, or even links to associated data, much like we saw in the earlier example before we introduced HAL. So template links are something I want to kind of bring together and help you understand what they're all about. Defining a link as being templated is really pointing at template URIs, and here you can see the URI if you want to see the full Spec of how they're defined. HAL doesn't define what templates look like, HAL simply points at another standard that is out there for template links. But essentially, you're taking what is inside the curly braces and saying, this is where you're going to put data; in this case, this games is going to become ?query, and then whatever the data that can be supplied to query as a query parameter. So, if I was searching for games that had the name halo in them, I could indicate that here. And there's a full syntax for adding multiple parts of the query, the query may be part of the URI, it may be part of the query string, it may have multiple parts, all that is discussed in the RFC6570, so I don't want to duplicate that effort a lot here, but understand that the power of what HAL is doing with templated links here is allowing you to not only have simple links over to discrete operations, but also having links that may include variable information like a search link would have here.
Collection+JSON
The next standard we'll look at for communicating Hypermedia in your results is Collection+JSON. And this standard is really for allowing the standard reading and writing of collections; it's meant to be very self-describing. It defines a standard way to communicate lists and individual items, as well as includes UI information, so that if you're trying to fill in an object, some object that's inside of a list, you're able to know what sort of labels for input controls you should use in order to gather that information. And so, it contains a lot more metadata than HAL does. It uses that MIME type of application/vnd.collection+json to indicate that, and you can use the profile media type as well, that's very common in Collection+JSON. The Collection+JSON doesn't have a corollary of Collection+XML. Some people have sort of talked about creating that standard, but Collection+JSON is fairly tied to the way that JSON works. Well let's take a look at an example. In Collection+JSON, it always starts with that collection element as the top object, so it sort of creates one nested object below it, that's including information like what the version is and what the href of this collection is. It also has a set of links similar to the way that HAL does, but they're defining them in the typical rel and href properties, and then items in that collection are defined in another property, and each item itself has it's own href, and then the data for the items in that collection. So in this case it's just simple JSON describing the objects of the data. And each of those items can still include their own links. In this case we're saying the link is to a blog, and notice the last piece of this where it says prompt:Blog; this is an indicator to you of how you could create a link and what to show to the user. When they say prompt, they mean what is visible to a user that wants to know where this links goes. In addition to what we've seen, they also have the notion of queries, and so similar to what you saw in the templated links in HAL, queries allow you to define the kinds of different search semantics you have. Again, they're including the prompt property here, so that you know what to show to the user when they're using this link. It also includes a data section so you know what query string parameters to include. And the last one is the notion of what's called template, and the idea behind template is really to give you information about how to build a UI. So this is saying for the property that you want to create, called full-name, the prompt should be full-name, and the value should be, in this case, a string. Now, it's giving you empty values because this is the same data structure once filled in with the values you'll actually post up to the server to make those changes. Again, Collection+JSON is focused on how to create and maintain certain collections up on the web. The sweet spot for Collection+JSON is for simple machine-driven lists; this is really where it excels. So, if you want to be able to point it at an arbitrary Collection+JSON data source, you should be able to build a UI based on what they're giving you, simply from what the API is returning. I find Collection+JSON a bit too verbose for real data-driven RESTs, so as I'm creating most REST services, if I want to include Hypermedia, I'm really leaning on HAL instead of Collection+JSON, because of the additional verbosity. The idea of including these UI elements really allows you to create automated code, but I don't find that this works all that well in practice unless you're dealing with maybe SharePoint lists or the kind of information you're dealing with is fairly fixed. In practice, in most APIs I've developed, using something like Collection+JSON is just going to sort of bloat the API and not going to be as informative to my users. And at the end of the day, you just find that the payloads you're returning to your developers using your APIs end up being bloated because of it.
Summary
So when designing an API where you want to include Hypermedia to create these better versions of APIs that can be more self-describing, and help your users build systems based on your APIs in a more simple way, you can really leverage this information from HATEOAS to create better APIs. There's a lot going on in the thinking around Hypermedia and HATEOAS right now, and so these ideas are emerging. Tying yourself too much to one concept in Hypermedia or another concept in Hypermedia may make you a bit of a star in your company today, but it might end up biting you later when the thinking around HATEOAS might change. Although APIs are going to be longer lived and are going to develop, I'm not sure I would hang my hat on making sure everything adhered to a strict interpretation of what Hypermedia or HATEOAS is. If you're going to do Hypermedia, I really like HAL as being the middle ground of being this brief or lean version of the Hypermedia-driven language, but it's small and consistent. I really think that HAL is the right choice when you're doing HATEOAS today. But at the end of the day when it comes down to it, if your API is an internal one or you have a small number of users that maybe can deal with reading documentation or even you already have documentation written, I find that using Hypermedia sometimes may be more about the ceremony of making sure that I have this fully REST-ful API than purely useful, and I really want to focus on what is pragmatic when I build and design an API, not just what people say about whether it is valid, or good, or true; I'm trying to stay away from what the community thinks about my API, as long as developers are willing and able to use those APIs. Now in many cases, using Hypermedia to decorate the result of my APIs can make them easier to work with, and developers are going to be happy about this, but I would limit the amount of use of Hypermedia to the things that are only truly useful; things like paging, and insert and deletes, or associations, are really sweet spots there, but if you start thinking that every result should return a full list of every operations that could be done on some entity or item that you're returning, you're probably working way too hard to make this happen. Thanks for joining me in this Web API Design course. My name is Shawn Wildermuth of Wilder Minds.
Course author
Shawn Wildermuth
Shawn Wildermuth has been tinkering with computers and software since he
got a Vic-20 back in the early '80s. As a Microsoft MVP since 2002, he's
also involved with Microsoft as an ASP.NET...
Course info
LevelIntermediate
Rating
(1410)
My rating
Duration2h 17m
Released19 Jun 2013
Share course