What do you want to learn?
Skip to main content
by Troy Hunt and Aaron Powell
In this course, you’ll learn how to minimize the security risks that are present when working with Single-page Applications.
Start CourseBookmarkAdd to Channel
Table of contents
Managing Auth Tokens
Introduction to Tokens
Alright, so I know you're going to do a lot demos, and I'm going to ask you a lot of questions as we go along, and then probably a little bit vice-versa as we get further through. Can you show us what you mean by this implementation of OAuth tokens, and then we'll have a look at how we're doing things like storing these tokens. Yeah, so I've got a really simple demo here. It's actually using an open source identity server, and it's got some basic interactions between some APIs and how you would log in and stuff like that. So I'll just jump over to my browser, and I'm just going to log into this simple application that we've got. We'll type in username. So this is actually IdentityServer, the product IdentityServer? Yeah, so what we've done is we started off at port 5003, and we've bounced across to port 5000, which is our IdentityServer, the thing that's acting as our intermediary for doing our OAuth and OpenID Connect, and this could be like an Azure B2C, or it could be Google OAuth connections, and you can see here I can do external login if I wanted, but I'm just going to stick with the one that's built into IdentityServer. So what you're really emulating here is that you're logging onto a totally different service which is doing your authentication, and then we're going to hop back to the original one and obviously persist some sort of authentication state. Yeah, so we'll get back a bunch of tokens that represent my allowance to access the stuff that we've got secured. Okay, cool, so you can see that now we're logged in, you can see I've got a bit of information about the person that I've logged in with, that's just really basic stuff, but if I hit this Call API button, obviously I get back that stuff about that current user. So this is just some claims information, so like I said, it's a really simple demo that we've got here. but I've got a token back, right, so I probably want to do something with that token so that if I was to reload the page I don't lose everything. Now, for my sake as well, and I honestly have not seen this demo before now, when you hit Call API you're effectively emulating what would happen if I had a web application and I needed to call an API, which probably wouldn't normally return this stuff, you'd return a list of products or something like that. Yeah, you'd normally be pulling something a bit more useful than just a bunch of claims flags for what I can and can't do on the website, but yeah, it is doing something. So let's open up the dev tools and have a look a bit at what it's doing. So we'll just come over to the Network tab and hit that Call API again. And you'll see that we've done a couple of requests because it's an OAuth-based call. We've had to do an options request just to make sure that we do have permission to access this, and that's just checking the headers that we're sending and things like that, but really the meat of it is going to be in this second request, which is actually going to get back the data. You can see there's the response that we got back that's on the screen. Now just to double-check these ports again. So we're on port 5003, which is effectively our web application, Yeah. And you have made a call to port 5001, which is sort of emulating, in this case, the authentication environment, which would be the separate website. Yeah, so 5001 is the backend for our system. So if we think of this in a single-page application, or maybe a microservices architecture, you've got APIs that are sitting on different servers, they might be subdomains, or something like that, but in this case I'm just using different ports because I only have one local host, which is my machine. And the really important part of this request, from a security standpoint at least, is this authorization header that we've got down here. It says Bearer, and then a whole bunch of garbled stuff, which is the token that represents my ability to access the API that we're calling. This is what I got back from my IdentityServer. So this is effectively bearer-token-based authorization. Yeah, exactly. So that token is going to be fairly useful. Like if I reload the page, you'll see that I'm still logged in. It's obviously done something to possess that. There's a bunch of ways that we can do that, though.
Using the OIDC Client Library to Manage Tokens
Session Storage Persistence
Okay, so tell us a little bit then about the persistence of Session Storage. I mean, how does it differ to a cookie in terms of how long it lasts fo., I assume it does not get sent automatically on every request like a cookie. No, so with Session Storage, it is entirely owned by that particular browser window or tab at the lifetime of that. So as soon as I close that tab in an in-private session, which I'm running here, it will kill the Session Storage. If I was running not in private, it would be for the lifetime of that window, so until I close the last running instance of Chrome or Edge or Firefox. So it's a little bit like a cookie expiring at the end of a session? Yeah, it's a little bit like a cookie expiring, but we can't control the time that it might expire. That's probably the only downside that we get with this versus a cookie. So if you wanted to have, I mean, I know every time I log into my bank, I log into my bank, I do some stuff, I go and get a cup of coffee, I come back, I'm logged out already, because very often that cookie just expires after a very, very short period of inactivity. So you're saying here that there's no native construct to expire the Session Storage, but you would implement that in other ways. Yeah, you would implement that in other ways. Like I said, Session Storage will terminate once you've closed all browser types and all windows, or you then have to build something really custom that sits over the top of that. You sort of timestamp against the thing that you've put in Session Storage, you check is that timestamp less than that current timestamp, stuff like that. So that's obviously a really nice way that you can persist something for that session. But if you close this tab, you have another problem. We mentioned before we're talking about HttOnly cookies, and we said we flag HttpOnly cookies because we want to be, or we flag cookies as HttpOnly because we want to be resilient to cross-site scripting. If I can get cross-site scripting on your page, can I access Session Storage? Yeah, it is definitely a risk that you run with this kind of a storage model, is that this is just a global object that you've got access to, and any script that is running in context of your browser is also able to access Session Storage, so it does have a bit of a security risk that you've opened yourself up to when you're doing that. So I think this is interesting, because we're going to keep coming back to these points where there's trade-offs. So on the upside, it doesn't get sent with every request like a cookie, on the downside, we can't protect it from client script like we can a cookie, but you kind of don't want to anyway because you've got to accept for a client script so that you can attach that to the bearer token and send it off to the API. Yeah. It's all tradeoffs. Yeah, so it's what's going to be the least risk for the kind of thing that you're doing. But, as I said before, the big downside is if you close this you're going to be fully logged out, so if I was to come back I'd have to go through the login again, and that might be valuable for your banking scenario.
Remembering Identities Between Sessions
So let me give you a question, so on a website where you login and it has a little box and it says check this to remember me, how do you remember yourself in a Session Storage sort of model? So, well you actually wouldn't use Session Storage for that, we would use a different storage mechanism, which is called localStorage. So localStorage is very similar to sessionStorage. It's actually the same API, so I can do getItem, setItem, etc., but the difference is that localStorage is there for, well, basically the lifetime of your browser, until someone goes in and uses those deep, dark settings that people never actually go to, where it's clear all my data that I've stored about a web page. So we can put stuff in there, and it will last for a really long period of time. So that's how you can do persistence across logging in, logging out, or, sorry, opening and closing browser windows, not really logging in and logging out. You kind of want to _____ at the end anyway. So that has persistence in a more long-term fashion, a little bit like a cookie, but a cookie would have that persistence by virtue of a long expiration date, but localStorage has no expiration date at all, it's up to you to programmatically decide when you're going to take that out of the storage. Yeah, exactly, and that's something that you've got to obviously think about, you want to make sure that you're not leaving someone permanently signed in when they've hit the sign-out button. Gotcha. Okay, cool. So I think that's a neat interest. So we've got the two constructs, localStorage and, of course, sessionStorage before that. We've got the unload situation. Does that sort of cover us for the auth token bit? Yeah, I think that's probably a good wrap-up of how you can manage an auth token. Obviously, cookies are great if you don't need to access them client-side. If you do need to access them client-side, sessionStorage gives you that auto-expiry, but if you want it longer localStorage is your friend then.
Caching Strings and Service Workers
Defining Service Workers
Okay, so that was a good start. We have covered off auth tokens, we started with cookies, we've looked at sessionStorage, localStorage, let's move on and talk a bit about caching things in the browser and server workers. So where do you want to start there? Yeah, so service workers are obviously starting to get a lot more popular. They're appearing across all of the browsers these days, you know, Edge, Chrome, Firefox, iOS, and Android, they're all getting service workers so that we can do these sexy, new progressive web applications that everyone is talking about. But they do introduce a really interesting thing around how we manage data, because a bunch of what we're doing is we're storing data so that we can have these offline experiences. So to take one step back, do you want to define service workers? So put this in a context. What are we talking about here? Yeah, so a service worker is essentially something that's going to run in the background of a web application, and it will continue to run even when you don't have a browser open, so it's just continual background processing. It's also used so that you can do things like intercept network requests, and maybe proxy them, so if you're offline we can send back some data that you've previously cached, and that's where we can start finding some interesting challenges when we're looking at it from a security standpoint.
Service Workers and Intercepts
Alright, so this leads us to the caching bit. Alright, cool, so what are you going to show us with service workers? Right, so I've taken a demo that we had in that first module around managing our tokens and things like that, and changed it so that we actually have a service worker running. So all I've done is I've added this serviceworker.js file to my application, and it's just basically going to start up a really simple example of how to use a service worker. So it's going to intercept the fetch event. So this is the event that's happening when you're doing a network hold. So you might be calling out to an API, you might just be loading another resource which is inside of your website, an image or another HTML page, but that gives us the ability to catch that request and then do some logic with that. So you see here that I'm tracking anything that's going through to our API backend, which is that port 5001/identity, which is the API we're calling, and if we do that we're going to respond with data from cache and then update that data again so that we have the most recent version of the data in our cache for the next time we call it. So you see what we're doing here is we try and call it, we go, well, stopping you actually doing the network request I'm going to respond with what you've previously requested, and then I'm going to get the data back. Okay, so security, got a question for you. So you have added an EventListener which is looking for a request to localhost port 5001. I take it the browser security model will not let you add a service worker that listens to requests on some totally random other host name, right? You can actually intercept any request that's going to any other domain, you just get limited stuff that can be done with it. So let's say I wanted to intercept any request to Google resources, I want to intercept the Google analytics tracking codes, so I want to put them, and I just don't want to have to serve them out every single time serve them from cache. I can do that, but I can't touch anything about that. I can't change the way the request would have happened. So putting my evil hat on, because this is a fun bit, so giving that you're the one writing the service worker in the scope of your app, but you're saying that there are certain things you can't intercept with other requests, how evil can you be? Oh, you can stop requests to some of the domains happening because you're intercepting anything that's happening over the network from your web application. So it gives you the ability to say hang on a second, I might be detecting something that looks a little bit suspect. I ought to just stop that. That's an interesting use case of service workers. Okay, so I want to drill down on this a little bit, because, again, my black hat bit is now stating to go this is kind of cool. When you say you can intercept requests that might go to another host, so say it's Facebook. If you're writing an app, and you're writing an app in this case on localhost port 5001, what can you do with requests to Facebook? So, I can only intercept requests to Facebook that have been made by my web application. So if I have an Ajax call that's happening from my web application, I can intercept that one, but I can't, if you're on facebook.com, I can't touch those directly. So that's what I was hoping to get to, because what we're saying is that the sandbox that we're playing in is really just those requests that are initiated from the scope of your site. It's not like there's another tab over here and it's open on Facebook and now you can get in there and start messing with the traffic. No, you can't play with anything else that's happening with inside of your browser. It's only stuff that comes from your application as it is. Alright, okay, good, I'm glad you found that out. You define a scope when you define your service worker, so it could be a part of your application, so from a subfolder down, it could be from the whole domain, it could be on a subdomain, and stuff like that, so you can have different service workers on different subdomains and stuff like that. Okay, cool, but you're ultimately just messing with what is there on your own site anyway. Yeah, so the way that works is if we scroll down a little bit we have this caches global variable or global object that we can work with inside of the browser, and that's similar to the localStorage and sessionStorage in that it's always available. But I can open a cache there, based off a cache key, they're defined up top as a constant, and then add and remove things from that, and here's that fromCache method that we're using up above. So we'll try and find it, if it finds it, it returns it, otherwise, it's going to return a failing promise. So, how does that work? Let's jump over to our browser, and we'll hit Login and go through again. Okay, cool. So we've logged in, and now we'll Call API, and it's actually going to immediately fail. You see here I've got a failing promise no match. That's because I've previously not cached this data. Now if you remember from our service worker, it's going to immediately respond with what's in cache. It's not in cache yet, though. Okay, but there's nothing there. So it's going to return an error, but in the background it's also going to have fetched that. And to be clear, what's going to go in the cache is the response from that request, right, so that when you replay that later on it'll go, hey, we've already got it, here it is. Exactly. So if we come over to our Application tab, we'll see that we've got Cache down here, API-CACHE, and you'll see that that's on the domain and port that we are accessing. So host 5003. That's where our application is running, and that's our sandbox, so you can see that's the domain. Right, and I guess to the effect of the discussion earlier on, it's not like you're going to be able to go and access something from a completely different host. No, no, you're not. Then you can see here we have that API that we tired to call, and the response that came back. You can see that we know that it was an application.json response, you see the time that it was cached, and you can see the information here. We can even also see the headers that were sent in that request and the response that came back. Alright, so it's cached the entire thing. And in terms of that time that it was cached, is there any default expression or? No, again, like the other storage objects it doesn't actually expire, it's kind of as long as your service worker is running and things like that. But based on the time, we can see how stale it is as well. Yeah, we can. We can definitely see how stale something is. And now when we Call API again, it's really fast because we've hit cache, and we've pulled it out, and we've shown that to the user, and then it's actually updated in the background, so that Time Cached will have changed because of the new request that's being done. Okay, so while we're here, I'm looking at these requests headers, and of course all of this you're saying is being cached that was part of what went into the cache storage we see here. The authorization header is there, so our bearer token there, so that's the one that it would have previously picked up probably from sessionStorage. Correct, yeah, so that's the one that we have that represents us. Okay, cool. Tell us about Access-Control-Allow-Origin.
CORS and Access Control Allow Origin
Third-party Library Vulnerabilities
Defining the Problem of Third-party Tool Integration
Protecting Yourself from Third-party Vulnerabilities
So that sounds like a really big problem. Question then, how can we go about trying to prevent this? It's a little bit tricky in a couple of ways, because what you're sort of saying is, I want to embed something from a third party, now I think we need to break this into two categories as well, because something like browsealoud is a service. So you pull in their script, there's a bunch of other things that then happen in the background as well. It embeds toolbars, it can do things like text to speech. It's very similar, as well, to something like discuss. So I run Disqus, the commenting engine on my blog. I have one line of script that says, just go and get me Disqus, and then they do a bunch of stuff in the background, so there's that use case. There's also the use case which I think we need to separate, which is what you mentioned before about CDNs. So, for example, on a site like Have I Been Pwned, I pull in jquery from Cloudflare CDN, and I reference it very, very explicitly. In fact, let me show you what it looks like. I've got Have I Been Pwned open here, and I use jquery to do a bunch of very typical sort of things like orchestrate API calls and things like that. If we jump down and have a look at the source code, and I'm going to jump to the end of this page, we can see down here around line 364 I'm pulling in jquery from cloudflare, and there's a couple of things I want to point out here. So number one is I'm referencing an explicit version. So this is version 2.2.4 of jquery, the minified version. Now here is where this becomes different to Disqus and browsealoud. This version will never change. It's always going to be exactly that. If they add features and they release an update later on, it's going to be a new version. They're not going to change the one that's there. That is very different to, say, pulling in Disqus. Let's talk about Disqus because most people are familiar with that. That is like here's a script tag, you go away and give me the service, and you can do whatever you want within that service. But, of course, the risk we have here is what we just saw with the coinhive and the browsealoud stuff, which is that anyone who controls the script can do whatever they want. Now we'll come back to that in a moment. The defense that I've now got here is you'll see that I've got an integrity attribute off the end of that, and it starts with sha384, and then we've got a great big hash after that. So this is SRI, sub resource integrity, and what we're doing here is saying that the sha384 hash of that script, so jquery version 2.2.4, minified version, it is exactly that hash. And when my browser goes and pulls that script down and it hashes it with sha384, if they don't match, the browser doesn't run it. Right, cool, so that probably wouldn't be ideal for something like Disqus where it's doing something a lot more complex. It's great in this case where you're talking about CDNs. Yeah, and here's the rub, because in this particular situation this works beautifully, this allows you to use a CDN and have confidence in the integrity of the file. It's a lot harder with something like Disqus, but we have another way of dealing with it, which is to use a content security policy, or CSP, because a CSP then allows us to say things like, ah, okay, this website is allowed to embed a script from Disqus, and it can embed images from Disqus, and that is it. And then if someone manages to modify that script and put a request to go in and grab something from coinhive, the CSP hasn't actually whitelisted the coinhive address. Right, okay, well so that helps us so we can kind of save ourselves in both scenarios. Yeah, look, I mean we're not going to drill into SRI and CSP here, but that is the defense against this. And if you don't have SRI an CSP and you're pulling stuff in from other locations, you need to seriously reconsider that because we have just seen the perfect storm demo of how it goes wrong.
NPM Package Manager Vulnerabilities
Site Linting with Sonarwhal
Dealing with Typosquatting (And Sonarwhal Followup)
Working with the OWASP Top Ten
Alright, so there's one last thing I think we should refer to in this whole thing about dependencies, and it's the OAuth's top ten, and I've got the OAuth's top ten, and the top ten, of course, is the top ten web application security risks. I've got them open on my machine here, and as we're talking about this, I just remembered that we do have an item in the top ten that speaks specifically about this. And, in fact, if we scroll down, and this is the 2017 edition, it's the latest one at the time of recording, A9 Using Components with Known Vulnerabilities. And I'm going to skip down all the way to where that's described, and this would be something really, really worthwhile looking into for folks that maybe have not thought about this too much before. So the OAuth top ten is sort of the canonical resource for the ten most critical web application risks that you should be paying attention to. Yep. They recognize this, they've got it there in their list, and they talk about things like, they talk about retire.js actually just here, which we mentioned before, they talk about dependency check tools, and they're saying you should have a process. And the thing that struck me yesterday with that whole coinhive situation and the external dependencies is that so many organizations just don't have a process for safely embedding things, and that was the issue with the coinhive stuff, and also maintaining their versions, as well, which sort of speaks to the bits that you were just talking about.
Client-side Validation and Controls
Incorporating Client and Server-side Validation
Why You Need Server-side Validation
So let's do something similar to that just to sort of illustrate the point. And I've got this website open, hackyourselffirst.troyhunt.com. This is a website that I've used on a heap of my Pluralsight courses before. You can go to this website and hack it and not go to jail, which is really useful. So it's got all sorts of vulnerabilities, SQL injection, cross-site scripting, all this sort of stuff, as well as an implementation of precisely what we just discussed. And I thought I'd show you what that looks like. So I'm logged in this site. The site is designed for you to go and rate super cars. You can rate on the ones you like, and then it ranks them. Now, let's go ahead and drill down into this particular car, the GT-R, and we see a green message to the left of the screen. It says you've already voted for the GT-R, so you can't vote again. Now, let's go to the home page again. I'm going to pick another car, we'll pick, say, McLaren down here, there's one McLaren. I've already voted for that, too. I've got to pick one I haven't voted for. So I might go and grab the Pagani. Alright, so I can now vote for this car, but before I vote for it, I'm just going to grab this in another tab and do this, paste it over there like that. So now I've got two tabs open, both on the same car. I'm going to go to the first tab, and I'm going to make a vote, and I'm going to call this, say, Vote 1, we'll give it a comment Vote 1. We'll vote for that guy like that. Yep, fine. Scroll down to the bottom, Vote 1 is in there. Now, it says Thank you for voting, I can't vote again, right? Unless, and I know this sounds really fundamentally basic, but this is the sort of stuff that goes on in web apps. I go to this tab here, which is still showing the button. The vote button is still there. The vote button is still there, Vote 2. Vote, the Vote button disappear. Scroll down to the bottom, Vote 2 is in there, but if I give this page a refresh, Vote 1 and Vote 2. Ah, so it hasn't done that server-side validation. Yeah, exactly right. So what's happened here is we've security trimmed it. So we've said, alright, under this circumstance there should no longer be a Vote button because you've already voted it. And you can imagine that the, let's say the less sophisticated developer building this game works his design. Alright, the spec says vote once, can't vote again. I show this a lot in the workshops I run, and I talk about particularly testers. Like if a tester had a test script, and they literally just followed through the test script, they would go this works fine, but they would miss this entirely. Now, there's another interesting thing that happens here, and I might show you this one too. If I jump over to the leader board, and we'll go and grab something else I might not have voted on. Let's go to the Koenigsegg there. And then I open up my DevTools into my Network tab, and let's have a look at what happens when I vote on this one. So we'll give that a little quick comment up there, Nice, like so. Let's actually scroll down to where we can see the button. We'll vote. Watch that request go through, drill down, and now let's go to the bottom and actually have look at what got sent in that post request. Now what might you as a slightly curious person do with that? Well, there's obviously a couple of IDs there. What would happen if I was to change the user ID? Can I vote on behalf of someone else? Yeah, so that's where your problem is, you can. And, again, I think people will look at this and they'll go, who would do this, but there are so many examples of this happening. So, again, someone who built this would put it together and go, well, this is the way I've constructed it. It will always work this way because this is the way the website is designed. And they miss the fact that we could always go and grab that request. We could go to Postman or a Fiddler, or we could wget or curl it and recreate this request, because what's actually happening here is it is taking this data. It's not actually looking at the auth cookie, which is up here, we've got this great big value here, cryptographically signed value which can't be tampered with, which uniquely identifies the user. It is persisted via cookie though, which goes back to our earlier discussion. This is what we want to use to identify people. I mean, we could take this on the server-side, grab that, tie it back to an identity and go that's the person who voted. That is tamperproof. This is not tamperproof. So that's sort of a very traditional API call. Now, here's a good question for you. Is this potentially at risk of a cross-site request forgery attack, a CSRF attack? Well, yeah, you did mention that you could use curl to invoke this before. So, yeah, this is something that definitely you could be having a cross-site request forgery attack vector, because it's not really doing any kind of validation to make sure that the page that created it is the one that's submitting it.
Using Anti-forgery Tokens for API Calls
Using Anti-forgery Tokens for a Get Request
Released2 May 2018