Good Parts of JavaScript and the Web by Douglas Crockford Programming Style and Your Brain Two Systems So good morning. I'm Doug Crockford. This is Three Days of the Good Parts. Unexpectedly, JavaScript has become perhaps the most important programming language in the world, which is surprising because it's also the world's most misunderstood programming language. We'll try to fix some of that this week. So this is the outline for the course. Today is going to be a series of five lectures and I apologize for that. There's a lot of information I need to give you and the only way I know to do that is to say it. So you're going to have to listen to me whining on all day, but tomorrow will be much better. Tomorrow we're going to workshop, we're going to have fun with functions, you're going to be writing programs and it's going to be great. We're going to have a much better time tomorrow. But then on the third day we'll do some more fun with functions, then we'll also look at some very important topics, security, asynchronicity which is something you have to deal a lot with, especially in servers, and the better parts, looking at the future of this language and other languages. So does that sound agreeable? So let's begin with part one, Programming Style and Your Brain. These are two topics which appear to have nothing to do with each other. Programming style is the part of your program that the compiler ignores and some people think that because the compiler doesn't care, you shouldn't care, any style is as good as another, it's just a matter of taste. I'm going to try to persuade you that that is not the case, that some styles are significantly better than others. And then your brain. Your brain is that big wad of meat in your skull that you think with and what possible connection could there be between these two things? Well, it turns out there is a connection, a surprising and important one. I'm going to be misrepresenting the work of Daniel Kahneman, the Nobel-winning psychologist. Now Nobel does not give a prize for Psychology so they gave him the award for Economics, but he's not an economist, but he found that one of the fundamental assumptions of economics doesn't hold and that is that in any transaction a party can be expected to pursue their own best interest. Kahneman shows this is not the case if the party is a human being because human beings do not think the way economists think, we think, in fact, none of us think the way we think we think and the more you study Kahneman, the more amazing it is that we ever get anything done at all. So Kahneman has a metaphor for human thought based on the interaction of two systems, system two and system one. System two, which we can think of as the head is the higher-level one, it's analytic, it's algorithm. It's where we do mathematics and reasoning and logic. It's a really sophisticated machine. It requires a tremendous amount of effort to use and it is very slow. So we leave it turned off as much as possible. It's because that system is so slow that we had to invent computers because system two was not capable of generating all of the numbers that society needed in order to deal with social evolution so we needed to come up with machines to do that processing for us. Then there's system one. System one you can think of as your gut. It is intuitive, it's heuristic, it's associative, it is very, very fast. It requires no effort and in fact, you cannot turn it off. It's on all the time. And the idea that we have these two systems is not new and we all have this intuitive sense that we've got these two systems and sometimes you can even hear them. Sometimes if you're confused about something important, you know, you can hear your head telling you one thing and your gut telling you something else. I mean we've all experienced that. What Kahneman found was how the two systems are connected. It turns out system two gets its assumptions, its givens, its working set from system one and is not aware of that connection. System two thinks it's getting this stuff from the vault of deep truth, but it's not; it's getting it from system one and system one, because it's an approximate system sometimes gets things wrong and if you have a logical system with false inputs, you can get false outputs and it turns out we do this all the time. Visual Processing Let me give you an analogy from visual processing. Visual processing is the opposite of computer graphics. It's where you take a signal from a camera and analyze all the pixels and try to determine what are all the objects in the scene and how are they all moving relative to each other in the camera? You need to be able to solve this problem in order to walk in the world and it's a really, really difficult problem, but we're all able to solve it and in fact, all the animals are able to solve it and the reason we're able to do that is because we don't solve the general problem because the general problem is hard. Instead we solve special problems, problems which are much easier to solve, but which sometimes give the wrong answer and we have techniques for revealing the ways in which we get this wrong and we call them optical illusions. So this is an illusion designed by Edward Adelson at MIT and here we've got a checkerboard with white squares and black squares and two of the squares are labeled A and B and it turns out those two squares are exactly the same color. Some of you may be seeing that, you know, you may believe that I'm telling you that, but you're looking at that and one is clearly white and one is clearly black, but in fact they are both that color. They're both that shad of neutral gray and in fact, if I overlay my square on top of the picture, if A and B were different colors, then you should see some break in continuity, right? You should see some edge forming and you don't. Now many of you may be seeing a gradient. Who is seeing a gradient here? Seeing the colors going from black and white? Come on, yes, yes? Your brain is lying to you. There's not a gradient. It is that color; it is a solid-colored square. You know the truth of this image and yet you still can't see it correctly. It's because there's another part of your brain that Kahneman doesn't talk about which deals with inconsistency, which is a really important thing to have because if a computer system becomes inconsistent, there's a really good chance it's going to fail, right? It's just, you know, boom, and biological systems can't afford to fail when they get confused. So instead we've got this system which will reject information which is coming from the world through our senses if it contradicts our internal state and that's why you cannot see this correctly, because it is altering what you're seeing with your eyes in order to conform with your patterns of image processing. And it turns out that we do this all the time; it's not just in the visual system, it's in everything and that's why we argue so much because we literally cannot hear what the person is saying. If the other person is saying something which contradicts us, then we will misinterpret what they're saying. It's not intentional, it's just part of what we're doing in order to resist inconsistency and this is the mechanism. Oh, so it turns out none of this was news to the advertising industry. They had figured out a long time ago that they could design messages which would bypass the head that would go straight to the gut, convincing us that we have to buy things that we don't need and they have been doing this for a long, long time. What Kahneman does is provide the theory which explains why that works and I don't think anybody understood this better than the tobacco industry because you look at tobacco and well okay, so how do you sell tobacco? What does tobacco do? It makes you smell bad, it turns your teeth yellow, it makes you sick, and then it kills you. So how do you convince people, yes, let's buy some tobacco? And they do it by creating messages which are able to confuse system one. System one is easily confused about the difference between slow death and good for you. So that is the equipment that we have for writing computer programming or for writing computer programs and programs are the most complicated things that humans make. There's nothing else in human experience which is composed of as many tiny little pieces which all have to go together perfectly that have to work in real time with changing states and changing inputs in a dynamic situation. Nothing else comes anywhere close to the complexity that we have in software and it's amazing that we're able to manage this, but it's really hard. So when the first von Neumann machines starting online and it was discovering programming is really, really difficult, you know there was a lot of thinking that we need to figure out a way in order to make it easier and in fact, to make it unnecessary because otherwise the computers are not much use to us. And so that started research in artificial intelligence and the goal of artificial intelligence originally was to figure out a way to have the computers write their own programs because it was just too hard to have humans writing them. AI has had a lot of amazing successes, but they have failed completely at their original mission. They cannot give a problem to a computer and ask it to write the program that solves that program. If it were, then we would ask the computer, well write a program that's better than you and keep doing that until they become our overlords and that hasn't happened. Computers don't know how to write programs so we write them. So as a result of the failure of AI, we are still writing programs by hand. So the most important tool we have for doing programming is the programming language because one thing that computers are really good at is translating one formal language into another formal language and that's what programming languages do. They allow us to write at increasingly higher levels of abstraction where we get more leverage, where we can create more work with less effort and then the computer or through the programming language, figures out how to convert it into a form that the computer can execute. And so programming languages are really, really important. The thing that makes programming so extremely difficult is the requirement for perfection. Programs have to be absolutely perfect in every aspect, in every detail, for all possible inputs, for all possible states, for all time. And the contract with the computer is that if a program is not perfect in all of those ways, the computer has license to do the worst possible thing at the worst possible time and it is not the computer's fault. Whose fault is it? It's your fault! Why did you give us a program which you knew wasn't perfect? So you'd think, well okay, because of that requirement for perfection we would never release a program to production until we know it's perfect and we don't do that for two really good reasons. The first one is we would never release a program to production and eventually management would figure, well why are we paying these guys, they never finish, let's figure out something else, but the second reason is much more insidious and that is we have no test for perfection. It may be that someone once completed a complex program that actually was perfect. There is no way to know. There is no test for perfection. We have tests for imperfection, but no tests for perfect, so there's just no way to tell. Now it's most likely that none of our programs are perfect and so what we do is we put our programs into production, hoping that we'll discover what's wrong with it and fix it before anyone else finds out, which is crazy, but that is the state of the art; that is the best way we have figured out how to do. We're doing this work with the brains of hunters and gatherers and this is not a metaphor. Our brains have not evolved since the last ice age and there was nothing in that experience to have prepared us for computer programming. You know, hunting and gathering required a completely different set of skills so it's kind of amazing that we're able to do this at all; it's sort of lucky, but there's nothing in our evolution which has prepared us for this. So we're making use of everything we have in order to do this. So programming obviously is making use of the head, of system two, because that's where the algorithms live, that's where we do the analytical stuff, that's where we do the logic and the state and the dynamic stuff, all of that stuff, but we don't understand how we do it. You have all figured it out, but not well enough that you can write down a list of instructions: this is how you write a computer program, you do these things, and hand it to someone else and they can read it and follow those steps and then they can write programs. We can't do that; somehow we have all figured it out, but not well enough that we could tell someone else how to do it and that's why we can't tell the machines how to do it because if we could tell another human how to do it then we could tell a machine how to do it and we don't know. And much of what we're doing involves intuition because we're always making tradeoffs. It's never clear what the best approach to solving a problem is going to be and usually we're trying to solve a problem with incomplete knowledge of what the solution is going to be and so we have to guess and system two is not good is not guessing, system one loves to guess. System one is guess without knowing it's guessing, it's just great at that and you know, so we look at a problem and we'll look at top-down and bottom-up and take a macro view and a micro view. We keep constantly shifting our point of view until eventually a program emerges and we don't know how we do that. So you know, system two is obviously involved, but I think system one is involved there as well. That's part of the reason why we can't document it. Now I have no evidence to support that argument, but my gut tells me it's true so I believe it. JavaScript: Good Parts and Bad Parts That brings us to JavaScript, the world's most misunderstood programming language. JavaScript has some of the best parts ever put into any programming language. It's extraordinary how good the good parts of this language are and we're going to spend a lot of time exploring that. JavaScript also famously has more bad parts than any other programming language and in the next hour we'll talk about why that happened, but it does. Now it turns out every programming language has good parts and bad parts, but very few languages have as much goodness as JavaScript or as much badness as JavaScript. JavaScript has everything from the sublime to the completely ridiculous and the analysis I'm about to do you could do and should do on every language, but it's especially telling with JavaScript. So because JavaScript has so many bad parts in it, I wrote a program called JSLint. It's written in JavaScript, it reads JavaScript programs, and it warns you, you're messing with the bad parts and by cleaning that up and sticking to the good parts, you end up with programs which are substantially better, which are going to be more maintainable, closer to perfect, that's all good stuff and it's free and everybody should be using it. All of your code should pass JSLint without exceptions. You really want your code to be that good, but a lot of people don't want to do that because it comes with this warning, JSLint will hurt your feelings, and it's true and I hear from people constantly, crying and whining, oh, JSLint, oh, can you fix it so it doesn't complain about me, because you know, I'm awesome. And after a few years I started wondering, well, wait a minute; there's no crying in programming. You know, we imagine ourselves to be the most rational computers in the world, or the most rational people in the world, right, because we're the ambassadors to the computer, you can't BS computers, and we imagine ourselves to be totally, completely logical about everything, superior to all ordinary mortals, but it turns out we get really emotional about things, about how we write our programs. You know, questions like, tabs or spaces? It turns out there's not a lot of data to support one versus the other, but we get really emotional about it or do curly braces go on the left or on the right? You know, you can toss that question into a room of programmers and all work is going to stop and they're going to have these really emotional arguments because it turns out there is no data at all to support either argument. There's no data which shows you get better productivity or lower error rates, any important difference, but that doesn't matter. The arguments will be there and they'll be intense and they won't stop and for this I blame Ken Thompson. Ken Thompson is one of the most important programmers to have ever lived; we'll talk more about him later. He designed a language called B, which inspired the language called C, which inspired virtually all languages to happen since then. Really, really important contributions and in those languages you could put the curly brace on the left or on the right. Thompson put them on the right, it just seems right to him and his partner Richie also put them on the right, but there are other people in their lab who wanted to put them on the left and I'm sure they had a meeting and it probably went on for days and at some point Thompson said, leave me alone, I don't care, just I don't care, leave me alone, which is a shame because he could have said, I'm fixing my compiler so that you have to put them on the right and if you put them on the left it's going to be a syntax error. If he had done that, who knows how many man centuries of time we could have saved, but he didn't so we are still arguing about it and it's a really emotional argument. So there are some things that we can't agree on. Everyone will agree that you should be consistent. Now you should always put them on the left or you should always put them on the right, you shouldn't mix it up because it looks stupid and we're trying really hard not to intentionally look stupid so you don't want to do that. And everyone will agree that however they do it, everyone else should do it that way, too. That's easy, but we can't agree on what that way is. You know, so if you have someone who is used to putting them on the left and he goes to work for a shop that puts them on the right and they're going to say, well, that's how we do it, you've got to put them on the right, he's going to start to cry. He'll go no, and he's going to start trying to come up with reasons for why this is wrong and the gut is saying, this is wrong, this is wrong, and system two is going, yes, I know, that's wrong, why is that? And it starts rationalizing. It's trying to come up with reasons to support its deeply held belief, but there are no reasons and so you hear yourself saying things which are completely ridiculous and that just makes you madder and the madder you get, the more, you know, and it's just this, and it ultimately doesn't make any sense. You know, it's sort of like driving. In some countries they drive on the left and in some countries they drive on the right and there's no data to support one versus the other. You know, you don't get better fuel efficiency or lower accident rates on one side or the other, but wherever you are, it's a really good idea that everybody be driving on the same side. And this is kind of like that. We know that there should be one answer, but coming up with the one answer is really hard and leaving it to chance or leaving it to personal taste is not a good predictor of any kind of outcome and so we keep wasting time on this. So ultimately I know what the correct solution to this is, except in JavaScript and in JavaScript I'm absolutely convinced you should always put them on the right and never on the left and here is why. So one of the really nice things about JavaScript is we can have a function that returns an object; sometimes you call those constructors and JavaScript provides this really nice thing called an object literal where we can return a new object and that's very, very nice. And if you put the curly brace on the right, it does exactly what you would expect, but if you put the curly brace on the left, you get a silent error. These are the worst kinds of errors. You don't get an error at compile time, you don't get an error at runtime. What'll happen is your function instead of returning a new object will return the undefined value. So it doesn't fail here. It's going to fail somewhere downstream from here where something will go to operate on an object and it's not there and boom, the program blows up. And so you're going to have to debug this thing and you may eventually trace it back to this line and that looks right, where did my object go? And it can easily waste a lot of time on this. So let's look at what happens, let's zoom in on this. So as we're going to find out in the next hour, JavaScript was designed as a language for beginners; that was the original goal was to make a beginners programming language, but it also looks like C and C was not designed for beginners. C's syntax is really fussy. It's way too complicated, especially if you're going to give it to people who haven't had much training, so how do you deal with that? But I think the correct answer would be, design a new syntax, which is simple and clear. He didn't do that. Instead there was a feature called automatic semicolon insertion, which would attempt to insert semicolons in places where the beginners wouldn't know that they had to go and unfortunately, it sometimes doesn't put them in places where you think it should and it sometimes puts them in places where you think it shouldn't and this is one of those times. So automatic semicolon insertion will put a semicolon after that return. Now we've got all this other stuff. You would hope that something is going to snag a syntax error so we can get some warning. For example, we've got the curly brace, which opens up the object literal, except when you move that curly brace into a statement position, it becomes a block. So there's this syntactic ambiguity. Now JavaScript doesn't have block scope so there's no value in having naked, top-level blocks, but it does so that masks this error. Then we've got okay colon. Ok: doesn't look like a statement, but it does look like a label. Now the C language had a go-to statement in it which meant that you could transfer control from one statement to any other statement and we have since determined that go-to is harmful so we don't use it anymore, it does not exist in JavaScript. Hooray for JavaScript, it's better than C in that respect, but JavaScript syntax still allows every statement to have a label, even though there are only four statements that make sense having labels because they make use of labeled break and that is not one of them. So that's bad, although we've got a false statement. A false statement doesn't make any sense, but there's another thing we inherit from C which is a useless expression statement that we can take any expression, put it in a statement position, put a semicolon on the end of it and it's now a statement. Now in my view there are only two expressions that make sense in statement position, assignment and invocation. Everything else I think should be a syntax error, but it's not. So this is allowed and the compiler will look at it and go, false? Yes, you're still a constant? Good, and it will just go on, then but if it's a statement it needs a semicolon. In our problem we've got semicolon insertion so it comes in and inserts that. Now there's a semicolon at the end of the block. Blocks do not end with semicolons, but there's another feature that we inherit from C, which is the empty statement, so anywhere you want you can have extra semicolons and they all get soaked up. Then all of this is unreachable code that you can't possibly execute something that comes after a disruption, but JavaScript is okay with that. It allows anything you want after a return statement without any warning. So if you write code that looks like that, the compiler thinks you're writing code that looks like that. You might be thinking this is the stupidest thing I've ever seen and you're probably not wrong, but that's how it exists. That's what this silly language does. So if you put your curly brace on the right, you will never experience this and if you put your curly braces on the left, this is waiting for you. So in thinking about programming style, let's look at the tradeoffs here. So what is the cost of putting a curly brace on one side versus the other? Zero, right? There's no cost, it's just punctuation, there's no cost. What's the benefit? The benefit is you will never waste time on this bug. This bug will never exist in your program. Is that a good benefit? If you can get it for free? Yes, absolutely! So that's why in JavaScript I recommend always put your curly braces on the right, never on the left. You should prefer forms that are error resistant. You should be coding in a way which makes it harder to form new errors. Programming Style The switch statement is another of Thompson's things. It was modeled after or inspired by Tony Hoare's case statement which was this brilliant idea that we will have several independent clauses and we will pick one of them based on the value of an expression. A brilliant idea, but Thompson reinterprets it in the form of a computed go-to, which is another thing that was in Fortran, which we have since decided is a bad idea, except it survives in almost all modern languages in the form of a switch statement and the hazard is the fall-through hazard that in any case if you do not explicitly disrupt, it will automatically fall into the next case. And early on when I was developing JSLint someone wrote to me and recommended that JSLint give warning signs because it's really difficult to look at the code and observe that one case is falling through into another because the syntax is designed to obscure that fact. And I thought about it deeply and I wrote back to them and I said, I can understand that hazard, but there's this wonderful elegance that you can get if you can cause one case to fall into the next one and you know, that's a valuable thing, and the error can happen, but it hardly ever happens and so given you've got this good thing that can happen versus this unlikely bad thing, I'm not going to test for that, I'm not going to give warnings on it and I think this is actually a good part of the language. The next day the same guy wrote to me and said, I found a bug in JSLint. So I said, okay, good, thank you. I opened the debugger and it turns out, you know where this going, right? I had a case that was falling through and in that moment I achieved enlightenment because it turns out we spend most of our time making and fixing our own bugs. You know, we think, what I do today is power typing, it's typing as if we didn't know. We spend most of our time saying, what have I done? And then we find it and get this little rush of euphoria, oh I did that again. Boom, and we forget. And so we have all this lost time. We forget about how much time we spend chasing bugs and we fail to learn from our mistakes. But on this particular occasion it was so humiliating because I had just given a speech about how this was a good part and then I'm shown by the same guy a bug in my own code that was caused by that thing that I was defending and because it was so humiliating, I could not avoid learning the lesson, which was if I never intentionally fall through I can find the cases where I accidentally fall through and that turns out to be much more valuable and it caused me to reexamine my analysis and it turned out I was completely wrong in every aspect of it. I thought I was being so logical and measuring the tradeoffs, but I wasn't. You know where I said was arguing in favor of that wonderful elegance? No, it turns out that's a really bad thing because it causes coupling of these clauses which should be independent and that means that the code is now really brittle, but if you need to make a change to one clause, that's going to affect other clauses, which can now be spilling into it so you're making the program harder to maintain. You're making it more likely to have bugs introduced to it as a result of it's going through simple revisions because, you know, what we should always be doing is trying to uncouple things, right? But this causes an implicit coupling, which is bad, but even worse was when I said, that hardly ever happens. That's another way of saying it happens, right? And we don't want it to ever happen. It's not like we don't want it to happen very often; we want it never to happen because we want our programs to be perfect. We can't tolerate it ever going wrong. This is system one talking. System one is terrible at math. System one gives most weight than all; system one thinks zero and not very many are the same. They're not. Any time we're making mathematical arguments without any data, that's probably system one talking and there's a really good chance we're wrong because it's guessing; it doesn't know. And we do this all the time. So a good style can help produce better programs. Style should not be about personal preference or self-expression or taste; it should be about reducing your error rate. Every decision we make about how are we going to code this thing? How are we going to express this thing? It should be about how are we best going to make a program which is going to be perfect because ultimately that's the goal. Now we can get some clues about programming style from literary style. The Romans wrote Latin all in uppercase with no word breaks or punctuation and it worked for them. You can go to Rome and you can still see these letters, we call them Roman letters today because that's what they used, you know, engraved in stone, it looks just like this except they were writing in Latin, but it's the same letters. And it worked for them. They had a very powerful civilization that took over a large part of world and held it for centuries, although there are ambiguities in this. For example, I can read the third line as, now or dB reaks. It's a possible reading. So those sorts of ambiguities can lead to errors. But it worked for them for a long time until Constantine established Christianity as the official religion of the Roman Empire. At this point it becomes necessary to make copies of these documents and distribute them all over the world and the problem is they don't have originals for any of those documents. All they have are copies of copies of copies of copies and it turns out no two copies agree, that every time they are copied they mutate. And that's a problem if you're trying to establish an institution that derives its authority from the word and nobody can know for certain what the word was. So medieval copyists introduced lowercase, word breaks, and punctuation into their manuscripts and these innovations helped reduce their error rate and made it easier to copy a manuscript without introducing errors into it and that was a really important innovation. It also unexpectedly made the manuscripts easier to read. So when Gutenberg begins printing with movable type, he copies those conventions and we're still using them. We now have centuries of experience with literary style using lowercase, word breaks, and punctuation. In a particular way, we have all been trained since we were children to read and write in that style and that turns out to be really valuable that you can easily tell the difference between good writing and bad writing by, does it conform to this or not? So good use of style can help reduce the occurrence of errors. One of my favorite style manuals in English is The Elements of Style. It's a little pamphlet that was self-published by William Strunk about a hundred years ago. It's since been updated by E.B. White and it needed to be updated because English has continued to evolve since then, but most of Strunk's advice is still really good and a number of computer scientists have used it as a model for writing books on programming style very, very effectively. So programs must communicate clearly to people. We should think of our programs as literary works and it's at least as important that they communicate what they do to people as they do to the machine. It's not good enough to write something that's really sloppy, thinking that well, if I can get it past the compiler, I've done my work because if the program is ever going to be used more than once, it's going to be necessary for somebody to be able to understand it and make it better. And so we need to make the programs communicate what they do in order to allow us to do that, otherwise we're setting ourselves up or others for failure. Composition So we should use elements of good composition wherever applicable and I think we should, for example, use a space after a comma and not before a comma. That's how we do it in literary style. In a programming style we should do the same thing unless we have clear evidence of a benefit that occurs from that difference. Because we already have all this education locked into our brains about those sorts of patterns, we can use that to make it easier for writing our programs. Now programming requires significantly more precision than writing does so we'll need additional rules. For example, I propose that we use spaces to disambiguate parents. We use parents to indicate parts of statements, to indicate invocations and function definitions and we grouping and so to help us to distinguish those cases, I propose a rule, there is no space between a function name and a left paren and there is always one space between all other names and a left paren. So by that rule, these forms are all wrong. You know, for example, return is not a function in this language so we shouldn't make it look like a function invocation. Now you could argue, well, someone could easily figure out what these mean, right? So it doesn't matter, but it does matter because you want the person who's reading your program to understand what the program does; you don't want them to be mentally correcting and proof-reading you because that is distracting them from the much more important work of figuring out what your program does and how do we make it better? So one of the good parts in JavaScript is the immediately invocable function expression and we'll look a lot at these much more later today and unfortunately, the design of JavaScript didn't anticipate that this was a feature of the language. So here we've got a function statement and we want to invoke that function immediately, but we can't, it's a syntax error. Someone figured out that we can wrap the function in parens. That function is now in expression position and now we can immediately invoke it, but then you've got the extra invoking parens hanging off on the end like a pair of dog balls. So we can do better for the reader by wrapping the entire invocation expression in parens. So now the goal is not to just get it past the compiler, but to give a little bit more information to the reader so that when the reader sees a function and the closing parens wrapped in parentheses that means something. That means what's important here is not the function, it's the result of the function and just by moving the paren one space over, we give a little bit more information to the reader and that can be helpful. And again, we do it at no cost. So remember earlier I warned you about automatic semicolon insertion and how it sometimes it puts one in places where you think it shouldn't and sometimes doesn't put them in places where it should? This is one of those times. So here we've got an expression statement. Someone left out the semicolon, expecting that the system was going to add it, but in this case the system is not going to add it. So instead of assigning y to x, we're going to assign the result of calling y as a function, passing the result of the other function as its argument, which is not how anybody would read this, but that's how the compiler reads it and so I recommend never rely on automatic semicolon insertion, always put the semicolons in in the right place. If you're not sure where that it is, JSLint will be really happy to tell you where they go. JavaScript has a with statement which was modeled after the Pascal with statement which was intended to make it easier to write certain kinds of function access or object accesses. Unfortunately, it wasn't designed correctly. So here we've got a with statement and here we have four statements that it expands into. Can anyone guess which of these that will do? Anybody? I think it's the first one. Actually, it's a trick question; it could be any of them. There's no way you can tell by reading the program which of these it's going to do and in fact, it's possible that every time that statement executes, it could be a different one. It depends on o? Yes, it depends on o, and since it can't know what o is at compile time there's no way to read the program. So we want to be confident that our programs are perfect but how can you know that your program is perfect if you can't even read the program and know what it does? So for that reason I recommend don't ever use the with statement. You don't need it, leave it out, you're better off without it. Now there are a lot of very clever uses for the with statement and the people who discover these clever uses think that they should be entitled to use it because it's so useful, but I'm not saying it isn't useful; I'm saying there's never a case where it isn't confusing and confusion is the enemy. Confusion must be avoided. Confusion is when a program appears to do one thing and it actually does something else. Another word for that? Bug. We don't want bugs. So I want to eliminate as many sources of confusion as possible because I'm trying to attain perfection. I may never get there, but you know, that's where I'm going. I want to try and make programs that are perfect. JavaScript has a double equal operator. I have problems with that. I think it should've been called equal, but that's another thing that Thompson did to us, but unfortunately, even worse than what Thompson did, it does type coercion. So it first looks at the two types of its arguments and if they're not the same type, it will convert one or both of them to another type and then compare that result and as a result, you get false positives and it also means you lose transitivity. Transitivity means that if two things are equal to the same thing then they should be equal to each other and that's not the case here. So when JavaScript was developed this was thought to be a good idea. It turns out it's not so the inventor of the language realized he had made a mistake so when it went to be standardized, he went to the standardization committee and said, I screwed up the double equal, but I know the fix for it; we just have it not do type coercion. I've already tested it in the Netscape code base and it actually runs a little faster and it's a little smaller so that's good, you know, this is a big win. So let's do this right and the committee said no, no, we're going to keep it the way it is, but we'll offer you a triple equal as a compromise. So a lot of people don't like triple equal because instead of having to go ==, you now have to go === and it's like yes, triple equal. Always use triple equal. JSLint will tell you that? Yes, JSLint will tell you for sure, yes; JSLint will recommend you use triple equal. Will it fix it? Oh, it won't fix it. You have to fix your own code. JSLint is very happy to tell you about all the terrible things you've done, but it's on you to make it right, as it should be. You know, there are some cases, and because that double equal accidentally does exactly what you want, you know, should you use it then? And I recommend, no, don't even use it then because you don't want the person who is reading your code to have to ask, did he find the one case where double equal does exactly what you want, or is he just incompetent? You don't want people asking that question when they're reading your code, right? You want them to understand what it's doing and go, yes, this is well written stuff; I can make this better. So if there's a feature of the language that is sometimes problematic and if it could be replaced by another feature that is more reliable then always use the more reliable feature. This is a surprisingly controversial statement, but I think it's obviously true, right. So this is a feature that was added in ES5, multiline string literals. There are lots of other languages that do this in exactly the same way. I think it was a mistake. First because it breaks indentation, because a continuation has to be over on the margin and indentation is really important for understanding the composition of our programs, right? Because we've got nested functions, nested objects, and nested blocks. That nesting is critically important and this breaks indentation and it does impair readability and that's a bad thing, but worse than that there is this syntax hazard where this statement is correct, this one is a syntax error, can anybody spot the error in the second statement? Is there a space at the beginning of the second line? No. There's a space after the backslash. Yes, exactly. There's a space right here. Can't you guys see it? Yes, I mean it's obvious once it's pointed out, right? But you know, I'm trying to make programs that are perfect. I want to be able to look at it and go, yes, that is perfect and I can't so I don't want to use this. You know, we spend a lot of our time searching for the needle in the haystack. We can cut that time down if our programs look less like haystacks. So avoid forms that are difficult to distinguish from common errors. This is another one due to Thompson. He allowed you to put any or allowed you to put assignment expressions in condition position of an if statement. Java got this right. Java says you have to put something that evaluates to a Boolean, but then JavaScript got it wrong and went back to the COA and said you can put any expression in there. So this statement appears to do what this does, but it actually does what this one does and so you don't ever want to write this one. You want to figure out which one you mean and write that instead. Make your programs look like what they do and don't make your reader guess did you get it right or not. Scope Scope is one of the most important inventions in programming. We got it from ALGOL 60. Almost all modern languages have block scope. JavaScript doesn't; JavaScript only has function scope, which turns out is not a bad thing, that function scope is sufficient for writing good programs. The problem is that most programmers who are writing in JavaScript were trained in Java or C or other C-ish languages and expected to have block scope and the rules about where you declare variables are different in block scope languages versus function scope languages and that confusion can cause errors. And this is because of the crazy way that the var statement works. We'll look more at var this afternoon. So because of this craziness I recommend declare all of your variables at the top of the function because that is where JavaScript actually will declare them and so if you put your declarations there, that increases the likelihood that who is ever reading your program will correctly understand what the program is doing. I find the place that gets people the most confused is in the induction variable of the for statement. They really want to say var I there and I recommend no, don't even do it there, move the var I to the top of the function, and they get really upset and they start screaming and they say, but that's not how you write it in Java and I say, write in the language you're writing in. That Java is a different language, they look similar, but it's a different language with a set of bad parts. In JavaScript you don't want to be doing that because it doesn't do what you imagine it does. In ES6, which was published in June of this year by the ACMA General Assembly, there is now a let statement in JavaScript, which is now starting to find its way into implementations and when that becomes totally ubiquitous, I will recommend stop using var, use let from now on because let will respect block scope and lets your code excrement on IE6 because let is a syntax error on IE6 and your code won't ever run or on IE7, or on IE8, or on IE9 or IE10 or IE11, but if you only have to run on Edge 2 or whatever the next one is going to be called, yes, let! Otherwise keep using var. Does the current Edge allow let? I don't know. Okay. It might, I don't know. We'll all be good in 2016 because Microsoft is dropping support for all those IE browsers. And none of your customers will be using those browsers. Yes, that's the problem, so Microsoft said we're not supporting you guys anymore, but that doesn't mean that the world is going to stop using it so we're going to be stuck with IE with a while longer, I'm afraid. So global variables are evil in all languages. They cause coupling of things and accidental collisions and security hazards and all sorts of badness and unfortunately, in the browser the use of global variables is required because there's no kind of linkage mechanism that allows one compilation unit to find another. They just all share a common global space, which is crazy. So because of that I recommend in browser applications, use as few global variables as possible and when you do, name them all in uppercase because you want them to really stand out as something that is dangerous and important. JavaScript has a new prefix that was modeled after the C++ and Java new prefix. I don't recommend using it ever, but because one of the hazards with it is that if you forget to put the new prefix on the front of a constructor function, instead of creating a new object, it will clobber global variables and happen to have the same names as the instance variables you're trying to initialize, which is awful and there's no runtime warning, no compile time warning, it's just awful. So because of that we have a convention in JavaScript that constructor functions should be named with InitialCaps and nothing else should ever be named with InitialCaps. This is the only clue we have as to what requires a new prefix and what doesn't. This is something which is allowed in JavaScript, but it doesn't mean what people expect. So this appears to do what this does. It's going to create two variables, a and b, and initialize them both to 0. What it does instead is it initializes b to 0 and create a, which will initially also become 0, but this one will not be a locally scoped variable. This one will be a global variable. One of the big design errors in JavaScript is if you do not explicitly declare a variable in a function, JavaScript assumes that you intended for it to be a global variable, which was something that was done intentionally to make it easier for the beginners because often they didn't know what variables were at all, but it makes it much harder for you because if you're not really careful, any of your variables could turn into global variables where they could easily get stepped on. So again, recommend one of these you mean and write that instead. Write in a way that . . . Is it the same way you can declare multiple variables on an var a, b, c? Yes? Do b and c ever become global variables if you do that? No, in this comma form they're okay; the problem is if you use assignment to do that. Alright, okay. So write in a way that clearly communicates your intent. This is another one of Thompson's. So the B language was modeled after BCPL, which was a brilliant little language. BCPL was the first curly brace language and BCPL got its if statement right, that the parens around the condition were optional and the curly braces around the consequence were required. Thompson reversed that. Thompson made the parens around the condition required and the curly braces around the consequence optional because that looked more like Fortran, which was more in the style of the day. So as a result, this statement appears to do what this one does, but actually does what this does, that it's going to call c unconditionally and someone reading this statement could easily think that c is only going to be called conditionally. That's a bug, right? And so because of this, I recommend always put the curly braces in every time, on every if, on every else and every while and every for. Every time put the curly braces in; it's just two characters, duh-duh, and it's done and what that does is for very low cost it makes your program much more resilient. It means that if someone else is going to come in and add to your if statement, they're going to have a much greater likelihood of being able to do that without introducing an error into it, which is a really good thing and if you are leaving the curly braces out, you are setting your coworkers up for failure, which is inexcusable and unprofessional. So always put them in. And I hear from people all the time, but you have go, duh-duh, and it's so hard. It's just two characters, it's just two lousy characters and they make your program so much better. That as our processes become more agile, our coding must be more resilient because our programs are never finished, right? There are always going to be constant states of improvement and we need to code that way. Code our programs so that they are more easily improved over time. Bad Style So this is the last thing I'm going to complain about Thompson and again I have to appreciate Thompson. Thompson gave us Unix, Thompson gave us B, which led to all of our modern languages. Thompson gave us regular expressions. Thompson gave us UTF8, which I think is one of the smartest things I've ever seen. Thompson has given us some amazing gifts and as professional programmers we all owe a tremendous amount to Thompson. Unfortunately, Thompson had terrible taste in programming languages, in programming language design and even worse than that, he was extremely influential so all of us have been reading Thompson's designs from our first hours of programming and we've been doing it so long it all looks right to us. We cannot see how bad this stuff is and how much it compromises us. So for example, the ++ operator. This was added to B for doing pointer arithmetic and we have since determined that pointer arithmetic is harmful so we don't do it anymore. Modern languages do not support pointer arithmetic. The last popular language to have pointer arithmetic was C++, a language so bad it was named after this operator, but the operator refused to die. It's still in all of our languages, even though we don't need it to increment pointers anymore. So now we use it to add 1 to variables, which is completely unnecessary, but we do it and unfortunately, it leads to a style of programming in which you try to write one-liners, in which you try to take a whole bunch of stuff and try to smash it down into one line and that leads to really bad code, stuff which is very hard to maintain, very hard to correct. We've seen security errors, buffer overruns, those sorts of things. This operator is always implicated in those sorts of security errors and I've found in my own practice, any time I use ++ anywhere, this thing takes hold of me and I can't control it and it makes me want to take code and try and mush it down to one line and even though I know that's a stupid thing to do, I can't control myself. It's just this thing takes hold of me and I start writing really stupid stuff, thinking I'm being really smart. So eventually I had to stop because I couldn't do it a little bit, I had to stop completely. So I said no more ++. From now on it's += 1 and I can relax and it's easy. I can just write good programs now, you know, food tastes better, it's just everything is great and for a while I thought it was just me, but now I'm recommending everybody += 1 all the time, every time, += 1, it's great and so much better. But I hear from people all the time saying, but I want to be able to write x++ because it means the same thing and instead of having to go uh-uh-uh and again uh, and you know I can't go uh-uh! I don't have that kind of time! Except that the typing time is irrelevant, it is completely irrelevant. We don't spend our time typing, but more than that it's ++x which means about the same thing as x += 1. So any time I see someone writing x++ in the increment position, I have to ask, does this clown understand the difference between pre-increment and post-increment? And it means I have to look at every ++ in his program ask, did he get this one right, did he get this one right? Because it's a little dyslexic thing, which it's really hard to tell when you've got them reversed and it causes an off-by-1 error that's only off for an instant, but that's enough to cause a bug and it's really hard to debug those things. The argument in favor of ++ is that it improves readability, which is BS; it does not improve readability. It improves ambiguity, it improves confusion, which are things which are not desirable. So I was reviewing some code and I saw ++x; ++x. So what's going on there? So it's possible that it was a copy and paste error, except the code seemed to be working, so more likely what had happened was someone had done a ++x and then someone else noticed oh, there's an off-by-1 problem here, so they did it again. You know, if the original code had said +=1 then the obvious solution would be +=2, right? And it raises a question, why do we think we need completely different syntax for adding 1 to a variable than every other value? How does that make any sense? The answer is it does not make any sense, but there's this emotional attachment we have to bad grammar in our languages that makes us feel, you know, ++, that's who I am, it's part of who I am. If you take ++ away from me, what am I? What's left? So for no cost, by adopting a more rigorous style, many classes of errors can be automatically avoided. So it occurred to me, so who is writing with bad style? So I came up with four classes of bad stylists and I have to confess at various times in my career I have been all of these people. So first are the under-educated. We see this a lot in web developers. We have people who are writing JavaScript who received no formal training in JavaScript who taught themselves how to do it by doing a view source of some of the worst code ever written, copies of crap going all the way back to Dreamweaver, just awful stuff, and nobody ever told these people, you can do this stuff well, right? They've never seen good examples, it's just sad. Then there's old school. You know, people who are extremely skilled and experienced in one particular language and now circumstances force them to work in another one where they don't have that same knowledge or experience. We see this a lot now in JavaScript that Java and C++ guys are having to go into JavaScript because that's where the jobs are and they're really bitter about it. They say, okay, I'll write in your JavaScript, but there's no way I'm going to know what I'm doing. That's principle. For some reason people feel really good about not knowing what they're doing with JavaScript, but in fact we know that programming is complicated business and you never want to undertake it in ignorance, but for some reason we feel really good about being ignorant about JavaScript. Then there are the thrill seekers. These are guys who found out, hey, did you know you could put your semicolon at the beginning of a statement instead of at the end and sometimes it works? Ah! And so they just write this stupid-looking crap just because, ah! I think people like that should have to wear helmets when they're coding. And finally the exhibitionists. These are people who will study the standards, study the implementations, find all the weird corner cases and there a lot of weird, bizarre corner cases in this language and intentionally use them in everything that they write. Stuff where you look at it and go, what the heck is that supposed to mean and they're proud of it. They go, look at that! You have no idea, but they're doing that, and they'll say that was intentional, I know what I'm doing. And I say no, if you knew what you were doing you would not be doing that. Performance So I need to say something about performance because I sometimes hear the need for performance as the excuse for why we can't good style and it turns out those arguments are all wrong. Performance-specific code is usually crufty because we're removing generality from the code. We're increasing code or paths through the code, which makes testing and maintenance all much harder to do. Clean code is much easier to reason about so we should try to keep our code as clean as we can for as long as we can. Donald Knuth of Stanford said that premature optimization is the root of all evil, which is true. We should not attempt to optimize any code until we have measured it. We should then do the optimization. We should then measure again and if we did not get a significant improvement then we should consider the change to be a bug and back it out. If the reason we added cruft to the code base was to make it faster, if we didn't actually obtain the significant performance increase then it fails and we should remove that code. So measure twice, cut once. It turns out most code has a measurable impact on performance. So we should only bother with optimizing the code that is taking the most time. You have only a limited amount of time if you are doing optimization so you don't want to waste that time optimizing code that doesn't matter. You need to optimize your optimizing. And finally, algorithm replacement is vastly more effective than code fiddling. So if your code is slow because you've got an n-squared loop, fiddling with the details of the inner loop will have no effect on significantly increasing n. The only thing that could work is replacing that code with a different algorithm and maybe a log-n algorithm, which can then give you a much larger n and that kind of change is much easier to do if the code has not already been pre-optimized. Programming is the most complicated thing that humans do. Computer programs must be perfect and humans are not good at perfect, especially me. I am a deeply flawed human being, but I'm a pretty good programmer, but it requires a tremendous amount of effort and discipline. So designing a programming style demands discipline. It is not selecting features because they are liked or pretty or familiar; it's because it helps you to reduce your error rate. The alternative is spending more time in the abyss. You know what I'm talking about; that's that cold, dark, hurtful place we descend into where we go and battle the demons and kill the bugs. When I first started programming it was wonderful. I was having this epiphany. I thought I had this new way of understanding everything in the world and I thought it was great and I thought it was great and I thought everybody should learn how to do this because this is real exciting stuff. I don't believe that anymore. I think that we are able to do programming because there is something seriously wrong with us. Normal people can't do this. If a normal person goes down into the abyss, they come right back up and say, I'm changing majors, I don't know what's wrong with you people; I'm out of here! We're able to do this because of this amnesia thing that we suffer from that we forget how much time we spend down there and how awful it is when we're down there and that's what allows us to go down there again. Also, we're tremendous optimists, that we think that we can actually go down there and win and come back and that's a great thing. You know, I don't think you can be an optimist and be a programmer, or you've got to be an optimist to be a programmer, it's just necessary. That's also why we can't schedule worth crap, right? How long is that going to take? And you go, oh, well it's about that many keystrokes, so it should take about that long. We forget about that, right? That's where most of the time goes. So if you want to be more effective, more productive, figuring out a way to stay out of there, that's the big win. Figuring out how to reduce your keystrokes is irrelevant, right? You want to optimize the thing that's taking your time and the thing that's taking your time isn't the typing, it's that. So the JSLint style was driven by the need to automatically detect defects and that forms that can hide defects are considered defective, even if they are not bugs of that instant. So when I was developing JSLint there was a thing called comp.lang.javascript on the Usenet and there was an endless stream of people saying, my program doesn't work; can someone tell me why? And so I would take them and I'd throw them in JSLint and sometimes they'd go, there it is, and sometimes they'd go, I couldn't, you know and why is that? And sometimes it was because they were using forms which made further analysis of the program impossible and eventually reluctantly I came to decide that if you're writing that way and you don't need to, then that's a problem because we want to make our programs, make the errors stand out and you can do that by writing better all the time. So the idea of subsetting the language is not an original idea. For virtually every language you want to subset it. It's been said only a madman would use all of C++. It's also been said only a madman would use C++, but that's for another time. You know, and there's something nice about having everybody using the same subset. Every team writing C++ will figure out how much of the language they want to use and that's fine until they now have to take over someone else's code and go, oh friends, crap, you know and have to deal with that stuff. So if we could get everybody using the same subset, that makes interoperability a lot easier. So there will be bugs. I'm not promising that you're going to be bug-free if you adopt a better programming style, but you can move the odds to your favor and you definitely win and you do that at no cost. And Then There Was JavaScript History of JavaScript And then there was JavaScript. So to recap, first there was the big bang, then there was the dawn of man, and then there was JavaScript. So I was the first person to recognize that JavaScript had good parts. That was the first important discovery of the 21st century and when I announced my results, they were met with wild skepticism. No way! There could not possibly be any good parts in JavaScript, but it turns out that my results were validated and in fact JavaScript has some very good parts. This is the most important discovery of the 21st century and you made this? Yes, I did that. It's done. So to give you some historical background, at the National Center for Super-computing Applications at the University of Illinois, Urbana-Champaign, there were a bunch of kids who were developing a client program for the internet. At that time there were a number of protocols that were being considered for the way that the internet was going to deliver information to people. There was WAIS, Archie, Gopher, FTP, Finger, and a few others, and the World Wide Web and these guys didn't know which of those was going to win so they wrote a program that worked for all of them and they called the program Mosaic. And because of the way that they implemented the viewer for the World Wide Web, the World Wide Web won. And the thing that they did was they came up with something called the image tag, which allowed the web to display images, which is something that the other formats couldn't easily do and because the web could display images, it could look like what you wanted it to be, even if it wasn't what you wanted it to be and that was sufficient. And that allowed the web to win and everything took off from there. A bunch of the people from that project were lured to California where they became part of a company called Netscape. Netscape made the first commercial web browser called Netscape Navigator and it was a huge hit. It kind of disrupted everything and they were then planning what to do for Navigator 2 and they added a bunch of new features including support for electronic commerce. They also wanted to make it easy for end-user programming. They remembered something that had been on the Macintosh called HyperCard. HyperCard was a simple application creation program based on a simple metaphor of stacks of cards and it was an event-driven script thing and it was remarkably easy to use and they wanted something like that in the web browser. So they gave that job to this guy. This is Brendan Eich, a very smart guy; he'd been a kernel hacker at Silicon Graphics. His idea was he would write a scheme interpreter to do this and he was told, no, don't do scheme, do a language that people like. You know, make it look like Java or Visual Basic, something popular; this is for the kids. So he was given 10 days to create a prototype of this new interactive browser and in those 10 days he designed and implemented a new programming language, which is an amazing achievement. So from Java he took syntax. In fact, most of the things that are wrong with JavaScript are things that were inherited from Java. There is a language called Scheme, which is a dialect of Lisp that was developed at MIT and Scheme has lambdas. That's what Scheme calls its functions and he took Scheme functions and he put them into his language and there's a dialect of Small Talk called Self that was developed first at Xerox Park and then later at Sun Labs. Self took Small Talk, which was the first modern classical language, object-oriented language and made it better, both better performance and more expressive, easier to use by removing one feature from Small Talk. It's uncommon where someone makes a new language by removing things from another language; usually it's adding more stuff. The thing they removed from Small Talk was classes. By removing classes they could make it faster and they could make it much better to program and so he took that idea and put it into this language and Netscape called LiveScript. Now while that was going on, there was another language that was being developed by a guy at Sun named Jim Gosling. He started with something called Green Talk. He was then moved into a new company that was developing set-top box applications, there his language became called Oak. That company failed so he was brought back into Sun and they tried to figure out, what do we do with this language now. The internet was becoming popular, the web had become popular. They wrote a web browser in this language. That browser called Hot Java was wildly successful, at least for a short time and the language that they wrote it in had its name changed to Java and it became wildly successfully. So much so that Sun was making noise that the Java language was going to be the future of software, that if you design all of your programs to target the Java virtual machine instead of the operating system, we can be liberated from Microsoft and that was a wildly successful message and you know, Java shoots up like that. It is the most successful launch of a new programming language in history, it's amazing. At Netscape, they're making similar claims. They're saying, if you design your applications to target the web browser, it doesn't matter what operating system we're on and again, we can be liberated from Microsoft. You know, these two companies realized, if we're both going after Microsoft, we'd probably better work together because if we don't, Microsoft will play us off against each other and we'll both lose. So they form an alliance and the first thing they agree on is that Netscape adds Java to the web browser and in exchange for that, Sun will drop their Hot Java browser, which wasn't very complete anyway. So check, that's easy to agree to. Step number two. Sun says you have to kill LiveScript because that's an embarrassment. We're saying that Java is the last programming language you'll ever need, you can't then also be introducing another new language; you're just making us look bad, so kill it. Netscape refused to kill it for two reasons. One is they wanted a language for beginners and Java isn't that language. You need a lot of specific knowledge about Java just to write Hello World and they wanted something with a much lower barrier to entry, but there's also a practical problem. They wanted to launch the new browser right away and so the way they put Java in was they had Java talk to LiveScript through an interface called Live Connect. So LiveScript could talk to the browser, Java could talk to LiveScript through Live Connect and if they took LiveScript out, Java wouldn't work and so in order to get Java in there they'd have to delay the launching of the new browser and they didn't want to do that because they were on internet time and they couldn't afford to wait that long. So their alliance was at an impasse and almost failed when one of the founders of Netscape, maybe as a joke, suggested that they change the name of LiveScript to JavaScript and that they position it not as a new language, but as a subset of Java, interpreted Java, which was silly because Java was interpreted Java. It was Java's stupid little brother. So they went out and held a press conference in which they lied about the relationship of these two languages and echoes of that lie still reverberate pretty loudly through the internet. Meanwhile, Microsoft has noticed that there are these two companies in California that are getting ready to destroy Microsoft and they weren't ready for that yet. So Microsoft had completely missed the web and the internet. They thought the future of telecommunications was going to be fax and cable TV. So they went out and they bought a browser company, it was another spinoff out of Illinois called Spyglass, took their thing, relabeled it as Internet Explorer and decided we need to get one of these JavaScript things, too. So they put a team on reverse engineering the first JavaScript engine. Now it turns out 10 days is way too short a time to design and implement a programming language and there are lots of errors, lots of bugs, lots of design defects, lots of blunders. The Microsoft team discovers and carefully documents all of them and replicates them. Now usually when Microsoft goes to knock something off they can't help doing their own thing to it. For example, when Bill Gates told them, I want a Macintosh, they built Windows, okay? He didn't ask for Windows, he asked for Macintosh. That's how it goes there, but on this case they got it exactly right and in fact, they were able to keep the write-once-run-everywhere promise that Java was failing to keep and in fact, had they not done that we wouldn't be talking about this language today, but we'll get to that later. But they couldn't call it JavaScript because Sun had claimed ownership of the JavaScript trademark even though they had nothing to do with the development of the language because they claimed to own the word Java and in fact, they were shaking down coffee companies for daring to have the word java in their URLs, and this is true. So they called it Jscript, just as they called their implementation of Java J++ because they couldn't get a license to use the trademark. So now Netscape is alarmed that Microsoft is going to embrace and extend us, we're going to lose control of JavaScript, what are we going to do? We need to make a standard. So they went to W3C and said, W3C we have developed a programming language for the web, we'd like you to help us make it a standard. Now it turns out, W3C had been waiting for a chance to tell Netscape to go to _____. So they told Netscape to go to _____. So Netscape then went to ISO. Eventually they end up at the European Computer Manufacturers Association, which is a long way to go for a California software company, but ECMA agrees, yes, we'll help you make a standard. So they convene a committee. Microsoft joins the committee, Microsoft dominates the committee. Basically the standard is based on Microsoft's notes that they made in reverse engineering the original JavaScript interpreter and they insist that all of those bugs, all those blunders, and all those defects remain in the standard, where they still exist today. So there's a lot that's intentionally wrong, which will never be repaired. Now when they went to publish it was, what do we call the standard? They can't call it JavaScript because that's a Sun trademark so they kick around other titles, webscript, netscript, they can't agree so it's published with the working title ECMAScript which is maybe the worst name every put on a programming language and confusion still exists today. What's the difference JavaScript, Jscript, and ECMAScript? There are people out there who think that they are three remarkably similar, but different languages; no, they're three silly names for one silly language. So JavaScript is a standardized language. The edition that is in most versions of IE is the Third Edition which was published in December of 1999. Before that one was finished, work began on a Fourth Edition, which went on for 10 years and was abandoned. In 2009, a Fifth Edition was published, which defines two languages, a Strict and Default. We'll talk more about that dialect this afternoon, and then recently the Sixth Edition was published, which will eventually find its way into browsers. We'll talk more about that on the third day. So where do bad parts come from? I think there are three sources. One is legacy. A lot of what happens in any language is repeating mistakes that happened in previous languages and most of what is wrong in JavaScript is not unique to JavaScript; it's wrong in lots of other languages, too. For some reason it just looks more stupid in JavaScript. I don't why that is the case, but it is clearly the case. Another is good intentions. There were a number of features that were added to the language to make it easier, which failed to make it easier. Things like width, semicolon insertion, type coercion, implied global variables, were all intended to make the language easier for people, but they actually make it much harder, at least for professional programmers. And then haste. Ten days is just way too short a time to design and implement a programming language. Brendan did not intend for his proof of concept to be shipped as a product, but Netscape did that. I think that was inexcusable and he has suffered for that; it was an unfortunate thing. So for the most part, you know to give you a comparison, maybe the best designed programming language in history was Smalltalk-80. Xerox Park spent a decade designing, refining, testing, improving that language. They spent almost as many years as Netscape spent days in designing that language and I think Xerox got it right even though the community did not agree. So for the most part the bad parts can be avoided, which is great and the problem with the bad parts isn't that they're useless, it's that they're dangerous. Objects So JavaScript is an object-oriented programming language, but what it thinks an object is is different than most other object-oriented languages. In JavaScript an object is simply a dynamic collection of properties. Every property has a key string that is unique within that object and that key string will be associated with any value. You can get a property from an object by using the dot notation or the bracket notation. They can be used in some cases interchangeably. If the bracket contains an expression which is a string, which is the same as a name, then you'll get the same property. That makes it possible to dynamically get at things without use of a reflection API. You can set properties of an object simply by assigning to those properties. You can add new properties or modify old properties simply by assignment and you can also delete properties from an object using a delete operator, although I very rarely see this used. Generally once an object is made, it keeps its stuff. One design mistake in JavaScript is that the keys must be values or must be strings. It would've been better if they could've been any value, but they're not. So you can pass any type into the brackets. JavaScript will turn that into a string so it means effectively the values have to be strings. We have a very nice object literal notation. JavaScript's object literal notation was the inspiration for the JSON data interchange format, the world's most loved data interchange format and you've got curly braces that identify the object. Then you've got named fields or properties within the object. You don't have to put quotes around the names if the names are valid identifiers; if they are not identifiers then you need to put the quotes around them. So most object-oriented programming languages are classical. They're all about making classes where objects are instances of classes and classes inherit from other classes. JavaScript has a much simpler and much more highly evolved design, which is based on prototypes where objects inherit from objects and that's it. There are no classes. This was the idea that we got from Self, which was an improvement over Small Talk, but it's a really controversial idea and most people who are writing in this language do not understand how this really works, but we're going to talk about it. So working with prototypes is actually very easy. You make an object that you like using any of the techniques available for making an object. You can create new instances that inherit from the object and then you can customize the new instances by assigning new properties or replacing properties. The classification and taxonomy operations that you have to do in an object system, in a classical system, aren't necessary, which turns out to be a huge win. So if you're working in a classical system, when you start off you have to classify all the objects and you have to figure out what are all the objects, what are the composed of, how are they related? Then you make a taxonomy in which you figure out what's going to inherit from what, what's going to implement what, and it's a really complicated hierarchy and usually you're doing that work at the point on the project, usually the beginning where you have the least understanding of how the object system is actually going to work, which means it's almost certain that you're going to get it wrong and you see that in that things don't compose properly. You just can't, they don't work, you find yourself wishing you had multiple inheritance because there's no way to get from here to there and as you layer in more and more classes on top, you find that that brokenness starts to seep up into all the higher layers and eventually reach the point where it's so broken that you have to refactor, which means you have to tear it all apart and put it back together again, which is really dangerous because there's a chance it might not ever go back together again. And all of that is due to the coupling of classes and if you get rid of the classes you don't have to do any of that, which is really quite remarkable. So the model that JavaScript provides is sometimes called delegation where an object can only do what it can do and if it's asked to do something that it can't do, it will designate another object to do that work on its behalf. This is also called differential inheritance where each object will only contain the material which distinguishes it from the object that it inherits from. The primitive in JavaScript for making a new object is Object.create where we pass it an object, which unfortunately is called a prototype and we make a new object that inherits from the old one. The word prototype here is problematic because it's so overloaded. All it means is this is the object that we want to inherit from. So let's diagram this. I'm going to make an object called mother, which will have two properties, a and b, whose values will be 1 and 2, and this is the data structure that is created. So we've got our key value pairs and we also have an invisible pointer to Object.prototype. So whenever you make an object literal it will ultimately inherit from Object.prototype. Then if I make a daughter object, I'll use Object.create mother so I now have a daughter, so I've got an empty daughter object which inherits from mother. So if I say daughter.b +2, originally there's no value here so we get the inherited value of b. Then we add 2 to it and store it in the daughter. So storing operations will always go into the top-most object, but reading operations can go all the way down the prototype chain and we can have these chains go as long as you want. Generally, chains in this language tend to be very shallow. We don't get the deep hierarchies that you tend to see in classical languages, but if you want to go deep, you can. Then we can add material to the daughter object, which has no relation to the mother object. So we can add more stuff. So generally the way we'll use this is the mother object will be where we put the methods and then the instances will inherit those methods, okay? There's also a form Object.create null which will make an object that inherits nothing. So you get an object that does not inherit from Object.prototype. This is handy if you want to make something which is just a container of stuff and you don't want to be confused by stuff that you might inherit from Object.prototype. So this will act much more like a hash table. Numbers Everything in JavaScript is an object, so numbers, Booleans, strings, arrays, dates, regular expressions, and functions, they are all objects. So let's look at numbers. Numbers are objects. It's a much simpler number system than you have in Java in that we don't have ints and integers. We don't have either of those, we just have numbers. We make numbers with number literals. All of these number literals refer to the same object. There is only one number type in JavaScript, which is actually a very good thing and there are no integer types, which is something you have to get used to, but it's actually a good thing, too. The problem is that the one number type we have is the wrong one. It's based on 64-bit binary floating point from the IEEE-754 standard, which is strangely called Double in Java and other languages. Anybody care to know why it's called Double, why they picked such a silly name? It's something that comes from Fortran. Fortran had integer and real and later they added double precision, which was two reals put together in order to give you more precision and C took Fortran's Double Precision and shortened it to Double and everyone else has been using Double ever since then. So we don't have ints and I'm glad we don't have ints because I hate ints. Ints have some really strange properties. For example, we can have two ints which are both greater than 0. We can add them together and we can get results that are smaller than the original numbers, which is insane and inexcusable and how can you have any confidence in the correctness and the perfection of your system if it's built on a number type, which can do such insanely ridiculous things? So JavaScript does not have this defect in it, which I think is brilliant. So one problem with computer arithmetic in general is that the associative law will not hold and that's because computer numbers are necessarily finite and real numbers aren't so in many cases we're only dealing with approximations and when the values are proximate, then associativity doesn't hold. Now if you're just confined to the integer space, JavaScript integers go up to around 9 quadrillion, which is pretty big and 9 quadrillion is bigger than the national debt so that's big, right? That's big. So as long as your integers are smaller than 9 quadrillion, they work out exactly. When the integers get above 9 quadrillion, they don't do the crazy wraparound thing that ints do, they just get fuzzy. So if I take a number above 9 quadrillion and add 1 to it, it's like I added 0 to it, which is not good, but it's much less bad than what ints do and because computer arithmetic can be approximate then there are identities that we're used to thinking about which don't hold. So you need to be aware of that and cautious. So the most reported bug for JavaScript is that .1 plus .2 is not not equal to .3 and that's because we're trying to represent decimal fractions in a binary floating point and a binary floating point cannot accurately represent most of the decimal fractions. It can only approximate them, but it approximates them with infinite repeating bit patterns, but we're not allowed to have infinitely long numbers and so they truncate and so every number is going to be slightly wrong, which is only a problem if you're living on a planet that uses the decimal system, but on such a planet, you're counting people's money using this. And when you're adding people's money, they have a reasonable expectation you're going to get the right sum and we're not guaranteed to get the right sum with a binary floating point, which is a huge problem. Numbers are objects and so numbers have methods. You don't have to box them in order to get object behavior. Every number is already an object. Every number inherits from Number.prototype. So if we wanted to add new methods to numbers, we can do that by adding them to Number.prototype. This is not something that applications should ever do, but it is a useful thing for libraries to do and in fact this is how we evolve the language. So we can add new methods to new versions of the standard and libraries can backfill old browsers and old implementations with the new stuff, as long as the new methods can be implemented in JavaScript. Numbers are first-class objects, which means that a number can be stored in a variable. It can be passed as a parameter. It can be returned from a function and it can be stored in an object. And because numbers are themselves objects, they can have methods. JavaScript has made the same mistake that Java made in having a separate math object or a math container for keeping the higher elementary functions. This was done in Java anticipating that in the future there might be very low memory configurations and they'd want to be able to remove the math functions, but that didn't happen because Moore's law kept cranking on memory capacity so that turned out not to have been a good strategy, but it wouldn't have worked anyway because you'd be throwing away essential things like floor. There's no good way to get the integer part of a number if you get rid of the floor function. So it couldn't have worked. There are also some constants stored in the math object as well. So one of the things that we get from the IEEE format is something called NaN, which stands for Not a Number. It's the result of confusing or erroneous operations. For example, if you try to divide 0 by 0 the result is NaN. NaN is toxic, which means that any arithmetic operation with NaN as an input will have NaN as an output and despite the fact that NaN stands for Not a Number, it is a number. If you ask JavaScript what is the type of NaN, it says number and it's right. The thing I hate about NaN is that NaN is not equal to anything including NaN. So NaN = NaN is false, which bugs the heck out of me and even worse than that is NaN not equal NaN is true, which I hate even more. So if you want to find out if something is NaN there is a global isNaN function and you can pass NaN to it and it will return true, which is good. Unfortunately, isNaN also does type coercion. So if you pass it a string like Hello World, it'll try to convert the string into a number, the number that Hello World turns into is NaN, so Hello World is NaN, which is not true. So ever since Fortran we've been writing statements that look like this, you know, x = x + 1, which is mathematical nonsense. So ALGOL got this right. ALGOL came up with an assignment operator so this didn't look so ridiculous and BCPL did the same thing as ALGOL, which got it right. Unfortunately, Thompson liked this better and so we reverted back to it and we have not evolved away from this since. So we're stuck with this and it looks crazy, right? Because it looks like an equation, but there's no value of x which equals x + 1, right? Except, it turns out if you're using binary floating point, there is a value called Infinity and if you add 1 to Infinity you get Infinity. So this actually is an equation. There's a value of x for which this is true, and not just that. There's another value called Number.MAX_VALUE which is 1 followed 308 digits that's really a big number and if you add 1 to the biggest number that JavaScript knows, you would think that would be Infinity, but it isn't. It'll be MAX_VALUE so it holds. In fact, that is true for every number above 9 quadrillion. So there's a lot of values for which this holds, but not NaN. Even though NaN + 1 is NaN, NaN is not equal to NaN. So I hate that and NaN, well I hate that even more. Booleans and Strings So Booleans. There are exactly two Booleans in this language and they are true and false. So hooray! We got that right. We have strings. Does anybody know why they're called strings? Anyone? The change of characters? Well, it turns out nobody knows. Okay. The first place I've been able to find in the literature that refers to a specific data type which is a sequence of characters is the ALGOL 60 report, but there is nothing in that report which explains why they chose the word string. I mean the first time you encountered it you must have thought, well this is strange because it doesn't look like a piece of string in any way, so how did that happen? I asked John McCarthy who was on that committee, he was the inventor of Lisp, why did you call it strings? And he said, well Clannaugh was doing this stuff with strings of symbols he said, but that didn't really answer the question because at the time of the ALGOL 60 report they would talk about a block was a string of statements. So when did it become specifically a string of characters? And he was really annoyed that I was asking him a question that he didn't know the answer to so that was the end of that conversation and then he died and he was last survivor of the Lisp committee so we'll never know, or of the ALGOL committee. So anyway, we've got them and we call them strings and I think we call them strings because there may have been some CPU architecture which linked characters together to make stringed sequences of characters and you know, I think that because what's the operation we do on strings? It's concatenate, but concatenate means to form links. It's an operation in chain making. So you would think if it's strings it should be tie or maybe slice or wind or something, something that strings do. So we've got some mixed metaphors going on. So I'm still searching through the literature trying to find out where strings came from, but I haven't found it yet. So anyway, we've got them, we've got strings. A string is a sequence of 0 or more 16-bit Unicode characters. It's characters in the sense of UCS-2, not UTF-16 because there is no awareness of surrogate pairs. At the time that JavaScript and Java were designed, Unicode was going to be a 16-bit character set. It later grew to a 21-bit character set and JavaScript is not aware of that. There is no separate character type. Characters are represented as strings with the length of 1. Strings are immutable, which is good; once you make a string, it cannot be changed. It's fixed for all time. You can make new strings by concatenating together bits of old strings, but you cannot change a string once it's made. Similar strings are equal, which is great. Java got that wrong and JavaScript got that right and hooray for JavaScript. String literals can be used single quotes or double quotes with escapement. Both work exactly the same way, no magic quotes or quasi quotes or anything like that, but there's no reason to have two ways to make strings, but we do so given that, I recommend you use double quotes for external strings, strings that make sense outside of the program like URLs and templates and notes to the user, that kind of stuff, and use single quotes for internal strings like the names of properties and character constants and stuff like that. There is nothing that enforces that, but I think it's a good convention. We can convert a number to a string by using its toString method. I prefer instead to use the global String function, which does the same thing and will also work better on things that do not have two-string methods. We can convert a string into a number by using the global Number function or by using the + prefix operator, which I prefer. We also have a parseInt function that I don't like; parseInt is something that was borrowed from Java. It will convert a value into a number, but it stops parsing at the first non-digit character and it won't tell you where it stopped or why it stopped or what was left over and you usually want to know that because that can be important, but it doesn't give you any clue. There was a big design error in that if the string you were parsing starts with a 0, it assumes that you met that to be parsed as base 8, octal. And so if you're parsing something like a date or a time, which might have an 08 or an 09 that will be parsed as a 0 because the 0 goes into base 8, it sees the 8 and says you are not an octal digit so I stop here and we get the 0. Yes. So because of that I recommend you always include the radix argument, which says base 10, we're on planet Earth, darn it! You know, base 10, 0 to 9. Strings have a length property, which tells you how many 16-bit characters are in the string. Surrogate pairs will be counted as two characters in that count. Strings are objects and so they contain a big mess of methods. I'll let you look them up offline, but there are a lot of them. Many of them are actually very useful. Every string inherits from String.prototype so if we wanted to add new strings or new methods to strings, this is the place where we could do that. Again, this is not something applications can do, but this is how we've been growing the language. Arrays An array is one of the fundamental data structures. It's a contiguous series or span of memory divided into equal-size slots where each slot is indexed by a number; very fast, very efficient. JavaScript doesn't have anything like that. Instead what JavaScript does is, well the first version of JavaScript forgot to put arrays in and people figured out that, well, we can just use an object, right? And we can use the bracket notation and we can pass numbers, the numbers get turned into strings, and it all kind of works like an array sort of, and that's still what we do. So there is now an array data type. It inherits from object and the indexes are converted into strings and used to retrieve and store values. The good news is that is extremely efficient for sparse arrays. Unfortunately, we don't do sparse array stuff, we're almost exclusively doing dense array stuff and it's very inefficient for that. One other advantage it provides is that there is no need to provide a length or a type when creating an array. We can just say, empty angle brackets, that's a new array and then you can add as much stuff as you want to it. You don't have to worry about out-of-bounds errors because it's not really an array, it's a hash table. So every value is in bounds. Arrays, unlike objects, have a special length property and that property is always 1 larger than the highest integer subscript, which is not necessarily the same as the number of elements in the array. So we can make an array with an object literal or with an array literal, using the bracket notation. We can have multiple expressions in there separated by commas and each of those will provide a value to an element. We can append to an array by assigning to its current length, which is really weird looking, but it works. So in this case my list.length is 3 so we'll assign to it and the length will now be turned into 3+1, 4. Arrays come with a more interesting set of methods than objects have. Objects have very little useful stuff coming from object.prototype. This stuff is all stored in array.prototype, a much more useful set. We'll look at a couple of these. For example, there is a sort method where we can take an array of numbers and we can sort it. So anybody see what's going on here? But it's sorting on the strings. So it take each number and converts it into a string, it take the other one and converts it into a string, compares them, and it does that in log times. It's awful. So fortunately, sort can take a function argument, which receives pairs of values and it returns minus 1, 0, or 1, based on their relative magnitudes. So you can override this terrible behavior, but it's stupid by default. You can delete an element from an array, but it doesn't do what you expect. So usually you want to use the splice method to close up a hole. So let me demonstrate that. So here we've got an array containing four elements and I want to delete element number 1. That'll leave a hole in it, which is identified as the undefined value. If you try to retrieve a value from an array and it doesn't have that element, it returns the undefined value. So if you want to get rid of that, you use a splice method, you say go to element 1, delete 1 element. Then you'll get an array that's more like what you were expecting and the way it does that is it deletes element 1, it retrieves element 2, deletes element 2, reinserts it as element 1, it goes to element 3, reads it, deletes it, reinserts it as element 2. It's not fast. There's a look of horror in the crowd. Okay, so because objects and arrays are made out of the same stuff in this language, it turns out you can use just one of them most of the time, which is a really bad idea because sometimes it matters. And because it sometimes matters I recommend use the right one. Use objects when the names are arbitrary strings, use arrays when the names are sequential integers. Don't be confused by the term associative array. In JavaScript the associative array is called the object. Dates, RegEx, and Types We have a date function which was inspired by Java's Date class, which was not Y2K ready when it was introduced in 1995. I don't know if they didn't think the language was going to be around that long or if they didn't see it coming or what the deal was, but we survived it. Hooray, and it's been fixed so we got that. We got regular expressions, which were borrowed from Perl 4. This is a regular expression literal that matches regular expression literals. That's what I claim. The thing that I hate about this convention is that if you have a regular expression that's longer than an inch or 2, it's really hard to have any confidence that it contains what you think it contains and that it matches and rejects what you think it will match and reject and I will confess I've written regular expression literals that are several feet long and I am not proud of that. But the language doesn't make it easy to do because they all have to be scrunched together. You can't even use white space in them to kind of separate the elements so you can see what it's doing and all smashed together, it's virtually impossible for a human to decipher what that is. Fortunately, there's something on the web called regulex and you give a regular expression literal to regulex and it will give you a railroad diagram of what that regular expression does. So you can see what it does and you can have a good understanding of how it will behave. I will not write another regular expression without running it through regulex. I highly, highly recommend it. Yes? If you find yourself writing a regular expression that's too many inches long, can't you just get a smaller string? Yes, thank you for that. I had to, I'm sorry. No, you didn't have to. So you're probably wondering, when do the good parts start. I haven't heard any good parts yet. Functions. Functions are the good part and we'll talk about those in the next hour. So in JavaScript all values are objects except null and undefined. These are what are sometimes called bottom values and there is some debate as to whether a language should have any bottom values. There is no debate on the question, should a language have two bottom values; the answer is absolutely not, that doesn't make any sense, but we have to and they act a lot alike, but they don't act exactly alike so they're not interchangeable, but some people use them interchangeably, which is a confusion, confusion causes bugs and I don't like that. So I recommend using only one of them and the one I choose to use is undefined because that's the one that the language itself uses, but both of them are used to represent values that are not objects. These are the only values in the language that are not objects. So if you try to retrieve a value from one of these, you're not going to get anything. If you try to execute them as functions, they'll throw exceptions. They're just used to indicate the absence or the end of something. So I recommend using undefined. Undefined is or I choose undefined because it is the one that the language itself uses. So if I'm only going to use one, let's use the language as one. So it's the default value for variables and parameters. So if you create a variable, but you don't initialize it, it actually gets initialized for you with the undefined value and if you have a function and you don't pass enough arguments in, the missing parameters will get bound to undefined. It's the value of missing members of objects and arrays. So if you try to retrieve a property or an element and it isn't there, you don't get a runtime warning, you don't get a compile time warning. Instead you get the undefined value, which is actually a very nice thing because it allows you to reflect on objects without any effort; you just ask, do you have one of those, and if you get a value back then you've got something and if you didn't, then you didn't. One thing to watch out for though, is that you can store undefined in an object and then you can read that undefined back, but you can't easily tell, am I getting an undefined that was stored in the object or am I getting an undefined because it wasn't stored in the object? They both look the same and that's a confusion and I don't like confusion. There's a type of operator in JavaScript which will return a string identifying what the type of something is. For example, if you pass it an object, it returns a string object, which is great. If you pass it a function, it returns the word function, which is great. If you pass it an array, it returns object, which is well, not technically wrong because everything is an object, but it's certainly not useful. You'd like it to return array because that's what it is, so but if you pass it null, it returns object, which is wrong. There's no excuse for that. It doesn't make it hard to detect if something is null because there's only one null value so you can ask, are you triple equal to null? That's very effective. The problems is you're trying to figure out are you an object, because if you are not an object there are certain things that I don't want to try next and this test fails for that, which is really, really bad. So we have Array.isArray now that was added to the language in ES5. While it's extremely ugly looking, it does finally allow us to detect if some value is an array or not, which is good, and this is how it is implemented on older browsers. JavaScript is a boolish language in that every value is truthy or falsy. The falsy values are false, null, undefined, the empty string, the number 0, and NaN. All other values are truthy including the string 0, the string false, all objects, all arrays, even if they're empty, all of those are truthy. I think this was a huge mistake. The purpose of this was to allow if statements to work in a way that is similar to the way C works. C is not a strongly typed language and so it uses 0 to represent the number 0 and false and null and a few other things, end of strings, and I'm sure there are lots of others. So you could just or the condition of an if statement in C is 0 or not 0 and JavaScript wanted to look like that, but it turns out to be a bad idea. We'll see some examples of that tomorrow. JavaScript is a loosely typed language in that any of these types can be stored in any variable or passed as a parameter to any function. The language is not untyped because as we have seen, we've got a lot of very specific types, but it's loosely typed and I content that this is a good part and although that statement is very controversial. The prevailing style in the world is calling for strong typing and there are really good arguments for strong typing. The argument is that strong typing allows the compiler to find errors very early and the earlier we can find errors, the more valuable that is and that is true. And so when I first started working with JavaScript I was very, very nervous because this is a loosely typed language. Any kind of type can be passed in as any parameter or be stored in any property; how can you have any confidence that anything is ever going to work because you've got to be prepared for anything at any time, it was crazy, but what I found in working with the language was that to my surprise, my testing burden did not go up. I thought I was going to have be watching for all of these things and putting in explicit type checks of my own and I very rarely ever had to do that and in fact, what I was found was that the sorts of bugs that a type checker can find, you find instantly anyway. If you're doing even the most trivial level of testing, those things show up right away. The type systems provide no help in finding the bugs that keep you up at night and that you end up doing a lot less work because you know, strongly typed language, you end up spending a lot of time working against the type system. There are things you need to do in order to get your job done and the type system doesn't want you doing those things and so you have to figure out ways to get around it and that's a lot of effort and any time you cast, then that means the type system is failing you and you end up casting way too much. Also, it turns out there is a large class of errors which are caused by the type system. Because the type system is causing you to circumvent it, you end up doing a lot of extra work and some of that work turns into bugs and in JavaScript that tends not to happen. So I find in JavaScript, writing correct programs is not harder than in Java. I think it's actually easier and you do a lot less work because you're not managing all of the types all of the time. That's my argument. You might not believe me and it doesn't matter. If you're writing in JavaScript, get used to it because that's how it works. So in JavaScript objects are passed by reference, not by value which means that objects are not copied. In fact, there is no way, no easy way in JavaScript to make a copy of an object, which seems like a surprising omission, but in practice I have not found it to be a problem. The triple equal operator compares object references so it'll give you true only if both operands are the same object. There is no easy test for are two objects very similar. You know, containing the same properties, having the same values. That again seems like a strange omission, but I've not found that to be a problem. JavaScript Syntax JavaScript is syntactically a member of the C family. We've got identifiers which can start with letters or under bars or dollar signs. I recommend not starting or ending your property names with under bars or dollar signs, but they seem to be really popular. Dollar sign was added to the language for the benefit of machines, for macro processors and code generators, things that wanted to be able to create names and be guaranteed they wouldn't conflict with your names, but they could use a dollar sign and that made them safe. Unfortunately, some kids found out that you could have a function called dollar and went crazy with it so dollar functions are all over the place now. We have both formats for doing comments. I recommend using just the slashslash line format because sometimes we want to use comments for commenting out code and regular expression literals can contain star slash and slash star. So weirdness can happen if you're using block comments. We have the same set of operators that you would expect to see in a C-like language. A few of them work a little bit differently that you need to be aware of. One big mistake in the language is that the plus operator does both addition and concatenation. This is a bad habit JavaScript learned from Java. In Java it's not so bad because it's strongly typed so you can predict which one it's going to do. JavaScript is loosely typed so it's not until you go to do a plus that it looks at the operands and if they're both numbers, it'll add them, otherwise it'll convert them both into strings and concatenate them, which is bad. For example, in a web application you might have a form field and you ask the user to type a number into the field, you then want to take that number out and add it to something, forgetting that the value you take out of a form element is always a string even though everything around it says it's a number. That's a really big source of confusion and you don't get an error, you just get extremely bad behavior. So we can convert, we can use the plus unary operator to convert strings into numbers and so you'll often want to do that as a defensive thing. If you've got a value and you want to add it and you're concerned that it might not be a number, you can coerce it to be a number before you do the addition, but if you do that, I recommend putting parens around it because otherwise you end up with two pluses next to each other which can look like another problem. We don't have integers formally so you can divide two integers using the divide operator, but you're not guaranteed to get an integer result so you need to be prepared for that and because it's binary floating point, even the floating point result might not be the one you would expect. The percent sign operator is the remainder operator, not the modulo operator, which is a shame because I think modulo is the more useful one. The difference is in which sign it uses. We already talked about double equal and the problems with it and I recommend you always use triple equal instead, just because there's so many weirdnesses in it, things that aren't expected. There's a meme on YouTube called wat. Has anybody seen wat? Yes, it's a crackup, right? And mostly it's playing fun with this. You know, they take two things which are wildly different and double equal them and it says true and they go, wat, and they get a big laugh. So don't use double equal. The logical and operator works a little differently than it does in Java because the operands do not need to be Booleans, they only need to be boolish. So if the first operand is truthy, the result is the value of the second operand, otherwise it's the result of the first operand. It does do the short-circuiting though so the second operand will only be evaluated if the first one was falsy. And logical or works in a similar way. The exclamation point is the logical not operator. If the operand is truthy then the result is false, otherwise the result is true. If you have bang bang twice, it'll turn a boolish value into a Boolean. We have bitwise operators, but we don't have ints. So the way that works is we'll take the 64-bit binary floating point, turn it into a 32-bit signed integer, do the nasty to it, and then convert it back. So in some languages you'll see people doing a shift because they think it's going to be faster than a multiply. You shouldn't do that, even in those languages because compilers are smart enough. You know, you should write the thing that you intend so that someone reading the program will know what's supposed to be happening and the compiler will sort out the fastest way to do it, but in this language, definitely, you're not going to get any speed improvement by doing the wrong one. Statements We have the same set of statements that you would expect to see in a C language, again with some differences. We've got labeled break, which is good, so if you've got nested loops and switches and stuff you can break out of innermost things, which is good. We've got the for statement, which we can use to iterate through arrays, but I don't recommend using that. In ES5 we got the forEach method, the map method where you can have a function called on each element of the array. That eliminates almost all need for the for loops. So I don't use for loops anymore. We have a for in statement, which iterates through all the names of all the properties of an object. Unfortunately, it also iterates through all of the inherited properties which are usually your methods and so your methods get mixed up with your data and it gets to be a mess. Fortunately, in ES5 we got Object.keys, which takes an object and returns an array of strings, which is just the enumerable owned properties of the object, which are usually the ones you want. So I don't recommend using for in either. We already talked about the problems with the switch statement. Oh, one thing about the switch statement though. JavaScript improved the switch statement in one way and that is that the switch value does not need to be a number. It can be a string, which can be very nice because you can switch on a greater set of values and the case values can be expressionless. They don't have to be constants so that can be useful in doing internationalized applications that you can case against a function which will return the yes value for this language; that could be a nice thing. We have exception handling in this language, which is nice. Before ES3 we didn't, which meant that you had to write programs that never went wrong ever because there was no way to recover from anything. That was hard. So we've got exceptions now. You can throw literally any value. There's a convention that you throw something that comes from a new constructor which is the same as creating an object that has a name property and a message property, but in fact you can throw literally anything. So the way that exceptions are used is very different than in Java. So exception handling is very simple because we don't have exception types; there is only catch block and it catches everything and generally it's going to ignore whatever it caught. It doesn't care what happened. All it cares about is that didn't work so let's try something else, maybe it'll work instead. So in Java you tend to see exceptions used to implement a form of computed go-to where you can get very complicated control paths that are dictated by whoever through the exception and I think that was probably a big mistake. JavaScript didn't make that particular mistake. So we tend to use exceptions properly and use them only for failures. We'd never do normal control paths using exceptions. Yes? Could you clarify that? I didn't understand that statement. So in Java there's a tendency to look at why did this fail, right? Oh, you have different reasons, exception a, exception, b, exception c. Yes, exactly. You've got a whole bunch of exception cases and so you've got who threw the exception deciding how you're going to go through your code and that usually means that the exceptions are not actually exceptional, they're just alternate control paths, right? And sometimes that's motivated by weakness in the type system, that you've got something that wants to return an int, but something else is going to happen, which is not exceptional, it's just not an int. I'm out of values and so you'll throw an exception instead whereas in JavaScript because we're loosely typed, you can pass a number or you can pass undefined, or you can pass a string or you can pass an object, you can return anything that you need to and so these things are not exceptions, they're just more modes of normally processing. So your control flows tend to get much simpler. Any other questions? Okay, so that's the end of this hour. So let's take a break and we'll come back. Hey, I have quick question from chat; people are asking why you don't use for loops and instead favor forEach. Yes, forEach is just much nicer. Once you get used to it, it's more readable. In future versions of the language it's going to be parallelizable. It's a more modern construct; it's more functional, it's more composable, it's better in every aspect. What about the fact that you can't leave a forEach with a return? I'm sorry? The fact that you can't leave a forEach, like stop a forEach loop? Oh, you can. You can, instead of using forEach use every and then you can have something return false and then it stops. Function the Ultimate Functions So, Act III, Function the Ultimate. You've been wondering, when are we going to get to the good parts, right, because everything has been pretty horrible up until now, right? Hasn't this been kind of tragic? So we're going to get to some good parts and the good parts in JavaScript all wrap around the function. Functions are really powerful bits of stuff. In other languages you'll have methods, classes, constructors, modules, and in JavaScript all you need are functions. Functions are so powerful, they can do the work of all of those things and more, just sort of universal stuff for building things. We make functions using the function expression or function literal. It will return a new function object, which could then be invoked. A function expression starts with the word function. It can then take an optional name, which allows you to call the function recursively. It also allows for documenting things like the code and stack traces. You pass it parameters, which are a set of names separated by commas, wrapped in parens and it takes a body, which is a block wrapped in curly braces. A function expression produces an instance of a function object and every time that expression gets evaluated you'll get a new function object. Function objects are first class, which means they can be passed as arguments to functions. They can be returned from functions. They may be assigned to variables and they may be stored in objects and arrays. This is very different than in languages where functions are sort of these static things which just kind of exist before the program starts. In JavaScript the functions run and add things to the environment as they're compiled and because functions are objects, they inherit from Function.prototype. We use the var statement to declare and initialize variables within functions. You don't specify a type in the var statement, it can accept any type and variables declared anywhere within a function are visible everywhere within function. So we don't have block scope with the var statement, we only have function scope. So the var statement does a really weird thing. It gets split into two parts. The declaration part gets hoisted to the top of the function where the variable gets initialized with undefined and the initialization part turns into an ordinary assignment statement. So here we've got var myVar = 0, var myVar = undefined gets hoisted to the top of the function and at the site of the original statement we get an ordinary assignment statement which does the initialization. To make things more complicated, JavaScript has a function statement or function declaration, which unfortunately looks exactly like the function expression. It starts with the word function. In this case the name is mandatory, you can't leave it out. It takes the same parameters and takes the same body. So it looks exactly like the other one. The function statement is a shorthand for a var statement with a function value. So function foo expands into var foo = function foo and that further expands because of hoisting into var foo = undefined and foo = function foo. Both of these now get hoisted to the top of the function. It's because of this second hoisting that the language does not allow you to declare a function inside of an if because the assignment of the function variable is going to be pulled out of the if and move to the top. So that's illegal. Not it turns out most of the JavaScript engines allow you to do it anyway, but because the standard says you can't, they all have different opinions on what it actually means. I don't recommend doing that. So it's really confusing have both function expressions and function statements, which both look exactly the same. So how do you tell them apart, and it depends on where you are. If the first token of a statement is function then it is a function statement. If the word function occurs any place else, it's a function declaration. So we talked this morning about scope, about block scope versus function scope and block scope is a very common thing. Function scope is a very unusual thing, which is unique to JavaScript's var statement. So function scope is sufficient, but function scope looks like block scope or at least the syntax is the same as block scope languages and so you get confusion. So in some languages that have block scope you can do stupid things like this where you can have two loops, both using the same variable name as their induction variable and it works because those languages, each of these will be in a different scope and so they won't interfere with each other. It's extremely unwise, but legal. In JavaScript it is also legal, but it's worse than unwise because there's only one i variable created here. Both loops will be using the same i variable and so they will interfere with each other really badly and this loop will never correctly perform. And that's because everything gets hoisted to top? It's because both var i declarations get hoisted to the top and there is no check to see if a variable has already been declared. So if there's a second var declaration for the same name, good! That's double good. It should be an error, but it's not and because it's not an error you can get into trouble. So because of all of this weirdness I recommend declare all of your variables at the top of the function because that's where they're actually being declared and also declare all of your functions before you call them. All of this hoisting nonsense was created specifically so that you don't have to do that, that you can call a function before it's declared, but that requires that all this hoisting weirdness be going on and that you understand what the hoisting is doing and I think that's too much to expect of the people reading your program so instead, I think it's much better to say, declare the function before you call it and it's a much easier thing to understand. Alright, call? You're using the word call where I think I would call it defining? In JavaScript you can lexically call a function before you declare it. I can say. Oh, I see what you mean. I can say foo parent and then below that I can say function foo. You don't mean to say var, some function name and then later on define the function? Oh, I see; never mind. These two statements, there's not this parallel between these two statements that I'm looking for, because functions are variables? No, functions are values that can be stored in variables. Okay, if they have a name? Even anonymous functions can be stored in variables. Okay. Function Best Practices JavaScript functions have return statements. A return statement stops the execution of the function, returns control to the caller, and can also optionally return a value. A return statement can take an expression. It will evaluate that express and return its value, or you can have a return statement that returns nothing. It will actually return the value undefined or if the function falls through the bottom, it will return the undefined value. I recommend that you not make function objects in a loop. It can be wasteful because every new function object is created on every iteration so that could be expensive and it can also be confusing because the new function closes over the loops' variables, not over their current values and that confusion can lead to bugs. For example, here we've got a for loop, which is going to loop over an array of divs and it's going to add a click handler to each one, which when clicked on will alert the id of the div. You know, this code looks very straightforward, but it will fail and the way it fails is no matter which div you click on, they will all display the same id. It'll be the last one and it's because what's being captured by the inner function is the current value of div id, not the value of div id at the time that the function was created. So the way to get around that is to not use a for loop at all, use forEach and we'll pass in a function which will receive each div as an argument and will alert correctly. We invoke functions with the parens suffix operator, which will surround 0 or more comma-separated arguments and each of those arguments will be bound to one of the parameters of the function. If a function is called with too many arguments, the extra arguments are ignored, it is not an error. If an argument is called with too few arguments, the missing values will be the undefined value, if there is no explicit type checking on the arguments, on their types or on their numbers. So if you really care about that stuff, you need to check it yourself. Generally, you don't have to. It turns out most of the defaults in JavaScript are right so that if you don't do checking, it'll usually do the right thing anyway. So in addition to any parameters that are formally defined as part of a function, there are also two bonus pseudo parameters that you can also get access to, arguments and this. I don't recommend using either of them, but they're both very popular in the language so I need to describe them both. We'll start with arguments. When a function is invoked in addition to its parameters, it also gets a special parameter called arguments and it contains all of the arguments from the invocation. So everything that got passed in the parens that called the function, every one of those values will be in the arguments array. Now one problem with the arguments array is that it is not really an array, even in the weird way that JavaScript thinks of arrays. It is an array-like object in that it's an object that has a length property, but it's not a magical length property and it doesn't inherit any of the useful array methods. So it's a difficult thing to work with. You do get arguments.length, which will be the number of arguments that were actually passed and there's a weird interaction with the parameters. So if you change argument subzero, the first parameter also changes. If you change the second parameter, then arguments sub-one changes. For the people who make JavaScript engines, this is maybe the most hated feature of the language because they're working really hard to try to make it go fast and you can't go fast when you have to mess with behavior like that. So let me show an example of how you could use it. So we've got a simple thumb function, which will receive some variable number of numbers and it will add them up and return the total. So we're going to get from arguments.length n, the number of things that we're going to add. We're going to loop through n and we're going to get each of those arguments and add it to the total. Notice I didn't include any parameters in sum because I didn't need to. I could have, but it wasn't necessary because I'm getting all the material from arguments. Okay, so now this. This is a really difficult thing to have in any language because it makes the language hard to talk about. It's like pair programming with Abbott and Costello so bear with me. So the this parameter contains a reference to the object of invocation. This allows a method to know what object it is concerned with. This allows a single function object to service many objects and it's the key to prototypal inheritance. There are four ways to call a function in JavaScript. The function form, method form, constructor form, and apply form. Four forms. I think in a well-designed language there should be one form, but we've got four and the way they vary is in what happens to this. So let's start with the method form and in the method form we say some object, dot, method name or some object bracket, you know, an expression that evaluates to a method name. When we call a function this way, then this will get bound to this object and it allows the method that gets called to know which object it's going to be manipulating. In most languages, the binding of this happens fairly late, but nobody does it as late as JavaScript. JavaScript doesn't bind this until the moment of invocation. Then there's the function form, which looks the same except we don't have an object prefix so we just have some function value and we call that. In this case, this gets bound to the global object. It's the thing that is the container of all the global variables and that's a terrible thing because it's leaking way too much capability, it's a security hazard, it's a reliability hazard and this was sort of fixed in ES5 Strict Mode, but not completely. In ES5 Strict Mode this gets bound to undefined. So at least we're not binding it to something dangerous, but sometimes it's not what you want, either. That if you have an inner function inside of a method, you want the inner function to help the method do its work, but the inner function doesn't get to see this because the inner function gets called as a function and so it's this will be either the global object or undefined. So the workaround is in the outer function, in the method, you create a variable called that, assign this to it, and then the inner function gets to see that. Then there's a constructor form. It looks like the function form, except there's a new prefix and when a function is called with the new prefix, a new object is created and will be bound to this and if the constructor function does not have an explicit return value, then this will be the return value and this is used in the pseudo classical style, which we'll get to in a few minutes. Then finally there is the apply form. In the apply form, you call the apply method of the function object and you get to specify what the value of this is going to be. In addition, you can also provide either an array of arguments or you can provide individual arguments separated by commas. So to summarize, there are four invocation forms, function, method, constructor, and apply, and they each vary according to what they do with this and again, I don't recommend using this at all, but that's how most people use the language and I'll show an alternative in a few minutes. So you all know about recursion; that's when you have a function that is defined in terms of itself or that can call itself, a really important thing. I don't need to tell you about Quicksort, that's one of the important examples of using recursion. JavaScript has recursion, which is good. If you're not familiar with recursive functions, I highly, highly recommend a book called The Little Lisper, which has been revised and it's called The Little Schemer. It's based on the scheme language, but it's not really about scheme, it's about recursive functions and everything in that book can be written in JavaScript and this web page will give you the key to doing that translation. I highly, highly recommend it and this is one of those books that can significantly change the way you think in a really good way. There's another book called The Principles of Programming Languages by R.C. Tennent and one of his principles is the principle of correspondence in which he talks about how variables are like parameters in some languages. So here we have two versions of the factorial function. In one of the result is a variable and the other result is a parameter, otherwise these two functions do exactly the same thing and so this demonstrates the similarity of variables and parameters based on how the function is constructed. The interesting thing about the second function is it's using an immediately invokable function expression. So here we have a function, declaration, or a function expression, which is creating a new function object, which we then call immediately, passing in the 1. So in general we can take any expression and wrap it in a function that will return that expression and call the function immediately and it does the same thing as the expression; it's just more verbose, but it does the same thing. Now it turns out this is now true in JavaScript for all values. For example, this and arguments change their meaning when you put them in a different function. So it doesn't work for them, but it works for all other expressions, and it works for statements. You can take any bunch of statements and put them in a function and call it immediately and it's as though you had executed those statements, except that var, function, break, continue and return change their meaning when you put them in another function, otherwise you can do this transformation and there are some interesting things that can happen as a result of being able to do this. So this is a feature that you look for in a functional language. Closure I promised you some good parts and we're finally getting to a good part and this is something that you probably don't have experience with. It's something called closure. It's also call lexical scoping or static scoping. It's a consequence of functions that can nest and functions that are first-class values and JavaScript has it and is almost completely right in the way it implements them and it's maybe the best feature ever put into a programming language. The context of an inner function includes the scope out of the outer function and the inner function enjoys that context even after the outer function has returned. Now I expect that statement made no sense to you at all and I understand that so I'm going to have to explain this in steps. So we'll start with the observation that function scope works like block scope. So you're familiar with the idea of block scope. So next year we'll probably have let statements in most of our systems and here we've got two blocks, one inside of the other and the inner block can see its variables and the variables of the outer block. The outer block can only see its own variables. So I assume everybody is comfortable with the idea of block scope, right? We've had that since 1960 and it's great. So we can do the same thing with functions. You can think of a function as simply being a block with a little bit of extra mechanism attached to it so that you can invoke it in the future, but you don't have to do it immediately and the same relationships apply. So we've got the yellow function which can see a and b and the green function that can only see a, right? Exactly the same relationship and we can represent this relationship as sets. So this is the set of variables that the outer function can see and this is the set of variables that the inner function can see. You know, nothing surprising here, except that we can describe this relationship in that the set of variables of the inner function encloses the set of the outer function and that's why we call this closure. And it's kind of too bad that we call it closure because most people think of closure as meaning something else like retribution or vengeance. You know, I've been victimized but I'm getting me some closure and it's going to feel good. It's not that kind of closure, but that's what we calling it so we're kind of stuck with that and it seems like a simple idea, right? You just nested functions which have what should be an obvious relationship because of the way that scope works, but it took a long, long time for this idea to get developed. So you needed a language that had three things. You needed lexical scoping, you needed nested functions, and functions as first-class values and pretty early on we got functions that would have two out of three, but we didn't get a function that had all three until scheme. Scheme was an experiment at MIT in an attempt to understand Carl Hewitt's actor model in the early 1970s. The experiment failed in that they never did understand what Carl was talking about, but they did discover this new way of programming, which I think was the most important breakthrough in the history of computing and like all of the really important, significant breakthroughs, the world took no notice of it whatsoever and in fact it took two generations before we finally figured out that this was a good idea and brought it to the mainstream, but it was always a good idea. And the reason it took so long was because of this problem. So we've got the same thing we had before we've got an inner function and an outer function, but this time the inner function survives the execution of the outer function because the outer function is returning the inner function. So we will call the green function, it'll allocate a on the stack, it will return a new function and exit. Now we want to call the yellow function that it returned, but the yellow function wants access to a, but a is no longer on the stack. Boom! It took us 20 years, 40 years to figure out what to do about this. It turns out the solution is trivial. Don't use a stack. Allocate all of the activation records on the heap, get a good garbage collector, done. Really, really easy, but it took a long time to figure this out. Can you repeat that solution? Yes. Don't allocate the activation records on the stack. Allocate them on the heap, get a good garbage collector, done. That's it. That is the whole thing. So anyway this idea is discovered in scheme. It takes a long time for it to finally get to the mainstream. Anybody happen to know what was the first language to bring this idea to the mainstream was? Anybody? It was JavaScript. JavaScript was the first language to do this. It was followed quickly by Python and Ruby, C# got it, eventually C++ got it, PHP does it in a really _____ way, but they got it. Just last year Java finally got it. Java is having a tough time keeping up with its stupid little brother, but it finally got this. This idea took so long to get to the mainstream that there was an opinion that well, it's not getting to the mainstream because it's not a good idea and the proof that it's not a good idea is that it hasn't been adopted, but we're now adopting it and we're adopting it because it turns out it's the solution to a lot of the problems that we're having now, dealing with distributed systems and asynchronicity and the things that we do now are much easier if you've got functions working for you. And this particular pattern where we have a function that survives another function, a function that returns a function, is an incredibly powerful, important, construction. It's amazingly useful, amazingly powerful, and we got it and we got it first with JavaScript, which is amazing. Closure Examples I need to demonstrate how this works because I can talk all day about a function that returns a function, but if we don't actually mess with it it won't make any sense. So let me start with a very simple example. We're going to make a function called digit_name. Digit_name will take a string or take a number and return a string based on what that number is and it will do that by returning an element from an array of strings. So you can see this is very trivial and unfortunately, there's a problem with the way I wrote this and that is I'm using a global variable called names. The problem with that is if there's anything else in my system that also has a global variable called names, we're going to get a conflict and either my function will fail or theirs will fail or maybe we both fail and that's bad, and there is no way you can test for that because you can't anticipate all of the code that your code is ever going to have to run with and this is especially a problem in browsers if you have to run with ads because ads come with some of the worst code you've ever seen because they'll pay you to take their crap, right? And you can threaten them and say, we're not going to take your ads unless you clean it up and they say, okay, we'll give our money to someone else and you say, oh, just kidding. That's how it works and they use lots of global variables and eventually you're going to get the call that your stuff stopped working because of some ad, but it's not the ad's fault, it's your fault. So we need to reduce our dependence on global variables. So here is another way that I could write the function. Here I've got the digit name function and it has a private variable called names because all the variables are private within the function scope so if there is a global variable called names, no conflict. So that's all really good. The bad part is that every time I call digit name, I'm going to have to construct a new array, put 10 things in it, just so I can take one thing out of it. That's hugely expensive. Now an optimizing compiler might observe that names isn't variant over invocations and so it might try to optimize that out, but optimizing compilers can take minutes to do their work and on the web we can't take many minutes. We have to start instantly so the compilers are always going to be really fast, get going quick compilers. So they're not going to do this so we need to do it. So here is a third way to write this function. Here I've got digit name. I've got a function with the private variable containing the names, but this time I'm returning a function and the function I return is going to use the names and I'm calling the outer function immediately, which means that what I'm going to be storing in digit name is not the outer function, but the return value of the outer function, which happens to be the inner function and that inner function because of closure will continue to have access to the names variable, even after the green function has returned. This is the most important idea we're going to talk about over these three days so I want to see everybody, including you at home, nodding like yes, okay, I got that, okay? You didn't get it so let's do it again. So let's start over. This is the original one, okay? We've got the global variable, we've got our simple digit name function. Now remember the thing we just talked about with Tennent and his correspondence principle that we can take any value, wrap it in a function that will return that value, call that function immediately and it's the same thing. So I can do that in this case. I've got, I'm going to assign to a digit name the result of a function which will return a function that I call immediately. So you can see that this and this do exactly the same thing, this one is just a little bit more code, okay, is everybody comfortable with that? Could you show those two? Yes, so we're going to assign a value to the variable. We're going to assign the result of calling a function that will return that value in calling the function immediately. So in this example, is the main benefit of using this that we're preventing polluting the global stack or what are the benefits? Well, that's what we're going to get to, we haven't done that yet. So these two are doing exactly the same thing. So we still have the global variable problem. We haven't fixed anything yet, we're just getting set up. Now you're familiar with the idea that we can have local variables. We declare a variable within a function that's local to the function and we've got global variables, which all the functions can see, but we can have, you know, so that's two worlds, right? You can think of it like the instance world and the static world. It's not exactly like that, but it's kind of like that. We've got two levels of visibility, but we can go much finer because every time we nest a function, we get a new place where we can keep stuff. Another place where we can get variables, and each function will determine the visibility and lifetime of the variables that are defined inside of it and so it's not just local and global, we can now have in-between scopes and they're just scopes. So I'm going to make one change to this one. I'm going to take this global variable and I'm going to just move it down one line to be inside of the outer function, okay? So we've introduced a third place where we can keep stuff, which is now in the scope of the outer function. This is the same result that we had before, okay? So we've got digit name, which will receive a value which is a function which is obtained from a function that we call immediately and that inner function will continue to have access to the names variable of the outer function. So this is closure. This is the big idea, the key idea in functional programming and this is the good stuff, this is the good part. We're going to spend a lot of time tomorrow playing with this model. Any questions before we go on to the next one? Yes? So you solved the global variable problem by moving it in there, but you said that the solution to this was basically you'd move the invocation, move everything onto a heap, right? So now in the heap, are you going to have three instances of the names array? If you call alert digit name 3, then 4, then 5? No, so that's a really good question. So you're familiar with the idea of an activation record in a programming language so an activation record will contain the return address, contain all the input parameters and might also contain local arguments or local variables and stuff like that. So in JavaScript or in a functional language like this, the activation or a function object will contain a pointer to the code so it knows what to execute when the function is invoked and the function object will contain a reference to the activation of the function that created it. So in this case this function, or the object that's created by that function expression will be linked to the context of the green function that called it or that created it. So through that activation connection it will have access to the names variable, not to the value of the names variable that it was created with, but the actual names variable itself. It will always have access to that and will continue to have it as long as it survives. If at some point I set digit name to undefined, at that point all of that stuff can get garbage collected, but until then as long as the inner function survives and needs that activation, it will keep it and the garbage collector will not touch it. Okay, thank you. So when we call digit_name we'll be calling this function which will simply go to the array and return a value. So we can call it with 1, 2, 3, and we're just calling it and returning three values. We're not creating any stuff because we're just calling that little function. So let me show you a slightly more complex example. This is the fade function. The fade function is something we might run in a browser, which will pass in the id of a dom element and it will change the colors of that element from yellow to white gradually over a couple seconds and so here's the fade function. The first thing it will do is it will use document.getElementById to look up the id and obtain a dom element and it will create a variable called level and it will set it to 1. It'll create a function called step and it will then call setTimeout, which will call step in 100 milliseconds and then it returns. So at this point the fade function has finished, it's returned, it's done. Suddenly 100 milliseconds later, the step function runs. So the step function will go to its level variable, the actual level variable of the parent function and turn it into a hex character. It will go to the dom element and change its background color. It will go to the level variable again; if it's less than 15, which it will be because it was 1, we will add 1 to it. We're now changing the value of the variable in the outer function and we'll call setTimeout to do this again and we'll keep doing it until eventually the level gets up to 15 and then we stop and at that point the garbage collector can come in and take it all out. So, but you call fade, you would just say fade 75? You don't have to say var x = fade 75, you just fade 75? Right, well we'd probably give it a name. Will the garbage collect it if we don't give it a name? Well, no, I mean the argument to fade will probably, because we're talking to a dom, or it could be the string 75. Yes, it should be a string. Yes, but yes, so we could do that. So suppose there are two dom elements and we want to fade them both simultaneously, so we call fade a, fade b immediately. Will there be any conflict between them? And the answer is no because every time we call fade we get a new dom variable, we get a new level variable and a new step function and they will not interfere with each other; they're all within their own function scopes and will be completely unique. So one of the nice things about this construction is that we get that separation so that we can have lots of complex behavior going on without those behaviors interfering with each other. Each time you call fade, setTimeout passes a new instance of the step function to the dom? Is that correct? And so the dom doesn't garbage collect the step, those step functions until after it's invoked the timeout, after it's called that function? The dom is only involved in that we're calling, we're looking at the style to change the color. Otherwise the dom is not aware that any. I'm sorry, setTimeout is? Yes, it doesn't actually belong to JavaScript, it does belong to the browser, or it also has it, yes. So each time we call fade we will make a new step function and we'll pass that step function many times to setTimeout until we finish. I was trying to get my head around why step doesn't get garbage collected; it's because the browser? Right, because the browser is holding it in its timer queue. Okay, its timer queue. And as long as anything in the system is aware of something that's rooted then the garbage collector won't touch it. Okay, thanks. Yes? What's the additional value of h when you declare it there? The first time h will be the string 1. And then what does the toString 16 do? ToString 16 means make a hex character so this'll give us our from 1 to f, and then we add them to here so the final color will be pure white so it'll be FFF F F. Okay. Pseudoclassical Inheritance Let's get into object-oriented programming, what do you say? So when Brendan Eich designed JavaScript he was very strongly influenced by a paper that he read about the self project and the self project was into prototypes, not into classes and he thought, wow, that's pretty neat. So he decided to put that in his language, but he didn't fully understand it or have confidence in it so he came up with something kind of intended to be more classical, thinking that the classical guys would like it better. So this is the original intended model for how you were supposed to use JavaScript. So we're going to make a gizmo. That's our Gizmo constructor function. We'll pass in an id and we'll create a property of the object that is named id, which will get that value and we want our gizmos to inherit a toString method so the way you do that is you say Gizmo.prototype.toString = function and that function implements the toString method and the Java programmers looked at this and said, what the heck is that? Okay, this is awful looking! You know, what's this .prototype stuff, why are you leaking your stuff and nothing is contained in anything, right? And you want to have your class have some integrity like it's wrapped in something and this has its guts spilled all over the place. This is just really awful, but this is how Brendan thought the language was going to work. So let me diagram this for you so you can see what's actually going on. So this is that code put up on the screen. New Gizmo is how we make the instance so this is the instance of the Gizmo, we see the id property that the constructor put in it. This is the Gizmo function because in this language functions are objects so they can also have properties. Every function is born with a prototype property just in case it's going to be used as a method or as a constructor and the Gizmo.prototype is this object. This is Gizmo.prototype and it contains these fields. This is the toString method that we assigned there and the system also has the object function and that's the constructor of all objects and we've already talked about object.prototype; that's the thing that all object literals inherit from. So we can add a constructor link here. The constructor property contains a reference to the constructor. So Gizmo.prototype.constructor is Gizmo and similarly, Object.prototype.constructor is Object and you can go around that loop as many times as you want, but the important thing is the delegation link, the inheritance link. So our instance inherits from Gizmo.prototype and Gizmo.prototype inherits from Object.prototype. So I ask my instance for an id, id we find it here and we return the string. If I ask my instance for its toString method, it'll say no, I don't have one, he's got one so we return this function as though it were part of this object and if we ask for its foo property, he doesn't have one, he doesn't have one, he doesn't have one, it's undefined. Okay? So that's how Brendan thought you were going to use this language and there are some good ideas in here, but it's kind of a mess, but you know, you've got some idea about what JavaScript is actually doing now. So the new prefix, if it had been implemented as a method instead of as an operator, this is what it would do. It would create a new object, which inherits from the functions.prototype property and will then call the method, passing in that object and binding it to this and getting a result and that result is probably what's going to get returned. And again, this is kind of a mess, but we haven't inherited much yet, so let's reexamine this thinking about inheritance. So if we replace the original prototype object we can then inherit another object's stuff. So that made no sense. So let's look at an example and try to figure this out. So we're going to make a Hoozit. The Hoozit is something that will inherit from Gizmo, okay? So we've got our Hoozit constructor like before and we're going to replace Hoozit.protoype with an instance of Gizmo and then we'll add an additional method, we're going to add a test method to our new prototype and the Java guys looked at this and said, what the heck is that? I mean, we thought the other one was bad, but holy cow! That's how you write extends, are you serious?! This is just horrible! This is absolutely horrible! So let's diagram it. Let's look at what's actually going on here. So here is our instance of Hoozit. This is the Gizmo function that we had before and the Gizmo.prototype. There's our Hoozit function and Hoozit.prototype, but we replace Hoozit.protoype with this new instance of Gizmo. So when we add the delegation links we've got that inheriting from that, which inherits from that and that will inherit from object.prototype, but I left that one out. So if we ask our new instance for its id, we get it there; if we ask for its test method, no, there it is and it'll return that method. If we ask for its toString method, go no, no, there it is, and it'll return that one. If we ask for its constructor property we'll go no, no, yes, except it's a Gizmo and go oops, which I think is not that bad because you should never ask anything what it inherits from. You should only be asking what can you do? You know, we shouldn't judge our objects by the character of their contents. So you know, again, that's how it works and one source of confusion is that sometimes we consider this to be the prototype link, but this one is also called .prototype. So having two pointers which are designated as the prototype pointer which are completely different and distinct is certainly a source of confusion as well. But this is how most of the people who are writing JavaScript are using the language. They are doing this and they are miserable. They're hating their lives and they're hating JavaScript and they're angry and bitter and wishing that their favorite language was doing better because, you know, why have they come to this? Writing classes in JavaScript is just awful. So yes, so putting them together, it's just, well there's awful stuff here. Like you know, there is just the horrendous ugliness and the lack of soothing syntax. We're getting enough code reuse. For examples, both constructors create an id property, but they both have to repeat that. That's not code reuse. So there are lots and lots of JavaScript libraries in the world and most of them recognize that there's something seriously lacking here and so they'll provide some mechanism for sugar-coating this pseudo classical system in order to make it a little bit nicer. For example, they might do something like create a function called new_constructor and it will make constructor functions and I will pass into it the thing that I want to inherit from. So I want to extend an object, I want to extend Gizmo and I'll pass in a constructor function and I'll pass in an object containing the methods that I want the instances to inherit. So you know, it doesn't look like Java, but at least you can recognize the components, right? And so there is something nicer about this and the surprising thing about this function is that that's the entire function. So JavaScript is such an amazingly expressive language, such a powerful language that one little piece of code can radically transform the appearance of the language, which is pretty extraordinary. Not a lot of languages can do that. Do I recommend this approach? No, I don't. Even though this is clearly better than using the language as intended, you're still stuck in this classical paradigm, except you're trying to do classical programming in a language without a type system and that is really, really hard. Classes provide a lot of brittleness and without the constant type checking to keep you honest, it's really easy for things to go bad and in JavaScript things go bad very quickly all the time and that's why people who are trying to write in this model in JavaScript are so angry at the language, they're constantly in a sense of rage. Module Pattern I think we can do better than that and I think the way we do better is by going back to the module pattern, that we can take any bunch of stuff and put it inside of a function and invoke it immediately and it does the same thing, except that in this space we're not creating global variables. This stuff is not leaking in out creation where it's a danger and a hazard. So we can make stuff using this pattern. For example, I want to make a singleton. I want to make one instance of an object containing two methods, firstMethod and secondMethod and I want those two methods to share private variables and private functions. In JavaScript there is no easy way to get privacy, except when you're doing this stuff with module or function modules, because each function has a function scope, which is a really effective container for keeping stuff, that nothing leaks out of function scope ever. There is no force in the universe that can force a function to leak what is held in its scope and we can take advantage of that and it's a very nice pattern. So what I'm going to store in singleton is not this outer function because I'm going to invoke that function immediately. What I'm going to store will be its return value, which is this object containing two methods and those two methods will close over the private state and those two methods will share that private state. If one changes one of those private variables, the other, the next time he looks at that variable he will see the change. Now before we figured out how to do this stuff with functions in JavaScript, the private variable and private function would've been global variables and there is no privacy there. And one nice thing about this, you know, compared to a classical language, in a classical language if you want to make a singleton, you still have to make a class, which is a lot of work just to make one thing whereas in JavaScript, none of these are very much work. And there are lots of variations on this pattern. For example, I might have it instead of returning an object, I want to have a global variable, which is going to be the container of everything in my application. It's going to be this one object which is the root of everything and I want to enhance it; I want to add a methodical property to our global object that has these two methods in it which share this private state and we can do that easily. So we don't have to make a class just to make one of and again, there are lots and lots of variations on this pattern; it's a very rich pattern. And again this works because we're invoking a function immediately, but if we don't invoke the function immediately, then we could hold onto that function and make lots of instances. So let's do that. This module pattern is easily transformed into a powerful constructor pattern. So here's the recipe. Step one, we're going to make an object using all of the techniques, any of the techniques available for making objects. We can use an object literal, we can use new, we can use Object.create, we can call another of these power constructors. Any way we can get an object, we get an object. Step 2, we will define some variables and functions. These will become the private members of our object. Step 3, we will augment the object with privileged methods. A privileged method is a publicly available method of an object, which closes over the private stuff. And step 4, we return the object. So it's a pretty simple recipe; it's way too abstract to make sense out of it, right? So we need to go a little deeper. So let's turn it into a template. So step 1, I'm going to make a constructor function and I'm spelling constructor now with a lowercase c instead of an uppercase c because this form of constructor does not care about the new prefix. If you call this function with a new prefix, it'll run a little bit slower, but it will still do exactly the right thing. So this way we don't have to worry about people forgetting the new prefix because nothing can go wrong. Then I'm going to recommend passing in a specification object, that the way you normally write constructors is you'll just pass in some number of things separated by commas. Some years ago I designed a constructor that had 10 arguments in it, which was a problem because nobody could remember what order they went it so it was really hard to use and then it turned out that nobody used the third parameter, but we couldn't take it out, right? Because if we took that argument out then all of the code would break and so people had to use the third parameter unnecessarily forever and that was really awful and brittle. So what I would do instead is say in an object using an object literal. So by just adding two more characters we can now have named parameters. I can give each of the parameters a name so that the call is self-documenting so we can see what's getting passed in. We can have them in any order. If we leave things out we can have nice defaults; if it turns out some parameters become unnecessary, we simply ignore them. We have lots of power in the way that we can manipulate objects so I would take advantage of that. Also, if we could get the specification object from a JSON payload. So we might have some store and we want to use the constructor to re-constitute the object, we can do that as well or to cause the creation of objects across the network using specifications that get passed over the wire through JSON. Then I'm going to call another constructor to make an object and I'm going to pass the same specification object to him as well. So it may be that that other constructor will make use of some properties in the specification object that I don't need, but that makes sense to him and if there's anything we both need, we're sharing it. I used to call this parasitic inheritance where we're going to call another constructor and we're going to take credit for its work. I was inspired by a wasp that lays its eggs in the bodies of live spiders. So the spider does most of the work of making instances, but the wasp gets all the credit. This is going to do something like that. And then I'll take the result of what the other maker did and put it in a variable called that. I can't put it in something called this because this is a reserved word. Then I can create my member variables and I can create as many member variables as I want and because they're going to be held in the function scope, they're not visible outside of the object, they're only visible inside the object. I will create my private methods and my private methods will have access to the specification object, to the member variables and to the other methods as well. I do not use this in here; I don't need this. So the code is going to be a little bit smaller and cleaner and any of these methods that I need to be public, I simply assign them to the outgoing object and the last step, I return the object and that's it. And it's a very flexible pattern; there are lots and lots of variations on this, but this is the basic core idea that we're taking use of closure in order to private state within the object. Pseudoclassical Inheritance vs. Functional Inheritance So let's compare this to the pseudo classical model. So you remember this. It's just awful with all this stuff hanging around. So this is the same thing now in this functional model and the code gets a lot cleaner. For one thing we don't have the .prototype stuff hanging off the end. Everything is nice and neat and contained within here. So the Gizmo constructor, it's very simple, it's simply returning a new object. Notice the curly braces are on the right so this is going to work perfectly and instead of saying = id, we simply say id: id and we're done. Similarly, then our Hoozit is going to make a new Gizmo and we're then going to add to that function our test method, or add to that object our test method and return the object and it's done. So the code got a lot simpler as well. Yes? I think about 10 minutes ago you were saying not to use this or you didn't recommend it or even like it? Yes, that's the next step. Okay. So let's suppose we want to have privacy, which is something that I think is very important to have in object systems so that we may have a goal where the only way to get access to the id is through the methods of the object, that we don't want anybody to be able to access id except through the methods. JavaScript doesn't provide any way of having privacy in the classical model, in the pseudo classical model because everything you attach to an object is visible to anybody who gets access to the object. So there's no privacy there, but the functional model, because we have function scope and closure, we do. So let's say we want to hide the id. So you can only get id through the toString and test methods, the code actually gets simpler. So instead of saying here we simply say id and we're getting it from or we're closing over the variable that got passed in. So it's private, it's completely private and this code got simpler, too. We got this id and which again we're getting through closure and the code got smaller and simpler. Yes? Aren't the functions copied though then for each instance? Right, so there is a cost to this model that in the prototypal model we're saving memory because we only have one of each method per instance. In this one we're going to have lots more function objects being created and we'll talk more about this on Friday, but it turns out that that's significant only if you've got millions of instances, that memory has become so expansive, you've got gigabytes of RAM in your pocket. Making important architectural decisions based on memory conservation is not a good way to go now and I'll contend that if you are worried about millions of instances, maybe this not the language you should be using, but for most of what we do, the number of objects that you're going to make is going to be relatively small and performance wise this is going to be great. So we're going to spend a little bit more in construction, but we're going to spend a lot less in execution so I think it's a win. That fact that you create an instance of that function for each instance of the object, that's because of closure, right? If or, we can't have closure without that. That's right. Okay. Yes, because of closure we're willing to pay that cost. Right. And it's not much of a cost. If you look at what is in a function object, there is basically an object with two extra pointers in it, one for the pointer to the code and one for the pointer to the activation of the creating function, there's not much else in there. There'll be a link to the prototype object, which is a waste because we're not going to use that and that's it. So there's not much memory in those things, not a lot of work to initialize them. So it's very lightweight. Anybody else? So this also solves another problem that we have in the language, which is related to this binding. So this is the old Hoozit, the original pseudo classical Hoozit. Suppose we want to take the test function out of the object, which we can do because you can copy any object reference out of an object and put it in a separate variable and then call that variable as a function, that call will fail because when we call test as a function, this will be bound not to the object that it came from, but to something else, either to the global object or undefined, either of which will cause this test to fail, but in the new form, because our methods do not have this in them, we can take those functions, pull them out of the object and call them independently and they still work exactly the same way because they don't depend on this so no matter how those methods are called, they will always work correctly. All the fragileness of this is completely avoided. Just so I'm unclear on this, in the middle one there, in your function where you're taking the test id, you have access to id because it was pushed onto the heap and it hasn't been garbage collected yet because it's rooted somewhere? Right. Okay. Yes, in fact it's being rooted by that object or by that method. As long as that method survives, the context it needs will also survive. Okay. The Metamorphosis of AJAX The History of HTML Episode IV, The Metamorphosis of Ajax. All the world's a page and all the men and women merely pointers and clickers. So we're going to start this hour with Sir John Harrington, a poet, courtier, the saucy godson of Queen Elizabeth I of England. He is best remembered today as being an inventor and he invented one of the fundamental inventions which makes civilization at our scale possible. Without his breakthrough we could not live in cities, we could not be living in the density necessary for technological achievement. A great move forward. So this is a picture from his book of his invention. Can anybody identify it? It's a toilet. It is a flush toilet, indeed! Civilization couldn't work without them, but the Romans had flush technology but it was lost when the empire collapsed, but Harrington rediscovered it and he built one of these devices for the Queen and installed it in her residence and as usually happens with the really important inventions, she refused to use it. She complained that it made too much noise. She didn't want everybody in the castle knowing when she was going about her royal business. So it never got used, but he published a book about it and eventually other inventors saw it and improved it. Over the next couple hundred years people added other essential elements to it including the float valve and the S trap and the siphon and eventually we got it. But where are the fish though? I mean, I want fish my toilet. Of course, you do. So he wrote a book and the title of the book was The Metamorphosis of Ajax, which he published in 1596, he called it the Ajax. We'll now flash forward a little bit. This is Jesse James Garrett. He was a designer and consultant in San Francisco. He was on a project where he was consulting with the engineers of a company and they told him that they had found this way of writing applications for the browser, that instead of doing a page replacement each time the user clicks, instead they will send a little bit of data for the browser to the server. The server will then send a small response back. JavaScript in the browser will then display it on the screen so that they can get much better user experiences using this technique and it worked. The problem they were having was they couldn't convince their own management that this was an acceptable thing to do. So they asked Garrett if he could please explain to management that this was a good idea. So Garrett goes off and thinks about this problem and thinks about how do we present this idea and he says the solution came to him when he was in the shower. I think it would've been a better story if he'd been on the throne going about his royal business, but he says he was in the shower when it occurred to him that he had to give it a name and the name he gave it was Asynchronous JavaScript and XML or Ajax and he published a memo about this on his company's website in 2005 and it went viral. Almost overnight everybody was talking about Ajax and writing Ajax applications for the web and it completely transformed the way we think about browsers and JavaScript. So to give some historical context for this, the web comes from word processing and word processing historically comes in two very distinct schools. There is the binary proprietary school which started with standalone equipment and then shared logic and eventually personal computers, which was dominated at various times by companies like IBM, Wang, and today Microsoft. Then there's the textual open school in which everything is represented as text all the time. One of the first examples of this is a program called Runoff that was developed by MIT and Runoff was intended for producing text that could be sent to various printers. So here we've got an example of a Runoff file. If a line starts with a period in column 1, that means it's a command. So in this case we'll skip 1 line and then we'll tab 4, we'll offset 4 and so on and then the text between the commands will be filled into the margins and this was a very popular program. It was ported from one mainframe to another, it moved all over the place. Eventually it got to Bell Labs where its name was shortened to roff and other versions of it were created including nroff and troff. Troff was the way the UNIX community did typesetting and made books for a long, long time. Meanwhile, there's an attorney at IBM named Charles Goldfarb who thinks he can do this stuff better. So he begins a project called Generalized Markup Language. Now this is an example of a generalized markup language, kind of mid point through its evolution. The markup has gotten a little bit more complicated, now he's got a colon in column 1 followed by a cryptic command and then if the command is followed by a dot, he can then have content on the remainder of the line. So he's starting to mix up the commands and the content and if you're familiar with HTML, some of these commands might be eerily familiar to you and that's not accidental. In fact, HTML comes out of this heritage. All of them except for eol, but you can probably guess what that one means. And in fact, as GML went through its evolution, we got :e and then :: and then finally angled brackets and so you know what happened after that. So if you've ever looked at HTML entities where you've got an ampersand and a cryptic code and a semicolon and you're wondering in what bizarre universe does it make sense to have a piece of random punctuation and then a code and another random piece of punctuation. Where did that come from? This is where it came from. He ran out of angled brackets; there was no other way to make it look nice so he went with the awful-looking thing. So where did the angled brackets come from? The angled brackets were inspired by Scribe. Brian Reid was an amazingly bright guy at Carnegie Mellon who made Scribe, which he called a document compiler and this is the first time someone got the separation between content and formatting right. Scribe was a brilliant piece of work. Scribe could take documents and put them to all sorts of output devices and it had an extremely nice language in that there was one reserved character, which was the at sign. You'd say @ and then name of an environment and then you'd have some quoted stuff, which would be affected by that environment and he had six sets of quoting characters so you could pick the one that's guaranteed not to interfere with the content that you're putting inside and you could nest these things as deeply as you wanted and if the nesting got too deep, he had special forms with begin and end so that you didn't care about things accidentally matching. So for long things like chapters and sections you could enclose them like that and Goldfarb looked at that and went, oh angled brackets; I didn't know you could do angled brackets, that's great! So he stole some ideas from this and unfortunately he didn't steal enough ideas. So one of the things that Scribe could do was bibliographies. You know, since it was developed in the University, it has to be able to deal with academic papers and such. So here we've got a description of a tech report and a book and I believe that this is the very first time that a document format was being used as a representation for data because it looks like JSON, right? It's key value pairs separated by commas and it's really reasonable looking. I mean, even things like the details, like year equals 1972. There are no quotes around the year because requiring quotes around numbers would be insane, right? You know, it's just really good. So the GML community got the idea of attributes from this, but they didn't copy enough of the good stuff and it's a shame that Tim Berners-Lee hadn't been more knowledgeable about text processing systems because if he had based the web on Scribe instead of on SGML, our lives today would be so much better, but he didn't do that. So you know, we can see that Runoff inspires GML and Scribe and Scribe helps inspire SGML, but not enough, and eventually that leads to HTML. How JavaScript Saved HTML HTML was not state of the art when it was introduced in the late 20th century. It was intended for simple document viewers; it was not intended to be an application delivery platform. We insisted on using it as an application delivery platform because we needed one and that was the best thing available so we used it, but it was not well suited to what we wanted to do. A lot of people looked at it in the beginning and thought it didn't have what it takes and they were right, but we went ahead and did it anyway. So since then the web standards were grown from a naïve hyper-tech system under intense, highly unstable competitive pressure, as Netscape and Microsoft attempted to destroy each other by manipulating web standards and it wasn't designed to do all of this Ajax stuff. Its success is due to a lot of very clever people who found ways to make it work despite its design limitations. Now HTML was a huge improvement over SGML, primarily in that it was much simpler. Any time you take something that's too complex and turn it into something that's simple, you generally are making it better. Unfortunately, they also made it more resilient. One of the rules in SGML was if you don't recognize a tag then total failure and nothing happens and they thought that was too much for the web so instead on the web the rule is if you see a tag and you don't recognize it, ignore it and keep parsing and it works. And that's actually been good for the web because it meant that we could upgrade. We could have forward and backward compatibility because the browsers would be ignoring the differences as they enter and leave the standards. Unfortunately, the dark side of that is that there were and maybe still are incompetent web masters who could not correctly write HTML and the browsers would do heroic stuff to try to make sense out of the stuff that they were writing, which turned into security exploits, which we'll talk about on the third day. So in the original formation of HTML, authors had virtually no control over presentation and the thing didn't anticipate applications beyond simple document retrieval, which is a very small part of what we're doing now. It's not internally self-consistent. For example, it provides two ways of writing outlines; one is nested, one is not, it's not consistent. And the thing that we call a web page is not a page, it's a scroll. I'm hoping someday we invent pages because pages are great. Pages were a big step forward in the march of civilization. Maybe someday the web will catch up to that. So the SGML community hated HTML; they thought it was an abomination. They did not like the way it was simplified and even more than that, they did not like the way it was so much more popular than SGML. So eventually they took over W3C and started changing things to be more to their liking. For example, they changed the way p worked. P had originally been a separator; they turned it into a container. They started the thing about semantic markup, which turned out to be a colossal waste of time and they also created the XML fiasco. The idea was that they would create XML which would be the successor of SGML which would be used as the new document format, replacing HTML and be used as the world's data interchange format. It failed at both of those things. HTML it turned out refused to die and it is still the web's document format and the world's data interchange format is JSON. You said you were going to talk about mythical semantic forms? No, I am not going to be talking about semantic markup. Okay. You'll have to find someone else to tell you about that. No, I'm curious as to why. It didn't pay off. We spent a lot of time on it and didn't get anything for it. Yes, the only thing or only folks that we run into that like it are who have disabilities who want the semantic markup to provide some extra bit of information for them. I have a lot of sympathy for people with disabilities who are trying to use the web. I don't think that was the solution, but I can understand being desperate enough that you know, compared to what the web delivers that anyone else might look to be better than what you got. I could talk about CSS, but I just start ranting and you don't need to hear me rant, do you? Anyway, so talking to someone who's finally gotten good at CSS, I mean it's hard; I mean the step up to CSS is really hard and difficult and eventually designers get there and once they've done it the Stockholm thing seems to happen to them. It's kind of like watching a domestic dispute on Cops. You know, CSS isn't bad, you just don't understand it like I do, you know? Yes, anyway, if all there was to the web was HTML and CSS, it would've been replaced by now. The web would be gone and we would be working on something else and this is the proof of that. George Colony, the chairman and CEO of Forrester Research was saying, another software technology will come along and kill off the web just as it killed News, Gopher, et al. and that judgment day will arrive very soon in the next two to three years. So he was predicting in 2000 that by 2003 the web would be dead, replaced by what he called the X Internet, which was the executable internet, it was an application delivery system that he thought that the web was deficient because all it could do was dispense documents and that is not what the world needed. This quote is no longer on the Forrester website. has it though, you can go to and it's right there. So a lot of people heard this message, a lot of people believed it including Microsoft so when Netscape self-destructed, Microsoft said, good, we didn't want the web in the first place. So they disbanded the IE team and put them to work on the X Internet. So they start up the .NET project, they put people on Avalon, they got moving on that stuff. So the surprise was that the web didn't die and the reason it didn't die was because of JavaScript. That what Colony was arguing for correctly was that the world needed an application delivery system, the thing he didn't recognize was that the web already was one because JavaScript was in all the browsers. Now the Java community remembers this differently. The Java community is really angry at JavaScript for being in the browser and thinks it's completely unfair that JavaScript was the language in the browser, had Java been in the browser, things would have been different, except that's not the way it was. Java was the first language in the browser and Java applets failed. Java applets were the biggest failure in the history of software, a total, out-there-in-public, huge scale, flat on your face, the biggest failure we've ever seen in software was Java applets. They were supposed to write once and run everywhere and do everything and they didn't. A total failure. On the other hand, the browser is still in existence because JavaScript was there to save it. So a lot of people who used, or everybody hates JavaScript. There are people who don't know JavaScript who hate it because they should know it. The people who do use JavaScript and hate it actually hate the DOM. The DOM is the API that the browser presents to JavaScript and it is one of the worst APIs every invented. It's just really awful. It was designed also by Brendan Eich. He designed it the same week that he designed JavaScript. It was a busy week. He was handed Danny Goodman's HyperCard bible, which is a book that's 2 or 3 inches thick about HyperCard. He had never used HyperCard, never used a scripting system. A very smart guy, a very quick study, he read the book very quickly and thought he got the sense of it. You know, how do you apply the HyperCard model to the browser and he came up with the DOM and it's just awful, it's just awful. There have been many other people over the years who have improved it. For example, one of the most important contributions was Scott Isaacs of Microsoft. He was on the IE 14 and he looked at the very peculiar model that Netscape had come up with and normalized it in a really good way. In the original Netscape model, not all elements were scriptable and those that were scriptable were only scriptable in a way that matched what was in the HyperCard book and Isaacs said, let's make them all work the same way, which made DOM programming significantly easier, but they never finished and the reason they never finished was because Forrester said it's done so they said, okay, and they were all put on other projects and after IE 6 it was left and that was the end of the story. So they knew it wasn't finished. They never intended to leave it at that state, it's just the web was done so they went off and did other things. So as a result, it's this incomplete API that's at the wrong level of abstraction that's just horrible to use. The Browser So this is a flow chart of a browser. This is how the original web browser worked. You can think of it as a snake. You put URLs in one side and you get pixels out the other side. So you feed it with a URL that goes to the fetch engine, which will then go out on the internet and find the thing and bring it back and put in the cache. Then it gets given to the parse engine, which will parse it and turn it into a tree, the tree being the data structure, which represents the document. The tree is then given to the flow or layout engine, which will figure out all of the components on the page, how big they are and how they are located relative to each other and create a display list and then the display list gets given to the paint engine, which will then turn it all into pixels, which you can send to the screen or to the printer and that's, well all browsers still do essentially that. When work started on the Mosaic browser they added the image tag. So this is the hack that they came up for making the image tag work. When they got to the parse engine and they'd get an image tag, they'd stop, sneak back to the fetch engine and say, go get that picture and they'd wait for it to come back, and then they'd resume parsing. They were on one of the world's fastest university networks so that was working really well for them, but Mosaic got loose, got into the world and at that time we were still on dial-up modems. Anybody remember dial-up modems? Can anybody sing the dial-up modem song? Yes, and so the experience of running Mosaic on those modems was that you would wait until every image got loaded and then everything would display. So it could take a long time to get a web page going. So when those kids then moved to Netscape, their goal was to kill Mosaic. They want to create a monster that kills Mosaic so they make a Mozilla and Mozilla works a little differently. So when the parse engine gets to an image tag, it goes to the fetch engine and says go get it, but they then resume parsing. They put a placeholder in the tree to represent the picture and they continue to parse and if they see another image tag, they tell the fetch engine, get that one too; they put a second placeholder in the tree and they continue parsing and then at the end they will display what they've got so far and so you would see little placeholders in the thing, but you saw text right away and then as the fetch engine delivers the images, they then repeat the flow and the paint to incorporate the new things and so there'd be an animation effect where it'd go boom, boom, boom, boom as the images would appear. So overall this could take longer than it did on Mosaic, but the user experience was much better because they would see things immediately and so it was a hit. It was very successful and all browsers today are essentially doing that, it's just our networks are going so fast that we don't see the placeholders anymore. So in Netscape Navigator 2 they added scripting. So there is now an event loop in the browser, which looks something like this. We'll do the layout, we'll do the painting, we'll then wait for an event, which could be something happening with the UI, someone moving a mouse or typing on a keyboard or it could be something coming from the fetch engine, something happened on the network or it could be a timer queue saying some time is elapsed and something happens. Whatever it is, it will cause some script to run and that script will run to completion. It's guaranteed it will not be interrupted by the next event, which is a good thing because it makes the scripts much easier to write. The script will probably mutate to the tree in some way, the event will probably cause something to want to be displayed or modified, which means we'll then do another flow, another paint, and then we'll get the next event and so on and that's basically what browsers do. Now this is way over simplified. There are some mutations of the tree which will cause flow to happen immediately so it's not in these very clean phases necessarily, but this is pretty much how they work. This is the way of the browser, they all do that. The Script Tag Brendan invented the script tag because he had the problem, where do you put the scripts that go on the page, and since HTML was a text format, he decided to deliver the programs to the browser in text form, which is unusual because most languages will deliver an executable to the execute site. JavaScript delivers source to the execution site and it was because of this problem. So one of the very first things they found when they started writing pages to take advantage of JavaScript was that when you display those pages on Navigator 1 and on Mosaic, the script would show up as text and it was because of the HTML rule, if you see a tag and you don't recognize it, just keep going, and that was hugely embarrassing and there was no way they could go back in time and tell those older browsers not to do that. So they came up with this terrible hack. They wrapped the script in HTML comments and as long as you're not using a minus minus, then the script will be hidden and no one will see it. I still see people doing that so if you see anybody doing that, tell them this hasn't been necessary since 1996 so knock it off. Microsoft added a language attribute because they intended to kill off JScript and replace it with their own VBScript, VBScript being a dialect to Visual Basic. That didn't happen and the irony is the reason it didn't happen is because they did such a good job on JScript. If they'd done their usual thing then JScript would not have gotten critical mass and JavaScript probably would've failed and they had a chance to steal the market, but they didn't. So as it turned out, JavaScript was the only language that ran reliably on all browsers and so that was the language everybody used. The only people I've seen using VBScript are criminals and advertisers. Everybody else is using JavaScript. They added the source attribute, which was a really good idea because it turns out you should not put script tags on to pages. If you put scripts into separate files then they can be minified, they can be g-zipped, they can be cached, all of which are extremely good for page startup performance, which is critical. If you put them on the page, you don't get any of those benefits. So everything should be in separate files. Then finally, W3C didn't like the language attribute because they didn't make it up so they replaced it with their own thing and they say that it's required, but it turns out, if you're using a source attribute and you should, it is the server that is authoritative on what the mind type of the acid is, not the tag that requested it. So the browsers are required to ignore it. W3C says it's required. I say leave it out. It's not necessary, it's just a waste of space. So document.write is I hope the worst idea Brendan Eich ever has. The way he thought that interaction with the browser was going to work was that JavaScript would run as the page is being loaded incrementally and that JavaScript as it's running can insert new HTML text into the document as it's being parsed, which was kind of awful. So I don't see it being used very much anymore except by criminals and advertisers. I see advertisers use this a lot. In the early days of web advertising there was a huge amount of fraud and it was all the ad companies were ripping each other off. You know, misreporting image views and other things. So they came up with this agreement that the way an ad placement can work is you put some ad script on your page and it will do a document.write of a script tag going to an ad server and that one will then return code, which will do a document.write of a script tag going to another server and they can do several of those things going off to different places and it allows all of these different companies or agencies to separately count the thing so that they can all agree on what actually happened, but one of the consequences of that is they can add huge delays to the rendering of the page because everything has to wait until all those ad redirections get finished and it's also a huge security vulnerability because any of those servers can be sending anything they want and if they send something nasty, bad things happen and there's no way to defend against it so that's all pretty awful. So I don't recommend anybody use it ever again. So unfortunately where you put a script tag on a page can have a huge impact on the page loading time. The correct place to put script tags should be in the head because it's meta, right? Scripts are not content, they're meta so that's what the head is for. Unfortunately, browsers are extremely incompetent at script loading and so if you put scripts in the head then all parsing blocks until the scripts load, compile, and execute, which means that if there are any images in the body that need to get loaded, they don't even start to load until all the scripts have finished and that's not good. So Steve Souders figured out that we need to move all of the scripts to the bottom of the body instead, which is great for performance, but terrible for reliability, but that's what we do now. He also recommended that we minify and g-zip the script files, which is very good advice. He also recommended that we reduce the number of script files as much as possible by concatenating them all together and there are really good reasons to recommend that because HTTP is incredibly incompetent at loading script files and so the serial delays in HTML request transactions really hurt you badly there. So Souders recommended that you concatenate all of your script files together and turn them into one big file and that way you avoid some of that HTTP overhead. Unfortunately, that leads to other problems. It completely breaks caching because every page will have a different combinatorial set of scripts it's going to load so the likelihood that one combined script file is going to be reused again is extremely low. It also introduces bugs because there are certain errors that can happen, which are not correctable if you put lots files together. For example, you might have one file that's written by an incompetent idiot who depends on semicolon insertion and that gets concatenated onto someone else's file and the place where the semicolon would be inserted is no longer an insertion site because of the concatenation and now the file fails. Document Tree Structure I talked about parsing and making trees. So this is some HTML text. This is a tree that it expands into and there are some interesting things to observe about this. One is first, we've got lowercase here and uppercase here. So when the web first started, the first generation of web masters typed their markup all in uppercase because they wanted it to stand out. They wanted to make it really obvious what was markup and what was content and writing it all in uppercase made that clearer. After a few years of that, they got tired of leaning on the Shift key and they decided, ah, what the heck, and it's all lowercase now. That transition happened just as Brendan was designing the DOM and so the DOM had to pick a convention and the convention was let's go with uppercase because that's what people were using at the time. So everything that's lowercase here gets shifted up to uppercase there and you need to be aware of that because sometimes it doesn't matter, but sometimes it does matter so you need to be aware. Then there are features in the tree that are not in the text. For example, I did not specify a head tag, but there is a head tag in the tree. So it'll add extra bits to the markup. Another place it'll do that is in a table. If don't specify a t body it'll stick a t body in there for you and that can get you into trouble if you're trying to parse around things and you'll find levels of content that you didn't expect to find. Then the other thing here is that this is the ideal Microsoft tree. So Microsoft IE 6 would make a tree like this one. W3C under the influence of the SGML community said no, you need to have more stuff than that. For example, the white space between here and here, which you would ordinarily ignore, that has to go into the tree. Microsoft correctly decided, no, we shouldn't do that because that's just a waste of space, but everybody else did and now Microsoft did, everybody does that, but I went with Microsoft tree because the real tree is too hairy and it's hard to talk about so I'm going with the simpler tree instead. Then there is some implied hierarchy in this. For example, you've got h1, h2, and it kind of looks like the h2 is subordinate to h1, and that the p's are subordinate to that, but they're not. They're all at the same level in the tree. So you need to be aware of that. Then on the JavaScript side, in the browser, there's a global variable called document, which is the root of the tree and document.body is a shortcut for getting to the body node. There is also a shortcut for getting to the HTML node, which is called document element, which is, you would think it'd be called HTML, right, because it goes to that one and the reason it isn't is because at the time that document element was created, W3C was planning to kill HTML and they didn't want to leave that evidence in the DOM. So they went with the longer name so that no one would know, but that plot didn't work. So this is a subset of the same tree; I turned it sideways to demonstrate the next thing. So each node has pointers to other nodes. For example, each node has a first child and a last child node, which points to the children and these are the neglected middle children and they don't get pointers. Then this p node only has one child so both pointers point to the same element. Then there are sibling pointers, next sibling and previous sibling going back that way. The body will have a sibling relationship with the head tag. These guys are cousins and you'll be glad to know there are no cousin pointers. Then there is the parent node pointer, which goes up. The body will have a parent node going to HTML and the HTML will go up to the document root and you might be doing, wow, that's a lot of pointers. So if I edit this tree do I have to update all of those pointers? That could be kind of hairy. And the answer is, no, that in fact you cannot edit the tree. These are only read-only from your perspective. You have to use the DOM API if you want to edit this tree and I'll show you that API in a moment. Now if it turns out all you want to do is traverse the tree, for example, if you want to visit every node in display order, you don't need all these pointers, right? If you understand how recursion works, you only need a binary tree, right, and you can make that with two pointers. So I can use first child and next sibling with a walkTheDOM function. It's a recursive function, which knows how to do that traversal and that allows me to do things like implement getElementsByName by passing in a function which will look for names and compile things into a list. In addition to all of those pointers, each node also has a list of child nodes, which is kind of like an array, which will have all of the children in it. Retrieving Nodes You can get a node by retrieving. The way I prefer to do it is document.getElementById. You can also get things by name, by tag name, you can do CSS queries and things like that. There are quite a lot of ways of getting access to a node now. Once you have access to a node you can manipulate it. So these are the standard properties of an image tag and if you have access to an image tag, you can change any of these properties. The most interesting one to change is the source. If you replace the source with a different URL, then suddenly a different picture will show up and JavaScript provides a very convenient way of doing that by just saying name equals whatever value you want it to have and it works. These are the standard properties of an image tag. Every browser will have additional properties that are unique to that browser. I recommend to stay away from that stuff because it's a trap. You want to stay to the common part so it'll work everywhere. W3C was not happy about JavaScript surviving in the web after they'd rejected it. They thought that was going to be the end of it, but it didn't and there was a lot more sympathy at W3C then for JavaScript and so over the years they've been trying to replace the API with something that would be more friendly for Java than for JavaScript, even though Java has never lived in browsers in this way. So they didn't like the old-school way of doing things so they added this one where you can call getAttribute and setAttribute, this form having the obvious advantage that it's a lot more typing and people like that. Another thing you can do when you've got hold of a node is style it. So you've got lots of options for styling. One is you can get at its class name, which is a misnamed thing. It should be .class, but it's not, it's called className even though it can be several class names, it's still className. More interesting is you can get at the style object and change attributes of the style. Microsoft added a really nice thing called currentStyle that sometimes you want to know what is the current attribute for something? You know, I want to know how big something is or where it is or what color it is currently, something like that and Microsoft provides a very nice way of finding that out, but W3C said, no, that's not the way we want to go; instead, we'd rather have you write document.defaultView.getComputedStyle, node.getPropertyValue stylename, right? This is obviously a Java API, right? I mean, nobody who knows JavaScript would write that. Now this is clearly designed by somebody who did not know anything about JavaScript and I don't know what happens with people in the Java world that they want to write stuff like this, but that's kind of the way things happen. So I need to rant a moment. So CSS and the DOM were both designed about the same time and each project was aware of the other. The guy who designed CSS was aware that someday programming languages were going to manipulate style sheets, he thought that was a certainty. And yet, he chose to use the minus sign as a hyphen, knowing that most of our programming languages want to do subtraction with it and that this creates a syntactic difficulty for all of you. He did that anyway. Meanwhile at Netscape they're working on the DOM and they see what just happened to CSS and they go, okay, what are we going to do about that? What they could've said was, well, it's annoying, but we'll just say you take the brackets and you put the string in and that's that. So you'll have to type four extra characters, but it's completely compatible, or they could've said, well, minus signs are a problem so we'll just change them to underbars so you know, minimal change, search and replace can fix. No! They went with the least compatible way of writing these names and it matters because you need to be aware of which space am I in right now? Am I in CSS space or am I in DOM space and it's really easy to get confused about where you are and if you pick the wrong form, it's total failure. They could've made this work like a system, you know? They could've made this work well for you and they chose not to and it's still there and it's never going to get fixed. Anyway, so another thing you can do with nodes is you can make nodes. You can make brand-new nodes. So you can call document.createElement, pass in a tag name like a div or something and you get a new element. That new element is not visible yet and it won't be visible until you paste it into the tree we'll do that next. You can make text nodes the same way; you pass a string to create text node and it'll make you a text node which you can then attach to some other node. Another way you can make nodes is by cloning nodes that you already have. A clone is not a perfect copy. For example, a node might have event handlers on it, but a cloned node will not, but otherwise it's pretty similar and if you pass true to cloneNode, then if the node has children, you'll get clones of the children as well. So if you want to make it visible you have to stick it into the tree. So you can call a node and say append child to the new node and so the new node becomes the new last child of that node and you can also insert things before and you can replace a node with a different node. And again, this is a Java API, right, because you have to say old twice. Like I could just say node, replace yourself with this. No, you have to tell the node's parent, replace that child with that one and the way you find the parent is with the node itself. Why do you have to write old twice? Why do Java people do that? Can anyone? Anyway, you have to remove children and so you can call node.removeChild, but again you don't tell the node to remove itself, you have to tell the parent to remove it so have to do it twice. There's a particular hazard here for garbage collection if you're on one of the older IE browser. So anybody supporting IE 6? 7--- 8? Oh, you're breaking my heart, really? Oh--- okay, I was hoping I wouldn't have to say this anymore, but I do. So there is a design error in IE that if you attach an event handler to a node, that node will not get garbage collected and the event handler and everything that it's holding onto will not get garbage collected, even if you remove that node from the tree. So the requirement is that you have to remove all of the event handlers from the node before you remove the node from the tree, which is too much to ask of anybody, but that's what you have to do. It was identified in IE 6, it was supposed to be fixed in IE 7 and it wasn't, it was supposed to be fixed in IE 8 and it wasn't. I believe it finally did get fixed in IE 9. So sometimes what you'll want to do is take a fragment of HTML text and incorporate that into the document somehow and W3C in their API provided no good way to do that. Their model was you would write an XML parser in JavaScript and have it parse the text and then you would call all the methods that I just showed you and have it build the structure. Microsoft said, no, that's way too much work so they came up with a property of a node called innerHTML, which is a terrible name, but it works. So what you do is you assign an HTML text fragment to that and it will parse it and turn it into a tree and stick it into the document and do all of that very nicely. Unfortunately, it's a security hazard any time you have any manipulation of HTML text, particularly by concatenation, but also by templating, there is a good chance that an attacker can exploit that. So you need to be really, really cautious with this stuff. So which way is better? Is it better to build or clone elements and append them to the document or is it better to compile an HTML text and use innerHTML to realize it? Generally, these sorts of questions I want to answer in terms of what gives you cleaner code and better maintenance? You know, what best matches the way you make the application? That you should only favor performance in cases where it really matters and the cases where it really matters, you want to use innerHTML because one of the few things that browsers are really good at is parsing HTML and they can do that really quickly and they can get the whole thing done in one transaction whereas messing with the DOM, every time you touch the DOM you're going to pay a big time penalty. Events In Netscape 2 we added an event model which is still in all the browsers and the browser has an event-driven single-threaded programming model, which is one of the best things about the browser. Every event will be targeted at a particular node and the events cause the invocation of event handler functions. Unfortunately, the composition of events is really sloppy. From my perspective, there is layer confusion. There are some events which are intended for widgets, you know, low-level components and some intended for application, higher-level components and the design of the DOM mixes them all up. So you've got application-level things like click and double-click completely mixed up with widget-level things like mouse-down and mouse-move. Now it turns out the silver lining here is because everything was exposed to everybody. It was possible for the libraries to come and clean things up and impose order on it and that worked because everything was available, but it would've been much better had they not even been necessary, if the DOM had just been designed correctly in the first place. There is similar layer confusion in the input events. You've got application-level things like blur, change, and focus, and keypress and widget-level things like keyup and keydown and again, it's all mixed up and because it's all mixed up, I see applications mix it up. You know, you'll see people doing all of those things all the time, everywhere because it's available. So there are three ways of adding an event handler to a node. There's the original Microsoft model where you can say node.onClick and assign a function to it and done, and that still works everywhere, and that's pretty nice. Microsoft decided that it should be a method. So you'd say node.attach method on-click function, okay. W3C said, well, that's not enough typing so let's make it addEventListener type f false and this false thing is kind of weird. Normally if you leave a parameter off the browser will replace it with undefined, which is falsy, but in this case it really has to be false or it's going to fail and in a minute I'll explain how the false is used. So an event handler takes an optional event parameter, which is how the event knows what happened. Microsoft unfortunately didn't do that and instead they had a global event variable, which was a bad thing. So because of that this is the standard template for writing event handlers where you either use the event that was passed in or you use the global variable and the target is either going to be the event's target variable or property or the event's source element. Well, I heard two names and it doesn't make any sense. And then after that you can do your normal thing. Now I don't recommend that anybody use any of what I've just shown you. You don't want to use any of that because it's painful and it just doesn't work very well. You want to be using libraries instead. So there were two models for how to do event propagation. There's the trickling model that was done at Netscape and there was the bubbling model that was done at Microsoft. So with trickling you would start with the top of the tree and you would descend down to the tree until you get to the target node and any node along the way down can capture the event and say, I'm taking care of it, don't pass it down. Microsoft on the other hand started with the bottom and they go up, parent, parent, parent, parent until they get to the top and any node along the way can say, I'm interested in this event, I want to handle it. It turns out, Microsoft got it right and bubbling up is the correct way to do that. So when W3C went to standardized and said, well, we need to come up with a standard for how we do the event propagation, do we do the trickling down or the bubbling up? They could've said, let's do the one that's right? No, why would they do that, right? And so what they did instead was, we'll do it both ways. We'll require that the browsers trickle down and bubble up. So they do both. They'll first do a trickle-down phase and then they'll do a bubble-up phase and that false that was on addEvent listener tells you on which phase you're doing it and false will be on the bubble-up phase. So why do we even care? Why do they have this bubbling stuff at all? It was to solve a problem that's probably not a problem anymore. So suppose you're making a catalog page and you've got 100 things and you can drag any of those things from a reservoir and put them onto a page and so that's how you've got this nice authoring package that you're writing and say you've got 100 drag-able elements. That means you need to add a set of event handlers to each one of those. You have to add mouse-down, mouse-up, mouse-move, and so on. So that's hundreds of sets of event handlers. In 1995, that took a long time. Browsers were a lot slower then and JavaScript was a lot slower then and it would take time to do all of that work. So instead what you would do is add one set of event handlers to the common parent of that 100 and that common parent would intercept all of the events and move all the children as needed. I don't think it matters anymore. The systems have gotten so much faster that it's in the noise, but it's still in the model and so you still need to be aware of it. One thing you have to do is cancel bubbling. So at some point some node has taken care of everything and it says, please don't tell my parents, it's under control. And there are two ways to do that and you need to be able to do it both ways. And then sometimes you want to prevent the default action. After everything is done then the browser itself may want to do something. It may want to submit a form or give focus to something, if you don't want that to happen, we would prevent that and there are three ways to prevent that from happening and you have to do all of those as well. Performance So performance is a huge problem in working with the DOM. Every time you touch a node you pay a huge time penalty. Restyling has a big cost, reflowing has a big cost, repainting has a big cost, random things like node lists can have a huge cost. A node list looks like an array except every time you touch it, it can repeat the query that caused it to come into existence, which can be wildly expensive and it turns out in most applications JavaScript has a very small cost. So if you took all the time that's being spent on the browser and say, this much is parsing, this much is marshalling, this much is rendering and so on, JavaScript will be like that much, but I see people going after their code and trying to shrink it down more and more and more when it's all this other stuff that's taking the time, that trying to optimize that. If JavaScript engines were infinitely fast, most web applications would run about the same speed and no one would notice any difference. So you can't optimize for performance unless you have good tools. Unfortunately, we now have some good tools. For example, there's Speed Tracer on Chrome. It records micro events as your application is running on the browser and when it's done, you can then do a post-mortem analysis and it'll show you all the hot spots in the code and what is consuming time and with that information, it is possible to then optimize the application. Microsoft has a similar thing called Performance Dashboard, which I think starts becoming available on IE 11. Optimization without good performance data is a waste of time. So it turns out that a small amount of JavaScript can transform the DOM which is one of the world's most awful APIs into something pleasant and productive. So Ajax libraries are fun and easy to make, which is why there are so many of them and you absolutely should be using one. They provide portability because the browsers are hugely inconsistent and so you need a library that will deal with those inconsistencies for you so you can be working with a much better model. They provide correction of errors that are in the specification of the DOM. They provide a much higher programming model, something where you can be much more productive, and they also provide sets of widgets, which can simplify your development work. So how do you choose? There are so many libraries out there. It would take longer to do a complete evaluation of all the existing libraries than to build a new one from scratch. I don't recommend that anybody build a new one from scratch. So some years ago the Ajaxians suggested just throw a dart because it's just picking one at random; they're all pretty good and they're all much better than talking to the DOM directly so you can't lose, just pick one at random. The problem with that is that each one is a trap, that they are all extremely incompatible with each other. In fact, some are incompatible with themselves from one version to another. That .1 of something might be completely incompatible with .2 of something. And once you get into one of these things you are stuck and it's made of tar and everything you write is going to be dependent on this thing and getting loose from it and onto something else is extremely expensive and painful. So how do you decide what to do? Some years ago I predicted that the market could not tolerate having so many of these libraries around, that there is going to have to be a shakeout and I predicted that they would also disappear and there would be maybe two winners. One of them was probably going to be Microsoft because one of the winners is always Microsoft and one would be something else, Dojo, or jQuery or something. That turned out to be completely wrong. The number of libraries has only increased since then and the first library to fail and completely leave the market was Microsoft's Atlas, which was so bad that even they couldn't use it and they switched to jQuery. So how do you decide? It's a really hard problem and I do not have a solution and we find more libraries coming online all the time, more platforms, more ways of doing stuff and I found I can't keep up with this stuff so how do you choose? I have no good advice, except this, for what it's worth. Ask JSLint. Take any candidates that you're considering and run them through JSLint. That's an objective measure of code quality and it'll give you some indication of how well the thing's written maybe that's meaningful to you. Division of Labor Finally, one more thing about living in the browser is the problem of division in labor because we're now doing client-server programming and for some people, working in the browser, this is the first time to be doing client-server programming and that means you're writing distributed application where half of the application is running in one machine and half of the application is running in another machine, how do you divide the work? How do you decide what goes in which machine, and I've seen people make mistakes in every possible dimension. For example, early on in the web, everything was in the server and the browser was treated as a terminal, specifically as an IBM 3270 terminal and you know, you would send information to the server and the server would generate a new view and send it out and that was hugely inefficient and it was recognition of that inefficiency which got the Ajax thing so popular. So when Ajax started, it went the other way. People started putting the entire application in the browser and they were treating the server as a file system and I say people trying to basically replicate their database in the browser. They'd say, take everything we've got because who knows if we need it or not and they'd send everything over and then they'd complain, well, why does it take so long to send all that data? That turns out not a good way to do it either. So what's the right way? It's to seek the middle way, that you want to create a pleasant dialog between specialized peers, you want to minimize the volume of traffic, you want to be sending stuff on a just-in-time basis. You don't need to send the browser everything it might ever need to know; you just need to send it what it needs next and that's usually a much smaller set of data, something you can send very, very quickly, particular as our networks have gotten so performant and that's the end of that browser. Any questions about that? Yes? I'm just curious what your thoughts are on kind of the virtual DOM approach, like diffing and, I don't know if you have any opinions? Oh, yes, I've got opinions, I don't know if they're useful opinions. So generally, the DOM is a horrible model and propagating that horribleness onto the other side of the network, I think just keeps you stuck in that awful model for longer. I'm hoping someday we figure out a way to liberate ourselves from the DOM because it really is dreadful and the thing that's encouraging to me about the libraries is that they provide a way of doing that. You know, for example, jQuery is so much superior as an API for addressing graphical elements on a page to what the DOM provides, I would like to see us get better in that division. Isn't jQuery just a wrapper around the DOM? It is, but it turns out you don't need much wrappage to make the DOM significantly better. The DOM is so horrible I recommend don't use anything that I've shown you in the last hour. Don't use anything! (inaudible), just use jQuery. Find some library, I'm not recommending jQuery necessarily, but some library, every library is better than the DOM; I'm happy to say that, even, I don't know what the worst library out there is, but whatever it is, it can't be half as bad as the DOM is. If jQuery just calls getElement by id, how can using a jQuery CSS selector be better than getElement by id? It's hiding that stuff for you. It's adding a level of abstraction. It's adding enough indirection so you're not thinking in terms of what the DOM does, you're thinking more in terms of what you need to do. Okay. And it's very effective at that. Lots of other libraries are, too. Okay, so other question. I kind of thought that the DOM was the browser; how did the browser work without a DOM, because you said there was HTML and then right around the same time that HTML was going from all caps to lowercase, the DOM was invented, how would have a browser without a DOM? Oh, before there was JavaScript there was no need for a DOM, right? Other than there was no modeling of the DOM, but the DOM existed in some. The browser had a tree, but it had no API for the world to get access to that tree. Okay, alright. Yes? I guess listening to just the history of everything and it might be an odd question, but everything has fallen into what it is right now and it seems like things are pretty stably moving in a very succinct way, and then when I listen to your stories of like how this came from this and this came from this and all of these languages came from all these other languages, do you have like an opinion or an answer to why such a minute, like how come everything is just as focused so much? Like, how come there isn't so much of this development of these different ideas? Like just how you're saying like the idea of like having returning a function took years to figure that out, like, why, because that's still going on and like, what's going on with that? Why the slowdown? It has always been slow and we'll talk about this on the third day. We think of ourselves as being the most innovative of industries and maybe we are, but we're also humans and humans tend to be extremely change averse and the people who are the last to recognize the value of a new technology are often the people who would most benefit from it and there are lots of examples of that in software technology and we will look at those. ES5: The New Parts A Better JavaScript ECMAScript 5: The New Parts. So complete implementations of the 5th edition are now in all the best web browsers and also in IE 10. So it's almost everywhere. So again, to review the history of the standards, the 3rd edition was ratified in December of 1999, work on the 4th edition started almost a year before that. The 4th edition was attempting to solve some problems that I think were unnecessary and eventually it got so big and so complicated that it was not completable. That project slipped a year per year for 10 years. So it was eventually abandoned and instead we did the 5th edition, which started with the working title, ES 3.1, indicating that it was going to be a much less ambitious attempt at adding goodness to the 3rd edition and it adds two languages or describes two languages, the default language and the strict language and of the two, I recommend not using the default, but using the strict language exclusively and we'll talk later about what's in the strict language. So the goal of the ECMAScript 5 project was to make a better JavaScript. There were a lot of people who wanted us to not make a better JavaScript, but to make a different language and there are certainly good arguments for doing that, but I think standardization is not the correct place to try to do that. So instead of trying to do a big thing, we tried to do a lot of little things. We tried to make the standard conform better to reality. There are some cases where the standard said, implementations must do this and none of the implementations did that. So we said, okay, the standard is obviously wrong; we should make the standard conform to what people actually do. We should try to make the browsers conform better to each other. There were cases where three browser makers would do things one way and Microsoft would do things another way. Microsoft very generously agreed to do what everybody else was doing and that was a very nice thing. In cases where all of the browsers disagreed, where every browser did something different, we took that as license to fix something deep in the standard, that we assume that the web doesn't care if every browser does something different and in that case we can go in and do deeper changes. As a result of all of this work, interoperability is improved. JavaScript was already very good at write once, run everywhere and with ES5 it gets even better. So our number one goal for ES5 was don't break the web and that's a really difficult goal to keep because any time you change anything, something will break and we did break some stuff with ES5, but we tried really hard not to. Now you could argue as I did, there is a lot of stuff out there which deserves to break, but we tried not to break even that stuff. We wanted to improve the language for the users of the language. Most of the critics of JavaScript are people who do not use the language and would not use it, even if we did everything they told us to, so we tried not to listen to them too much and not always successfully. We tried instead to listen to you. We paid a lot of attention to third-party security or mashups. We wanted to make it possible for you to add someone else's code to your page and have it not violate your security or your customers' security. We decided not to try to protect stupid people from themselves because that is just too hard. So yes, we in fact added new ways that stupid people can do outrageously stupid things and we'll get to that a little bit later. And we decided to have no new syntax and the reason for that is that at the time that we were doing this work, IE 6 was still the dominant browser and our concern was that if we launched a new version of the language and if IE 6 is still dominant, then if the value of that new language depended on its syntax, then it's going to fail because every new feature means total failure and that's not helping you at all. So we tried to add as much value to the language as we could without changing the syntax of the language, hoping that eventually we would solve the IE problem and then later editions of the language would be able to be freer with syntax. New Syntax So saying that, we couldn't resist adding some new syntax. So we did it not too much and I'll warn you, all of these are fatal errors in browsers before IE 9. So if you're still on IE 8, please ignore the following. So we added trailing commas and array literals, or to object literals, which is not useful. I don't recommend you use it. The reason we did that was because we fixed the way trailing commas worked in array literals. It turns out arrays always allowed that dangling comma, but they disagreed on what it meant. So in some browsers that array would have a length of 2. On IE browsers it had a length of 3. When you have that kind of disparity, bugs can happen so we changed it so the rule is now that the dangling comma is ignored so everybody will agree that it has a length of 2. We fixed a terrible reserved word policy. JavaScript's original reserved word policy said that you could not have a reserved word in name position in an object literal or in dot position and there is no reason for that so we fixed that. So everything that's yellow now used to be an error; that's all now okay. So that's good. In fact, the reason that JSON requires quotes around its names was because of this design problem in ES3. We added getters and setters. These are accessor properties so that when you go to retrieve a value from an object or go to store a value into an object, instead a function will be called, which will either consume your value or return a value and that allows some new forms of programming. So here we've got a temperature constructor which has a Celsius property and a Fahrenheit property. You can read either one and you can set either one. So you don't know how it works, but it just works and that's a pretty nifty thing. Now it turns out you can do some really stupid stuff with this because if you use this correctly then these methods will be very limited, very restricted in what they do, but there is no actual restriction on them. They can do anything, they can change everything in the world and so you can simply by reading a property of an object you can cause things to happen over there. I don't recommend you do that because it's stupid. I'm sure somebody's going to do that and, I'll be nobody thought that you can do that, and they'll go and do that. We actually did think about that, just hoping nobody does that. Earlier we talked about the multiline string literals. I still think this was a mistake. Oh, this is great. So it used to be infinity, NaN, and undefined were not constants, they were global variables, which meant you could change them. You could say today infinity is 5, let's see what happens. I know that security experts were very worried about someone redefining undefined and what the consequences of that could be. I'm not aware of any exploits that actually happened as a result of that, but it is something that was worried about. So they are now read-only variables, so whew! That's fixed. We fixed parseInt so it no longer defaults to octal when the first character is 0. I still recommend putting in the radix argument anyway, but at least the default behavior is not nearly as bad as it had been. Regular expression literals now will produce a new regular expression object every time they're evaluated so they work the same way that functions do. The reason we did that, or originally the compiler would only create one regular expression object for each regular expression literal and the problem with that is that they contain state. For example, they contain a last, what is it? Last position or last? Anybody remember? The thing that exec uses to remember where the last match happened? No one knows? Anyway, there is this variable that's in there, which if everybody is sharing it, then you can't have multiple execs happening because they'll interfere with each other so we fixed that. Oh, this was awful. So I told you earlier about the object function. It gets called every time you make an object and the array function gets called every time you make an array. It turns out you could replace those functions with your functions. So you get called every time someone makes an object or an array and that's a huge security violation, right? You don't want some code to be able to take those things over. So the specification will now say as if by the original object function or array function, that you can still replace them, but you don't get the security hazard. I'm happy to report that JSON, the world's best loved data interchange format is now built into the language. So we've got a JSON object that provides a parse function and a stringify function. Those names are my fault, completely my fault. I probably should have set encode or decode or something like that, but I didn't. So we're stuck with those so you're welcome. If you're using json2.js then it works exactly the same except json2.js knows to get out of the way if the built-in one is there. So it'll just get faster, which is what you want. New Methods We added a lot of new methods because new methods allow us to provide new functionality without new syntax and so that's where most of the attention went. We added function.bind, which was something that jQuery has been promoting. It allows you to turn a method into a function so that you can use it as a callback. We added string.trim so we now have a method which will remove the extraneous white space from a string. This should've been in the language from day 1, right? It shouldn't have taken this long to get that. Does it trim on both sides? It trims both sides. Okay. We didn't find enough demand to do the individual, the left and right versions and I haven't heard anyone complain about it. If anybody does, maybe it could get into ES7. Yes? You just showed some JS for trim and now is that what it's doing under the hood? It's actually just more JS or is it written in a lower level? It depends on the implementation. It could be written in a lower level. So this is called poly filling; it's also been called monkey patching. It allows you to insert code into the system. So when Brendan originally designed the language, you know, he only had 10 days to do this prototype, he was told that or he expected that if his demo didn't work that Netscape would've done something much stupider. I don't know what it would've been, I can't imagine what that would've been, but that was his race so he only had 10 days and he was pretty confident that he wasn't going to get everything right in 10 days, that there were probably going to need to be some tweaks in the field. So he left almost everything exposed and unlocked. So things which in a properly designed language would be locked down are completely open. So you can add things to object.prototype. For example, in a real language you shouldn't even be able to see object.prototype, you know, all that stuff should be sealed and he did that probably correctly because he figured he didn't get everything right and someday people were going to need to patch the system in real time in order to make it work, which is exactly what happened. So there's all this stuff in the language which is exposed including this. So if you take this code and run it on an IE 6 browser or an IE 7 or IE 8 browser, then they will work the same way as IE 10 with respect to string.trim. Now IE 10 might have its own implementation of this. It might be a native implementation so it might run faster. So this version on IE 6 runs slower. I think that's great! That's exactly how it should be. We added a large new set of array methods and I really like these methods. The design wasn't perfect, but they're good enough that we can use these. So most of these work the same way in that you'll pass, you'll call a method on an array and pass in a function and that function will get called for each element of the array and things will happen based on what method you provide. So the every method will keep doing that behavior until a function returns false and when a function returns false, then it stops doing that. So that allows you to loop for a time and then bail out, similar to what break might be doing in a for loop. Filter will take all of the return values; if the return value is true then the original element of the array will be copied into a new array. So that allows you to take a big array and produce a smaller array based on what happens for each element. forEach is the thing that replaces a for loop. It just goes and gives its every element. I like forEach. IndexOf will do a search so you can do a search in an array similar to the searching you can do on strings. Last of searches, the array from the back end. Map is maybe the most powerful of them. It will take every return value from that function and store it into a new array so you can do transformations and you can do selections and lots of new things using map. Reduce will take an array and reduce it down to a single value and it'll do that by, you pass it a function and that function gets called for pairs of values from the array. For example, if you pass the add function to reduce, you will get a total function. If you pass a multiply function to reduce, you'll get a product function. Then there is reduceRight, which reduces from the other end of the array. I've not found any reason for that. I don't know why we did that, but it's in there. Some is similar to every, except it uses the opposite Boolean value. We added So it used to be if you wanted to get the real time clock you'd have to do what this is doing. You would make a new date object with the default configuration and then extract its time. So usually when you're getting the time, it's because you want to measure something, but this process of getting access to the time adds a lot of latency and so you can now call directly and it'll give you the current value of the real time clock. Now browsers still add a lot of latency and operating systems also add a lot of latency so it's not proven that this actually does a better job, but it's not so much JavaScript's fault now so that's gotten better. When the first standard was made, Netscape and Microsoft could not agree on the language in the standard for describing how date parsing works. There is a date constructor that can take a string and it can then attempt to turn that string into a date object, but they couldn't agree on the details for how that worked. So instead, incorrectly they agreed to disagree by leaving it completely unspecified. So there is no language in the standard that gives you a guarantee that if you pass it this form that it will parse, which is kind of bad. Now since then, ISO has come up with an international date format, which is pretty nice, sometimes called ISO dates or ISO strings in which you have a year, then a dash, and a month and a dash and so on. So the date object will now produce strings in that form and the date parsers are now guaranteed to accept that form. So we now have at least one date format, which is guaranteed to be acceptable. So this is the first one we'll try. If that one fails, then they'll fall back on the underspecified proprietary ones, but at least there's one that's guaranteed to work. A question about those two? So the ISO format slide before this, everything was getting UTC times whereas the one before, so if you do two ISO strings you're going to get everything based on the UTC time; if you are you going to get server time or the time that the machine that the code is running on thinks it is? You get a date object, which has a UTC object? Yes, the date object has some fundamental time that's independent of the current time zone or situation. Okay, alright, okay. So you're getting that core value. Okay, thank you. We added Array.isArray to the language. We tried to fix typeOf and couldn't, that there is too much in the web that was dependent on the broken behavior typeOf so instead we added this horrendously ugly thing, but at least we got into the language and Brendan had actually counted the keystrokes and this is smaller than the typeOf form, so even though it looks a lot worse it's actually a little bit smaller. I managed to add Object.keys, which will give you an enumeration of all of the own properties of an object so you don't get the inherited method names, you only get the data members, which are the thing that you're probably most concerned with and it returns it in the form of an array, which means you can then do forEach on it and so that's pretty nice. Did you propose that? Yes, I proposed that. Cool. I use that all the time and then I do like .length just to see like how big this object is and stuff like that. You're welcome. Yes, sweet! And as you were talking about earlier, there is some, was it s-o-m-e? Yes. How is that different than keys? Keys is an array. Some is for looping on an array; it's the opposite of every. Every gives you all the values, some gives you all the keys? Every gives you everything as long as the functions are returning true and some keeps working as long as everything is returning false. Oh! But it's operating on the values, not the keys? Okay. On the return values that are based on the function that is processing the keys. Got it. Yes? Back on the dates there, there was a question about are there any plans for immutable date objects? Immutable date objects? Yes, I don't know what he means by that. We're going to get to freeze, you can freeze any date object, is that what he wants? I don't know, we don't know. I'm just going to ask for clarification, but I'd just go on. Okay, I also managed to get Object.create added. Object.create is the primitive which makes a new object that inherits from another object. In a prototypal language this should've been in the language from day 1 so instead we had the weird thing with new and .prototype. so this gives us a direct way of doing prototypal inheritance, which is good. Meta Object API Then we added a meta object API to give us more control over the attributes of the properties of objects so we can get much finer control over what's going on. So we now have two types of properties. We have data properties which are the things we've always had where it's just some data that you can store in the object and retrieve and we have accessor properties. I showed you an example earlier with the temperature object where we can have accessors and that's or actually there have always been accessor properties in the language, but they were never exposed to you so you couldn't make them. So there are things in the DOM like .innerHTML is an accessor property. That's why you can assign something to it and then something happens and so there was a lot of interest in allowing you to write stuff, which is as crappy as the DOM and we succeeded so there's no limit on the crap you can write now. So we've got objects. Objects are composed of properties. Each property is composed of attributes. Every property has four attributes. If you're a data property, your attributes are value, writable, enumerable, and configurable. If you're an accessor property your attributes are enumerable, configurable, get, and set. So value is the actual value of the property. Writable is a Boolean that determines if you can read it or write it, or if you can write it. So if writable is false, it's read-only. It turns out the language has always had read-only properties, but it was never exposed to you so you couldn't create them; only the language could create them. So now anybody can. Enumerable means it will show up in a forIn loop or will show up in Object.keys. If you turn that off it means it's not going to show up in enumerations, which means you can hide it a little bit better, that it won't get drudged up all the time. Configurable means you can delete it or you can change it into an accessor property if it was a data property. Get is a function that'll be called if you try to get the property and set is a function that will be called if you try to set the property. So having that, we've got two versions of a statement. The first statement is the way you could create an object literal in ES3 and the second one does exactly the same thing in ES5 and you might be thinking, thank you very much for that, that's huge! So the thing is we wanted to be able to add this functionality, but we were constrained that we couldn't add syntax so we wanted to make it possible, but we couldn't make it nice, but at least possible is better than impossible, which is where we were. So this means that a library can now construct objects and can have control over what's going on. So an object can designate what it wants to inherit from. It doesn't have to inherit from object.prototype. It can inherit from anything or nothing. That's never been an option before, now we can do that and it can say this property is not writable, it's not enumerable, it's not configurable so we can allow you to lock those things down and once you set one of those things to false, it can never be turned back to true again. So that gives you some control that if you want to lock your object down, you can now do that. So we added a new method, Object.getProperty, which allows you to take advantage of this stuff. So this shows an example of creating an accessor property without using new syntax. So this does something similar to what we did before, but it's only using method calls and the advantage of this form is that you can put an if around this and you won't get a syntax error if you're trying to run it on a browser that doesn't have it, whereas with the syntax form, you can't put an if around a syntax error. So if you're trying to run it on an older browser, it will simply fail. So the meta object API contains Object.defineProperty, which I just showed you, Object.defineProperties, which will allow you to define several of these at once, and also Object.getOwnPropertyDescriptor, which will return an object, which describes the attributes of a property. I should point out that this system was clearly designed by committee. You've got things like Object.create and you've got things like getOwnPropertyDescriptor. Committees do stuff like that, right? There's no consistency on how the names work. We added a couple of methods that you'll probably never get to use, getOwnPropertyNames, getPropertyOf. We added these for the purpose of security libraries that want to run before everything else and lock down everything that Brendan intentionally left unlocked so that the environment is now safe for third-party code to run in your environment and in order to get at everything that needed to be locked down, it needed special access ports drilled into the language so it could at this stuff. So what those libraries are likely to do is take these, use them, then destroy them so that nobody else can use them. So having all of this stuff, it becomes possible to do things that we couldn't do before. For example, this is the replace_prototype function and it makes a perfect copy of an object, except that it now inherits from a different prototype. That's something that people have been asking for for years and there is no, until now there was no way to accomplish that and now there is. So the function itself is an ugly-looking function, but you can wrap that in a name and now you can provide that in your library and anybody can now make an object that's got a different prototype. It used to be the case that you could add a new property to any object at any time simply by assigning to it and you can now turn that off. If you call Object.preventExtensions and pass it an object, that object will now refuse to accept new properties. If you attempt to give it a new property, it'll throw an exception instead. And we can go even further than that. We can freeze the object. Freeze prevents extensions and also makes every property read-only and nonconfigurable, which means it is now an immutable object and that has some nice properties. It means that future versions of the language may be able to make some interesting optimizations because they can make assumptions that this object cannot change. That means we can be smarter in the way we generate code for that object. It also means that we can take a frozen object, hand it to a third party and be confident that the third party cannot corrupt or tamper with the object and that's an extremely valuable thing, particularly as you know, we're doing more and more complicated stuff with more parties. Strict Mode We added a strict mode to the language because there are a lot of things in the language that were clearly wrong that we wanted to repair, but they would be breaking changes and so we wanted or we needed to have some kind of opt-in so you'd say, yes, I'm prepared for the breakage this might cause because I want to be using the better language. The difficulty we had in specifying this was how do you say, I want strict mode without introducing new syntax, because we wanted older browsers to simply ignore the fact that we're in strict mode and work the way they always have. So we spent a lot of time trying to figure out how to do that and eventually we came up with a terrible hack. We used the useless expression statement that I was complaining about earlier to specify the pragma. So if the first statement of a function or a file is the string use strict semicolon, that puts that file or that function into strict mode and on older browsers they simply ignore it and it does nothing so that works. We've got the file form, which is good if you're on node. If you're on a browser I recommend you only use the function form. The reason for that is Steve Souders will tell you to concatenate all of your files together and that means that all of your files will have the same strictness as the first file and that could be bad in any number of ways so in browsers we recommend you only use the function form. Strict mode adds a number of new reserved words. So far I'm only aware of let and yield being used in the next edition of the language. The others may or may not be used in other editions. This is the list of things that are provided by strict mode. Unfortunately, it's a boring list so I'm just going to read the boring list. There are no more implied global variables within functions. This was a huge design error in JavaScript so now if you forget to declare a variable in a function, it's going to be a syntax error. It's not going to default to global variables so that's good. This is no longer bound to the global object by the function form. So if you call a method as a function, this will get bound to undefined and not the global object. That turned out to be really important for security. The call and apply methods no longer default to the global object. So it used to be if you called the apply method and passed in null or undefined, meaning you don't want this to get bound to anything, JavaScript will go, oh you can't mean that, you probably mean the global object, and would do that substitution and it doesn't do that anymore. We got rid of the with statement. If you try to assign to something which is not writable you will now throw an exception. It used to fail silently, which is really bad for integrity because the code may have made a change assuming that it succeeded in making the change and if the change failed and it isn't notified then it could become inconsistent. We did a similar thing with deleting nonconfigurable properties. We put restrictions on eval. I haven't talked at all this week about eval and I'm not going to. Eval is the most misused feature of the language and I don't recommend using it and the implementation of it in the old language was really quite horrible in that it gave extreme powers to whoever provided a string to it and so we put some limits on eval. The thing I complained about with arguments getting bound strangely to the parameters, that's been fixed. We got rid of arguments.caller and arguments.callee and getting rid of those was surprisingly difficult because arguments.caller had never been in the standard. So we couldn't simply go to a line of the standard and delete it. Instead we had to add caller to the language and then poison it. We got rid of octal literals because we found that they are confusing to humans. Most people in school learned that a leading 0 in front of a number is not significant, but JavaScript said it was and turned you into base 8. So we fixed that and that got us a lot of complaints from the node community. Apparently the node guys are still using octal literals for setting file permissions. I didn't know anybody was still doing that, but they are so ES6 added octal back in, but in a slightly less awful way. And the duplicate names in an object literal or function parameters are now a syntax error. So if you say function foo a comma a, you'll recognize that the second a is an error. We fixed the new operator so forgetting to use the new prefix in strict mode will now throw and exception and not silently clobber global variables so that's good. There are a few things that we know did break because of strict mode. For example, if you call addEventListener in a browser intending to add an event to the window object, that accidentally worked in the old language and the reason it accidentally worked was in the browser, the window object is the global object, it just happened to be the same thing and when you call a function as a function, this gets bound to the global object, which happens to be the window object so it worked. It was never intended to work, it just accidentally worked. So now in strict mode you have to be explicit. If you want to add an event listener to the window, you have to say window dot. There is nothing in the language that will tell you if you are in strict mode or if strict mode is available, but you can write either of these little functions and they will tell you exactly what you need to know. The design of strict mode was informed by JSLint. They're not entirely the same because JSLint is forced to do a static analysis and strict mode can do some things dynamically, but if you're using JSLint and you should, then you should be very happy with strict mode. A lot of the work that we did was motivated by the problems of mashups. A mashup is where you've got code representing two parties that wants to work together in the same page for the benefit of the user without one being able to corrupt the other and that turns out to be a very difficult problem. So we don't have a complete solution to that, but we are on the road to solving that. So the design of the mashup solutions that we put into the language were derived from Google's Caja project and my own ADsafe project and by fixing things like the binding of this and some of the other problems we can now get security solutions which provide all the benefits of both of these which should be a good thing. So any questions about any of this stuff about ES5 or anything about anything today or anything at all? Just in the discussion we're all talking about ES6 and stuff like that. We'll get to ES6 on the third day. Yes, you had that in the schedule. Okay, anybody else? When you talked about how pervasive certain constructs are out in the web and figuring out how much you'd break and stuff like that, how did you gather those metrics? It's really hard. Some of it was just I think somebody did something. You know, I heard about a guy who did something, you know, and a lot of it was that. Some of it was mining Google code. Microsoft also had an extensive database of code and we would mine against that. I'm not confident how effective our tools were in all cases because some of the patterns we're looking for are really complex and some code is so badly written that you might not necessarily be able to recognize the patterns, but we tried as much as was possible given the technology available to match stuff against what the web was doing. Fun with Functions Function Challenge 1 I'm going to give you a series of problems. A problem might look like this. Write an identity function that takes an argument and returns that argument, and I'll show an example of how you would call it and what the result of that call would be. I will then give you some time to work on it, probably about 10 minutes or so. Some of you are going to work faster than others, so for those of you who finish quickly, you're going to get frustrated because you're going to have to wait for the time before we go on and some of you are going to be working slower and you're going to get frustrated because you might not have enough time to finish every problem. So my goal today is to frustrate everybody equally. Okay, that's where we're going to be going with this. So you'll take some time, you'll work on the problem, you'll come up with a solution, I will then show you my solution, and in some cases, I may show you multiple solutions and you can compare yours with mine. If you like mine better than yours, you'll want to record it because each of these later problems will refer to earlier problems either in pattern or we'll actually be calling the earlier functions, so makes sure that you get at least one version of everything that works. Also, I highly recommend that you use exactly the same function names that I do because later we're going to write functions which call these functions, and if you're giving them different names, it'll be really easy to get confused. Okay. Then you'll probably want to test it. If you have a JavaScript engine on your machine and you know how to use it, for example, if you have Node installed and you know how to use the rappel, that's great, you can just plug it in there and it'll work. If you don't, you have at least one web browser and all the web browsers now come with very nice debuggers so you can do that as well. This is one way to use a web browser. You could simply have a form like this or a page containing a script tag. For my convenience, I created a log function so I could log results, it'll just write them to the screen. Then in the box, I've got the function that I wrote and then I called the function sending the result to the log. If you want to do something more sophisticated, you're certainly welcome to do that. This is sort of a minimum that you need to get started. You don't have to test your functions, but you probably want to, right, because otherwise you can't have confidence that they're working correctly. While everybody is still getting set up, we're going to get a quiz, pop quiz, so here we go. Ready? Question number one, we have a function called funky, it takes an argument o, sets o to null, we create a global variable x, which is an empty array, we pass x to funky, what is now the value of x? So who thinks x is null? And who thinks x is the empty array? And who thinks it's undefined? And who thinks it'll throw an exception? Okay. The answer is B, the empty array. So let's look at what's happening here. So we start off, we have the global variable x that points at the array, we pass the contents of x, which is that reference to funky, which has a bound to o, we replace o with null, and there you are. So funky as written is a completely useless function and this thing about the way variables work is not peculiar to JavaScript, almost all modern languages work this way. Most languages do not allow you to pass a reference to a variable, what they allow you to do is to pass a reference to the contents of the variable. Okay. ALGOL 60, which was a brilliant language, had something in it called call by name where you actually could pass a reference to a variable, that was one of the very few, very bad ideas in ALGOL 60. Ready for another one? Okay. Here is a function swap, it takes two arguments a and b and it swaps them using a temp variable, we have two global variables, x and y, whose values are 1 and 2, we pass x and y to swap. What is now the value of x? So who thinks x is 1? Who thinks x is 2? Okay, I'm not going to fool you this time. The answer is 1 and it's similar to what we did before. So here we've got our two global variables x and y that point to 1 and 2, we pass x and y to swap, then we fiddle them around with swap, but that did not change what x and y are doing. I'm hoping that someday in the future JavaScript gets macros, so instead of saying function swap, I'd say macro swap, and in that case, macros do get called or do get passed with variable names and macro swap would be a useful thing, but as written, function swap is pretty useless. Okay, any questions about the quiz? Quiz is done. It's time to do some real work. Are you ready? So the first one is going to be totally trivial, no tricks, it's just something to allow you to test your environment. So write three binary functions, a binary function is a function that takes two arguments, add, sub, and mul that take two numbers and return their sum difference in product. This is as trivial as it sounds, there are no tricks. We're doing this so that you can practice and also so that we'll have functions that we write later we'll be able to call. So here we have add first and mul. I assume that everybody got this right, yes. Who got this? Yeah. If you didn't write it down because you're going to need these for later. Okay, ready for the first interesting problem of the day. Here we go. Write a function identityf that takes an argument and returns a function that returns that argument. So we're going to call identityf, we're going pass it 3, that will return a function. When we call that function, it will return 3. So here is identityf. Identityf takes an argument and returns a function that returns that argument. So who got identityf? Very good. If you didn't get it, write it down because you're going definitely need it later and don't be discouraged. I'm not expecting everybody to get every problem. What I'm expecting is that everybody gets the last problem. So if you don't see what's going on immediately, just stay with it. Eventually the patterns should be start to become clearer and you'll get a sense of where we're going. Any questions so far? We ready for the next one? Okay. Write a function addf that adds from two invocations. So if we pass a 3 to addf, it will return a function. If we pass 4 to that function, it will return 7. So here is addf. Addf takes a first argument, it returns a function that takes a second argument and it returns the result of calling the first and second argument. So who got addf? Very good. If you didn't get it, write it down. You're going to want to need it for the next one. So this problem was suggested to me by Dimitri Baranovski, who is a brilliant Ukrainian programmer living in Sydney, Australia, he is the author of Raphael JS, a very nice graphics package for the browser. He suggested that this should be a hiring question, and if you got this right, maybe you think so too. So we ready for the next one? Okay, here we go. Write a function liftf that takes a binary function and makes it callable with two invocations. So if we pass the add function that we wrote the first thing this morning to liftf, it returns a function that works exactly like the addf function that we just wrote, and to make it even more interesting, we can pass other binary functions like the multiple function and get a similar kind of capability. Let's look at liftf. Liftf takes a binary function, it returns a function that takes a first argument that returns a function that takes a second argument that returns the result of passing the first and second arguments to the binary function. So who got liftf? Very good. If you didn't get it, write it down, you're going to want to need it for a later one. This is an example of a higher order function. Higher order functions are functions that receive other functions as parameters and return other functions as results so we're starting to get moving down the rabbit whole into a very interesting way of constructing things. Function Challenge 2 Here is the next one. Write a function curry that takes a binary function and an argument and returns a function that can take a second argument. So here we'll pass to the curry function, the add function that we wrote this morning and 3, it will return a function that will add 3 to things. Similarly, we could pass to curry the multiply function and 5 and it will return a function that'll multiply things by 5. We did curry yesterday. So we've come full circle. Falafel balls. Yeah, it's not uncommon when I introduce this problem that some people perk up and go curry, this is finally starting to get interesting. In the previous solution, you had a var addf and earlier on we had a function addf and JSLint complains that it's a redefinition. That's right. JSLint is right. Okay. Okay, so here's the curry function. Curry takes a binary function and a first argument and returns a function that takes a second argument and returns the result of calling the binary function with the first and second arguments. So who got curry? Very good. If you didn't get curry, get it now because you're going to need it later. Now another way you could have written curry is you could have used the lift function that we wrote earlier. Did anyone use liftf? That'd be extra credit if you'd done that. So this process of taking a function with multiple arguments and turning it into multiple functions that take a single argument is called currying and it's named after Haskell Curry who is a mathematician who did a lot of work with Church's lambda calculus and by normalizing all functions to take only one argument, it made a lot of operations easier to think about. Some people think it should be called shonfinkelisation because there is a fellow named Schonfinkel who was doing this stuff earlier, but we're going to call it currying. Now some people would like curry to be able to work with functions of any number of arguments and it's possible to do that with JavaScript, and it is horrible, it's terribly ugly. So if I want to have either of the two functions take any number of things rather than just the one thing that we're concerned with, you have to do that and it's inexcusably awful and it's because the arguments array isn't really an array and so it doesn't work right and so you have to do all these tricks in order to get things to happen. I'm not going to explain what this does because I just don't want to waste time on it, it's awful. And so, because of this awfulness, all of the functions that we write today will only take a fixed number of arguments, so we'll have unary functions, it'll take one argument, and binary functions will take two arguments. Even though lots of the applications we're going to do would really like it to be variable, I don't want you to have to waste your time on this because you can see there is a whole lot of stuff going on that has nothing to do with curry. Now in ES6, which is starting to find its way into implementations, there is a new syntax with the ellipses operator, the … operator, and if you put the ellipses in a parameter list, what that says is take all of the rest of the arguments, all of the remaining arguments, put them into an array and bind that array to that parameter and that is so much cleaner. And then on the call side in an argument list, if you go … and there is an array there, it says take that array and spread it out so each element of the array will be a separate argument. And so, by doing that, we've got this version of the curry function, which it looks exactly like the first one, except it's got the annotations in it, which says this is where you can have a multiple number of things being passed and both of these do exactly the same thing, which one would you rather be reading, I'm guessing it'd be the second one. So when ES6 becomes everywhere, then I'll change this course so that we'll be writing functions that look like that. That was kind of a question I had for you yesterday, but it's saying now, so like when you were saying the admin from ES3 to like ES5, did that take like a long time for browsers to adopt or like kind of like do you see us moving a little faster now or is it still going to take quite a while to get ES6 adopted fully. It's faster, but the big hang up with ES5 was the IE problem and IE is not nearly the problem that it was before and all of the other browsers are updating so the problem isn't propagation and adoption, the problem is now just implementation. So I'm hoping that we'll get all of this stuff out much faster. So anyway, ellipses will be my second most favorite feature in ES6 if it ever gets finished, so that'll be great, but in the meantime, we're only going to be concerned with functions taking a fixed number of arguments. So we ready for the next one? This next problem is going to be a little bit different because you are not going to write any new functions. Instead, you're going to be using functions that we have already written. So without writing any new functions, show three ways to create the inc function. The inc function adds one to a number and returns it. So if you pass 5 to inc, it'll return 6. And so, you're going to call a function, which will create the inc function and you're going to show three different ways to do that using functions you've already written. So here are three ways to do inc. First one is addf of 1. Who got that one? And the next one is liftf of add and 1. Who got that? And curry of add and 1. Who got that? And who got all three? Very good. So this illustrates the first rule of functional programming, which is let the functions do the work. If you've already written a function that does what you need, you don't need to write another one. Function Challenge 3 Write a function twice that takes a binary function and returns a unary function that passes its argument to the binary function twice. So by twice, I mean this, we've got the add function and we're going to add 11 twice, okay, that'll produce 22, so we're going to automate that. We're going to make a twice function, we can pass add to that, and it will create a double function, which does the same thing. We could also pass the multiply function to twice, it will produce a square function, which will square things. Now I intentionally misspelled the word doubl because in some implementations doubl is reserved word and if you spell it correctly, you'll get a syntax error, which is inexcusable since doubl isn't even used in this language. So I misspelled it, I recommend that you misspell it too for the same reason. So here is twice. Twice takes a binary function and returns a function that takes an argument and it returns the result of calling the binary function with that argument twice. So who got twice? Very good. If you didn't get it, write it down, you're going to need it for another one. Any questions before we go onto the next one? It may be a dumb question, I was trying to figure out if it was a way to find this using curry or liftf and I couldn't think of one. There probably is, but it's probably just as well that you didn't. Alright. So if we did, would we want to change it to that implementation? That's up to you. The most important thing today is that it works. Okay, we ready to move on? Okay, here's the next one. Write reverse, a function that reverses the arguments of a binary function. So we're going to pass to reverse the sub function that we wrote this morning and it returns the bus function, which is subtract backwards. So if we passed 3 and 2 to bus, we'll get -1. Okay, so here is reverse. Reverse takes a binary function and returns a function that takes a first and second argument and returns the result of calling the binary function with the second and first argument. So who got reverse? Very good. So this is how we'll write it next year, we'll be able to reverse any number of arguments. So we're starting to get into practical stuff now, so you might imagine you've got two APIs and you need to make them work together, but they were not designed to work together, so their calling sequences are incompatible and you could rewrite one of them to be more like the other, but that's too much work, or you could write a wrapper function around every entry point of one, but that's too much work too. With this approach, we could let functions do that work so we can have functions write wrappers, which will allow us to use the metawrapper build interchangeably. Okay, so ready for the next one? Write a function composeu that takes two unary functions and returns a unary function that calls them both. So we're going to take the doubl and the square function that we wrote earlier and we're going to pass them both to composeu and then we'll pass the function that it returns, 5, and it will return 100 and it'll do that by taking the 5 and doubling it and then taking that and squaring it. So here is composeu. Composeu takes functions f and g, it returns a function that takes an argument and returns the result of calling g of f of a. So who got compuseu? Very good. The tricky thing about this one was that nested function invocations are written inside out, which lexically looks backwards, and so, you just need to get the g before the f, even though it's called later. So this introduces a way of programming, which is kind of like adding UNIX pipes, except at the function level that we can take existing functions and kind of string them together and pass values through them and it'll go through this chain of functions until something comes out the other end. So next year when we write this function, we'll allow it to take not two functions, but any number of functions and you can just program a whole series of things. Until then, you could call composeu several times, each time adding a new function to the list, sort of like currying, I guess. Okay, any questions about that? Ready for another one? Anybody? Yeah, alright, good, good. Alright. So write a function composeb that takes two binary functions and returns a function that calls them both. So we're going to pass add and mull to composeb and it will return a function, and if we pass it 2, 3, and 7, it'll return 35. So here is composeb. Composeb takes two functions, f and g, returns a function that takes a, b, and c, and returns a result of calling g of f of a and b and c. So who got composeb? Really good. You guys are doing great. Feeling good? Yeah, okay. Want to do another one? I've got another one, so let's go. Write a limit function that allows a binary function to be called a limited number of times. So we're going to pass the add function to limit and say you can use it one time and that will produce a limited add function. We could then give that to a third-party and the third-party can call it once and it works just right, but if they call it a second time, all it does is return undefined, it doesn't do anything else. Okay, you could think of, you could pass a wish function to limit and say you only get three wishes and the wish function could make any number of wishes, but the function that we hand you will have a limit on it. Does the function you pass have to be a binary function? Let's say yes. In the future, we want it to work with anything, but for today, we'll just say a binary function. Okay. So here is limit. Limit takes a binary function and a count and it returns a function that takes two arguments. If the count is greater than equal to 1, it decrements the count and returns the result of calling the binary function with the 2 arguments. Otherwise, it will return undefined. So from this point on, the functions are starting to get a little bit more complicated so it's unlikely that you did the same thing I did. So from this point on, I'm going to ask who something that works. Okay, very good. Anyone do something interesting, a different approach? I used a variable, instead of, I had two variables instead of just one. Okay, and where did you put them? Before return function, I set a max, I set i equal to 0 and count I used max, and each time I ran function, it just incremented i by 1. Great, very good. So anybody else? I'm a little confused how the count or how is this being stored? How is the number of times it's being called? But we're using closure, right, so the other function has access to the variables and parameters of the outer function, so it sees the count and is changing that parameter. Do you need a return undefined there? Yeah, so that's a really good question and that's why I underlined this statement because I want to talk about it. So there are two schools of thought on that statement, one is this is completely unnecessary because in JavaScript, if a function falls off the bottom, it returns undefined, so this is just a waste of space, there is no reason to say this. The other school of thought is that part of the contract for this function is that it returns undefined when the limit is reached. And so, by explicitly returning undefined, we're providing self-documenting code. I think both points of view are valid. I have not been able to decide which one is better, so you'll see me flip flopping on this. The thing that is clearly bad would be saying just return semicolon because that doesn't accomplish either thing. It's a waste of space and isn't explicit about what's being returned. Yes. You answered my question right there. Okay. Anybody else? I had something maybe less clever. I had the local variable we'll call it, which I initialized to 0, which was the number of calls made. Okay, and the local variable was stored above. Yeah, above the, before the return. Brilliant. Okay, very good. Function Challenge 4 Write a from function that produces a generator that will produce a series of values. So we're going to pass 0 to the from function, it will return a generator, in this case, I'm calling it index. Every time I call index, it will return the next value in the sequence that started with the starting value. What's a generator? A generator is a function that'll make things, so each time you call a generator, you'll get another thing. In this case, we're making a generator that's going to produce a sequence of integers starting at some value. Okay, so here is from. From takes a start value, it returns a function, which computes the next value simply by taking start, it adds 1 to start and then returns next. So who got something that works? Okay, very good. Anybody take a different approach? Yeah. I just, in the inner function, I just did return x += 1. Okay. That work? Sure. Anybody else? That won't give you the starting point you put in. You put in 0 at the start 1. Yeah, I think he's right. You won't like this. I did return start ++. Yeah, there's always one of those. Anybody else? You want to give start the first time? Yeah, start wants to be, if we say from 0, we need to start with 0. It could be ++1, but you return ++ start. Yeah, but we're not doing that. Yeah, I know you're not doing that. Okay, ready to move on? Okay, here is the next one. Write a to function that takes a generator and an end value and returns a generator that will produce numbers up to that limit. So we'll pass to the to function a generator that we make with our 1 function, with our from function, so we'll pass from 1 to 2 and we'll also pass it a 3 and it would return a generator, which will return values up to the limit to 2, and from that point on, it will return undefined instead. So on the previous limit, this one does not include the limit? That's not the limited number of times. Right, this is to a value. Upper ceiling, not including this line. Right. Maybe we should call it almost, instead of to. Well if this is the convention we have in our languages, right. That's true. When you're taking the substring of something, this is the way we do it. Because it's 0 based index. Yeah. So here is one way to write the to function. To takes a generator and an end value, it returns a function that gets the next value from the generator. If that value is less than the end value, it returns that value, otherwise, it returns undefined. So who got something that works. Very good. Anyone take a different approach or do something amazing? Wonderful, spectacular, anything like that. I don't know how amazing it is, but mine looks very similar to the limit function. It just uses from generator inside of it. So I set a variable, the outer scope, and then add a number to that and if that number is less than or equal to the limit, I return, or greater than the limit, I return undefined, otherwise, I return generator. Okay. Anybody else? Alright, want to do another one? Yes, alright. Let's do another one. Okay, let's write a fromTo function that produces a generator that produces value in a range. So we're going to pass to fromTo 0 and 3 and it will get us the generator that gives us a sequence 0, 1, 2, and then undefined. So here is fromTo. FromTo takes a start value and an end value and it returns a result of calling to, passing it from start and end. So who got something that works? Very good. Who did it the hard way? Yeah, first rule of functional programming, let the functions do the work. Write an element function that takes an array and a generator and returns a generator that will produce elements from the array. So we're going to pass to the element factory an array containing a, b, c, and d, and we'll give it a generator, which does fromTo 1 to 3 and that will give us a generator which will produce b and c and then undefined. So here is element. Element takes an array and a generator, it returns a function that gets the next index from the generator. If the index is not undefined, it returns the next or the element of the array at that index. So who got something that works? So I want to talk about the statement that's underlined because if you leave that if out, if we do the return unconditionally, it does the same thing, and the reason it does the same thing is kind of weird. So, excuse me. If we don't have the if there and if index undefined, then the brackets will turn index into, or it will turn undefined into the string undefined, it will then look for the member undefined in the array and not find it and return undefined, which is what you return if you can't find something. So it kind of accidentally works if you leave that test out, except in the case where if someone creates an undefined property in the array, then they can cause the behavior of this function to change in that case. So you would to call like delete on another new array. No, not deleting a node, if you said array, if we said, what are we calling the array, yeah, if we said array.undefined=5, then in that case, we'll return 5, instead of undefined, which might not be what you want. And so, while most of the time it works, I'm concerned about the weird cases, which are what actually screw you up in production and in life, and so, I would rather be explicit and look for that case. Anybody else do something different? I just returned the array. I didn't do the test. I just returned array gen. Yeah. Because it's too much typing. That will almost always work. Modify the element function so that the generator argument is optional. If the generator is not provided, then each of the elements of the array will be provided. So we can call element passing the array a, b, c, and d and we will get a generator, which will return a, b, c, d, and undefined. So here is the revised element. All I did was add the code that's in the yellow box. If we didn't get passed a generator, then we'll call fromTo and get a new generator, otherwise, everything else is the same. So who got something that works? Did anybody try to do it the hard way? First rule of functional programming, yeah, let the functions do the work. So I want to talk about the condition I underlined. There are three ways I could have written that. That's the first one where I'm looking explicitly, did undefined get passed in, which means did nothing get passed in. Another way I could have written it is the type of gen equal to function. So if they passed in something, but it wasn't a function, we'll make it a function. So the difference between those two approaches is the second one will tend to always succeed and the first one will tend to fail if something bad was passed in and it depends on the characteristics of your application. Generally, fast failure is what you want because it helps you discover bugs faster, but sometimes you've got code which is really critical and you want to be sure that no matter what happens, this code is going to be working right. The third way you could write that condition would be to do a boolish check, if not gen, then do something, and I don't like that because what we're trying to do in making the condition is divide the whole universe of possibilities into one of two states, it either is or it isn't, and the boolish case splits in a really weird way in that if they pass in a 1, the behavior will be very, very different than if they passed in a 0 and it doesn't make sense to me that 1 and 0 should behave that differently when we're looking for a function. So this is why I think that JavaScript depending on boolish values and conditions was a mistake. I recommend always be more explicit, either figure out do I want the fast failure question or do I want the will seem to always exceed question, but not the convenient one which can straddle both of those in unexpected ways. Yeah. Why not just a generator = from 0? In that case, that'll work fine too. From will keep going forever, but we run into the undefined thing, so it'll stop, so from by itself is also okay. Anybody else? So you see a lot of minifiers do this and some people would write this code like gen = gen or fromTo 0 array.length, it's a funny syntax, but like I say, you see a lot of minifiers, but I don't think it would minify this code the way it's written that way because if you said undefined, as opposed to the boolish way. Right, so I used to recommend using the logical or operator to do something like you're suggesting to replace default values and I've stopped recommending that because it's too hazardous, that if someone passes in a 0, 0 is falsey and so it might get replaced in cases where you don't expect it will and that's problematic. So I now recommend instead be explicit, don't depend on boolish checks. Function Challenge 5 Write a collect function that takes a generator and an array and produces a function that will collect the results in the array. So we're going to make a generator that works like the NSA, it's going to spy on a function and record everything interesting that function returns. Okay, so first we'll set up the empty array where collect is going to its stuff. Then we'll call collect passing it any generator and that array and it will return a generator which will work exactly like the first generator, except it will record everything that it returns, except for undefined. So we can call that generator a few times and then we can look at what's in the array and it will have captured those values. So here is collect. Collect takes a generator and an array, it returns a function that gets the next value from the generator if the value is not undefined, it pushes it onto the end of the array, in either case, it will return the value. So who got something that works? So close. Anybody do something interesting? We just return array.push value instead of having the return value outside of it? I don't recall what it is that array.push returns, I think it might be the updated length of the array. Does anyone know? Yeah, it says right here. Yeah, so unfortunately, array.push isn't smart enough to do that. Anybody else? Alright, should we go onto the next one? Write a filter function that takes a generator and a predicate, a predicate is a function that returns a Boolean true or false, and produces a generator that produces only the values approved by the predicate. So in this case, we're going to be filtering using the third function. The third function will return true only if it's argument is divisible by 3. So we'll use that to select factors of 3 from the sequence given by the generator. So we'll have the fromTo generator going 0, 5, and the third function and it will return 0, 3, and then undefined. So you're asking if I call this sequentially five times, I should only get 0, a 3, or an undefined? Uh-huh. And those aren't the first three results I should get. Those are the results. I call it three times and should get those three results. Right. Okay, let's look at filter. So did everybody figure out they need to use a loop. Yeah, that was a key to this one. I did it recursive. Very good. We'll get to that. So here is one way to write filter. Well first off, who got something that works? Very good. So filter takes a generator and a predicate and returns a function which will stay in a loop calling the generator until it gets a value which is approved by the predicate, in which case, it'll drop out and return the value. And that's an acceptable way to do it, although, in ES6, my most favorite new feature in ES6 is going to be tail recursion or proper tail calls, which would let us write it like this, we're going to have a recursive function that I'm calling recur and it will continue to call itself until the predicate becomes true. And so, this is where it calls itself and the reason why this needs a new feature in the language is that in ES6, the compiler is required to optimize that call. So instead of doing a call and then a return, it will do a jump back to the recur function, so this function will run in the same time and same memory pressure as the earlier one. So at this point, there will no longer be an advantage to using loops over recursion and that's great, so I'm really looking forward to that. Anybody else do a third approach or any other observation of this. Do you think there should be a way to use that compose function that we wrote earlier calling a function on a function, well we're not really calling a function on a function, but it feels like we are. I guess while is set to doWhile because I'm so used to Python that I forget that there are other fun scripts. Can you flip back to the ES5 version? So he used while there, I actually think that we have too many looping statements. If it were just me, it would be loop and that would be it, now loop bracket and that's everything would happen in that. So you're saying do your qualifications… Yeah, because I find well I'm not using loops much anymore, but when I do, I tend to want to break out of the middle and we've got loop syntax which says we'll break out of the top or we'll break out of the bottom, but that's not usually where I'm going. If I'm using a loop that is that disciplined, I probably don't need to use a loop at all. Anybody else? Okay, should we go onto another one? Sure, why not? Okay, so let's do one more. This time, we're going to concatenate a couple of functions together. So write a concat function that takes two generators and produces a generator that combines their sequences. So we're going to call concat, we're going to pass it a generator which is fromTo 0 to 3 and fromTo 0 to 2 and that will give us a new generator, which will produce the sequence 0, 1, 2, 0, 1, and then undefined. So here is concat, it takes two generator functions, it has a variable to remember what the current generator function is, it returns a function, which gets a value from the current generator. If a value is not undefined, it returns it, otherwise, it replaces the current generator with the next generator and returns a value from that. So who got something that works? Very good. Anyone take a different approach? Why not just return generator 2? At the very end, why set generator 2 equal to generator 1 or to a temporary generator? Because next year, when we have the … thing, then I want to write it this way, whereas, I can take any number of generators and I'm going to be using the element function that we wrote earlier to help me step through that array of generators and I'll just load the next one in. So that's why. I didn't mean to show off, but that's why. What is element gens, can I see the previous slide. I'm sorry, why are you returning gen instead of value and returning the result of calling. Did you say value was, I see value = gen and if that doesn't succeed, you're calling it again. Gen2. I skip to the next generator and call that generator. At the very top, you assign gen1 to that gen variable. Right. So that changes if the if doesn't succeed. So the first time 3 calls gen1, second time 3 calls gen1, third time 3 calls gen1, still passing because it goes 0, 1, 2. Fourth time through, he calls gen1 who gets undefined. He sets gen to gen2, returns a result of gen2. Oh, we have a, okay. You had two returns, I wasn't expecting two returns, that's why I was mixed up. Then you set value = gen2. Right, I could have done that business inside of VF, I could have reversed it, but yeah, you're right. You think it would have been clearer doing it that way? Well I'd prefer to have one return than two, I guess. I'll think about that. So in your next function, your next big function, what's element? Element is a function we wrote this morning. You remember that. It pops off the next, it just pops. It returns the next thing from an array. Oh, not because it pops, it pulls it or whatever. I don't have that one yet. You should, we did it twice. Oh yeah, there it is. Function Challenge 6 Make a function gensymf that makes a function that generates symbols. So we're taking the generator idea and we're going to try to do something practical with it now. So gensymf is a symbol generator or gensymf is a factory that makes symbol generators or things that make serial numbers. So we designate a serial number with a prefix, and so, you can send it any string and that becomes the prefix string, and then we will get a series of strings starting with that symbol. So we're going to make two generators this time from the same factory. We're going to make the G series and the H series, and when we call them, we'll get G1, H1, G2, H2. Okay, let's look at gensymf. Gensymf takes a prefix string, it creates a number, which it's going to use for keeping track of where it is in the sequence, it'll return a function, which will add 1 to that number and return the result of concatenating the number to the prefix. So who got something that works? Good. Anyone do anything substantially different? I attached that prefix to the string, next you want to pass in digit as that first thing. That's a wise precaution. Anybody else? I used the from function. Used the from function, that's great too. So this is an example, we've seen a number of these where we've got a factory function, which then makes something which will do some work, usually a generator, but it could be, there are lots of different kinds of functions, but they're both functions, it's just one is nested in the other, and in fact, if we nest further, if we put another function outside of this, we could make a factory, factory and you could wrap that with a factory, factory, factory. So just for fun, let's look at what this would look like if it were a factory, factory. So gensymff is the factory, factory and we're going pass the increment function and the initial seed value to gensymff and it produces a function that works exactly like gensymf and it'll make those sequences, right, and we've done things like this a couple of times already. So I'm just going to show you the thing because we've already done this one. So it looks very similar to patterns we've seen before where the factory, factory is applying values that go into the generator and so we can automate the making of factories. Now the interesting thing about this one is that statement there where we're creating the number variable, which is going to hold the value that is being used to generate the sequence members. So if we were to move that up one line so that it's not in the factory anymore, but it's in the factor, factor, that would change the visibility of that variable so it would be seen by all the generators. So instead of generating G1, H1, G2, H2, we would generate G1, H2, G3, H4 and that's a really interesting change in behavior, it's just moving one variable declaration one place to another. So we've been dealing with closure and we saw that we can have things that are global and things that are local and things that are sort of in between, but there can be more of those in-betweens and if we get into nesting things in useful ways, we have tremendous control over the visibility and the lifetime of the variables and can do interesting things to affect their behavior. How about that? Any questions about that? Okay, ready for another one or do you want to take a break? Let's do another one, sure why not. Okay, so everybody remembers Fibonacci, right, you did Fibonacci numbers in school. So Fibonacci was an important mathematician, he discovered a lot of good stuff, but the only thing we seem to remember is the Fibonacci sequence, and there are an infinite number of Fibonacci sequences, but mostly, we only remember the famous one, which started with 0 and 1 and that's what we're going to be doing now. So we're going to make a factory, which will make a Fibonacci generator and you will see the factory with the first two numbers in the sequence. So everybody remember how Fibonacci numbers work? Nope. Well let's review. So the Fibonacci sequence will be a sequence of integers. You specify the first two integers in the sequence. The third number will be the sum of the first two. The fourth number will be the sum of the previous two and so on. So the first numbers we get are 0 and 1 because those are the ones we provide, but then the next one will be one because that's a sum of 0 and 1. The next 1 will be 2 because it's the sum of 1 and 1, the next will be 3 because it's the sum of 1 and 2, the next will be 5 because it's the sum of 2 and 3, and the next in the series will be 8. Exactly. So this was a tricky one wasn't it. I mean the Fibonacci sequence itself is totally trivial, right, it's three simple statements, but getting the first two numbers to come out, that was the trick, right. So first off, who got something that works? Congratulations. This one was hard, right. Let's look at a number of approaches that we could take. So here is one. There is the Fibonacci function there in the box, so that's it. And then we've got an if statement around it or a switch statement in which asks where are we in the sequence. If we're at the first, at the beginning of the sequence, put out the first number, otherwise, put out the second number, otherwise, use the Fibonacci function and do that. And this works, this absolutely works. Who took this approach or something like it, maybe used an if instead, but basically yeah, you've got a variable which is telling you where you are in the sequence. This is a completely, yeah. Except for case 1, you're inputting a and b, a and b don't necessarily equal 0 and 1. If a is something other than 0 for case 1, you need to return a plus b, don't you? No. I think I don't know the rules. He's switching on i. I've got i which is telling me where I am in the sequence, which is the value I'm at the sequence just how many numbers have I looked at, so if I'm looking at the first number, I output a, if I'm looking at the second number where i is 1, I output b, otherwise, I output a + b. Oh, so if you want the Fibonacci, if your starting elements are 5 and 7, first number is 5, second is 7, third number is 12, oh, okay. Yeah, that's how they work. Yeah, I was thinking that you added the first, first you took the second, then you added the first two. It's not until you get to the third, then you add. Right, because they tell you… Got it. Is i incremented? I'm not getting that, I guess. You need two case. Yeah, well instead of adding one, I set it to one because I know it is 0. And beyond that, I don't need to increment it. Once I get passed the first two cases, I don't care what i is anymore. Oh, okay. Okay, so completely acceptable, this is a breathable way to do it. I would argue that a reasonably intelligent person could figure out what this code is doing and that's most of what we want code to do, so this is good. It's okay. My complaint with it is it's a fairly big function and only that much of it is concerned with the Fibonacci thing so it just feels kind of lopsided to me. So here is another approach. In this one, I kind of permuted the statements of the Fibonacci sequence in order to delay the output of the first or to cause the first two numbers to get output. So who did something like this? Yeah. So this is probably the most optimal solution, it's going to be the smallest code, fastest performance, not that either of those matter in real life, but it does have that advantage. The disadvantage of this is I'd hate to be the guy who has to debug it, right. I said b = a + next instead. Yeah, they are variations on it, but it's basically the same idea. So another approach we could take is recognizing that we're making a generator and we already have some tools for constructing generators. So here is another approach. I've got my Fibonacci generator here, which will give me the next number I just need to get the first two on top of it, so I'm going to make a special generator and I'm going to start by taking identityf, which was the first interesting function that we wrote this morning that you thought had no practical application. It turns out what identityf is a constant generator, it will always produce the same value. So I'm going to use that to make generators and then I'm going to use the limit function that we wrote earlier to cut off the sequence so I only get one. So I got a pair of things and I can then concatenate those two together and then concatenate that onto the Fibonacci function. So who did that? Of course, nobody would do that. Yeah, so there is that. And if I were going to be doing this a lot, I would take limit identityf and encapsulate that into something, which would make sequences of one more compactly. Or we could do this, we could a similar thing, we take the element function that we wrote earlier, make an array containing the first two things and concatenate that onto the Fibonacci generator, so who did that? No, no one. Similar, I used an array. Very good. Brilliant. Function Challenge 7 What do you say we do some objectory programming? Yeah, alright. So we're going to do something with an object now. We're going to write a counter function that returns an object containing two functions that implement a top and up/down counter hiding the counter. So we'll call the counter factory and pass in an initial value and it will return an object containing an up method and a down method, and when we call up, it will add one to the value and return it. When we call down, it will subtract one from the value and return it. And I'll give you a hint, no global variables, no this, none of that crap. Darn it! I always use global variables everywhere. That's what Ben taught me to do. I did not teach you to do that. What could go wrong. Ben always says, more globals. That's exactly the opposite of that. More globals. So here is counter, it takes a value and returns an object containing two functions, the first 1, the up function, adds 1 to the value and returns it, the other 1, the down function subtracts 1 from the value and returns it. So who got something that works? Brilliant. If you didn't get it, write it down because you'll want to refer to this pattern later. So this is very similar to what we talked about yesterday where we've got two functions inside of a closure, which are both sharing common data. This is a very simple example of that, but basically, all object constructors are going to follow this pattern. Any questions about this? Okay, want to do another one? So this next one I promise is going to sound much worse than it actually is. So make a revocable function that takes a binary function and returns an object containing an invoke function that can invoke the binary function and a revoke function that disables the invoke function. Okay. So let me explain what's actually going on here. This is something that might have some security properties in that we might have some guest code that we allow into our system and we want it to be able to run as long as we want it to, but at any point we want to be able to cut it off and we don't want to have to rewrite our existing APIs in order to accommodate that. So this is a variation on the limit function that we wrote earlier, in fact, you might want to refer to your implementation of limit when you're doing this one, that except in this one, instead of keeping a count about how many times you get to do it, we will have a separate function called revoke, which will when we call it cause the thing to stop working. So we'll call our revocable factory, we'll pass in any existing function, in this case, the add function, and it will return an object containing two functions, one of them will be the invoke function, which is the revocable Add function, and we can give that function to the third-party, but we will hold onto the revoke function for ourselves. And so, the revocable add function will work just like add until we call the revoke function, at that point, all it does is return undefined. Everybody clear? It's one of those cases where you really need the … arguments event. Yeah, we want this to work for all functions, right. Right now, it only works for… For binary. And actually, anything that works for binary probably works for unary because we're just doing pass undefined as the second argument and it gets ignored. So the rest you could just plan ahead and… Yeah, if you knew we never used more than 10 rathers you could do that. This looks like… Who wants to do that, yeah, that's awful. But you could call if inside you could always just call our, sorry, you can pass that forever. You can, yeah. So you can do that today, it's just inexcusable ugly and I don't want to waste your time with it so we're not going to do that. So argument 0, arguments 1, arguments 2, you can't just pass in the arguments object. Actually you can, which is kind of a problem. It didn't work for me, that's why I didn't write that out. That was my go-to to try that and it ended up doing something funny. Oh good. You probably did something wrong, which is great. Keep doing that. What's the name of it, is it args or it's arguments? Arguments. So if you get arguments in a function and then you call another function inside of that passing arguments, you're passing an array to that function and you're not passing each of the arguments in that array. Right. That's why it doesn't work. But you can then use apply to spread it out. In ES6? In ES3. Oh, okay. So array.apply and function, I'm sorry, arguments.apply. No, unfortunately, it's function.apply. We're not doing that. It's too awful. That's right. You showed us yesterday and you said it was ugly so I ignored it. Yeah, it was good that you ignored it. Okay, are we ready for revocable? So here is revocable, it takes a binary function and returns an object containing an invoke method and revoke method. The invoke method looks to see if a binary is undefined, if it isn't, then it will call it passing the first and second argument, otherwise, it doesn't do anything, it returns undefined. Revoke function sets binary to undefined, thereby, disabling the invoke function. So who got something that works? Very good. Anybody do something different, something notable? I put a variable okay, which I set to false, but it's the same thing. That is okay. Thanks. So I opened this up, so I create, we made this object, which is in the and I look in the console at my object and I want to introspect it, so I expand it, well where is binary in there because it's not exposed as publicly as a private object it supposedly knows about it. Right, binary is hidden in the function scope of the revocable function and is available only through its closure and the only functions in the universe who have access to that closure are invoke and revoke. So that's why this is something that we can build secure systems out of. You know, if we were to take your Okay flag and put it in the object itself, then the attacker could go to the object and turn it the other way, right, so that wouldn't accomplish what we want to do here. So you could obviously, they could look the source, but you obfuscated that. I mean, it's… So we assume that the attacker can always look at the source, but they can look at this source and it doesn't help them, right, unless they were there at the creation of the object, they can't get to it, they can't get to binary. So revoke is irrevocable, in this case. That's right, this is a one-way trip. Now we could design this to work a different way. Now using his okay variable as an on/off switch, we could provide a second function or maybe an argumentry evoke, which could reverse it, but I generally, I prefer systems where once we cut them off, it's off and we don't have to worry about something turning it back on again in an unexpected way. Yeah. Since you're not explicitly returning undefined in the invoke, is this a case where you said you'd flip flop and really you should return undefined or not. That's right. I mean, I like the idea of being explicit and I also like not doing anything I don't have to do and so. Kind of odds at each other. Yeah, exactly. There is a real conflict there. Most of the time, doing the right thing and doing the right thing are obviously the same thing. This is one of those cases where it isn't so much. Function Challenge 8 So this one is going to be totally trivial so I'm just going to give you the solution. We're just going to do this because it's going to set up the next problem. So we're going to write a function m that takes a value and an optional source string and returns them as an object, and in this case, instead of simply passing things to the log, I'm going to be using JSON.stringify to make strings of the object because it turns out object.tostring is completely worthless, it doesn't show you anything about what's in the object, it's just not worth it. JSON.stringify on the other hand does a fairly good job of showing you what's in the object. So in this case, the stringification of m of 1 is an object where the value is 1 and the source is the string 1, we make the string of the value. If we pass in a second argument, then that second argument will be the source property. So if we pass in pi and the word pi, the value will pi and the source will be pi. So this is the function, the constructor that does that. So go ahead and type this in, I'll give you a minute, and then we'll go on and we'll do the next problem. Write a function addm that adds two m objects and returns an m object. So you can imagine we could have a regulatory requirement that not only do we provide results, we also provide extreme detail on how we obtained the results so this is a way to help automate that. So if I add two m objects, I get an object where the value is the sum of the values and the source is the concatenation of the two values or of the two sources with a plus sign and parents around them. So if we add an m3 and m4, the source will be parent 3 + 4 close parent. And a similar thing happening with pi. So here is addm. Addm takes two m objects, it returns a new m object where the value will be the sum of the two values and the source will be the concatenation of the sources. So who got something that works? Did anybody do it the hard way? Yep, first rule of objectory programming, let the objects do the work. We already have a nice constructor and we want to use it because there is a chance that we might change what m does, and so we want all of the instances to take advantage of that. So if you didn't get this one, write it down because we're going to need it for the next one. (Waiting) Okay, we ready? So write a function liftm that takes a binary function and a string and returns a function that acts on m objects. So we've done this before, right. If we pass the add function to liftm and also the plus string, it will make a function that works exactly like addm and we can do the same thing passing multiply and asterisk or any binary function. So this will help us to automate the process of building this system of journaling arithmetic. Do you need to do the check on the source or the string array? No, if we try to concatenate it with the string, it will be sourced with a string. Can you show the problem again? Okay, here is liftm. Liftm takes a binary function and an op string, it returns a function that takes a and b, it returns a new m object where the value will be the result of the binary function on the two values, and the source will be the concatenation of the sources of the op string. So who got liftm? Way to go! So have any of you ever heard of monads? The Haskell guys talk about them all the time. You just made a monad. So you can go home and tell your kids, made a monad today. You did that. Alright. So what is a monad? So a monad is a thing where you can take functions and lift them up to a higher level where they can have or require some new capability. The Haskell community uses them because there is a trap in Haskell. Haskell is a brilliantly designed language and one of the characteristics about it is it does not allow any kind of mutation. So all functions are pure functions in the mathematical sense and that's a really interesting thing, except that if your programs have to interact with the real world, the real world is constantly mutating, right, and something like an account balance cannot be a constant, it's got to vary. They can't even do I/O in immutable system, right, you can't have anything coming in, you can't have anything going out because nothing could ever change. So that kind of made things practically real hard for them, and they figured out this trick with monads which by using higher order functions in really clever ways, they can have the appearance of mutation without actually mutating anything. But that's for another time. Anyway, you made one of those, so congratulations. So if you didn't get this function, you should have it now because we're going to use it again in just a second. So is everybody ready to move on. Okay, so here is the next problem. Modify function liftm so that the functions it produces can accept arguments that are either numbers or m objects. So to make this a little bit easier to use, you can pass in any number, you don't have to explicitly wrap it with m first, so it will just change liftm to do that. Okay. And then make it really flexible so that you can pass in a number or an m object or two m objects or two numbers, any combination will work. Either choose three or four arguments or. No, it'll take two arguments. Whether it's either m or a number. Okay. Okay, here is liftm. So the change I made was in the box. I just looked at the type of the argument, and if it is a number, I call m to turn it into an m object. So who got something that works? Very good. Who did it the hard way? What's the hard way? Where you didn't do that. Yeah, so first rule of objects. Function Challenge 9 Write a function exp that evaluates simple array expressions. A simple array expression is an array in which the first element is a function and the remaining elements are the arguments to that function. So if we pass that array mul, 5, 11 to the exp function, it will return 55. And if we simple pass a number to the exp function, it'll just return that number. Okay. It looks like a binary. Yeah, but assume a binary function. It turns out it will also work with unary functions. What's the exp, oh we're writing exp, sorry. Yeah, you're writing it. And it's always three elements in the array. It could be two or three. Two or three, okay. But you don't care about that. Assume three, it's always three. Next year, it'll be m, but this year it'll be three. Okay, here is exp. Exp takes a value and it returns something. If the value is an array, it returns the result of calling the first element passing the next two elements as arguments, otherwise, it returns the value. So who got something that works? Very good. You want to do one more, you think you got it in you? It's going to be based on this one, so if anybody didn't get this one, get it now because you're going to need it. Okay, ready? Last problem. Finish strong, okay. Modify exp to evaluate nested array expressions. A nested array expression is just like the simple array expressions, except any of the arguments can also be a nested array expression. So in this case, we've got a hypotenuse thing going on here and if we pass that nested array structure to exp, it will now evaluate the whole thing and come up with 5. Okay, everybody got it? Any one parameters is an array, then that has to get passed off to, that gets evaluated. Yes, exactly. So let's take a look at exp. So the only change that I made was I called exp on each of the arguments before we used them, that's it. And so, recursion ends up doing all the work, which is really, really nice. Any time you're dealing with nested data structures, recursion is usually the ideal way to deal with that. So one of the nice things that this function illustrates is just how powerful object systems or function systems can be that this little function, and it is a little function, implements most of the programming language. Anyone recognize what language this is? Lisp. Lisp, yeah. In Lisp, they use parents instead of brackets and the commas are optional, otherwise, this is Lisp. It's clearly not the whole language, but it's an interesting part of the language and one of the reasons why people who do Lisp act the way they do is because they can do stuff like this really, really fast, really, really easily. That is just, you compare that to implementing a system to do one of our languages and it's just a completely different thing. So I hope you all enjoyed this. Today was grueling, but you came up, does anybody feel like their brain got bigger today, anyone feeling that? So before I send you home, obviously you're going to need some homework, right. So let's get to the homework. The homework problem is write a function addg that adds from many invocations until it sees an empty invocation. You know when you've got an empty invocation when you receive undefined as an argument. So in this case, if you pass nothing to addg, it'll return undefined, if you pass 2 and then nothing, it'll return 2, if you pass 2, and then 7, and then nothing, it'll return 9, if you pass 3, and then 0, then 4, and then nothing, it'll return 7, and if you pass 1, and then 2, and then 4, and then 8, and then nothing, it'll return 15. So I'll give you a hint, this problem is going to involve a function returning itself, which is something you may have never encountered before. Function Challenge 10 Okay, let's look at addg. Addg takes a first argument, it creates a more function, if the first argument is not undefined, it returns the more function. The more function takes the next argument. If the next argument is undefined, it returns first, otherwise, it adds next to first and returns itself. By returning itself, it allows for the next thing to happen and I call this retursion. Now recursion is when you have a function that calls itself. Retursion is when you have a function that returns itself. So anybody do anything else, take a different approach to solving this? There are lots of possible solutions. You're doing the same thing where you compute ahead the next value, you did another one yesterday. So if you didn't get this one, please write it down because you're going to need it for the next one. Anybody else? Well it is different, but I don't how to read it. I'll say it. If x is undefined, return undefined. I don't want to say anything. Never mind. Keep going. So I'll give you a minute to capture this. So while that's happening, can anyone guess the name of the next function that we should write? Addg. Liftg. Yeah, very, very good. Liftg, of course. What else do we want to write next. Write a function, liftg that will take a binary function and apply it to many invocations. So it's similar to what we've done before. If you pass the add function to liftg, it gives you the addg function. We could do the same thing with multiply and make the mulg function. If you pass add into liftg, what would the value of a first operation be, what would you get back from the first time you call? Undefined. It would be undefined. If you recall the result. No, instead of using passing in mul, you pass in add. Right, although, so you'll get undefined, you'll get 3, and then you'll get 7, and then you'll get 16. So you add 3 to 0, but with mul, you're multiplying 3 times 1. Okay. Alright, so here is liftg. Liftg returns a function, that function gets the first argument. If first is undefined, it returns first, otherwise it returns the more function. The more function gets the next argument. If next is undefined, it returns first, otherwise, it sets first to the result of the binary function with first and next and returns itself. So who got something that works? Outstanding. Very, very good. Any questions about this one? Anybody do a different approach that they think is worth mentioning? I didn't get it to work, but I tried to do type of first to object and I do something. I was trying to figure out if it's a function or number. Alright, so if you didn't get this one, get it down, you might want to refer to it in the next one. I can still serve the addg function inside the new function and generalize the addg function. So the first time, if you call this with nothing, if you pass in undefined, you just get that itself. Right, we return undefined. So the next time when you, so if you pass in a number. That number is now first, and then the second time, if we get undefined, we return first. The number is now first. Oh, I see. Yeah, great. Write a function arrayg that will build an array from many invocations. So if we call arrayg and pass it nothing, it returns an empty array. If we pass it 3 and then nothing, it returns the array containing three. If we pass it 3, and then 4, and then 5, and nothing, it returns 3, 4, and 5. Okay, so here is arrayg. Arrayg takes a first argument, it creates an array, it makes a more function and it returns the result of calling the more function with the first argument. The more function gets the next argument. If next is undefined, it returns the array, otherwise, it pushes the next argument onto the array and returns itself. So who got something that works? Very good. Very good. Did anybody try using liftg? Anyone? That would have been extra credit. Alright, so let's do another one. I've got, this is a good one for you. Make a function continuize that takes a unary function and returns a function that takes a callback and an argument. There is a style of programming called Continuation Passing Style in which functions, instead of returning values instead are given an additional argument to which is the function to which to deliver that value, and in continuation passing style, computation is always going forward, functions tend not to return anything, they're always going forward and it's amazing. And when JavaScript gets tail recursion with ES6, then we'll be able to do that in JavaScript and that will be great. So this function is intended to help us to migrate toward that style of programming. So what it'll do is allow us to take any existing unary function and turn it into a function that will take a callback. So in this case, we're going to pass the square root function to continuize, it will return the square root with a callback function. We will then call that function passing in a function that will receive the result, in this case, I'm suggesting that it could be the alert function, which is something that lives in the browser, which will pop-up a box and show you the answer. If you're not running in a browser, you could log to the console or whatever you're doing, just some way of seeing where the result is going to go, and then we'll pass in 81, which is the number we want to take the square root of. So if this function is working correctly, then we should see a box show up on the browser with the number 9 in it. Okay, so here is continuize. Continuize takes a unary function, it returns a function that takes a callback in an argument, and returns the result of calling the callback function with the result of the unary function and the argument. So who got something that works? Congratulations. You are now in the 90 percentile of JavaScript programmers. Not that the bar was ever very high, but you're up there and it's because you understand an aspect of the language, which is really, really important and vitally important that most of the people working this language don't know that it's even there and it's really important. That what we've been doing over the last day is a paradigm shift. Now I could talk for hours and hours about what could happen when you have a function that returns a function and you would think you would understand that words and go yeah, but unless you've done it a few dozen times, it really won't make any difference, it won't make any sense, but you've done it now. When you came in here yesterday morning, you did not know how to write this function and you just knocked it off, you wrote it very quickly and you got it right, so good for you. And the reason we went through all of this is that I wanted to explain to you again something that I did on the first day that you thought you understood, but you didn't, so we're going to go through that again now, and this time, you now have the tools of the context in order to understand what I was talking about. Building a Better Constructor This pattern for using functions to construct objects and I've got my constructor function, I've passed in my initialization object, I'm calling another constructor, which allows me to inherit what that thing does and I'm going to add stuff to that or that will be the object that I'm constructing. I will create my member variables, which are the things that my methods will have access to, that's where I'm going to be keeping all of the state, all of the data that's within this object. I will create my member methods which are just local functions within this scope and each of these will close over the initialization value that we passed in, whatever that is, I recommend an object over all of the other members and over all of the other methods so that we don't ever use this in this pattern. And then anything which needs to be public, I publish it by assigning it to the outgoing object and when I'm done, I return the outgoing object. So I'll have as many members as I want, as many methods as I want, I make as many of them public as I need to, and then I'm done. It's a really straightforward way of making objects, it's very flexible, there are lots and lots of variations on this, but this is the basic pattern that I recommend for using, for constructing objects in JavaScript. Now next year, when all of ES6 becomes available and there is some new syntax in ES6, which can be applied toward this, and also with an eye on making our systems even more secure, I'm going to revise the pattern to be like this. So this is next year's pattern. I'm going to start with a constructor object as before, which will contain lots of good stuff, which tells me how to make the object. Then I'm going to make my instance variables and I'm using some new syntax here. First off, I'm using the let statement. In this case, there is no advantage to using let over var, but it makes the Java guys happier, so I'm going to try to use let as we move to the new language. And the curly braces around the variable name means something special here. So what I'm doing is I'm creating a new variable called member and I'm going to initialize it with spec.member, so it's a shorthand for doing those sorts of things, and I can put as many names in the curly braces as I want separated by commas and each of those names will be initialized by a similarly named property from that object. So this doesn't let us do anything that we couldn't do before, but it's some convenient syntax for pulling values out of the initialization object and putting them into our local variables. When you say let where member equals spec, now spec.member is initialized. It's the initialization value for the member variable. So that statement means the same thing as let member = spec.member. So member = spec.member. Okay. Then I'm going to use the new const statement to, in a similar way I'm going to this other constructor where I'm going to get methods which I'm going to be using and I'm going to extract the methods that I want to use and put them into private variables and I'm doing that because I anticipate that this object could be frozen, and so, I don't want to be adding stuff to it as I was in the previous model because I don't want to break if someone that I'm trying to inherit from passes me a frozen object. This also solves what's sometimes called the banana problem, that there is a complaint about object systems where you want to inherit a banana, but you end up also inheriting the gorilla that's holding the banana in the jungle that the gorilla is in, you get all of this stuff and all you wanted was this little thing. So this allows us to do that, we can extract things that we want, and because we're extracting things now, we can call as many of these guys as we want. So we can get to as many of these other constructors and take all of their goodness and pull it up into local variables and we get all that goodness. Then again, I'm going to be making my methods, and again, my methods will close over all of this stuff, and again, my methods are not using this. Then when I'm done, I publish the public methods here, the ones that I made or the ones that I inherited, and I'm taking advantage of new object literal syntax here that if you have an object literal and you just say a name and you leave out the colon, it's a shorthand for name:name. So this is short for method:method and it's short for other:other. Again, not essential, but it's nice and it gives us a, this starts to look more like a declaration than like code where I'm just giving the list of things that I want to publish. Then finally, I'm going to freeze that object because freezing gives us very good security and reliability properties for objects, which we can't get in this language in any other way. So when we look at the evolution of programming in languages like C and Pascal, we got structs and records, you know, data structures which lets you have named properties, and then in objectory programming, we took it a step further where we could have functions or methods which are related to those structures, which we'll act upon them, and I think that was a really important evolutionary step, but it shouldn't have been the last step. So I think JavaScript actually gives us a way to go forward from this so that I'm now thinking that we have two very distinct kinds of objects. We have objects which just contain data, only data, and objects contain only functions, which are frozen, and those objects are very strong and very reliable, they cannot be tampered with, they provide the interface for dealing with the objects which are containing the data and that way we can create good APIs which can defend themselves, which can remain robust in the face of all the confusion happening inside our system. So how would you pair those two things, the data object and… I'm going to put the data in the member variables. All of the data goes in the member variables and those could contain simple values like numbers and strings, they could contain objects. So any other questions about that? I don't understand any of this. Yeah, well we'll review it again this afternoon. So what I want to do now, do we need a break or are we ready to go onto the next problem. Can I just ask a question for Barry's sake? So if you were to use that pattern, you'll have your function object, you'll have your data object that you'll feed it into and you'll feed your data object into the function object, act on that function object creating a new data object to then act upon next in the GF function objects that are all function objects that never change or immutable, but they all create new data sets. Right, each time we call this constructor, we'll get a new thing, and within its function scope, it will close over whatever objects you need to work on. But in itself and its data when it's created is all immutable and it's going to give you new information out of it. Well the data is mutable, but only within the function scope. It is not, you cannot mutate it from outside, except through the function. The freeze doesn't mean that it can't change its own data if you have the functions, you just can't give it new input. Freeze is only on the object containing the functions. The functions themselves are still free to act on anything that they close over. Could you go back to the previous slide please? I think it could be 5 and it could be an object, it could be anything you want. It's anything you want to pass to a constructor. (Waiting) I think I'm confused by fact that what does the other constructor do, it creates yet another object that. Right, it provides a means of inheriting something else, this is how you would do inheritance in this pattern. If you don't need to inherit from something else, then we'll just start with that being an empty object. Right, or it could be something else I use could assigned. Whatever, yeah. I don't understand how this is related to what, you congratulated us for doing something five minutes ago that I've been doing that with Node for a year, and this still doesn't make any sense. I mean, this makes sense to me, I don't understand the relationship, we are passing a function in. The relationship is this is only possible if you understand how to use closure and function scope, which is something you can do now. Function Challenge 11 Here is the next problem. This one is going to be a little bit different in that we're going to do it all together. So we're going to make an array wrapper object with methods get, store, and append such that an attacker cannot get access to the private array. See the idea is that we've got an array and we want to protect it behind a good API and we want to be able to hand it to a third-party, even a malicious third-party that might want to get directly to the array, which we want to prevent. So this is how we would do it. We've got our vector constructor, which will make a vector instance, and it will have an append method, which can be used to append things onto that secret array, it'll have a store method, which will allow you to store things into that array, and it will have a get method, which will allow you to retrieve things from that array. Okay. So think about how you might implement something like that. I'm going to guess based on what we've been doing it might look something like this. So here is a function called vector. We have an array variable containing the array, that's the secret that we want to protect and it's hidden in the function scope, so we're already off to a good start, and we're going to return an object containing three functions, the get function, store function, and append function. The get function will take i and return arrays of i. The store function will take i and v and store into i the v value. And append will take a value and push that value onto the array. So the guarantee that we want to make is that we can give this to a third-party, the third-party can access the array indirectly using these methods, but the third-party cannot get direct access to the array itself because we want to limit access to the array to only the things that we can do and not to any of the other things that you could do to an array. Now it turns out there is a vulnerability in JavaScript, which invalidates that guarantee, that it is possible for a determined hacker to get direct access to the array. Now this problem has been shown to some of the top JavaScript experts in the world and they could not see the attack. The attack is not something that's due to bugs and implementations, so we're only concerned with standard behavior of the language. There are some things about the language which we know are problematic, for example, you can go to array.prototype and replace its push method with your own method or it, we'll assume that those things have been fixed. So we're just concerned with the language as it works as we've discussed. So your job will be to figure out this code and suggest how the code could be attacked and how an attacker could get access to the array and then how we would repair it so that the attacker couldn't do that. And my job will be to honestly answer all of your questions about how this code works and there you are. Okay. Identifying Security Vulnerabilities So the goal is to protect array. Right, which we have started to do. We've got the thing in a function scope, we've got methods that close over it, we've done all of that stuff well, but there is a vulnerability in the language which will frustrate our work. I mean, can't you just, and you just have the return object right, so you could make a new, could you potentially define a new key with a function that just returns array? How would you do that? Kind of store a function. So be more explicit. So you call and you pass in the function that portrays the array. Okay, but the function you pass in is created on the outside, right, so that function doesn't close over the array when it's on the outside. And so, only functions created on the inside will be part of that closure. You add a method to a function prototype. It is possible to add functions to the prototypes of system objects and that is definitely a thing to be concerned about, we're assuming that's been fixed. So if you, I think this is what Joe was saying, if you were to instantiate a vector and then you say vector getArray=function return array, array is undefined because it's in that scope of where you're creating it. That's right. So that's good. That's a place where the language is working for us. That's not where the language is weak. (Waiting) If you inherit from vector, do you have access to the array? It's a really good question. No because function scopes don't work like Java scopes, there is no package kind of thing, so no, you don't inherit the contents of a function scope. Can you twiddle with that function? So vector, your instance.get has this function and can you, it is a function, is there some way to manipulate the guts of that function once you have a pointer to it. No. No, functions, like you could store a function, you could use store to put a function into the array and then you use get to get it back out, but the function is not altered as a result of that experience. It doesn't remember that it's supposed to be over there. Yeah, the function doesn't have any properties that you can go and pick at or change so that when you call it the next time, it gives you something completely different. Right. If I tack a function onto that and then call this, does this have access to array? You are on to something. So if I call it, if I have a function called screw it up and screw it up said this.array= the new array. It wouldn't be this.array because there is not an object which represents the contents of the scope, but this is definitely part of the attack. Can we change push so that we said array, we change what the prototypical push of an array does so that it returns its own value. Yeah, you're getting close. Oh because the return I or the first function doesn't return anything, so that's not going to get it out. Second one doesn't return anything so that's not going to get it out, or actually, when you pass in an array to store, you could change, or no, you could pass in an array to append and change push so that what push does it takes the thing that you've passed in and pushes onto the end of it itself, so you call append with this array and then you get the last thing off of the array and that's the array you want. No. That didn't work. You're drifting away. You almost had it. You were very, very close. Push a function out at the end of the array using append and then you use get to call that function somehow. You can do that, but the function does not remember that happened, so it is not altered by that. How do you in JavaScript change the function of a prototype? Do you check like array.prototype.push and now you're manipulating? Okay, there it is. That's exactly it. Yeah, you could go to array.prototype.push, but we're assuming that that's been fixed. So one of the aspects of the design of JavaScript is that everything is unlocked with its guts exposed, and if you're thinking about security, that's obviously a terrible thing, so we're assuming that that's been fixed. So we're concerned now just with the performance of ordinary programs. So if you were to store a function that calls this.array, you still wouldn't have access to the internal, right. This.array won't get you there, but this itself will get you there. So this returns this. Yes. I got it. What I described exactly did it. Maybe you're right. Array.prototype.push = function on a, a bracket 0 = this, so then I created an array 1, 2, 3, 4 and I created array x empty array and then I called my first array.push passing in x. It took that empty array, it took the array itself made at the first element of x that I passed into it and now x, the first element is x is the array I wanted to hack out. Right, but I've said a couple times, we're not looking for array.prototype. Now we're assuming that that's been fixed. Oh, I'm sorry. Array.prototype is fixed in the new version of the language. No. We're assuming in the context of this problem that that's been fixed because of the very thing that you just described. It has nothing to do with using the call method? It does not require use of the call method. Identifying Security Vulnerabilities - Continued So you've gotten very close a couple of times. It is concerned with push, it is a concerned with a function that will return this. So can it return this or uses this referring to the array or is it referring to vector? The array. Find a new function does it through the lengths, can do it through the length of… No. If you push a function onto the, if you append a function onto the in this, you call that, I know it's the wrong scope, is it at all passing a function into one of these? Yes, it does. If that function includes this inside of it and then you call that function, what will the scope of this there be. It will depend on how the function is called. Remember that there are four different ways to call a function and what happens to this depends on which form of call it is. You want it to be… You want it to be the method form. What's that look like? It looks like a method invocation. So the function could assign this to a global variable. It could. Outside of the function. Right, that's not, in fact, that is a possible way of exploiting this, but it's not key to the attack. Okay, so I passed in, I pushed onto the end of the function, which console logs this 0 and it's putting the first element of the array into the console. Right, but they can get the first element of the array simply by calling push. So we want the whole array. Not necessarily the whole contents of the array, but the array object itself, that's the thing that we're trying to protect. Right. So I pass in a… Okay, got it. Can you call this.this.array from a function that you pass in the storage and to the array. I don't think that's going to do anything. Poking around the constructor, am I in the right path at all? I'm sorry. I say this.constructor. That's not going to help you. This by itself, if you can get it bound correctly is what you need. So I don't have this bound correctly, but I passed in a function that retuned a console.this, bound that via get to a variable and then called the variable and that's returning the window object, so I don't know if that's getting on the right track or not. No, it's not. I mean, you can get to the window object without calling anything. So all I do was push a function out and it returns this. And so now, I guess I need to know how many items are, now I'm going to get this function here. I guess I need to know how many items are on it. So now I have to actually use store it looks like. Right. So I'm going to store at a position say five function that returns this. Now all I have to call is get 5 parents and the result of that is going to be the array. No, it's not, but you're getting closer. You do want to store your function. Okay. So the thing you haven't got yet is the value of i. I've never gotten the value of i. Yeah. There is a specific value of i that makes this attack work. Undefined? Nope. Negative 1? No. What'll happen if you say -1? I have no idea. Well it'll take the brackets, right, the brackets will say I'll take that number, turn it into a string so you get the string, -1, and it will then store it in array.-1, that's where it goes. JavaScript group objects don't fill in if you set a really high number, do they. Nope. Just because they're not really arrays. They're really hash tables. Yeah. Does it have to do with the max number, what is that number. No. You're trying to store current.length +1. No, I guess. You don't know the length. No, you're right, I don't. In fact, for this attack, the length doesn't matter. Okay. So does i where want to pass in this to refer to the array object. I'm sorry. Is i where we want to pass in this to refer to the array object? No, you want the reference to this to be in the function that you pass in as v. I don't know, it seems to me that you could, first of all, you have to know that this vector that it's an array that you're dealing with, I would think. Right. If you. Yeah, we'll assume that the attacker sees the source code. You said an arbitrary value then you're obliterating that arbitrary value, so let's just choose a random number five, first you've got to get five, so now you've got that. Now you can set five to this function that returns the whole array and then you need to fix five in that array with the one that you got the first time and now you've got the array. But you don't. I don't. No, because in order for the binding of this to happen, you have to call it as a method in this context. Yeah, I mean I'm just playing the console window here. Okay, yeah. I'm sure I'm obviously doing some things wrong here. I act like I pretend like I know what I'm doing here. Right. So there is a specific value of i where you need to store this function. Yes. Okay. You're trying to get this from an exception, you mean? Nope. So are you stumped? So it's not undefined you said. It's not undefined. No. It's not all, it's not one. Can you do a pass of one of the array functions like map? You can, but that's not going to help you. Zero? It's not 0. There's no magic number? Hey come on. Is it 22, is it 3? Is it like vector.prototype or something like that. Nope. It's not some weird magic number. It's not a weird magic number. Forty-two. I was thinking that's high, but in that case, you can bring in a string. In fact, it's not even a number. Is it not a number? It is not, not a number. It's a number! Not a number is a number. So it is not a number like nan. No, it's not nan. It's not nan. How about function? I say we just go for it, what do you guys think? You going to make him tell us? Ikina is curious hacking over there. Ikina hacks to have people give up the answer before we solve it, but we could be here all day. What does Chad say? Somebody Google it, then we can pretend like we know it. Look at the answer on Stack Overflow. Is eval play anywhere in the mix? No, we are not using eval. That's a really good question. Eval is full of trouble, but we're not concerned about eval here. Is it a string? It is a string. Is it forEach? Nope. Is it the empty string? It is not the empty string. GetThis. ToStrings? It is not toString. It's not the string.this. It is not the string.this. Is it every? It is not every. Is the name of the function that you can call on an array? Yes. Well, okay, is it concat, constructor, copy, within… Yes, it is one of those. Okay. So it has to be an array method. Well, it's push. It is push. So you set. Right, so we --- You store push. --- we store push with a function that has access to this. So then the question is, how do you get that function to get called. Call append. You call append. Ladies and gentleman. Identifying Security Vulnerabilities - Solution So let's look at the attack that he just outlined. So we've got a global variable, I'm calling it stash. Someone suggested a global variable and there it is. Then we will store into push a function which will take this and store it in that global variable. And then in order to get this function to be executed, you call append. Append will call the push, and because this is a method invocation, this will be bound to the array. So this attack exploits several confusions that are in this language. The principle confusion is there are not arrays in this language, they don't behave like arrays in other languages, but we think that we do and even the people who build JavaScript engines when they were shown this problem, they know how arrays in JavaScript work at least better than anybody and they couldn't see it because everybody thinks this is what arrays do. So when you have this i here, even though we know we don't have ints in this language, everybody thinks it's an int, right, so it doesn't occur to anybody that this could be anything else. It's just a key. It's just a key. Right. In fact, what JavaScript will do is it will take whatever you pass in if it's not a string here, it'll turn it into a string. That's what JavaScript does, but that's not how we think a programming language should be behave, right, and when you have a system that does something other than what we expect, you got confusion and confusion can lead to security exploits as in this case. The other source of this problem is that all variable bindings in the language are static because function scope, if array is defined there, then this one will be bound to that one, it's predictable with the exception of this, this is bound dynamically, and so, it's much harder to reason about what the value of this is going to be at any instant because it can be different depending on where it's called and when it's called and that turns out if you're thinking about the security of systems or the reliability of systems to be a bad thing. That's why I prefer to try to figure out ways of programming in this language that don't depend on this since this tends to be unreliable. So what you think of that problem? Was that interesting? So now that you understand the attack, how would you correct this code to prevent that attack from happening? Just to give credit, Vincent in the chatroom, I think he had got this just a little bit before we did, but then he's saying to fix it, he would just force the i to an int like +i. That's exactly right. I would do that too. So I would put a +i here. What that accomplishes is + will turn the string into a number. If you pass in a number, it just stays a number so it's a no-op, but if you pass in the word push, it will turn it into a number, what number does push turn into, nan, and so, then turn nan into a string, so we'll be storing the addArray.nan, which is stupid, but at least it's not push. So we get around that. The other thing we could do is not use push, instead use the old fashion way of assigning to the current length. This illustrates another design error in the language that I think something as fundamental as appending to an array should be an operator, it should be something that's just built into the language which can't be corrupted, but it's not JavaScript allows. There is also the weird way of how these get bound. So ordinarily, when you call a push method, you're going to be using the inherited array.prototype.push, but when we assigned our own push property to this object using store, now push is local, and so we don't look at array.prototype.push and that again doesn't match most people's mindset about how a programming language should work, that's probably not the kind of behavior that you want this function to be able to modify. So what did you think of that problem? Was that fun? Was that interesting? Yes. So why not just give JavaScript real arrays? Is it the way to break it? Oh, absolutely break it. There is code all over the internet that assumes this terrible behavior. And you can't just say that ECMA 6 is a new standard that has a new… I wish. The way, the web is unlike any other platform in that like if you're developing a server, you dictate what software is installed on that at what revision level and so on. You've got control over everything happening on the server so you can dictate that stuff. And so, you can tolerate breaking changes because you decide when you're going to make the change. But on the web, you have to run with whatever it is that someone's got installed, and if they're running some awful old thing, your code is still expected to run on that old thing. So for ECMA 6, they're going to have to have new stuff that can eat ECMA 6, right. Right. So when I create a new thing, instead of an array, call it a list. They could do that, in fact, they have added new things in ES6, for example, there is a new thing called a map which behaves like objects should behave, it is more correctly like a hash table. And there are new forms for doing that, but it doesn't look like this, and so, it's going to be very difficult to get people to use it because it doesn't have the syntactic conveniences that this one has. Yeah. I know there are good answers for these things. No, I'm not trying to wave it off. I mean, the web is a really, once we make a mistake in a web standard, we're stuck with it, and all we can do is pile more mistakes on top of it. We can't correct any of the stuff that's ever been done, and so, working with web standards, as you all know, is incredibly frustrating because nothing ever gets better, it just gets bigger. It did get rid of the blink tag. Well not entirely, I mean, you can still write blink and it's not going to give you headaches anymore, but it's still allowed, it's still building structures in the DOM, it's still there. And in fact, you can write CSS, which will cause the blink to come back to life. People have written JavaScript plugins in the CSS, but bring back the blink. Yeah, that's very, very hard to fix the web because when something wrong happens, the problem with the wrong things isn't that they're useless, it's that they're dangerous and weird and there are very clever people who will go wow that's crappy, what can I do with that and they'll figure it out, they'll take the stupidest thing you ever saw and it now becomes part of some popular library and we can't get rid of it anymore. Function Challenge 12 So this is a problem similar to the first one. We're going to make a function that makes a publish subscribe object and it will reliably deliver all publications to all subscribers in the right order. So the idea is that we might have multiple parties running in our system and they need to be able to communicate with each other and we will provide them a pubsub instance that they will all share that allows them to communicate, but we want to make sure that it works correctly, and also that it can reliably deliver all of the messages to everybody in the right order. So this is how it might work. We'll have our pubsub constructor, which will make a pubsub instance, which we then share with all the parties, it will have a subscribe method on it, which can receive a function which will be called whenever somebody publishes, and they can publish by calling the publish method, which will then cause all the subscribers to get called. Okay, so you can imagine how you might implement something like that, it's not very complicated. Here is a possible implementation. We've got a pubsub constructor, it has an array of subscribers where it's going to keep the list of subscribers, it will return an object containing two methods, first the subscribe method will receive a subscriber and push it onto the subscribers list and the publish method will loop through all of the subscribers and call each one and give them the message. Okay, so it looks pretty straightforward. And unfortunately, it doesn't do any of the things that we need to guarantee. It is possible to prevent people from publishing and subscribing, it's possible to remove subscribers and cause messages to get delivered out of order, suppress messages, everything that this is supposed to prevent, it will allow. So your job is to figure out how an attacker can do this. So if an attacker is one of the subscribers and wants to screw it up for all of the other subscribers, how does he do that? Could he use the attack from before and empty out the subscriber's array? That is something to be concerned about. We don't have a store method this time, so we don't provide a way of changing the push property of a subscriber's array, but that's definitely something you want to be thinking about. You push a function though, publish. I'm sorry. You could push a function called publish or push a function called subscribe and call it publish? I think you push a function that deletes the subscribers, for example, inside this because this, when you get to me, the one that I just pushed, which is a function when you call subscribers i and you call it on publication, you now have access to this. You are exactly right. Unfortunately, PowerPoint compels me to force you to do these in a particular order, so please remember that we will come back to it, okay. Sorry, PowerPoint. Yeah, I'm a slave to PowerPoint here. Yeah, so we're going to have to do that, so we're going to have to, but I will call on you to do that again because you are absolutely right. John got it. So let's start with how can I prevent other people from receiving messages? You almost had it earlier. So push a function called publish and override it? No, it's much easier than that. Will this hack fail immediately on the next publish or will it not fail until the publish after the next publish. It should fail immediately. You'll fail instantly. I'm sorry, it'll fail on the next publication. Oh, I can just set some, I can just set publish, can I just set publish equal to null? Yes, unfortunately, PowerPoint compels me to ask you later. Ask you to remember that one too. I try to anticipate what order people are going to come up with these. I'm sorry. No, no, no you're good. We're being tasked with sabotaging, preventing all of those subscribers who have previously called subscribe from getting publications when publish is called. Right. Okay. Okay, so since you're getting way ahead of me, let me show the one that we need to do first. Securing the pubsub() Function The simplest attack to prevent people from getting publications is this one where we simply subscribe with nothing. So that means we're going to push undefined onto the subscribers array, so when we go to do this loop, kaboom, right, and so, everybody after us doesn't get any messages sent. So how would you fix that? It doesn't solve the problem though because people before you still get it so. Right, but the people after don't. Then what's the problem to try to prevent anyone from this. We have to make sure that everybody receives every message. If one person gets it, we can't say that we've succeeded, everybody has to get it. You could just do a type check before you type it on the array, correct? We could do that, up here we could do a type check if type of = function. You should be rewrapping your call in a try catch with inside before. Yeah, that's it because if they were to pass in a function that throws, then it wouldn't do that. So pubsub is something we give out to all the… We give it to all the third parties, they all share that instance. So really in this case, it's kind of like don't trust user input. They can send a subscriber, but they could send you a function as a subscriber, an empty string. Right. Well in this case, it could be a malicious failure, but could also be an accidental failure, someone might simply call it incorrectly and we don't want the whole system to fail because of that. So this is what try catch is for so we can simply catch the thing, ignore the error, and we'll keep on processing, so that's good. So we're now ready for your second observation. I said subscribe to null. Yeah, so we can tamper with the pubsub instance itself and we can delete the property, change either of them to undefined or null or replace them with other functions that could do things more insidious like allow only certain people to subscribe or allow people to think they subscribed when they didn't or to filter the messages when they publish or get tamper with the messages when they publish. There is an infinite number of terrible variations on this thing. So how would you fix that? Use a setter. I'm sorry. So you aren't going to really expose publish so much as you're going to return a function, you're going to return a function that returns a function. I don't know. It's something like a getter and a setter. Like you don't want to set that they can only get it. You could, but there is an easier thing to do than that. Freeze the option there. Yeah, you want to freeze it. So if we freeze the object, then all of those attacks are completely frustrated. We can do this right now in JavaScript. Yeah, freeze was added in ES5, so that's, it's in IE9, and 10, and 11, and in all the good browsers, so we can freeze. So that completely solves this, and so, that's one of the reasons why I like freeze as an object construction pattern because there is a whole lot of stuff we'll never have to worry about if the objects are frozen. And freezing this object does not impair its ability to do what it's supposed to do, that these methods still have access to the subscribers array and still can do all of that dynamic stuff, it's just nobody can tamper with the instance. Okay, we're now ready for your next suggestion, your first one. I forgot it already. Just kidding. We're going to pass in, we're going to subscribe with a function that deletes, iterates through this and deletes everything, something like that. Yeah, something like that. So that'd be something like this. So we're going to subscribe with a function which will get access to this and then that gives us access to the subscribers array because this is a method invocation, even though it doesn't look like a method, and we're calling a function that is stored in an array, we don't think of that as being a method invocation, but JavaScript does. So everything from here on will get bound to this. So in this case, we're going to set this.length to 0, that will delete all of the subscribers, but we could do much more insidious things as well. Again, there is an infinite number of bad things you can do to this object if you get access to this. And so, this is a big source of confusion again, right, that in JavaScript things with brackets are method invocations, but we don't see those as method invocations, so again, this is a confusion which leads to misunderstanding of what our programs do, which make it possible for bugs and security exploits to happen. Inside this function that you've just written, this is the paystub object, right, with its var subscribers, that's its scope, right? No, in this case, this is the subscribers array because this is the method, okay here is your function, you pass it into subscribe so it gets stored in the subscribers array, and in this loop, it now gets called, but it's being called as a method. There are four ways to call a method in this language, I think there should only be one, but there are four, and at least one of the forms is confusing as to which kind of invocation it is. So how would you fix that? Your brain says call, no? I'm sorry. My brain says using call. That would be one way to do it. If we say subscribers sub i .call and then pass in that, that would be one way to do it. The way I would prefer to do it or the other thing you could do is assign subscriber sub i to a local variable and then call that variable or what we could do is use forEach. I'm now distressful of for loops in general, and I like forEach much better. So I can pass to forEach a function which will call each element of the function and ignore any exceptions and I really like this, I think this is very, very nice. So I'm not using for loops anymore, I'm doing this stuff instead, and because it's passing in each individual element, there is no confusion about how this gets called. It's never going to be a method invocation. In fact, this function doesn't even see the array, all it sees are the individual elements. Remind us how to escape forEach. I'm sorry. Remind us how you break out of the forEach. You use every, instead of forEach, and we would design it so that this function would always return true. It returns false, then it leaves. Then yeah. False is the exit signal. On the previous slide, somebody wants you to clarify why does this refer to the array. Right, so this is the function that is being executed here and this form of method invocation, of function invocation is the method form. The method form will have a dot in it or a bracket in it, and so, everything to the left of the last dot or last bracket gets bound to this and that's why that happens. Securing the pubsub() Function - Continued There is still one attack left and this one is an out of order attack that the attacker can cause a message to be delivered before another message, at least to some subscribers, and by doing so, he can create confusion in the marketplace. So how do you do that? Meaning that the subscriber should be getting the messages in the order that they subscribed. Each of the subscriber should get them in the order that it subscribed and they should also each of them should get the messages in the order that they are sent. Subscribe with a number. I'm sorry. Subscribe with a number? No, that would cause an exception to get thrown and we're already ignoring those. So we're assuming at this point that we're working on the forEach loop rather than the previous. Either one, but we can focus on this one if you like since we can see it. But the corrections that we've made up until this point haven't fixed this new one. So much like we've pushed push onto the subscriber rate, can't you push any of the other function operators, function methods onto that to screw with the order of it or reverse it. No. You're talking about the first problem we did? Yeah. Yeah, that one was using store because we could name where it was going to go. In this one, we haven't provided an API for doing that. Got you. And the good thing about this construction is that that's impossible because they, well assuming that we've fixed the thing where they can get access to the subscribers array through this, there is no way that they can add a property to subscribers, except through the subscribe function. S doesn't have access to this anymore. That's correct. Are you trying to get the messages out of order or the subscribers out of order? The messages out of order. Just setTimeout? Could it just be asynchronous and, I mean, it'd just change where yours comes in. Yeah. So timeout might be part of the solution, but it's not part of the problem. You have like weight? We don't have weight in this language. Yeah, because that's synchronous anyways. If it was, it would just cause everything to just… Exactly. And in fact, weight in a system like this is a form of denial of service attack because it stops everything, so that's not part of this. Every time somebody calls publish, publication is, that's the message, right, it's going to iterate through the subscribers and call it and then what you're saying is I call it and then you call it after me, you're the bad guy, you call it after me, but somehow yours jumped the line in front of mine. That's what we're looking for. But you can't call it until it's finished with mine because the process is locked, but I guess apparently, I'm wrong. Right. Can I pass in the pubsub object to my function and then resubscribe myself in the midst of a call? You're close. It's not quite like that, but it's that kind of thing. Is this a vulnerability in forEach? No. Because these are forEach is synchronous. Right. So are we essentially creating an infinite loop based on what he said. We are not looking for an infinite loop because an infinite loop would be a denial of service attack and someone can do an infinite loop without calling our code. They can just sub in their own code and do an infinite loop, so that's not something this code can defend against. And instead of forEach, can you change a value of s? Yes. So if I publish one object and then later on I publish another object and that second object has a function in it that sets s back to the first object, is that a nasty thing to do? Well that's not likely to happen because whatever function you're going to be doing with that will be created outside of the scope and so, it will not see s. Oh yeah, duh. Only functions in this scope can see s. So if I pass in the pubsub object to my subscribers function, when my function gets called, I call public again and then restart the loop in the beginning again. Exactly. And we keep on doing that and the end people never get it. Right. So this is the attack he just described. So he subscribes a function which will publish and this function gets called in the publication loop, so he can then cause his own message to get published and delivered to everybody who is after himself in the subscriber list, and in doing that, he causes things to go out of order. Leonardo in the chat room beat us again. Way to go Leonardo. Well Vincent beat us last time, but this time it was Leonardo. You need to be telling us that before. I know. I didn't know exactly. He needs some glory. Yeah. Okay, so the reason I used the limit function that we wrote yesterday because if I don't, then I'm just creating a denial of service attack. It will keep calling it over and over again recursively and a denial of service attack doesn't accomplish the security exploit that we're trying to do, it just impairs the system. Limits are what we wrote yesterday. Yeah, we wrote limit yesterday so that means when I get called, I am going to send one message and that message will get delivered before everybody else. So how would you fix that? Every time somebody publishes a message, my message will go out, oh and it will only ever happen once. Well, but I can control it. This is the simplest way to keep it from getting abusive, but obviously, you could put a smarter function in there that could change its behavior every time it's called. You could type up the subscriber and make sure it's not an object, or a function, right. Type check the subscriber before you push. No, the subscribers have to be functions. In fact, that's what the try catch was for to guarantee that they are functions. Okay, can you, never mind. Would you just temporarily change the value of publish to something else, but it's frozen. Right, yeah, we fixed that one. I was just trying to think of a way to disable publish inside of this loop so you just set, yeah, you put a var at the top of this that says publish off. Yeah, we could put a Boolean up at the top which says we're in the publication mode so subscribe or publish is turned off while we're looping, that would certainly work. The way I would approach it is I would do an asynchronous solution because I like asynchrony. So there is something that's in browsers and in Java and in Node called setTimeout and setTimeout receives a function and causes that function to get executed in the future and you can give it a time which says no sooner than 0 ms from now, call this function. But this function happens in a different turn so it'll get scheduled after all of the current work and the current turn is finished. So this would cause everything to get lined up in the timer queue, and then if someone wants to subscribe during a publication, their stuff will get added later in the timer queue so everything stays in order, we're using the timer queue to sort the messages. So you move the publication off into a separate queue that gets processed independently and whatever goes on here. Right. And it also means I don't need to try catch here now because if it fails, it'll fail in that turn, but then the system says well that turn is finished and it'll go onto the next one. So everything keeps going. Now it turns out there is one hazard in this because of a design error and the way setTimeout works. When you call setTimeout, it returns a number and you can pass that number to clear timeout and that'll prevent things from happening. Unfortunately, it is an easily guessable number, and that means if the attacker can guess what the next clear timeout number is going to be, he can prevent messages from being delivered by cancelling those messages that are in the queue so we would have to fix that, but we'll fix that one another day. Principles of Security The History of Security Okay, so those last two exercises are the introduction to the subject of security and security is a really, really important topic, and unfortunately, it's not well understood at any level in our industry. So some people think of security as a war between people in invisible colored hats, they're the white hat guys and the black hat guys, and white hat guys are probably good guys and black hat guys are probably bad guys, except there are famous white hats that used to be black hats, and black hats who used to be white hats, and gray hats who seem to be playing both sides, and it turns out you cannot easily identify somebody by their invisible hat. In fact, this is a model of security that just doesn't work. Security is not about hats. The thing that's even worse about that model is that it says that security belongs to the specialists and that model doesn't work. So in the specialist model, it means you've got security experts who are responsible for all the security and nobody else is responsible for security, and in fact, everybody else is working against the guys who are helping to deal with security and that model is not effective at all. One consequence or one source of insecurity is that things change. It might be that within a limited scope or a limited context, you could do things and there is no security vulnerability in doing that, but then things change and the scope increases or whatever and now suddenly that turns into a big security problem. It is not unusual for the purpose or use or scope of software to change over its life. Rarely are the security properties of software systems reexamined in the context of new or evolving missions, this leads to insecure systems. Do you know who wrote this? I did. I wrote that, that's me. So I'm going to be giving you a set of principles and most of them are pretty simple and pretty obvious and once you've got it, you're able to reason about security on your own. The world of security is incredibly complicated and is always changing and it's impossible for any human to keep on top of it, particularly, a human who has a day job, there is just no way, but the set of principles is actually very small, and if you can understand the principles, you can work out most of the rest of it on your own. Security is not obtained by tricks or hacks. If you ever encounter someone who says we can secure our system by doing this trick, that person is either misinformed or lying because it turns out tricks don't work, it's only adherence to principles that works. So one of the first principles for online security of computer systems is that deterrence is not effective and that's because you can't publish or you can't punish an invisible attacker. Now in the real world, deterrence is very effective, that's why we have not blown ourselves up yet, but online, the attacker doesn't need to be awake while he's attacking you, so there is no threat you can make to someone you can't see or touch to prevent them from doing things. So the only thing that works is prevention. Prevention is the only effective mechanism. So I'm going to stop here and tell you a story about Johann Martin Schleyer. Schleyer was a priest living in Bavaria, and one night, God came to him in a dream and told him to do something. Now in order to understand that story, I need to tell you an earlier story. Long, long ago on the plain of Schnarr, the world's best architects, builders, engineers, material specialists, and workers got together to build a tower that would reach to heaven, it was the biggest construction project in the history of the world at that point and it was amazing project and God was not happy about it. We don't know what his complaint about the project was, but he did not want it succeeding, so he decided to cause the project to fail and he had a lot of options available as to how he was going to fail the project. He could have cause earthquakes, he could have flooded the plain, he could have thrown thunderbolts down on them, instead, he decided to go down and confuse their speech, and after he did that, they could no longer understand each other when they spoke and being unable to communicate, the project collapsed and they all wandered off and started their own countries. Basically, he created the I18N problem. So thousands of years later, He comes to Schleyer while he's sleeping and says I've changed my mind about that and what I want you to do now is to create a language that everybody in the world can understand. So Schleyer wakes up and he begins working on his language, which he calls Volapuk. Volapuk being the word in Volapuk meaning world speak. He based his vocabulary on English, but he transformed it so much that it was no help to English speakers at all. It's hard to see the words world and speech in there, but that's where the roots came from. And we was told that English speakers did not have a problem with umlauts, but I can tell you this English speaker has a lot of trouble with umlauts. But he published his language anyway, he worked for about a year and then published a book in German about Volapuk in 1880. Now people had been designing artificial languages for many years before Schleyer. There was John Wilkins and Dalgarno, in England, we're doing similar sorts of things, and there have been lots of artificial languages after, but Schleyer was doing this at just the right time. Europe had been in a fairly constant state of war for centuries and people were getting really tired of it and they observed what had just happened in the US with their Civil War where new technologies were coming onto the battle field for the first time and the terrible devastation that happened there and they were very concerned that this was going to get really, really bad and there was a lot of interest in trying to solve the world peace problem and a lot of them saw Volapuk as a method for doing that, that Volapuk would allow us to experience debabelization where we would break down the language barriers between countries, and maybe if we can communicate more effectively, maybe the coming war could be avoided. And so, the Volapuk movement took off all over the world, not just Europe, but everywhere, in cities, all over the world Volapuk societies were being created, books were being published about and in Volapuk, new journals we're being written in Volapuk coming online every month and it was reaching all classes of society, it wasn't just the intellectual elites learning this stuff. Everybody was interested in Volapuk and it was really taking off and a lot of the success of the language was due to this guy. This is Auguste Kerckhoffs, he was a Dutch linguist, and he wrote extensively about Volapuk and traveled all over the world teaching the language and he was so successful at doing that, that at the second Volapuk conference, he was elected to be director of the Volapuk Institute and given the jog of popularizing Volapuk throughout the world. The next year at the third congress, at that congress, they did all of the business in the congress in Volapuk, even the waiters serving the meals were speaking Volapuk, and Kerckhoffs at that meeting proposed a change to the language. There were some moods in the language which were rarely used that were very complicated and difficult to teach and learn and Kerckhoffs was proposing that they be removed from the language in order to make it more accessible to everybody, basically, Volapuk, the good parts. I love this guy, he was great. Schleyer was extremely upset about this. Schleyer insisted that the language belonged to him and he demanded a veto over anything that the congress might propose. At that point, the movement forked. About half of the attendees were in the German delegation and they went with Schleyer, everybody else went with Kerckhoffs, and then suddenly, chaos, there are all these cunning linguists that said well as long as we're proposing changes, I've got some ideas and they started throwing new features out for what could go into Volapuk and other people saying well Esperanto is a better language, we should go with that one and the thing collapsed. Almost overnight, Volapuk was dead. So we don't know what would have happened had Volapuk succeeded, but we do know what happened after Volapuk failed, what the world had its bloodiest century in history. So they ended up with Rebabelization. After it collapsed, they had more languages than when they started and people continued to design languages, it's similar to the compulsion that causes some people to design programming languages. The guy who designed The Saint, you remember that TV show and the movie, he designed a language called Paleone. The guy who designed who designed the board game Careers designed a language called Interlingua. Maybe the most famous of all language designers was JR Tolkien who designed languages for epic races like elves and dwarfs and wrote epic poetry in those languages and then wrote histories to explain the context of those epics and then used all of that as the back story to the Lord of the Rings. He once gave a talk about his compulsion, he called it a secret vice. Anybody happen to know what the most popular invented language in the world is today? Klingon? Very good. It is Klingon. Cryptography I told you that story because I want to introduce you to Kerckhoffs. Kerckhoffs was an amazing guy. After the Volapuk thing exploded, things went really bad for him, his health suffered and it went bad, but just before he got involved with them, he wrote the first book about modern cryptography. Prior to Kerckhoffs, cryptography was understood as the science of secret codes, generally, codes written on pieces of paper. Kerckhoffs reinvented cryptography based on electronic communication because they were now sending messages through the telegraph and the properties of the telegraph are radically different than the properties of paper and he worked this stuff out and came up with a set of principles, which are still understood to be true today. He was one of these amazing thinkers who was just so smart and so far ahead of his time. So one of the things he wrote is called The Kerckhoffs Principle and he said the design of a system should not require secrecy and compromise of the system should not inconvenience the correspondence. So let me demonstrate what he's talking about here. So we have two correspondents, Alice and Bob, and they need to communicate and they don't want to be observed while they're doing that, so they have a cryptographic system. In this cryptographic system, Alice takes her message, her plain text, and puts it in an encryption machine. She also puts a key into the encryption machine and out comes a Cypher text. She then transmits the Cypher text to Bob and it's entirely possible that wire is being observed and anybody can see it. Bob receives the Cypher text, he puts it in his decryption machine, which may be identical to Alice's encryption machine, and also puts in the same key, which they exchanged in a prior meeting and that allows him to obtain the plain text, and this is basically how all cryptographic systems work. The thing that Kerckhoffs said which was so amazing was that there are no secrets in the encryption machine, the only secret in this system is the key, and I still see people today, academics who are saying we should try to keep as much of our system private as we can in order to frustrate the attackers and Kerckhoffs said no, just protect the keys, that's all you have to do and he was right. And he was right because there is no security in obscurity and it turns out the more secrets you have, the harder they are to keep and just keeping the keys safe is really hard so you should focus all of your attention on that. There are some people who take the Kerckhoffs principle so far as to say we should publish everything about our system, except our keys, and the good thing about that attitude is you're never confused about what are the important secrets in your system. So cryptography is not security, but the way that cryptographers think about problems is a really useful way of thinking and we should think like that too, so I'm going to try to teach you to think like a cryptographer and I'm going to do that by teaching about a cryptographic system called the One Time Pad. This is the only cryptographic system which is guaranteed to be truly unbreakable and all other cryptographic systems if you have enough time and enough computer power, eventually you can brute force try every key and eventually you'll break the system, not possible with this system, guaranteed unbreakable, which is pretty nice. There are rumors that the hotline between the White House and the Kremlin is encrypted with the One Time Pad. So this is how it works. We have a number of rules that we have to follow. The first one is that the key must always remain secret, this is true of every cryptographic system. Number two, the key must be at least as long as the plain text. Now this is really unusual. In most cryptographic systems, the key is some fixed size and the messages tend to be much longer than the key. This one says the key has to be at least as long as the message. Then the cypher text is obtained by exclusive or-ing the plain text and the key. That is the algorithm, that is the implementation of this system, so it's really, really simple. So let me demonstrate it to you. This is my plain text, this is the JSON logo, I'm using it because I can. Usually, you think of a plain text as being a text, but this is an array of ones and zeros, right, so we can encrypt that. This is my key, the key is the same size as the message. I'm going to exclusive or them together, and if I do this correctly, you should not see any sign, any ghost, and whisper of the original message. If the message is completely obscured, hidden in the noise, then I have encrypted it correctly. The next rule says that the key must be perfectly random or cryptographically random. Getting that kind of randomness is hard and I went to a lot of trouble to get that into my key. So let me show you what happens if you don't. So this is a key, it looks just like the key I showed you first, except I made this one in Photoshop with a noise filter and Photoshop's noise filters are not guaranteed to be cryptographically random. So if I exclusive or my plain text with this weak key, you can see it, right, it's leaking through. This is what crypt analysts are looking for when they're breaking codes, they're looking for any kind of pattern, any kind of signal that's leaking through the noise. You are breaking an unbreakable code with your brain and you're not even working hard at it and the reason you're able to do that is your visual system is extremely effective at separating signal from noise and that's what crypt analysts do when they're trying to break a code. The next rule is so important, it is the name of the system, a key must never be used more than once. So the metaphor for how the system is used is I will print two identical books of random numbers, I will keep one, I will give you the other one. When I want to encrypt a message to you, I will take the first sheet, I will tear it out, I will encrypt with it, I will then destroy it. When you receive the message, you will take the corresponding sheet out of your book, you will decrypt with it, and you will destroy it. You will never use it again, that's why it's called the one time pad. So what happens if we violate that rule? So this is a picture of me, this was taken in Istanbul, I'm using it because I can. This you'll recognize is the first key that I showed you, this is the first good key, and I'm going to encrypt my picture with that, and again, it's completely hidden. I'm now going to exclusive or this one with the first cypher text that I obtained, and what happens is because they each are exclusive or'd with the key, the key will cancel out and I'm left with the two messages exclusive or'd with each other and there is no security in that. So cryptography is not security, but cryptography can teach us stuff about security. Something I've found is that often we think incorrectly about how to use cryptography. If I'm looking at someone's system and they're obvious security vulnerabilities in them and I point them out to the architect, the architect's first response is we need to encrypt something. That turns out not to work, but thinking like cryptographers, that does work, and one of the tools that they use for thinking is they have a cast of characters who they use for reasoning about their systems. We've already met Alice and Bob, they are standard characters in the cryptographic show, but there are lots of other characters, for example, there is Eve. Eve is an eavesdropper. Eve might have a packet sniffer and is looking, Eve might be NSA, for example. Eve is looking at everything that's going between Alice and Bob and Eve is hoping to accumulate enough stuff that eventually she'll have enough information to break their system or at least be able to observe the pattern of their communication and do traffic analysis. Then there is Mallory. Mallory can do man in the middle attacks. Mallory might be operating a public hotspot, and so, Bob connects to Mallory's network thinking that he's going to connect to Alice, but he's actually connected to Mallory. Mallory connects to Alice and Bob tries to log in, Mallory asks Bob, what's your password, Mallory connects to Alice, Alice says what's your password, Mallory gives Bob's password to Alice. Mallory tells Alice, please transfer my funds, Mallory says to Alice, please change my password. Mallory says to Bob, sorry we're down right now, please come back later. Is that something we should be worried about? Yeah, we should be worried about that, yeah. And then finally, in my own practice, I have a character I call Satan. Satan is totally malicious, he is very, very powerful, he is very, very smart, he is very, very determined, he is financed, he's got the tools, and he wants to hurt us and our customers. The approach that some people take to him is to say well we won't let him connect to us. If you're Satan, you can't connect, that's called a blacklist and that doesn't work. On the internet, it is so easy to create new identities that you can't do that because he'll just be somebody else. So the other approach is we want to allow everybody to connect to us who is not Satan. That's a whitelist and that doesn't work either, in this case. The only thing that works is recognizing that Satan is going to connect to us and he is going to do to us and our customers whatever we allow him to do to us and our customers, and so we have a responsibility to do everything right so that the capabilities that we give to Satan are so limited and so weak that he cannot cause us harm. That is the only thing that works. Nothing else works. So because of that, security must be factored into every decision we make. Everything we do has security implications and can either make things better or worse for Satan. One big source of insecurity is the idea that we'll go back and make it secure later. It's very common that engineers who are building something will think the hard part is going to be to get the systems to boot, or the boxes to talk, or the pixels on the screen, once we've done that, 2.0 we'll go back and add security. It turns out that doesn't work. The securities are really hard thing to retrofit. You need to get the stuff right at the beginning and all along all the time. You can't add security just as you can't add reliability. These are things that have to be removed, that you have to remove insecurity, you have to remove unreliability, you can't add them. Having survived to this point does not guarantee future survival. Everybody who does business on the web has been under attack since the beginning of the web and they have not destroyed us yet, but that does not guarantee that they won't, and so, you have to always remain vigilant. The impossible is not possible so don't expect impossible means to protect you. If a measure is not effective, it is ineffective and we should not be wasting any time or attention on it. Don't attempt to prohibit what you can't prevent, and what you don't prevent, you allow. False security is worse than no security because if you think you're, if you know you're not secure, you'll be cautious and if you think you're secure and you're not, you will be reckless. Security and the Browser So that brings us to the browser. So the browser platform is horribly insecure, we are still fixing it later after over 20 years. HTML5 made things worse instead of better by providing powerful new capabilities to the attacker without mitigating any of the pre-existing weaknesses, and yet, it is still the world's best application delivery system, it is better than everything else including systems that were designed after the web. Everybody refused to learn the web's lessons. So one of the things that the web got right that virtually every other platform has gotten wrong is the web does not have a blame the victim security model. A very common thing in systems is that if a system has to make a decision about security and does not have enough information to make a correct decision, it will ask the user and it will ask the user in language that the user cannot understand, and if the user says no, then it fails. And if the user says yes, it is the user's fault for giving up their security. This is not a valid model of security, but we see it all the time. For example, I recently bought an alarm clock at the Android store and it was only after I bought it that I looked at the permissions and one of the permissions it wanted was unlimited access to the internet. Wait a minute, why does a clock need unlimited access to the internet and my choices are to now let my clock leak everything it can find in my device and send it who knows where or I'm out the money I spent on the clock, and that's not a good model of security. So the thing that the browser got right that everything else got wrong was the answer to this question, whose interest does the program represent? From the beginning of computing, it has always been assumed that a program represents the owner of the machine or at least the owner of the account, and the browser says no, the program represents a site and a site does not necessarily represent the user, which was absolutely right and nobody else has figured that out. Unfortunately, the web got a lot of other things wrong. The web didn't anticipate that there could be more interest involved in a page than just the user and the site and that a malicious party can exploit coding conventions to inject malicious code and that malicious code gets all of the rights of the site, this is known as the XSS problem. We'll talk more about XSS in a minute, but to ground this, I mean, you're always hearing people whining about security and how terrible all the problems are, let's get really specific in this XSS attack, what can actually go wrong. What can an attacker do if he can get some script onto your page. The attacker can request additional scripts from any server in the world. Once it gets a foothold, it only needs about that much text in order to do that, it can then obtain all its scripts it wants. Browsers have a security policy called the same-origin policy which limit the ability of the browser to get data from other servers, but there is no limit at all on how much program you can load from the most evil server in the world. An attacker can read the document. The attacker can see everything the user can see and even things the user can't see, the attacker can see comments, and hidden fields, cookies, everything we transmit to the browser, the attacker gets access to. The attacker can make requests of your server and your server cannot detect that the request did not originate with your application. If you're using SSL and you should, the attacker gets access to your secure connection. If your server accepts SQL queries, then the attacker gets access to your database. Now if instead your server is constructing SQL queries based on information that it gets from the browser, then you're only probably giving access to the attacker to your database and that's because SQL was optimized for SQL injection attacks. Thank you. An attacker has control over the display and can request information from the user and the user cannot detect that the request did not originate with your application. So modern browsers have anti-phishing chrome, and unfortunately, most users don't pay any attention to that, but if they are paying attention to it, in this case, the browser is saying it's good because all the browser is concerned with is where did the HTML come from. The browser doesn't pay any attention to where did the script come from and the script could have come from anywhere and the script is what's going to cause the damage, not the HTML. So knowing this, on some sites, whenever something dangerous is about to happen, they will ask the user to type their password again and the theory is that only the user will know the password and that way we can be confident that this is not going on, except that the attacker has control of the display, and so the attacker can ask the user what's your password and the browser is saying good, tell him. So if you're operating one of these sites which asks people to reidentify themselves at unlikely times, what you're actually doing is training your users to give up their passwords the instant the attacker gets control. Then the attacker can send information to servers anywhere in the world, everything that they learned by scraping the page, by talking to the user, by querying your database, they can then send this to the most evil server on the planet. Again, the browser has a same-origin policy which limits their ability to receive data from other servers, but it puts no limit on their ability to send all of this stolen information. So in the browser, is this starting to freak anybody out? Why are we doing business on this platform? Well I thought that I just kind of walked through well a course takes care of all of this for us, right. No. Nobody is taking… I guess I'm really waiting for you to explain to me, I don't know, where, I want to know what I'm doing wrong. I think I'm making it clear. Anyway, the browser does not prevent in any of these. Web standards require these weaknesses. If you were to build a new web browser from scratch, if it didn't subject you and your customers to all of these potential problems, it would not be standards compliant. The consequences of a successful attack are horrible, there is harm to customers, there is loss of trust, there are legal liabilities, there is even talk now about criminal liabilities for negligence for allowing people to have their identity stolen. Cross Site Scripting So this attack, which I just outlined, is called the XSS attack, which stands for Cross Site Scripting. Now it should be called CSS, but there was another abomination called CSS so they call it XSS, but the name is a problem. The name says that there is something wrong with cross site scripting, but in fact, that is a highly desirable thing. We want to be able to have sites cooperate with each other and businesses and service cooperate with each other which might be located on different sites, that's a desirable thing, it's not a bad thing. And there are forms of this attack which do not require the use a second site, so the name is just completely wrong, but the security experts who first identified this attack did not understand what this attack was, they gave it an incorrect name, but the security industry is still operating on that name and they're expecting you as practitioners to understand what they're meaning when their language is incorrect. Cross site scripting attacks were invented in 1995 when JavaScript was first released and they have been going on ever since. We're just now on the 20th anniversary of cross site scripting attacks. So we have made some baby steps, for example, there is a content security policy thing that's in browsers now. It took a long time to get it out, but it's out there. Unfortunately, most sites aren't using it so it's still unsafe by default. In fact, in order to use it, it means that a lot of common practices now become illegal as they should, but because they're common practices, it's not going to get used. A mashup is a self-inflicted XSS attack and it turns out advertising is a mashup and the most reliable cost-effective method to inject evil code is to buy an ad. So why is there XSS? So when things go wrong at an architectural level, it's never just one thing. You need a lot of things to go wrong simultaneously and that happened on the web. The first one is that the web stack is too complicated. There are too many languages each with its own encoding, quoting, commenting, and escaping conventions that can all be nested inside of each other. I've called this the turducken problem. It's very difficult to look at a piece of code and determine that it's going to be benign in all context. That's a very hard analysis to do and it's made even worse because browsers do heroic things to try to make sense out of malformed content. In fact, HTML5 began as an attempt to standardize the terrible stupid things that browsers do to try to make sense of invalid HTML. Then that's compounded with the popularity of template-based web frameworks that are optimized for XSS injection. I hate templating, we'll talk more about those later. And then finally, the JavaScript global object gives every scrap of script the same powerful capabilities, and yet as bad as it is at security, the browser is still a vast improvement over everything else. So the problem is not a cross site scripting attack, it is a confusion of interests. The browser distinguishes between the interests of the user and of the site, but it did not anticipate that there might be other interests involved and that's where it fails. So within a page, interest are confused. So an ad, a widget, an AJAX library, an analytics library, anything that gets loaded from a third-party, all of that code gets exactly the same rights as you do. It means you are trusting everybody in all of those other institutions. Now JavaScript gets close to getting it right and I think JavaScript can be repaired and become an object capability language, we'll talk more about what that means. HTML, I don't have any hope that we're ever going to fix HTML, and the DOM, which is that hideous API that HTML uses is horribly insecure, I don't see that going away either. So this stuff is not going to get fixed in a hurry, so it's up to web developers to create secure applications on an insecure platform and that is really, really hard. It shouldn't have to be that hard, but it is that hard. Object Capabilities But there is hope in the Principle of Least Authority, which teaches that any unit of software should be given just the capabilities it needs to do its work and no more. Capabilities can be seen in the actor model, which was discovered at MIT in 1973. The investigation of the actor model led to the invention of scheme, which led to high order functions and all that good stuff. It also led to actors. So in the actor model, an actor is a computational entity, it's like asynchronous object systems. An actor can send a message to another actor only if it knows its address, an actor can create a new actor, and an actor can create messages. Web worker are kind of like actors, web services aren't, but should. There was a system called Waterken that was developed at HP labs, which applies the after model to web services, I think this stuff is brilliant. It gives you very reliable systems, I highly recommend getting into that. So one of the things that comes out of the actor model is the idea of capabilities. So an address to an actor is a capability or a reference to an object is a capability. So let's apply capabilities to object-oriented programming looking at object capabilities. So here is an object, A is an object, it has state and behavior. Object A has a reference to Object B because objects can have references. Object A can communicate with Object B because it has that reference. Object B provides an interface that constrains access to its own state and references, so Object A does not get access to B's internals, only to its interface. This is why I like freezing in JavaScript because in JavaScript that is the only way to guarantee this, the only way to make good, reliable objects. Object A does not have a reference to C, so A cannot communicate with C. It's as though there is a firewall between A and C, except that firewall is implemented at 0 cost. An object capability system is produced by constraining the ways that references are obtained. A reference cannot be obtained simply by knowing the name of a public of a global variable or a public class. In fact, there are exactly three ways to obtain a reference, by creation, by construction, and by introduction. By creation means that if a function creates an object, it gets a reference to that object. By construction means that an object may be endowed by its constructor with references, this can include references in the constructor's context and inherited references. Then three, this is the interesting one, by introduction. So here, A has reference to B and C. B and C have no references so they cannot communicate, but A wants them to be able to communicate. So A communicates with B passing a reference to C, and once that is done, B has now acquired the capability to communicate with C, that's why this is called the capability model. If references can only be obtained by creation, construction, or introduction, then you may have a safe system, and if they can be obtained in any other way, you don't. So potential weaknesses to watch out for include arrogation, corruption, confusion, and collusion. Arrogation means to take or claim for oneself without right, this include global variables, public static variables, standard libraries that grant powerful capabilities, address generation, which was why C++ can never be a secure language, and known URLs in web systems. Corruption, it should not be possible to tamper with or circumvent the system or other objects. Again, that's why freezing is so critically important. Number three, confusion, it should be possible to create objects that are not subject to confusion because a confused object can be tricked into misusing its capabilities. And then finally, collusion, it must not be possible for two objects to communicate until they are introduced. If two independent objects can collude, they might be able to pool their capabilities to cause harm. Some capabilities are too dangerous to give to guest code, so we can instead give those capabilities to intermediate objects that will constrain their power. For example, an intermediate object for a file system might limit access to a particular device, or directory, or so on. Ultimately, every object should be given exactly the capabilities it needs to do its work and no more. So capabilities should be granted on a need-to-do basis just as we grant information on a need-to-know basis. The principles of information hiding, which we know leads to good designs in software systems, are enhanced when you're thinking about capability hiding. Intermediate objects or facets can be very light weight and class-free languages can be especially effective. I've tried making facets in Java and it turns out it's really hard because every time you want to make a little connector, you have to make another class and that's a lot of work. Whereas, in JavaScript, most of these things are just a simple object or a generator, and boom, you're done. So here the Facet object limits the guest object's access to a powerful object. The Guest object cannot tamper with the facet to get a direct reference to the dangerous object. References are not revocable in a capability system. Once you introduce an object, you can't ask it to forget it. You can ask, but you cannot depend on the request being honored. So here the Guest object has a reference to an Agency object. The guest asks for an introduction to the Powerful object, but the agency instead, makes a facet and gives the facet to the guest and the facet might be a simple pass through. The guest can then call the facet, the facet will then call the Powerful object. When the agency wants to revoke the capability, it tells the facet to forget its capability. The facet is now useless to the guest. This is what you did yesterday. Yesterday's revocable function was this pattern so you know these things are really easy to implement. A facet can mark requests so that the Powerful object can know where they came from, that gives us accountability. Facets are very expressive, they're easy to construct, they're lightweight, they provide attenuation or power reduction, they give us revocation, notification, delegation, and it turns out the best object-oriented patterns are also capability patterns. Sometimes when you're trying to design a system and you're trying to figure out does this capability go there or there and sorting those early on when you're doing the design and taxonomy is hard, but if you think about it in terms of capabilities, what is the least amount of power I need to give to this guy in order to have it work correctly, you tend to end up with the correct designs, that thinking about security makes system designing easier, not harder. So facets can reduce the power of dangerous objects. Most code should not be given direct access to dangerous things, for example, the browser, innerHTML, or document.write. Instead of trying to guess if a piece of code can do something bad, you can give it a safe set of capabilities instead. You know that even if it is bad, it's limited in what it can do, and capabilities can aid in API design. Reducing Complexity So one of the nice things about this model is it completely changes the economics of hacking. In most architectures, if you can confuse an object, basically, that gives you access to everything in the system, whereas in this model, if you can confuse an object, you get the capabilities of that object, but only that, and in most cases… So you can confuse an object? Well, yeah, the object is not created correctly and you can cause it to do something that it's not supposed to do. You might be able to get control of what that object can do, but you don't get control of everything that the system can do. So we change the economics of hacking. There is a really nice talk on YouTube about this stuff, The Lazy Programmer's Guide to Secure Computing by Marc Stiegler. Stiegler's premise is that the best programmers are lazy, but they're lazy in a way which avoids future work, and if you take that approach that you tend to lead yourself to secure systems as well. So this morning, we did this exercise where we found this leakage because of the unexpected thing that this does which compromised our security. So we need to understand for this stuff, we need to plan for it, we need to improve the language, we need to improve our understanding of the language so that we're not compromised by it. That attack is an example of a confusion attack and that's why I hate confusion so much because confusion is what causes our systems to fail. Confusion aids the enemy. Bugs are a manifestation of confusion, but also security exploits are a manifestation of confusion and usually attackers get to take over systems when they're understanding of our systems are better than ours, and unfortunately, that tends to happen a lot, and the way we stop that is by getting a better understanding of our own systems and we do that by eliminating as many sources of confusion as possible. I have no tolerance for confusion. I want to eliminate as much confusion as I can. So a confusion attack is when you don't know how, you don't understand how the code that you're writing works. We think the system works in a particular way and the attacker by studying us figures out that's not how we work and he's able to exploit that, in fact, virtually all security exploits are like that. We never intentionally go public into production with code that we know has security vulnerabilities in it, and yet, everything that's ever been deployed has security vulnerabilities in it and it's because we're confused about how our own systems work. We need to stop doing that. So with great complexity comes great confusion, so that's why I'm a minimalist. I want to find the simplest solution in all cases because those are the things that are easiest to reason about, so we should keep everything as simple as we can, we should keep our code bases as clean as we can. If we allow our code bases to get crufty, then they become harder to understand and more likely to be confusing and will get exploited. So we should always code well. Good code is ultimately cheaper to produce than bad code, so we should always make good code. Good code is easier to reason about, code that is difficult to reason about is more likely to be problematic and we should have strict conformance to good style rules, which means that if we're writing in JavaScript, everything should pass JSLint without exceptions. We should not put anything out on the web, which is not at least that good. You should never trust a machine that is not under your absolute control and then I'm not even sure about all of those, but one thing for sure, you must never trust the browser, it cannot and will not protect your interests. You need to properly filter and validate all input that you get from the browser. You need to properly encode everything that you send to the browser. The context of that encoding and decoding and filtering is critical. You need to filter and encode for the correct context. So let me tell you what I mean about trusting the browser. So this is a true story. A few years ago, a friend of mine was going to go visit China, so he bought a ticket on a famous airline, and it's an expensive flight, so he bought a coach ticket, and then he realized wow it's a really long flight, it's really uncomfortable, it will be much nicer to upgrade to first class. So he went to the website of the airline and looked to see if upgrades were available and they were but you needed a certain number of upgrade certificates and it told him his number was 0 because he doesn't fly all that often. So he was very disappointed until he opened the debugger and he found the variable that contained his number of certificates and he flew first class to China. It's a true story. The reason that worked was the guys who designed that system assumed that there is a certain URL that could only be generated if a certain set of conditions were true and they were relying on the browser to guarantee that and the browser will not guarantee that, you cannot trust the browser. Another browser story, in the first days of ecommerce when online stores first started happening, they didn't understand yet how to scale web systems, right, they tried to do all the work in one server and very quickly they got up and amount of traffic was bigger than one box could handle and they were panicked. About the same time, JavaScript comes online and they go great, we can now offload all this processing onto the browser. So they now have the browser do all the work in preparing the invoice and totaling everything up, they just allow the browser to then submit this thing, which then gets sent directly to fulfillment. So if you could figure out how to type a URL, you could order anything you want and pay whatever you think is reasonable. So it turns out, there are good reasons to do stuff in the browser, most of them are to provide a better user experience. So you can avoid the thing where the user fills in a form and then you kick it back in their teeth because they got something wrong. We can help them do that and make it pleasant, but that doesn't mean that the server shouldn't check every detail of it and make sure that it happened right, you have to do that. So templating is this process, we'll talk more about templating this afternoon. I hate templating because templating is a gun pointed at your head and templating is what pulls the trigger. A Simple Attack This is the simplest XSS attack, just to show you what the beast actually looks like. So in this attack, the attacker tricks your user into clicking on a URL that looks like that and it's a really weird looking URL with angle brackets and crazy stuff in it and you might think no user would ever click on that, and unfortunately, no, they will, and if they're smart enough not to click on a URL that looks like that, they'll probably click on a bitly URL, it'll probably accept a QR code. Yeah, they're going to go there. So it used to be that web servers, by default, would simply take the file name and stick it in the body of a 404 page and send it back, and in this case, now in an HTML context, that becomes a strip tag, which will then execute and load script from the world's most dangerous server and they get access to everything that you've got on the browser, they've got your cookies, they've got your local storage, they've got the chrome saying this is a valid thing, so when they ask for your password, everything is good. And there are thousands, maybe millions of variations on this and a security expert can get some theme by discovering another one and they can always do that because there is an endless supply of these things, but the solution to all of these is exactly the same, everything has to be filtered correctly, everything has to be encoded correctly. So if you encode this and turn those into entities before putting it on, then it becomes inert and it becomes really important to do that everywhere, and if you do that correctly everywhere, then these sorts of injection attacks cannot happen. Alright, I think I'm missing the first part of this. So you go to all this nonsense, this script it's in there, and the server responds with 404 and it sticks that content in it. Yeah, it takes the file name and puts it in, it says this is the file we couldn't find. Oh, I see. Okay. And now, this script runs a form that says please submit your password and basically the password, whatever it is. Whatever it is. In fact, this script can now talk to both sides. Yep. And you're saying the mitigation is done in the 404 page you have on your server. Right, in fact, every page your server can construct need to do this coding, encoding correctly for all cases for all inputs. But you've never told them what they submitted, it should just say that page isn't found. Right. Like another example, this is one that doesn't require a second site. On Yahoo!, they used to have profile pages, and on your Profile page, there is a box for gender and being very nice people, they allowed you to type anything you wanted into that box so it wasn't like you have to pick M or F, they said whatever you want, that's okay, and there were a lot of people who figured out oh, it starts with an angle bracket and let's type a script tag in there, and so, it means every time anybody looked at your Profile page, they can see all your account. There was a similar thing that happened in MySpace. There was a guy named Sammy who figured out, MySpace at that time was a little bit smarter and had some filters, but this guy named Sammy figured out how to get around them, and so, he injected a script that whenever anybody looked at his Profile page, the script would run and then add him to their Heroes list, which is the best spot on a friend's list, and put that script onto their page as well. And so, the number of people who had looked at one of these pages doubled every minute or no, every hour. So in 20 hours, he had control of 2 million accounts, and about that time, MySpace went whoa, what's going on and they shut it down. But you can still search out on the web for a Sammy is my hero and you'll probably still find some of these out there. So that's a version of this attack and Sammy turns out to be a nice guy or I don't know if he's a nice guy, but he recognized, whoops, I didn't expect I'd take over 2 million accounts in a day, so he turned himself in and was charged with a crime. Probably got a sweet job working somewhere as a white hat. Yeah, I'm sure Sammy did okay, but he could have been a bad guy, right, he could have just kept it to himself and figured out how to exploit these 2 million people. So you also need to worry about concatenation. So the plus sign operator in JavaScript is dangerous depending on how you're using it because you can be taking bits of string and putting them together, and in some cases, if you're just putting together text, it's not a problem, but if you're doing something like assembling JSON or assembling HTML, there is a potential for that to go bad. So whenever possible, you should be using good encoders and writers and not just trying to plus things together. So I'm very happy to report that this is no longer apparently a source of insecurity, a manager when being warned that we need to protect against this would ask, why would anybody do that. Oh no. We now know why we do that. We still do that for measures. Well that's a problem. Yeah, we now know why they will do that, they will do that because we didn't prevent them from doing that. In the case of some companies, those companies will unintentionally pay them to do that, in some cases, maybe pay them really to do that. So these are things which are not security, which are sometimes thought of as being security, for example, inconvenience. We can't stop them, but we can slow them down. We'll put speedbumps on the information super highway, we'll stop them. That doesn't work. Those things tend to inconvenience you more than them, it's just again, if it's not effective, it's not effective and don't waste time on it. Identity is not security, but finding out who someone is doesn't answer the question of what should they be allowed to do to you. Tainting is sometimes sited as a model of security that doesn't work. What's the taint? It's a model where you try to track the source of data and how it's transformed over time and try to determine at what point it becomes tainted or corrupted, but it doesn't work. And then finally intrusion detection is not security. There are some system administrators who are so tired of being hacked and they've given up all hope of stopping it, they just want to know when it's happening, but that's not security. The biggest avoidable source of security, and we'll end here, is mismanagement. It's when a manager says we're going to get this past the security guys because we've to a deadline, we've got to meet it, we've got to get out there, I've got a bonus on the line, and we're going to go. Don't let that happen. You've got to kick it up because no company can afford to have this kind of stuff going on in their sites for any popular short-term game. Managing Asynchronicity Synchronous Functions It's time now for asynchronicity. So there are two kinds of functions in the world, there are synchronous functions and asynchronous functions. So let's first look at synchronous functions. A synchronous function is a function that does not return until the work is completed or it has failed. So all of the functions that we wrote over the last couple of days have been synchronous functions because that's how they work and that's a very useful thing to have in a function because it means it's easy to reason about its behavior over time, that when a function, a synchronous function calls another synchronous function, the caller is suspended in time and nothing advances until the callee returns. If the caller is looking at a clock at the moment that they make the call, their experience will be that the hands jump forward quickly, but otherwise, they're not aware that this stuff has happened, except that the thing that they asked for has magically been completed and that makes it easy for us to reason about things, unless we need to make multiple things happen at the same time. You can't make multiple things happen at the same time if you're suspended in time. So the way that's often mitigated is by use of threads. A thread allows to have multiple threads of execution happening through a memory space at the same time so that lots of things can happen at the same time. Unfortunately, races come with some problems, including, or threads come with problems, including races, deadlocks, and other reliability problems, and performance problems and we'll look more at these. So the threading model like all models comes with pros and cons and the first pro is a really important, a really significant one, no rethinking is necessary. You can take any existing piece of code, put it in a thread, and it'll just work that way, you don't have to make any changes to it in order to introduce it to introduce it to a threaded environment. Now that doesn't necessarily mean that you'll never need to change it, but the starting up phase is really easy. The next pro is that blocking programs are okay. It's okay for programs to block, and in fact, that is what threads are for. Threads exist so that things can stop and have other things happening while it's stopped. So execution will continue as long as any thread is not blocked, but there are some cons. The first con is that there is stacked memory allocated per thread. This used to be a significant problem. It's not anymore. Moore's laws continue to ramp on memory capacity, and so, thread stacks are now in the noise, we don't care about them anymore. A more important con is that if two threads use the same memory at the same time, a race may occur, and in fact, this is a big problem. So a lot of people have been asking, when are we going to get threads in JavaScript and the answer is never and let me illustrate why. So here we have two programs, which will each run in its own thread, and we will run them possible at the same time. And there are a number of possible consequences of this program, one is an array containing a and b and the other is an array containing b and a. Because we can't control in what order these two threads run, these are two possible outcomes, but that's not the race I'm worried about. The race I'm worried about is this one. Another possible outcome of this program is an array containing only a or an array containing only b and this is not due to anything, except what's happening in these two statements and you can look at these for a long time and try to figure out where did half of my data go, how did this fail. It's really hard to reason about programs that race in threads. So let's zoom in on this and look at what's going on. So that's the first statement. One of the things that makes JavaScript an expressively powerful language is that one statement can do the work of many statements and that's the case in this one, that one statement on the top does what the four green lines do, it'll get the current length, it will assign to that part of the array. If the length of the array has increased, then we'll make that change. And if we look at the second thread, it also expands into stuff like that and we cannot control how these might be ordered in actual execution, so one possible ordering could be both capture the length at the same time and both will be using that to store into the array and both will be using it to change the length of the array, and as the result, whichever one runs second is probably going to win, so that's where half of our data went. And this stuff is really hard to reason about. For one thing, you can't see it in the original code, but it's worse than this because each of these statements could expand into multiple machine language statements and don't know how those are going to interleave, and each of those could expand at the low level into micro instructions and you can't control how those will interleave, and it may be that the real time behavior of this threaded code changes according to loading. So it might be working fine during development, but then fail when we put it in production or it's working fine most of the year, but it fails at Christmas. As things change, things that wouldn't appear to affect the behavior of the program can actually radically affect the behavior of the program. So the way we mitigate this stuff, or so it's impossible to have application integrity when we're subject to race conditions. So that we mitigate with mutual exclusion, mutual exclusion allows only one thread to be running in a critical section of memory at a time and there is a long history of this stuff starting with Dykstra semaphores and horas monitors that aid in rendezvous and now we call it synchronization, but these are all transforms on the same idea. This used to be operating system stuff. It used to be only operating systems were concerned with mutual exclusion and running multiple things at the same time, but this has all leaked into applications because of networking and because of the multicore problem. The concern with networking is that we want to be able to have stuff happening while slow things are happening and one of the slowest things we can do is go out to the network because that has large latencies, and in some cases, unknowable latencies and so you can be stopped for a long time, and generally, you can't afford to have systems be suspended for that long so you need threads in order to allow them to continue. Then there is the multicore problem. CPU designers have lost the ability to make CPUs go faster. So instead, they're giving us more of them and we don't know how to use them. Unless, if you have a problem that's embarrassingly parallel, then we can take advantage of it, but most of what we do is embarrassingly serial and we don't know how to take multiple cores and put them in our applications and get a significant benefit from it. So that's where we are. So with multiple exclusion, only one thread can be executing in a critical section at a time and all other threads waiting to execute in the critical section will be blocked. If the threads don't interact, then the programs can run at full speed across all the cores, but if they do interact, then races will occur, unless mutual exclusion is employed. Unfortunately, mutual exclusion comes with its own dark side and that is deadlock. Here we have two threads, Alphonse and Gaston, they are both programmed to wait for the other to stand up before they can stand up, this is deadlock. It turns out computer systems do this all the time. Here is another example, this is a real-world example from Sal Palo. You can think of, apparently, they do this all the time. You can think of each of these cars as being a thread, which is ready to run, it's just waiting for the thread that's blocking it to get out of the way. So deadlock, that's a serious thing. Asynchronous Functions So an alternative to all of that are asynchronous functions. Asynchronous functions return immediately, you call it, it comes right back. There is almost no passage of time, and success or failure will be determined somehow in the future. When the asynchronous function returns, there is no solution yet. The solution might happen maybe later, but not now. So we like to use asynchronous functions in turn systems. A turn is started by an external event such as delivery of a message, or completion of an asynchronous request, a user action, or the ticking of the clock. Then a callback function associated with that event is called and it runs to completion, it doesn't have to worry about races because nothing else will get to run until it's finished. When it returns, the turn ends, so there is no need for threads, no races, no deadlocks, it's a very reliable, very straightforward programming model. We call it turns because of games, it comes from chess that in chess when it's my turn, I get to control a piece, you don't get to touch any pieces until my turn is done and then it exchanges. We're doing a similar thing with functions and events. But when you're using turns, it requires that you have to follow the iron law of turns, which says you must never block, you must never wait, you must finish fast. If you have any code which has to block or has to wait or can't finish quickly, it has to be isolated and run in a separate process. It is not allowed to run in the turns system, so that's a cost, right. I mean, you're allowed to run some code, but there is some code which you are definitely not allowed to run. We usually do this in an event loop. Event loops are turn-based systems and they come with pros and cons. The pro is a huge pro, it is completely free of races and deadlocks and that's a huge advantage. Any system that, applications should never be written, I think, in systems that use threads because it's just too hard to reason about them and it's too unreliable. Another pro is that there is only stack and we reuse that stack on every turn, so it's extremely memory efficient, which is of no interest at all. Again, because memory is so cheap, the fact that we're memory efficient is irrelevant. What's more important is that it is very low overhead because all we're doing in the event loop is take something off an event queue, hand it to a function, let it run, take the next one, and so on, so there is very little overhead, whereas in a threaded system, you're doing lots of locking, you're doing lots of process switching, and context switching, which are the most expensive things that CPUs know how to do. In a turn-based system, you're not doing that, you're just pulling something off a queue and running, pulling it and running it. It's also a surprisingly resilient programming model. If a turn fails, it's usually the case that the program can still go on. For example, if you ever take any web browser and open up the debugger and just go wading out into the web, you're going to see almost a constant string of failures. It's amazing how much failure is going on in web pages all the time, but if you don't have a debugger open, you don't see it. Now in a threaded environment when something fails, then they'll be an exception with one of the stacks and one of the threads will get unwound and it'll try to recover, but that thread may now be in an inconsistent state compared to the other threads because it's lost all of this context, and so that could lead to cascading thread failures, and so it tends to be a fairly brittle model. Whereas, what we see in web browsers is as long as there is any button that still works and the user can find it, there is good chance they're going to be able to complete the transaction and never know that the thing has been failing hugely behind the scenes. Now there is some important cons here, the most important con is that programs must never block and that turns must finish quickly, that we have to obey the law of turns, that is definitely a con, but it's something that has to be respected. Also, another con is that programs are written inside out and that makes some people cry. They call it, it's inversion of control, it's unnatural, it's an unrealistic way to write programs, we can't do it, it's too hard. Wah, wah. But actually, it turns out it's not hard, it's actually pretty easy. So we do things in event driven systems that are turn-based, there is no preemption, which is really good, that makes them very reliable, we associate events with actions, and it turns out despite the people who are complaining that it's very hard and inside out and unnatural, it's actually very easy, and in fact, beginners can do it, it's actually very hard. And in fact, it's how all user interfaces are implemented even on systems that have threading because it just turns out this is the best way to implement a user interface. Event loops and asynchronous programming has a long history, it was done in real time systems, and experimental systems, and game systems for a long time, it doesn't get into the mainstream until the Macintosh. Macintosh is the first consumer device, although, it's kind of expensive for a consumer device, which is programmable only in a turn-based manner. Prior to that, everything has been using blocking I/O going all the way back to Fortran. And our memory of this is that when Apple introduced and Steve Jobs introduced the Macintosh in 1984, it changed the world and turned Apple into one of the world's most successful companies and that's not actually what happened. This machine came very close to bankrupting Apple, and part of the reason was that they couldn't sell very many and the reason for that was that they couldn't convince programmers to write programs for this machine because they had never seen an event loop before and didn't understand how to write programs in that model and they were complaining it's unnatural, it's too hard, it's inside-out, wah, we're not going to do it, instead they wrote for MS-DOS, which was horrible and crappy, one of the worst things ever imagined, which outsold this by several orders, it was just ridiculous. So the thing that changed this, which turned this machine into a success was HyperCard. HyperCard was a system that was built by Bill Atkinson. Bill Atkinson had written QuickDraw, which was the graphics layer of the original Macintosh. He also wrote the first paint program called MacPaint. It's hard today to recognize it as a paint program because it only had two colors, black and white, but it came free with the machine, and when you bought a Macintosh, that was literally all there was to do with it, and so, people did a lot of stuff with MacPaint and called it art. His next program was HyperCard. He took MacPaint and allowed it to work on several pictures at the same time, only one of which would be visible. He called these pictures cards and that a file was a stack of cards or a stack and then he got the idea that he could buttons on the cards and wire those buttons to behavior, input fields on those cards, and allow those fields to contain text, which you could then search for and process on, and then he added an event driven programming language to that called HyperTalk, which was an event-driven programming language. Everything in it was events and you would say things like on key up and on mouse down and so on and beginners loved HyperCard, they got all into HyperCard, they were writing stuff, they'd start with very simple little event handlers and then start doing stuff that was much more sophisticated, inventing whole new classes of applications that all ran into HyperCard. There were predictions that HyperCard was going to be the future of software development and it might have had Apple not run it into the ground. When Atkinson originally demanded of Steve Jobs that it be distributed for free with Macintoshes and Jobs agreed and that's how they did it for several years. Once they saw how successful it was, they decided we need to figure out how to monetize this, and in the process of doing that, they killed it to death, but it worked and it inspired stuff that happened in the browser. In fact, you see that Home thing, the concept of the Home Page came out of HyperCard. JavaScript on the Server That's where the stuff came from, and because the beginners were writing in HyperCard, we know it's easy for beginners to use event-based systems and to do asynchronous programming, JavaScript is now moving to the server, after having conquered the browser, it's now moving to the server and its conquering there too, which is great because it means all of the skill that we've developed and doing stuff on the browser side, we can now bring those skills back to the server-side, it means you only need to be really competent in language and that can help save some time. Unfortunately, what servers do is quite different than what browsers do, in fact, it's so different that you need a fairly different programming model. So we're doing server stuff now in Node. Node implements a web server in a JavaScript event loop, it's just a high-performance event pump, it's very efficient, very fast, can do a tremendous amount of work, it does File I/O correctly for the first time, going all the way back to Fortran and COBOL, File I/O has always been blocking, and in Node, we finally get non-blocking File I/O, so we do it asynchronously. So when you want to read a file, you pass in a file name and a callback function, and when the file is read, it'll call your callback function and you get the data and it's great, and so, the programs don't block on doing File I/O because blocking would violate the law of turns and we never violate the law of turns. So in Node, everything is or can be, should be non-blocking. Unfortunately, like everything that has anything to do with JavaScript, Node doesn't get everything right and a couple of things that I'm really unhappy about with Node is that Node has a number of synchronous functions baked into the API. Now these synchronous functions block everything, and so, you go from having a very fast high performance system to a very slow low performance system instantly because you're blocking, nothing else gets to happen until the synchronous calls are done. So I would like to outlaw those synchronous functions. Unfortunately, what we've seen in the JavaScript community is if you put something really stupid into an API, there are web developers who say I have a right to use it and there are these synchronous functions in Node, which should never be used, but because they're in Node, there are people using them and they violate the law of turns. Then I don't like the requires thing because it is also blocking. The very first version of it on Node was asynchronous, which is the correct way to do it and there were complaints that oh, it's too hard, it's inside-out, wah, and we only start things up in the morning when we turn the servers on, and so, once they get warmed up then it doesn't matter, except that's not true. I mean, we're bringing servers up and down all the time, so anything which blocks is a bad thing. So fortunately, in ES6, we're getting a new module system which obsoletes that, so that'll be better. But what we're doing, servers are significantly different than browsers, so in a server, we're not dealing with events, we're dealing with messages. Now message comes in from the network, we'll do something, we'll send another message out, it's all about messaging. A Node server is actor like in that it's a thing which receives messages and sends messages, but the simple event model that we use on the browser doesn't really fit the workflows that we do on a server. For example, we may have things which take several sequential steps like a request comes in, we take that request, we have to send something to a database, get something back, we take that information, we use that to go to another database and get stuff back, so each of these things depend on the previous thing, how do you do that? If you don't know how to do it, the naive approach is to write deeply nested event handlers. So in the event handler of one thing, you make the request of the next service, and in its even handler, you make the request of the next one and that leads to code which is extremely brittle, extremely hard to maintain, low performance, it's all bad stuff. Then we have the opportunity to do things in parallel, it might be a request comes in and we can go to several systems at the same time because the requests are not dependent on each other, so they're independent requests, and by going in parallel, we get to change the performance of the thing significantly because instead of waiting for each to finish sequentially, instead of taking that much time, we only have to wait for the slowest of those things, which is usually a much shorter time so we can significantly improve our response processing. Unfortunately, each of these requests may be coming back at unexpected times and unexpected orders and how do you deal with that, and some might not come back at all and how do you deal with that and so that's hard. So if you don't know how to manage that, the naive thing is to not do it in parallel, instead do it sequentially with deeply nested event handlers, which is again quite awful. Then in addition to that, you may have to deal with limited time. You may SLAs or other policies which say we have to get a response within so many milliseconds, and if we don't, we have to go to plan B, we can't just keep the request hanging for some amount of time. And then we have to deal with cancellation. If we go to plan B, we want to stop all of the work which is no longer necessary. How do you do all of those things in deeply nested event handlers? Functional Programming Functional programming to the rescue. So one of the reasons why after four years functional programming has suddenly become important is because it contains the solutions to these problems and we've got a lot of history with this stuff. And going back to future is which came out of data flow and Lisp, a future is an object which represents something which isn't nullable yet, but might be in the future. So you can begin interacting with the future object, and eventually, it will communicate your interest to whatever the answer turns out to be. Future had a big influence on a feature called promises. Promises were discovered in a company that I founded called Electric Communities for a language that we called E. Promises have since escaped from E and moved into Python and then into JavaScript where unfortunately they've mutated kind of badly, but there is still a really interesting mechanism for managing asynchronicity and distributed systems. The Haskell community is big into monads and arrows. Microsoft has something called RX, Reactive Extensions, which allows for composing of event strings, very interesting stuff. Unfortunately, it's never been documented accurately or documented sufficiently. We do have a course by Jafar Husain which covers essentially the observable pattern and how RX kind of works and everything, just throwing it out there. Yeah, it's good stuff. It's just I wouldn't use it because if I'm putting it into production, I want to know how it works, but it's been inspirational, it's created a new form of programming called Functional Reactive Programming, examples include Flapjax, Bacon, and Elm, and others, but it occurred to me that this list just isn't long enough, so I'm adding my own thing to it and that is a library called RQ. RQ is a library for managing asynchronicity in server applications. The thing that distinguishes it from the others that I mentioned is that this one was designed specifically for what you do in servers, none of the other ones were. Everything can be made to work, but I think for what you do in servers, this one is going to work better for you. So RQ is a very small library, it only contains four or five methods depending on how you count. If you count in Java, it's five, if you count in JavaScript, it's four. So let's look at the four because once you do these four functions, that's the whole thing. So a sequence takes an array of requestor functions and returns a function that will call them one at a time, passing the result of the previous requestor to the next requestor. So here, for example, we're going to make a getNav requestor which will first read the file, and then get the preference, and then get the custom nav. Then we can do things in parallel. Now we're not adding parallelism to JavaScript. What we're doing is allowing JavaScript to exploit the natural parallelism of the universe because it's likely that you're going to be calling services that are in different boxes or different networks that's all going out and the universe runs asynchronously all the time, so you get to take advantage of that. So here, we're going to make a getStuff constructor, which will when it's called, get the nav, get the ads, and get the message of the day, it'll get them all at once. So the cost of this will be whichever these three is the slowest. Then we can also have optional things. So we can provide a second array of optional things. So we will also go and get the horoscope and the gossip, but we will not wait for those. If either of those doesn't complete before the main three complete, we'll just cancel them and return with whatever has been finished. We can deal with races, these are the good kinds of races. So we can start several things at once and whichever the first one to finish successfully, that's the one we get. So we might be having trouble with our ad network. We're dealing with three add networks and they're all too slow. So we tell them, we're going to have a race, whenever we have a position, we will ask you and two of your competitors, and whoever comes back first with a suitable placement will win. We can also deal with fallbacks. So we can try one thing, if that fails, we'll try another. So we're going to do that to get our weather. We'll first go to our local cache, and if that fails, we'll go to our localDB, if that fails, we go to the remoteDB. Now in theory, the fastest way to get the weather would be to have a race, but the whole point to having this kind of hierarchy is that we want to go to the remoteDB only as a last resort and fallbacks allow us to do that. So that's it, that's the whole matrix for our queue. We can start all the requestors at once or we can start them one at a time and then we can then take one result or we can take all of the results. RQ Example So this is an example of an RQ program. You can write RQ in the form of nested arrays, which describe the work that you want to do with the calls to RQ acting as annotations as to what happens in parallel, what happens in serial. So here we've got one request that's going to do stuff in parallel, in parallel, it's going to do a couple of sequences and then one standalone thing and a race and a fallback, and simultaneously, it will also do this optional set, which is a similar set of stuff, and when it's all done, we will then call this function that's the continuation which will show the result. So I will, this is an actual JavaScript program, so let me execute it so you can see what it does. So each of these widgets represents some service which is being accessed by that program, so those are the ones that start immediately and I get to decide which of these are going to succeed and which are going to fail. For example, if I have that one fail, then the entire request failed and you can see I cancelled everything else because we don't need it if the whole job is not going to work. So let me reset and we'll try it again. So let's say this time that one succeeded, okay, we're still running things are good. This one is optional, let's say it fails, okay it failed, but no problem, it's optional so we're still running. Let's say this sequence starts or is successful, good. So then A2 started, we got the request A1 finished so A2 starts. We do a similar thing with B1, that succeeded, so B2 is now running. We've got a race with D1, D2, D3, which one should win? Two. Two, so race two wins and we cancel D1 and D3 because we don't need those results anymore. Let's say our fallback fails, not a problem, the next step in the fallback succeeds and I can finish these in any order. So let's say that goes and maybe that succeeds and that succeeds and we're now good. So at that point, the whole thing finished, all of the optional things that were still pending, those got cancelled, and we're done. Any questions about that? Yeah, could you give us a real-world example of where you use this. Yeah, I will. And I've been sort of doing that. So I gave examples that we get the nav and we get the ads and all of those, those are real world things, those are the sorts of flows that you would have in a web application. Submitting 1, 3 ads, but any of 12 could load there, it will just take the first 3 or. Sure. You can do those sorts of things, whatever you need. So this again is the program that accomplished that. All of that behavior happened just in the RQ functions handling that stuff. So and you can compose these things in lots of ways. For example, this is a sequence in which we'll do one thing and we'll take its result and we'll pass that to three different things and then we'll accumulate all of their results into an array and pass that array to the last one so you can get lots of workflows. So for a real-world example, this is the most recent thing that I've been working on, I'm working an encyclopedia and I'm writing in an authoring language and this is the process that I run in Node for building the book. So I first read the file and I call the, then I pass that result to the include processor, which handles the stuff with all the include files and then I pass it to a thing which will then compile it and produce the HTML and then I have another thing which will then write it out to the file system. So this is a sequence, one thing happens to another, but it's not being written as nested event handlers, it's written as a list of things that happen, one thing after another so it looks more like a program. What's the callback? Is the callback the next function? The callback is similar to the thing that we did with continuize this morning, it's the thing that will receive the result. So this is a special case of continuation passing style where at each step we don't return the result, instead, we pass the result to a function that's provided to us. And the file callback is essentially the console log. Yeah, the last one will send it to the console. Okay, this is your program, right. Yep. Where is the, so this first function, it's a sequence, first it's going to call the first function, and it's going to call the second function, then the third, then the fourth, where is the first value of callback that gets passed into the first function, where is that coming from? It'll be called by, so what I didn't show here is that RQ sequence returns a function. When we call that function, it will then call the first in this list providing a callback function to it. And that's the hook that makes it call the second one. Right. So if RQ sequence kind of always passes itself into each of the functions that it calls. Sort of, yeah. It passes the value from the previous fallback. Yeah, so. Okay, you mentioned that earlier yeah. Right, so you can think of, it's sort of like a pipe that's passing this stuff through, so each gets the thing from the previous one. And this is not Node, this is just pure JS. Right, this is pure JS, you could run this in a browser as well if you've got asynchronous things to do there, it runs well. Yeah, it's not a module, it's just a global, RS would be a global or wherever you declare it. It's just a JS file, you can load it however you want. It's just a JS File. It creates global though called RS, your global namespace. In a browser it creates a global namespace, in Node it will add something, yeah, it will export as a module. So RQ can also deal with timeouts. So each of those functions will also take a time in milliseconds, which it says, this needs to complete successfully in this much time or we will fail it. And in the case of the parallel, you can also specify untilliseconds, which gives the optional set, you've got this much time until you get cancelled, so just in case the main ones finish early. Then we can also deal with cancellation. So any requestor function can optionally return a cancel function and that cancel function when it is called, will attempt to cancel the request. There is no guarantee that the cancellation will happen before the request completes because there can be network races and other things going on, so it's intended to stop unnecessary work, it's not an undo, it's not a rollback. So if you're trying to cancel a transaction, for example, this is not the cancellation that you want to do. RQ Function Types RQ has four types of functions in it and if you can manage these four types of functions, then it's pretty easy to use. We have requestor functions, which are functions that can execute a request, we've got callbacks, which are continuation functions that are passed to requestors, that's how a requestor returns its result, we have factories, which are like the factories you've been making, which make requestor functions, they act as a convenience, and in fact, the four functions that I've shown you are all factories, and then cancel functions, the cancel can be returned by requestor to cancel a request. So a callback function takes two parameters, a success parameter and a failure parameter. So if success is undefined, then it fails, and if failure is undefined, then it succeeds and there is just one callback and promises, for example, will be two callbacks, one for success, one for failure, I think it's easier if you just have one. Then a requestor function will take a callback and can optionally also take a value and that's how a sequence will move results from one step to the next. A requestor function can return a cancel function, which is used for cancellation, and factories make requestors. So that's the relationship of the four kinds of functions. So let's look at an example. This is the identity requestor, this is the simplest possible requestor, it will receive a value and it will then send the value to the next thing in the sequence. So if you put this in the middle of a sequence, it's a do nothing, it'll just take whatever is going on and give it to the next one. You would never do that, but this is the model for the simplest form. Then this is so we can wrap those things, I'm sorry. This is the full name requestor, this one receives an object containing parts of names and will concatenate them together to make a name and then deliver that to the callback, so I could add this as a processing step, for example. And we can write functions which will automate that for us. So requestorize is similar to functions that you've been writing. It will produce a requestor that will call some ordinary function, so we can take any existing function and turn it into a requestor, which could reduce the amount of work you have to do in transmitting to our queue. This is a delay requestor, this is the simplest real time thing. You would never put a delay into a server application. What you would do in a real situation is instead of calling setTimeout, you're going to send a message to some service, and instead of calling clearTimeout, you would send a message probably to the same service telling it to please stop. So you just make the substitution, but if you were to take the delay requestor and put it in a sequence, the sequence will run exactly the same, but it'll just take long, but it will take longer without blocking. Then this is the delay request, or the delay factory, so this code is the delay requestor, except the factory is providing the milliseconds value. These are all part of our queue? No, these are examples of how you could use our queue. Things that you put in there, okay. Yeah, no this code is in our queue. What does RQ stand for? Request. Then this is a factory for reading files. So this factory will produce a requestor, which you can then put in a list which will then read files for you and this one is wrapping the node files system. This is the widget that was in the demo that I showed you earlier. This is all HTML crap, so I'm not going to bother you with that and I did it this way because I didn't want to have to depend on any one library because if I chose a library that's not everybody's favorite and it turns out everybody is not somebody's favorite, so it's just DOM so it's awful. But the interesting features here are that this is the click handler or the success handler, so when I clicked on yes on success, I changed the color of the widget to green and I called the callback to let it know that it finished successfully and the failure side was similar, except I set the background color to pink and success is undefined now to indicate that we have a failure result. And in the case of cancellation, I just set the background to gray. Why did you reverse your success failures from convention node? Because node got it wrong. Isn't it wrong? Yeah. Okay, I agree that it seemed really weird, but now you start to use Node for a while and you encounter people who do it the right way and it seems weird. Yeah, exactly. And I know I'm introducing a hazard doing things right in a system that does things wrong, I understand that and I apologize for it. I just couldn't bring myself to do it the wrong way. Testing with JSCheck So let's talk about testing. So testing in asynchronous systems is tricky because most test frameworks come with stuff like this, right, you've got some assertion and you've got a descriptive message and something you'll have an expected value and an actual value and if they match, then it's good, and if they don't match, it's bad, except this doesn't work in asynchronous systems right because you can't wait for the actual value. The actual value might not happen until many turns into the future and this just doesn't work. More than that though, I have a problem with the way we do testing in general in that we're trying to guess what the expected value is, which is going to identify the bug, but most of our errors happen in the interactions of things and finding where a bunch of things are going to come together and interact at the one point that fails, it's virtually impossible to find that one point. So what you really want is to have a grid, right, a dragnet, where you're going to have a much larger array of spaces that's distributed randomly over the possibility space and you pick them all out and hopefully then you'll improve the likelihood that you're going to find the error, except you don't want to do that, right, you won't have write a thousand times more test cases, most of which are unlikely to succeed in finding anything anyway, that's a huge amount of effort, no one is going to want to do that. Plus, if there is a model change, it means you now have to update thousands more tests and nobody is going to do that and I was kind of despairing about that when I saw a talk by John Hughes of Chalmers University about something called QuickCheck, which I though was brilliant. So to give you the context about that, QuickCheck was developed in Haskell. Haskell is a language that was developed at the University of Edenborough and it is a pure functional programming language, that they say pure in that it's a language without side effects and that's good and bad, but there are good things about it. The other thing about Haskell is it's got maybe the best type system of any language in the world. Instead of having something like in Java where you specify the type of every little thing, you specify the types of almost nothing, and instead, there is an inference engine that runs as part of the compiler, which goes through looking up everything in the program trying to determine what everything is. Now if you're one of those and you interact with that, then you must be one of those and so on and it'll keep doing that, trying to solve the entire program, and if it gets to a point where it finds an inconsistency, it can stop and go ah, something is wrong here. And the difficultly with that is that where it finds its inconsistency maybe miles away from where you actually made your mistake. And so, getting a program to compile can be really challenging, but the theory is that once you get a program to compile, it's guaranteed to run, except that it's not because it turns out the class of potential errors is infinitely bigger than the class of errors that could be found even by the world's best type system, so you still have to test and these guys came up with a really nice way of doing tests. Instead of writing specific compare actual versus expected, instead you write a function which will be true if the system is working correctly, and they call those properties, that the system will be true, is working correctly if these properties hold. And so, and then QuickCheck will generate random data and throw them at your functions and try to disprove your assertion. And so, they can get tremendous coverage, they're even able to debug real time systems, which is something that is really, really hard to do. So I thought, wow, we should get one of those for JavaScript, so I wrote one. It's called JSCheck and JSCheck provides two nice things, one is case generation, so it'll generate random test cases for you, and it also supports testing overturns, so you can use it to test stuff in Node, you can use it to test things in browsers, you can use it to test synchronous and asynchronous functions. So it's a small library that comes with some functions, these are a few of the interesting ones. The most important one is claim, we'll talk more about that in a minute, that's where you make a claim about the system if it's working true or working correctly, then this will be true. You can then tell it to check all of the claims that you've made so far, and you can also put a time limit on it. So it'll start all of the tests simultaneously and they all have to finish by a certain amount of time. If they don't, then you can record that, and that's really important because the way our systems work now, getting the right answer, but taking too long is indistinguishable from the wrong answer. And so, we need to be able to test performance as well and that's something that the synchronous frameworks don't do. Then we can get a callback when the thing is done and we'll get a full report everything that happened. We can also get a callback on each error as the errors occur and you can program that callback to take you into the debugger, so as you're finding bugs, you're in the context where you can fix it immediately. So this is the claim function, it takes a descriptive name, it takes a predicate function as a Boolean, which will be true if your system is working correctly, and it takes a signature, which is an array of type descriptors, which describes the parameters to the predicate function. So here is an example, we're going to compare the old code with the new code, our predicate will take a verdict function and an argument and we will then determine that the old code for that argument does the same thing as the new code and we tell it that argument is an integer, so when we tell this thing to check, it will generate random integers and throw them at that function trying to disprove our assertion and you can set as many as you want. So all of the effort in using the system is in writing these predicates and there are lots of ways you could do it, one is to, in this case, we're comparing the old code against the new code, so as we're migrating the system, we can make sure that we haven't changed the behavior of anything. Another way you could do it is if you have symmetrical operations, for example, if you're writing an encoder and a decoder, it's usually the case that the decode or the encode should equal the original thing so we can test that that's actually true for a large class of trials. We can generate symmetric pairs of things like we can generate a credit and a debit, they're both using random values and make sure everything balances. In some cases, you might just throw a lot of random transactions at something, and after each one, run a deep diagnostic of all your data structures, make sure everything is still consistent. So how would this look like implemented? I don't understand what verdict is. We'll get to verdict in a minute. So verdict is the callback that is being used in the cases. So it comes with a small library of specifiers. You can put each of these in the descriptor array of what types we want to throw to our predicate and it'll try, it'll generate values, random values of each of these types. So if you need Booleans, if you need integers, numbers, objects, whatever, it'll make random things and pass them out. And these are also composable in interesting ways, for example, if you want random social security numbers, I can say I want a string with three digits and then a dash and two digits and a dash and so on or if I need an array containing three elements where the first element is an integer and the second is a number between 0 and 100 and the third is a character string, I can get that, or I need an object with a left property, a top property, and a color selected from the list, or I need an object with a variable number of properties where each property is a four letter name and each value is a Boolean, I can do that. There are lots of ways to compose these, and if it turns out there are some particular test data that you need that's not easily composed from these, you can write your own and it's a generator exactly like the generators you've been writing, so you have a function that returns a function that each time you call it, you'll get the next value, you can write one of those to generate all the random test data that you need. Then the reason this works asynchronously is because of the verdict function. Every check when it begins is passed a verdict function that it has used to return its result and the verdict is just a continuation, it's just a callback, which allows the trials to be extended over many turns, and because we can also put time limits on this, we now have three possible outcomes for every trial, we can see a pass, we can see a fail, or we can see a lost. Lost means we did not get a report before the time expired and those are sometimes as important as the passes and fails. Closure and Continuation Both of these systems, RQ and JSCheck, make tremendous use of closure and continuation and the really great thing about working with closure and continuation is that they make you feel really smart. I mean, you notice this, right. Yesterday, you've been doing a lot of work with closure, it makes you feel smart, when you get these things, it's like mmm. So and everybody deserves to feel that way. You know how people who do a lot of functional programming are always acting like they're so much smarter than you are, they're not, but they're working with this stuff all the time and it makes them feel good all the time, makes them feel smart and that's a really good way to feel, everybody should feel like that. Whereas the stuff you're working with every day, does it make you feel smart? No, it does not. So you want to get a hold of this stuff. It's great. So both of these projects are public domain, they're both on GitHub, they're both totally free. If they can help you, go ahead and take it, there is lots of other good stuff in the world too, if that is better suited to you, use that. So any last questions about RQ or JSCheck or asynchronicity? Yes. There is a couple that queued up here, one he was wondering if you have you checked out Elm since the type inference is the same as Haskell. Yeah, I mentioned Elm earlier. There is good stuff. Everything in that list is good, but I think RQ is of interest to you because it is the best suited, I think, to the workflows that you have to implement. Anything else? There was one from earlier from the security section, he just wanted you to clarify or he is asking we are learning about objects and security and how to prevent attacks, but isn't this something that servers are supposed to deal with? It's something everything has to deal with. It's not so much the servers, it's us. It's the people writing the programs who have to deal with this stuff. Yeah, a little bit more specifically though, what is the, it sounds like we're talking about, you have an object and you don't want some other object to tag it, this has to do with mashups, right, because you want to lend out your code to somebody else and you want to make sure they don't infect your code. I want to strengthen everything. So if I can make my code strong enough that I can run it with third parties, which is the thing that we want mashups to be able to do, I can be confident that it's going to work with my coworker's stuff as well, that most of the attacks that are bringing our systems down are self-inflicted, right, the stuff that we do to ourselves because we just didn't get it working well. Because you're not saying if you're not doing mashups, don't worry about this. No, no. We need to worry about this all the time. Everything we write should be to that level of quality, everything, everything. The Better Parts The Good Parts And now the better parts. So this is Antoine de Saint-Exupery. He was an aviator back in a time when aviation was really dangerous. That's what we used to call pilots, but he was an aviator. It was really dangerous because the planes crashed. And he experienced several crashes during his career and survived almost all of them. He was once trying to set the airspeed record from Paris to Saigon, and his plane crashed in Egypt, near Cairo out in the Sahara Desert. And he was not prepared for survival in the desert. He did very badly there, suffered terrible dehydration. Dehydrated so much that he stopped sweating, which is a really dangerous thing to do in the desert, suffered terrible hallucinations. Fortunately, he was rescued by someone who understood how to treat severe dehydration, and he survived and recovered. Moved to America as the war's beginning, becomes a writer. And he, turns out he's a brilliant writer. He writes one of the most famous children's books, The Little Prince, which is about an aviator who was stranded in the desert, who is visited by a strange little boy who lives on an asteroid and is suicidal. Like all the best children's books, it's not really a children's book. And he wrote other books too. He wrote some very good books about aviation, books aimed at grownups. In one of them, he has one of the best sentences ever written. He says, it seems that perfection is attained not when there is nothing more to add, but when there is nothing more to subtract, which is just brilliant. I've seen this quoted and requoted over and over again. It's used all over the place, in engineering, design, anything that's creative that requires some kind of discipline. It's just a brilliant quote, and he's talking about the design of airplanes, but it seems to talk about everything. And I think it's especially telling for programming because we have a special relationship with perfection, right, that our programs have to be perfect or they're not going to work correctly. And he gives us some insight as to how perfection is attained. It's by removing things. It's not by adding things. And I think we can also apply it to programming languages because we have similar things going on there, that most of our programming languages tend to want to add things, but I think most of them could be improved by removing things. And that's where the principle of The Good Parts comes from, that if a feature is sometimes useful and sometimes dangerous, and if there is a better option, always use the better option. And it surprises me that this is a controversial statement. There are lots of people out there who say, I don't want to use the better option. You can't make me use the better option. It's like, but being better should be enough, but it's not. And I think that comes from a fundamental misunderstanding about what it is that we do. We are not paid to use every feature of the language. At the end of a project, there is never a manager with a clipboard saying, did you use a with statement? Did you leave out any semicolons? Did you use comma as an operator? Excellent. Nobody's doing that. We are paid to write programs that work well and are free of error. That's what it's all about. Now free of error, where did that come from? Well it turns out free of error has always been the first requirement. It's just, we so rarely attain it, it's easy to forget that it's the most important thing. (Audience) That's why I never got paid, I guess. So a good programming language should teach you. I recommend everybody learn as many languages as you can because every language will give you some new perspective on something, maybe push you into a new paradigm, and you can take the things that you learned and apply it to all the other languages you know. Everybody should learn as many languages as they can. And the language in my career that has taught me the most has been JavaScript. And it took a while for me to learn from the language because, well I made every mistake in JavaScript that you can make. Starting with the first one, the worst one. I didn't think I needed to learn the language before I started writing in it. That I had so much contempt for it, the first time I saw JavaScript I thought it was the stupidest thing I'd ever seen. And it turns out I wasn't completely wrong, but I wasn't completely right either. Eventually I --- and I was cursing all of the time I was writing in JavaScript. It made me very, very angry. And eventually, I decided to learn the language. So I got the standard, which you can get free off of the Ecma site, and read it and was surprised to discover that it had lambdas in it, it had functions with first class, it has first-class values with lexical closure, which I had no idea that it was in there. Brendan put it in there intentionally, that's not something that gets into a language by accident, but he didn't tell anybody. And so I discovered it in reading the standard, and was like wow, this changes everything. And at that point, my relationship with the language changed. And at that point, it was able to finally start teaching me things, and it has taught me a lot. So one of the things I did with JavaScript was I wrote JSLint because I needed a tool to help me avoid all of the traps that are in that language that are there to defeat me. And JSLint, turns out, is much smarter about JavaScript than I am. My initial intuitions about what were good features of languages and what were bad were extremely unreliable, whereas JSLint was very good at always directing me to the correct point of view. What is it that we can do mechanically in order to find mistakes, because that's what I need to do in order to try to make my programs more perfect. And JSLint has taught me an enormous amount about programming. Arguments Against the Good Parts And so I wrote my own book about my experience with JavaScript and with JSLint, which is called JavaScript: The Good Parts. And the surprising thing about this book is it is still a best seller in its category, which is extremely rare for software books. Most software books are obsolete before they're printed, and it's because things tend to change and roll over so quickly in software. But it turns out the things that I was saying in this book are, for the most part, right. And so it's still a relevant book, even though JavaScript has gone through two major standard revisions. This book is still pretty good. So it's not universally adored though. There are lots of arguments against the idea of Good Parts, and I'd like to run through some of those if I may. The first one, it's just a matter of opinion. It's your opinion, my opinion's different. One opinion's as good as another, so there. Turns out that's not the case, that I am the maintainer of JSLint and as such, I get bug reports from people from all over the world telling me, you know I just spent two weeks chasing this thing down. And it was due to some weird edge case in the grammar of the language that no one had ever noticed and it just crippled us, hurt us at a bad time. Could you have JSLint look for that? Because if you can, then no one else will ever have to endure that. So whenever it makes sense to, I incorporate those things. So if you're using JSLint, then you will never experience those things. That is not an opinion. That is a fact. Every feature is an essential tool. I need every tool available in order to do my work, but that's simply not true. We can show that you can write better programs by not using all of the features. So if you can write a better program without using a feature, that feature is not essential. But this feature is sometimes useful. And that sounds like an important thing. I need to be able to use it because it's sometimes useful, except it turns out everything is sometimes useful. You cannot identify anything, which is so dangerous, so toxic, so disgusting, that it is not also sometimes useful. So sometimes useful is about as strong a statement as it exists. It sounds like it's a lot more than that, but it isn't. It actually is not an argument. So if the reason for using a feature is because it's sometimes useful, you simply do not have an argument. I have a right to use every feature. Now at this point, the argument has changed. We're no longer talking about what's the best way to write programs. We're now talking about our rights. And it sounds really important and righteous to be talking about rights and defending and blah, blah, blah, except if you follow this argument to its conclusion, ultimately it goes like this. Do you have the right to write crap? Yes, I have the right to write crap. So now we're arguing about, do you have the right to write crap? Is that true or not? I don't know. I don't care. It's not important. What's more important is you have a responsibility to write code that works well and is free of error. I need the freedom to express myself. I'm a poet, and the way I express my poetry is by leaving out my semicolons. I need to reduce my keystrokes. This is a really insidious one, again because we imagine that we spend most of our time typing, but we don't. So if you were to take all of the clean, finished code you finish in a year, you could type it in a day. So that raises the question, well what are you doing with the other 99% of your time? Keystrokes is not the important thing. If I could figure out a way to increase your keystrokes by a factor of 10 and cut your errors in half, that would be a huge win. Unfortunately, I don't have that formula for you, but that's the sort of ratios we're looking at. It is an insult to suggest that I would ever make a mistake with a dangerous feature. Inferior people could obviously have problems by doing that, but I must have satisfaction. And then, there is a good reason those features were added to the language. And I call tell you from experience, that is absolutely not true. I've seen how things get put into languages, and those things happen for all kinds of reasons and they're not all good ones. The creator of JavaScript, Brendan Eich, talks about foot guns. A foot gun is a feature of a language that you use to shoot yourself in the foot, and he put a lot of those in JavaScript, and I think he regrets it. And from time to time, he would tell the standards committee, let's not be adding too many more foot guns. They don't always listen to him, so you can always find more, but they're --- guys, look, look, boom, I almost always miss. Watch this, boom. (Laughing) So the purpose of a programming language is to aid programmers in producing error-free programs. That's the whole deal. So we used to think that it was not possible to write good programs in JavaScript because the language was such a mess that it just wasn't possible, so no reason to even try. It turned out that was not true, that it is not only possible to write good programs in JavaScript, it is necessary. It is so easy to go off the rails with this language. It requires tremendous discipline, maybe more discipline than any other language, so you really have to bring it when you're using this language. So there's a lot of confusion still about Java in JavaScript because they kind of look similar and they have the same name. Sometimes I'm introduced as a Java expert, which I am, but I'm usually invited because I'm a JavaScript expert. And it's similar to the confusion we have between Star Trek and Star Wars. We've got two science fiction franchises with basically the same name. And from 10 feet away, they kind of look like the same show, but we know they're not the same thing, right? The differences are really significant. In Star Trek, you've got phasers and photon torpedoes, you've got uniforms, regulations, everything's regulations, right? Whereas in Star Wars, you've got lightsabers and blasters, you've got proton torpedoes, completely different torpedoes, you've got sand, I don't like sand, and chaos. So, which one of these is JavaScript? Well, obviously Star Wars, right? In fact, when you're working with JavaScript, JavaScript presents you with a choice. You can go Jedi or you can go Jar Jar. And, quite a lot of our brothers like to go Jar Jar. So one thing I've observed, or two things I've observed, are the fantasy of infallibility and the futility of faultlessness. I say the fantasy of infallibility, especially in younger guys who think they have such mad skills that they can write just any crap, and they're going to get away with it. That they can just do stupid, and it's going to be all right. Then the futility of faultlessness, I see this more among older guys, guys who've been doing this for a long time, never seen it go right. It's never going to go right. Why even try? So two very different perspectives, but they both lead to the same thing, danger driven development where you're doing crazy reckless stuff in the code just to keep in interesting. Anyway, don't do that. Oh, and don't forget your semicolons, right? So, one of the things that makes programming difficult to manage is the difficultly of scheduling, that usually when we're doing something we're doing it for the first time we've ever done it, and so we don't know, there's no way we could know how long it's going to take, but we'll make guesses. And our guesses, because we're optimists, are usually wildly wrong, but that's the way it goes. But what's even harder is scheduling the second time, which is the time B, the time it takes to make the code work right. Now that should be 0, right? You should write it and it should be right, and we're done. But that's not the case. That often time B is greater than time A. I've seen cases where time B is infinite. That's what happens when the code's finished, but then the project is cancelled before they can get it to work. That happens way too often actually. So if you do anything in time A that increases the likelihood of time B being bigger, you're doing it wrong. That's not the time to be saving time. So we should always take the time to code well. Sometimes there's this idea that, well we're just doing this quick and dirty thing so let's just be really sloppy and get it done fast, except getting it done still means it has to work. And getting it to work is still going to be easier if you write it well. That the time it takes to write it well is going to be less time then it takes to make it work right if it's not written well. New Good Parts in ES6 We've got a new version of JavaScript that was approved by the General Assembly six months ago. And many of the features of it are already in browsers now, and some of them will be coming in later. And so this is a good time to look at what's in the new language. Did we get any good stuff in it? And the answer is yes. We've got actually a lot of good stuff in it. So my number one most favorite new feature of the language, proper tail calls. So the compiler will turn that into a jump instead of a call return. So it will go a little faster, take less memory in getting there. It enables continuation-passing style and other modes of programming. It's great, so that's my most favorite feature. With this, JavaScript finally becomes a real functional programming language, which is great. My second most favorite new feature, the ellipsis operator, which we talked quite a bit yesterday. Again, there's two version of the curry function. There's that one and that one. They both do the same thing. This one's inexcusable. This one's actually pretty nice, so that's going to be a good thing. We've got modules in the language now. Unfortunately, the module system we've got is way too complex, but it works. So there's a subset of the module system, which is adequate, which is you just want to say, here's some stuff. I'm going to export this thing, that's how you use it, and that's all it needs to be. And it can do that, so that's good. That's much better than requires or any of the other module loaders. We've got let and const, which by themselves are not that big a deal, except they are less confusing to Java programmers. So it's good not to confuse the Java programmers, so I like that. One thing I find confusing about this is that I've seen a lot of developers who are confused about the difference between const and freezing. And so they're surprised that you can put a mutable object in a const and still mutate that object. So const is for variables, freezing is for values. Just remember that. It's pretty easy. We have destructuring. Destructuring is something you can do in let and const statements, and also in assignment statements where we can create new variables and initialize them from some object at the same time. So again, not a big change, doesn't let us do anything we couldn't do before, but it is a convenience. Yeah? (Audience) I think you said freezing is for values. Did you mean freezing is for objects? Objects or values. (Audience) Okay. In theory you could freeze numbers, except they're already frozen. And strings are already frozen, so you don't need to freeze those. (Audience) One follow-up, what about nested objects? It only freezes the outer one. There is no deep freeze in the language. (Audience) So you have to do a recursive freezing. If you care about that, you have to descend into the language and freeze each level. Then we have WeakMap. WeakMap is a really nice thing. WeakMap works the way objects should work. With a WeakMap, you can take any JavaScript value and use it as a key, and then associate any other value with it, which is the way objects should've worked. And it also has a nice property that it does not root to the keys for the purpose of garbage collecting. So if the only place where an object still exists is as the key of a property in a WeakMap, it will get garbage collected automatically, which is brilliant. So that means there is a class of applications, which we can write in JavaScript now using WeakMap that we could not write in the language before, which is really nice. The biggest problem with WeakMap is it has maybe the worst name ever put on a feature of a programming language. Because who wants to use something called a WeakMap, right? But it's actually a really good thing. It's just a really bad name. And we have megastring literals. So we have the backtick, which we can now use to make literals that can span many lines. So here's an example of a regular expression. This particular regular expression matches all of the number literals that are in ES6. And you can see why I don't like regular expressions, because you can't make any sense out of this, right? Just looking at that, it's just noise. It's really hard to understand what it's about. So I can make things a little easier now. I've got a function here, which will call the regular expression constructor passing it in a string, but first it will remove all of the spaces from that string. And so now I can use one of these mega literals to write it out with all the whitespace in it. And with whitespace in it, regular expressions aren't quite that bad. We've got spaces between each of the elements and I can line them up, and you can kind of see what's going on. So we now have binary literals. You can 0 b followed 0s and 1s. And we've got octal literals, which I'm still not sure about those. You've got a 0 followed by an o, and then octal digits. And we've always had the hex and the floating point. So that's pretty neat. This is a much nicer way of writing regular expressions. There's still things about the mega literals that I don't like. That backtick is the smallest operator we've got, and it's used to bracket the biggest structures we've ever had. So there's potential for misreading here, so we're going to have to be cautious with this. But I think this is probably a good thing. And again, regulex. If you're writing regular expressions, even with the whitespace, you still want to be using regulex. So we've got fat arrow functions, or farts for short. The motivation here, there were people who were complaining that with function you have to type eight letters, you have to f-u-n-c. They said, that's too much to type. And I said, well there's this new --- there are these new things now called keyboard back rows. And they said, it's too much to read. Okay so, we added them. So it's a short form for writing a function that will return a value. So we've got the parameter list, we've got name, and then we've got the fart, and then we've got the return value after it. So you don't have to write function and you don't have to write return. Except this will fail. This looks like it's a little mini constructor that's going to return a new object, right? It's going to return a new thing where the id is whatever the name is that we passed. This is going to fail. Instead, it's going to give you a function that returns undefined, which it not good. And it's because of a syntactic ambiguity in the language. The committee was aware of this when they did it. They just, we just keep adding new bad parts. We can't stop doing it. So because of that, I'm not a big fan of these. (Audience) So it will compile, but it won't run? It will compile. It'll run. It'll return undefined when you call it. You will not get the new object that you're trying to construct. We keep doing this to ourselves. We keep doing this to you, is how that actually works. (Audience) Because the brackets are ambiguous? Yeah, because the brackets are ambiguous. We could've --- there are a number of ways they could have disambiguated it, but they decided instead to make them blocks. (Audience) Because that's not --- what you put inside the brackets isn't really the contents of a function. Typically a function would return --- I'm confused about --- I'm confusing this fart with what we're using in another language, in TypeScript I think. Because they put fart in TypeScript and then using it --- been using it to call a function to do something. So this will work for lots of other purposes. The place where it fails is if you're trying to return a new object literal. (Audience) You wouldn't have on the other side of this x var x=. Yeah, or you'd pass this --- more likely you'd pass this as an argument to another function. That's one of the most common uses for these. So anyway, moving on. And there are lots of other bad parts in the language too, like generators with the yield operator. I think that was probably a mistake. It adds a lot of complexity and not much value and kind of locks people into patterns, which I think are things that we should be evolving away from and not getting trapped in. A lot of the other new features, we haven't had enough experience with them yet to know if they're going to be good parts or bad parts. But there is one part that I'm confident is going to be a very, very bad part, and that is class. This was the most requested new feature for ES6. And it was mostly requested by Java programmers who were having to move into JavaScript and were unhappy about it and saying please, can you turn JavaScript into Java so we don't have to learn this crappy language? And so we did that, except we didn't actually add class. What we did was we added syntactic sugar on top of the prototypal stuff that's already there. So when you get down into the edges, it's not going to work the way you expect it will. But worse than that is it keeps you trapped in the mindset of the classical model, which means you will never learn to use the functional model. And also, you don't have the strong type system that you need in order to deal with classes. And so I'm pretty confident that the people who are going to be using classes are going to feel vindicated. It's in the standard, I have a right to use it, it's all that good stuff, but they will never learn to use the language effectively. They will go to their graves never knowing how miserable they were. Good Parts Reconsidered So I'm also reconsidering things that I had put in the book. Do I still think the things I said were good then, are they good now? So in the book I didn't recommend use of new, I stopped using it. I recommended using object.create instead. And in fact, I managed to get object.create added to the language just so that I could use it. So I was really surprised to kind of notice that I've stopped using object.create. And the reason for that is because I've stopped using this. If you're not using this, then create doesn't do all that much for you. And the reason I stopped using this was because of ADsafe. So in 2007, there were a number of experiments in trying to figure out how to turn JavaScript into a securer language for doing mashups and third-party interactions. There was FBJS at Facebook, there was Web Sandbox at Microsoft, there was the Caja project at Google, and there was my own ADsafe project. And one of the difficult problems everybody was struggling to solve was, what do you do about this? Because if you have a method, this will get bound to the object of interest, and that's good. But if you call that same method as a function, this gets bound to the global object, which gives away the farm. And how do you deal with that? And so the approach that the other three projects took was they have translators. They read JavaScript and they write JavaScript, and the JavaScript they write is much bigger because they're adding a lot of runtime checking and indirection in order to find cases where this is turning into a global object and allowing bad things to happen. The approach I took with ADsafe was much simpler. I said, this is illegal. So if we see a program with this in it, we reject it. Done. Easy. So we don't rewrite the programs, we don't make them harder to debug or slower to run. If a program passes ADsafe, it gets run at full speed without any modification. But the ADsafe rules would reject pretty close to 100% of all existing programs in the web. So in order to use ADsafe, you had to work in the ADsafe dialect. So I decided as an experiment, well, I'll work in that dialect, I'll see what it's like and how painful it is. And after working with it for a while, I was really surprised to learn that it got easier. That I expected it was going to be a hardship that I was going to have to get around, but it actually got easier and my programs were getting better. I was forced to use the functional patterns because the prototypal patterns weren't available because I stopped using this, and I really liked it. And so as I've gained confidence in this model, I've become an advocate now of getting rid of this entirely and going all with functions because that's where the goodness is. Yeah? (Audience) What does JSLint say about this? So JSLint, this is now an option. So there is now an option that you can specify, which will suppress warnings about this. I don't recommend using that option, but it's there. I added options to JSLint for transition, because over time I'm constantly looking at how can I make this language smaller and better? And things that I thought were maybe a good idea last year, this year I'm thinking I'm not sure. Like this year I'm starting to decide that labeled break is probably a bad idea. I actually haven't used one in many years, and I think I'm better for not having used one in many years. So labeled break is on its way out to the bad part list. But there are people who are using JSLint who, when JSLint's rules change, go ah, we're used to doing that and we can't do that now. So I added the options so that they can transit out. We can just get by for now and eventually we'll fix our code, except what happens is people don't fix their code and they design their programming style based on what options they can turn on. So I'm trying to remove options now from JSLint as much as I can. But I recently added the this option so we can transit out. (Audience) You don't use prototypes at all? You don't use prototype at all in the code you write, you personally? I am not using prototypes anymore. (Audience) So then pointing out yesterday that one of the drawbacks to functional versus objects is that all interfunctions --- every function you have that there's of instance of that function, that's functional programming. That's ---. Yeah, we're going to talk about that. That's coming up next. (Audience) Okay. I was going to say, are there any other drawbacks to abating prototypical? That's the one. (Audience) Okay. So I've stopped using null. JavaScript has two bottom values, which is at least one more than it should, and so I'd stopping using null. I just use undefined now for everything, even though it's longer to type. That doesn't matter. And I stopped relying on falsiness. I used to think that falsiness was a good idea, and I advocated having comparisons in ifs that were as small as possible. I now think that was a mistake, that we should be trying to make comparisons that are as explicit as possible, trying to intelligently bifurcate the true cases from the false cases. I've stopped using for because in ES5 we got foreach and every and map and all those others, and those are great. So for most purposes, I don't need to use for loops anymore. I don't use for in because I manage to get object.keys added to ES5. For in never worked correctly anyway because it would always dredge through the prototype chain and you get all the inherited methods coming up, which you would then have to filter out. For in never worked right. So ES6 will have proper tail calls. And when we get that, I will stop using while. So here are two versions of the repeat function. Repeat takes a function as an argument and will call it until it returns undefined. And that's a version written using a loop, and that's a version using tail recursion. In ES6, these should both run at the same speed, both consuming the same amount of memory. So this will probably be the last feature implemented in ES6, so I'm still waiting for it. But when it comes, then I'm done with loops. I'm going to be doing tail recursion from here on out. (Audience) What does ---, under the hood what does tail recursion look like in order to prevent from having this extremely long stack? It's really simple. It's just, this turns into jump to. It's like a goto with an argument list, so we jump to the repeat function. So we're jumping to the top, and if this has a new value, which it doesn't, the new value would get assigned up here and we go again. The Next Language So I've been thinking a lot about the next language. What's the language that replaces JavaScript? And I really hope there is a next language because if JavaScript is the last programming language, that would be really sad, wouldn't it? I mean for our kids, right? We need to leave them a better language, right? We can't allow JavaScript to be the last language, so something is going to have to succeed it. And I've been searching for the next language. What are the signs? How will we know when it's here? Sort of like, awaiting the Messiah. How do we know? And I don't know, but I'm looking for it and I'm starting to try to understand what it's going to be. I'm confident it's not here yet. Well actually, I don't know that. It might be it's here, but I just haven't recognized it yet. People would tell me, oh it's C#. No, no, no, or it's Java. It's been here the whole time. No, no, no, the next language is another language, and it's the thing which is the right thing to replace JavaScript, and we haven't seen it yet. But I think a lot about what it's going to be and what it's going to do. I am confident that when it arrives we will reject it out of hand as we always do because programmers are as emotional and irrational as normal people. We think that we're not, and maybe our spouses tell us we're not, but it's true. We are, and most of what we think about what we do is based on emotion and not on reason, even though we imagine it's the other way around. So this sounds like a wild charge to make of my own profession, but I have some good reasons for saying this, and I think the historical record backs me up on this. For example, it took a generation to agree that high-level languages were a good idea. Back in the early days when everything was an assembly language and the first higher-order languages were being developed, FORTRAN, COBOL, and so on, who would have most benefitted from use of those languages? The programmers. Who was it who was opposed to those languages? It was the programmers. They were complaining that the languages were taking control away from them, that they didn't give them the performance that they needed. They had all of these reasons for why they wanted to stay down in the muck, that they didn't want to be elevated in these more, more expressive productive languages. It took a generation to agree that goto was a bad idea. Dijkstra publishes his letter in '67, and that starts an argument that literally goes on for 20 years. And the arguments were all silly arguments. We need the performance of the goto. It's how I express myself. I can't be me if I can't use goto. We have a goto tradition. My grandfather used goto. You can't take --- my cold dead hands. All of those arguments were all made about goto. It took a generation to agree that objects were a good idea. So objects are first discovered in Simula in Norway in 1967. (Audience) Under the ice? I'm sorry? (Audience) Under the ice? You said they were discovered. Well, in Norway. (Audience) Under the ice. No, in Oslo. And as always happens with the really important innovations, the world took no notice at all, except for one guy. A graduate student at the University of Utah, Alan Kay, who then takes this idea to Xerox PARC in California thinking that this idea of objects is so incredibly powerful that he can use it as a programming language for children that children can use to program their personal portable devices. There was a lot of his vision, which was right, a lot of his vision we haven't caught up to yet. We still don't have the language for children. I don't think we have the language for adults yet, but we're still working on it. But they then spent a decade developing that language and did a brilliant, brilliant job of it. And so in the late '80s, the industry had a choice. We're going to into object-oriented programming, it took a long time to get there, but we're going to do it, are we going to go with Smalltalk-80, one of the best designed programming languages in history or are we going to go with C++? And the decision was made by people who fundamentally did not understand object-oriented programming, and they chose C++ because in order to use that language, you did not need to understand anything about object-oriented programming. That language got some things fundamentally wrong about object systems. Unfortunately, the language has been extremely influential and has set the mold for virtually everything that's happened ever since. I don't know if we're going to ever catch up to Smalltalk. Then it took two generations to agree that lambdas were a good idea. So Alan Kay, who was the Smalltalk guy, he started by writing a little program in NOVA BASIC on a Data General minicomputer, which demonstrated his weird little language. And he started touring with this idea, taking it to labs and universities. And he visited MIT. Very smart guys at MIT, and he's telling them about his new language. But it's still early in object-oriented programming and he doesn't have the vocabulary that we have now to describe what's going on. So he couldn't say, you invoke a method on an object, because nobody knew to say that yet. So he described it as you send a message to an object. Well the guys at MIT listening to him said, well you're not actually sending a message, you're making some routine invocation with indirection, but what if you did send a message? And that started research in the actor model. In the actor model, you actually have these entities that are running in separate systems that can send messages to each other. That's basically what actors are. And the guy who came up with this, Carl Hewitt, is one of these guys who is so amazingly smart. It's like he was born on the other side of the paradigm shift. When he talks about stuff, people cannot understand what he's saying. He's very clear, he's very eloquent, he speaks really well, but he's talking from a frame of reference which is so foreign to the rest of humanity that nobody understands him, including the other people at MIT. They are all very smart and they all had a lot of respect for Carl, but they couldn't figure out what he was raving about with all these actors. So a couple of them, Sussman and Steele, decide we need to build something in order to understand what it is that Carl's talking about. So they start by taking Lisp and rewriting it into a language which models the actor model. And they didn't fully implement the actor model, so they never did figure it out. But the language that they created was called Scheme. And they had accidentally discovered functions with lexical closure and all the stuff that we've been working on today. So the fact that you could have higher-order functions, functions that return functions, and all that stuff, happened by accident trying to figure out what Carl Hewitt was talking about. And that is maybe the most important history --- important discovery in the history of programming. And as always happens, the world took no notice of it at all. And it just sat around at MIT for years and years going nowhere, and is only now, finally after 40 years, finally coming to the mainstream. And it's coming to the mainstream because it was actually a really good idea, and its time has come, we really need it now. So the reason these things take so long, everything takes at least a generation, is because we don't change minds. We have to wait for a generation to retire or die before we can get critical mass on the next new idea. And that's the way progress goes. We imagine this is an extremely innovative industry, and in some ways it is, but in some ways we're just like everybody else. So I lived through the goto thing. I remember when that was happening and all the arguments, and they were really emotional, angry arguments. And all the arguments were from emotion. There was very little argument from fact. And it just went on and on and on and on, and then it got quiet. And I was like, are they gone? Can we get rid of the goto now? And we did, and we got rid of it. And it's gone, and we're not missing it. And all of the promises that the world was going to end if we got rid of goto or the lives of programmers were going to be made more miserable, none of that turned out to be true. We're not missing it. We're doing great without it. In fact, some of our languages have maybe a little bit too much goto left in them, but we're getting by. So it turned out we never needed goto. We don't need it, we're doing better without it. And in fact, what actually happened was by getting rid of goto, it made it easier to write programs of greater complexity. Because if your only control construct is goto and conditional goto, there's a limit to how complex a program can get before it becomes unmanageable. And by getting rid of goto, we could do better with larger programs and we could do better at achieving our ambitions in terms of writing software. So the people who were arguing against goto were the people who would've benefitted from goto. They benefitted from getting rid of goto. And it is always like that. I expect it will always be like that. So in looking for the next language, I'm looking for it to be different in some important ways, and I'm expecting it to be crucified for those differences. Evolution of Inheritance I think of programming languages as being in two very distinct classes, systems languages and application languages. A systems language is a language that you would use for writing system stuff, something that needs to be close to the metal like a kernel, a device driver, a memory manager, that kind of low level stuff, and nothing else. Everything else should be written in application languages. Now we need a new system language. Our dominant system language is still C, which comes from the late '60s. That was a long, long time ago. We seem to have lost the ability to innovate in system languages, and it's not like we don't need them anymore. I think we need a better version of C, but it's a long time coming. But I'm more concerned with the application languages because that's where we all live, and we need better application languages. I think a mistake that you can see in lots of languages is that they're confused about which side of this line that they're on. For example, for my money the biggest design error in Java was it couldn't decide if it wanted to be a system language or an application language. And one of the signs of that is that it requires the use of threads at the application level, which I think was a tragic mistake due from that confusion. So you can take the application languages and divide them into two classes. There's the classical school and the prototypal school. And for a long time, I've been an advocate of the prototypal school because of the obvious problems that are in the classical school. You've got the classification taxonomy problem that we talked about earlier. Back when the goto argument was going on, maybe at its loudest, someone published an article saying we should also eliminate the COMEFROM statement. And it was a joke. There were people who attempted to design programming languages based on the COMEFROM statement. But it turns out, we have COMEFROM statements in our classical languages, except we call it extends. And what it does is cause coupling of things, and coupling can be a problem. When you couple things together, then one becomes dependent on the other. You cannot change then independently anymore, and that causes systems to become brittle. And that happens in class systems. And so the big advantage in prototypal systems is we eliminate that coupling, or we at least reduce it significantly such that that no longer becomes a problem or a barrier to progress. So I was a big advocate of prototypal inheritance. Since then, I've changed my mind. The biggest benefit you get from prototypes is memory conservation. And I think that might have been important, probably was important, well maybe it wasn't important in 1995. It's not so important now that we've had --- memory has become so abundant. You now have gigabytes of RAM in your pocket. You just can't imagine that much memory. You really don't understand numbers that big, but we still think about memory as though it's dulled out in K. So yeah, it may have made sense then, not so much now. It also is a source of confusion that we have own properties and inherited properties, which for some purposes act the same, but for some purposes don't. I'm looking to get rid of all sources of confusion, as you know, because confusion causes bugs and exploits. It also provides retroactive heredity, that you can change what an object inherits after it's constructed. I've not found any good uses for this, but I can imagine a lot of really bad uses for it. And it's something that is provided by this architecture. One bad thing about it is that it's performance inhibiting. So JavaScript engines go fast by making assumptions about the shape of objects. And they have to be pessimistic when it comes to prototype chains because a prototype chain can change its contents at any time and they need to be prepared for that. And that slows things down. So I used to be in favor of prototypal inheritance. I'm now in favor of class-free, object-oriented programming. I think class-free, object-oriented programming is JavaScript's gift to humanity. I think that's why this is an important language. It's why I hope it becomes an influential language. So, let's review this again. So this is block scope. And we've got functions, which have the same scope relationship. And we've got our closure again. And we've got this really important form where we have a function that returns a function. And this is, I think, the best, most important discovery in the history of programming. And we have it in this silly little language. So to review one last time, this is the model that I recommend for doing class-free, object-oriented programming using functions in JavaScript. We've got a constructor, which takes a specification object. We make our instance variables, which we can initialize from the specification object. We get other methods or other functions that we can use that we can get from other constructors, and we can call as many of these as we like. We then create our methods, which will close over all of the members, all of the other methods, all of our methods, and the specification object. And any of these which need to be public, we will put in the outgoing object, which we then freeze. By doing this, we get an object which has a hard interface, an un-corruptible interface. It will always work exactly the way it should. It cannot be confused. And it also corrects, I think a problem that we've always had in objects in that some things you want to access through a method and some simply by poking at the properties. Except if you allow people to poke at the properties, then there's no defense against inconsistency. Everything, all changes at least to the object, should go through the interfaces. A Bug Story I made a bug once, and I need to confess. I need to come clean and share this with you. It was in 2001. And I was writing in Java. I wrote one of the first JSON parsers, and it was part of a reference implementation to show people how easy it was to write a JSON parser. And it included this statement, which created an index variable which counted characters into the string or stream that was being parsed. So if there were a syntax error, we could tell you at what character position that error occurred. And an int is big enough to cover, what about, 2 billion characters? Which is pretty big. That was --- 2 GB was a big disk drive at that time. And the way I was using JSON, I never had a message that was bigger than a couple of K. So I thought, that's a lot of head room. That's being pretty generous. I'm future-proof that this is good. So a couple years ago, I got a bug report from somebody who had passed a JSON text to the parser that was several gigabytes in size, that contained a syntax errors past 2 GB. And the error message that my parser provided was wildly wrong. And they looked at the code and they very quickly figured out why it was wildly wrong, and it was because I said int. So I hate int. Int's terrible. Int, I think, is one of the worst ideas in the history of programming. And the reason I hate int is because what it does on overflow. So if you --- so let's talk philosophically. So what should happen if you have a value that's too big to fit in a cell of memory? What should happen? There's one school of thought that says, well the CPU should halt. That's kind of old school. You're not going to get nine 9s that way, but at least you're guaranteed you're not going to create any bad results, right? Death before confusion. There's another school that says, we'll have an interrupt. We'll raise an exception. We'll go to some other path which says something eh, eh, do something about that, give some warning, and prevent the original operation from following through. That's reasonable. Another approach would say, we'll provide some sentinel, some man value, and put that in memory instead. So when you go back and look for it, it'll say, I don't know what you're looking for, but it's not here. Another approach might say, take the largest thing that does fit and put that in there instead. That's called saturation. That's a reasonable thing to do in computer graphics and signal processing, but you don't want to do that in financial applications. No, no, no. But suppose what you want to do is maximize the likelihood in effect of errors. I think what you want to do is throw away the most significant bits and don't tell anybody. And that's what we do. And I don't think it makes sense, so why do we do that? It's because it made sense in the '50s. So in the '50s, we made computers out of vacuum tubes, and vacuum tubes are big, and they consume a lot of power, and they get hot, and they burn out quickly. So the more tubes you have in your ALU, the more costs to build, the more costs to operate, the more costs to maintain, the shorter the meantime to failure. So if you can figure out a way to take some tubes out, that's a huge win. And some genius, and I do believe he was a genius, figured out that if we use complement arithmetic instead of signed magnitude, we don't have to implement the subtract circuit. All we have to do is put a complement, which is almost free, in front of the add circuit and ignore the overflow. That was brilliant. That was a great idea at the time. But since then, Moore's Law has gone crazy with the number of devices that are possible on a chip, and we have not reconsidered that decision that was made in the '50s. And I think we are way, way overdue on that. To make this even worse, memory used to be extremely scarce and expensive. A machine might only have a couple of K in it. The Atari 2600, if you remember, only had 128 bytes of RAM. That was all of the RAM in the entire machine. So you were really careful in allocating memory to numbers, right? And if you could figure out a way to get 2 numbers into 1 byte, you would do it because you didn't have very many of them. And our programming languages are still in that mindset. So for example, in Java you've got byte, char, short, int, long, float, and double just as the main built-in types. So every time you create a property or a parameter or a variable, you have to ask, hmmm, which one of those? And you've got to pick one, and if you get it right, then okay. But if you get it wrong, the program's going to fail. And it's not going to fail immediately, necessarily. In fact, it might fail sometime in the future. And tests cannot find that because your tests all assume a particular data model, and they are not sufficient for finding these sorts of errors. Tests fail on this stuff. And there's no value in having picked the right type. Because I used --- I saved 16 bits on this variable, yay. What's the cash value of the savings of that memory? The answer is 0. In fact, it's nothing compared to your time. The time you spent wasted trying to decide which type is infinitely more valuable than the value of the memory that you're saving. Now JavaScript, on the other hand, gets it right because it only has one number type. That means, it's impossible to make an error by picking the wrong number type, and that's a huge win. I'm totally in favor of that. The only problem is it's the wrong type, and you know why it's the wrong type. It's because 0.1 + 0.2 is not equal to 0.3. I've heard from some people who say, that doesn't matter. And they are absolutely wrong. That matters a lot. So how did this happen? How did we get to this place? Well again, we go back in time. So in the '40s, when the first Von Neumann machines start coming online, they are integer-only machines, but most of the programmers are mathematicians, and they're trying to figure out how to do real computation, and it's hard. They're trying to do stuff with scaled integers, and it's a lot of work, and it's error prone. And someone who is really smart figures out floating point, that we will have two numbers per number. One is the number itself and the other is a scale factor, which tells us how many positions to move the decimal point. And then we can just give it to a subroutine, and the subroutine will figure out how to add these things. And it worked, and it made programming much easier to do. Unfortunately, those libraries were really slow. So when we get to the '50s, there's now interest in putting floating point into hardware. But we're making stuff out of tubes, and it's hard to do. And some genius, and I do believe he was a genius, figured out that if we use binary floating point instead of decimal floating point, we don't have to implement a divide by 10 in order to do a scaling, we can just shift 1 bit, which is free. And so they did that. Now that worked great for scientific computing because in scientific computing, your loader digits are probably wrong anyway. So the fact that you can't exactly represent the decimal digits is not all that important. But it doesn't work for business processing because they're adding up money and they need to be exact. They have to give the cents exact. So there's this division. There's scientific computing and there's business computing, and they use different hardware, they use different programming languages. So when you order your mainframe, you either get the floating-point package or you get the BCD package for doing your COBOL processing. And that's sort of the way the world worked until we kind of came into the modern age and COBOL dies off, and Java is the successor to COBOL. But Java comes from the FORTRAN school, not from the COBOL school, and it doesn't do a good job of dealing with the business types. But that's kind of the tragedy that we're in now. So, I propose to fix it, and this is my solution. I call DEC64, and it's a 64-bit quantity containing 2 numbers, modeled very much after what they were doing in the '40s. So I've got one number, which is the coefficient, it's 56 bits long, and I've got another number which is the exponent, which is 8 bits long. And it's really nice. So you can --- and it works because of this 10. So it's really easy to see what something's going to do. It's just this formula, really, really simple. No complicated packing or unpacking. I put the lower 8 bits in the bottom because I'm going to tell architecture we can unpack that for free. So a software implementation can be very efficient. And this gets us 16 or 17 decimal places, very accurate. We can do exact business processing. It's not bad at doing scientific processing either. So my proposal is that this be the only data format in future application languages. I would just have one number type, and it'd be this one. So to prove it, I have a software implementation available in x64 assembly language. That's on GitHub. If you're curious about this, you can check it out. And it performs pretty well. So if you're adding integers in a software implementation, it can add two integers and five instructions, which is pretty good. It's maybe four more than you'd like, expect that with those five instructions, you also get overflow protection and you get nans, which are nice things to have. I think essential things to have. In a hardware implementation, adding integers should happen in one cycle, which means we don't need to have ints as a separate type in order to get performance. We can get performance, and we can get the range of values that we need in one number type. So my goal is to convince everybody in the world that this is the one number type to use in the next system. Now, it turns out I don't actually have to convince everybody in the world, I only have to convince one person, and that's the man or the woman who designs the next language. If I can convince that person that they want a language with one number type that works well with humans, that does arithmetic the way we were taught to do arithmetic in school so that we won't get confused, then we'll win. JSON Now it turns out, I have a little bit of experience at convincing everybody in the world to do the right thing. For example, there's the JSON data interchange format, the world's best loved data interchange format. It took a while to convince everybody, but eventually everybody got onboard, the death threats stopped, and we're all happy about it now. I'm very happy to report that this took much less than a generation. This is one of those very rare things where we were able to move it in just a few years. So if you look at Google Trends and compare the trends of JSON versus XML, you'll see that the world has steadily been losing interest in XML. Looks like the years are chopped off here, but it's on the other screen. Since 2005, the world has steadily been losing interest in XML, and we've slowly, quietly been seeing interest in JSON creeping up. And this increase was not motivated by any industry. There is no JSON industry which drives this because all the tools are free and simple, and so it's just happening because you decided this is a better way of doing things, and so it's just taking up. So I can't predict when these lines cross over, but Google can. So looks like next year for sure, JSON becomes officially more interesting, according to Google, than XML. (Audience) But in terms of use, people are --- we're using JSON more across --- as a data interchange, there's more people using JSON, right? I have no way of knowing. It seems like that to me too, but I haven't --- (Audience) I think they're just Googling XML because it's still difficult, and we don't have to Google a JSON so often because it's easy. I think there's a lot to it there, yeah, so a lot of --- like if the search suggestions, XML sucks, that's a popular thing. You don't see that on JSON searches, so there may be very well be something to that. So I did not intend for JSON to be the last data interchange format. JSON was intended to do a very specific thing of allowing applications written in different languages to be able to communicate with each other over the internet. But JSON is now being used for lots of other things. Things that it wasn't necessarily designed for. And so I expect there are going to be other data formats as well. So I have some advice to anybody else who might be designing another data format. Number one, please don't break JSON. JSON is working really well for the thing that it was originally designed for. One of the best things that I did in JSON was I did not put a version number on it, so there is no way to revise it. We can't say JSON 1.1 because there was no JSON 1, it's just JSON. So the way it is, is the way it will always be, which is great. Because if you look at the stack, I expect over the years, the stack is going to be changing all the time at every level, except in the JSON layer. The JSON layer will always work the way it does right now. And that's a pretty nice thing, having that level of reliability and integrity in anything in the net is huge. I can't think of any feature we could add to JSON which is more valuable than that. So if you want to make something else, please make something else and make it better, but don't try to break JSON. Leave JSON alone. Whatever you're going to do, if you're going to do something that's kind of like JSON but different, make it significantly better. If it's just JSON with one or two little things, then what you're doing is you're adding a compatibility hazard to the world that people are going to have to deal with this thing that's kind of like JSON but isn't. And so there's a cost to that. So make sure the cost is worth it. Add enough value to it to justify that cost. And oh, get a better name. Because no question, the worst thing about JSON is the name. For one thing, people are confused about how to pronounce it. I say JSON, a lot of people say JSON. It turns out the correct pronunciation is JSON. Everybody? (Audience) JSON. Yeah, so, and I hear from guys all the time named Jason saying, I'm in the office, and I hear someone and go, what? So I'm sorry, Jason if you're here. I don't know what I was thinking, sorry. So I've heard from guys named Jason who threatened to make new standards called Douglas just to show me what it's like. But again, the worst thing about JSON is the name. It stood for JavaScript Object Notation, and almost everything about that is wrong. So I called it JavaScript because I wanted to give credit to JavaScript because that is where the idea came from. I didn't invent JSON, I discovered it already in this language, and I was trying to give respect to it. I was not trying to jump on JavaScript's coattails. Because I'll tell you, in 2001 JavaScript was not a much-loved language, and saying I'm one of the ---, I'm JavaScript. That's not a way to get creditability from anybody. I was just trying to be honest. I was just trying to recognize where it came from, but the name is problematic. For one thing, people still don't know what Java is versus JavaScript, and that stuff is still going on. And also there's a confusion that, so JSON only works with JavaScript? No, that's not true. That was never true. There's also a confusion that, I guess the JavaScript language standard is what defines JSON, and that was never true. But there were all these confusions that happened because I chose the wrong name. (Audience) What should you have called it? I don't know. That's too ---. (Audience) Have you at least thought about it? That boat has sailed. We're done. Then there's object. Object here represents the way that JavaScript thinks about objects, right, just a loose collection of key values pairs, that's an object, which is not the way most languages think about objects, right? They have this more brittle thing. We've got classes and instances and so on. So they get confused that we can send them a JSON object and they go, that's not an object, and we go, um yeah. Notation, maybe that's okay. So notation's bad. So if you're going to make up a new data standard, and I think you should, don't call is JSON. Don't stick a letter in front of it or a letter on the end of it or anything like that. Come up with some creativity, do something good there. Final Thoughts Let's talk about responsibility. Who are you responsible to when you're programming? Well this is my answer. This is who I think I am responsible to when I'm writing code. My number one responsibility is to the people, to the poor people that have to use my software. I want it to work really well for them. I don't want to frustrate them, I don't want to confuse them, I don't want to hurt them, I don't want to lead them to ruin, I want to give them the best experience possible. I want to make their lives better to the extent that I can. We imagine, in the software industry, that we're making the world a better place despite all the obvious evidence to the contrary, we really want to believe that, but we should at least try to make it true for the people who are using our stuff as they are using it. We should at least do that. I've seen teams where they have contempt for their users. They hate their users, their users are always complaining, they don't understand how to use our stuff, they don't appreciate our genius, they're stupid, they're awful people, hate them, which I think is totally toxic. It's inexcusable. We should never be thinking like that. Everything that we do should be in service to humanity. Number two, the team. I want to be writing the best code that I can, the clearest, best, well-documented, easy-to-understand, good code because other people who I'm going to be working with are going to have to take that code and make it better. And I don't want to be setting them up for failure. I don't want to be frustrating them. I don't want to be making their lives worse. I want to make their lives better. I want good code for everybody, and I expect them to do the same. So programming can be a solo activity, but generally we're trying to do stuff which is so complicated and we're trying to do it so quickly, we're working in teams almost all the time, and sometimes big teams. And in large literary efforts, there's this idea of writing with a single voice, like if you're writing an encyclopedia or if you're a couple of authors writing a book or if you're writing a magazine or some other thing where you've got lots of writers working on the same project, they're all trying to write in the same voice, and that doesn't interfere at all with being able to express good ideas and presenting it clearly. And I think we need to be doing that in writing software too, so I want us to be writing all in the same voice. And that voice is the smartest, most professional programmer who has ever lived. I want everybody on the team to be writing up to that level. I don't want anybody dumbing it down because they think they need to express themselves by leaving out their semicolons. I want everybody to be writing really well. Then number three, management. If they ask for it and it makes sense, okay, we'll do that too. Now some managers might be saying, whoa, whoa, whoa, wait a minute, we're supposed to be in charge here, shouldn't we be number one? And I don't think so, I think number three is the right place for them. So if you look at what a CEO does, one of their most important responsibilities is to help the company look out. There's a tendency in a company, especially as they get bigger, that everything is focused inside. There's competition for resources and attention and promotion and all of that stuff, and you tend to forget about the customer and you tend to forget about the marketplace, and all these really, things that are actually much more important than what's going on inside. And part of the CEO's role is to remind the company of that. And you see CEOs also saying things like, be the customer and be customer-focused and all that stuff. That's, that's what that's all about. And that's very compatible with what I'm saying. I'm saying number one, the people. Then, one of the biggest problems that companies have is that their codebases are awful. That they're full of cruft, they're what they call legacy, it's just awful stuff, and that seriously impairs the company's ability to compete. That you can't make changes, you can't extend, you can't grow because the software is holding you back because of its poor quality. So software becomes a liability rather than an asset. And I think we can change that if we're writing high-quality software for the team. We allow the company then to be much more competitive, much more responsive, much better to serve the market. So suppose you're writing your own software. You're at home, and you're doing your own stuff. What are your responsibilities then? Number one's still the people. I mean, assuming that someone else is ever going to have to write your stuff, you really should write it well. Write it for them, whoever they might be. Even if you don't know who they're going to be, you should still have somebody in mind when you're writing this stuff. And then number two, the team. If you have any ambition of this stuff getting good enough that you might want to open source it, that means your team is now potentially every programmer in the world. So make it good for them. And then number three, you're management, so knock yourself out. So that brings us to the end. I can't think of anything else to tell you. So thank you all very much. But before we go, I've got one piece of advice. This is really serious and I hope you don't forget this. I'm really serious now. Please don't make bugs. Thank you. (Clapping) (Audience) We've got a couple random questions from Chad, and I'm not sure if you want to take any more questions but --- Are they good? (Audience) Well, it's just kind of opinions, opinions on Swift. I don't have an opinion on Swift. (Audience) TypeScript. I'm sorry? (Audience) How about your opinion on TypeScript? On TypeScript? I see as TypeScript as a trap. It keeps you in the classical model. You'll never figure out functions if you're locked into TypeScript. Audience Q&A (Audience) There's a question about being able to freeze data objects. You can. I don't recommend it, unless you have something that really wants to be immutable. Remember that once you freeze something, you can't unfreeze it, so you're only going one way. (Audience) And then, let's see, dates, dates in JSON. Nope, JSON's done. (Audience) Yeah, well no, okay, so you just don't have a recommended format? JavaScript didn't have one, so JSON didn't get one. (Audience) Do you use timestamp? I recommend using ISO strings. That seems to be the smartest thing. I know there are a lot of people who don't like doing smartest things, so they're free to do whatever they want. (Audience) What's your take on the JSON (Inaudible) that's been going around recently? For Java? (Audience) No, it's a (Inaudible) embraced it as being kind of the way to format JSON additives for (Inaudible), it's basically a collection of how you organize the data inside of JSON. I'm not aware of it. Anybody's free to do anything they want. (Audience) Except make bugs. No unfortunately, they're free to do that too. (Audience) Why don't you have an opinion on Swift? Have you just not looked at it enough? I've not looked at it enough. I went to get the Swift report, and it was hidden behind some Apple login thing. I said * with that, so I didn't get it, so I'm not up to date on it. (Audience) Is there a reason you don't use CamelCase and snake case? So there's this big debate. What is the correct way to do multiword identifiers in any language? Should you use CamelCase or should you use underbars? And this is another one of those things that programmers can't agree on, that there's an endless argument. And the reason there is an endless argument, why it will never end, is because they are both wrong. That the correct thing should have been to use space. That our languages only have to get a tiny bit smarter, and we could have spaces as valid characters inside of names. And then they would read better, and they'd be easier to process and all of that stuff. So, one of the aspects I'm looking for in the next language besides getting numbers right is that it'll also allow space as a component of names. (Audience) Thoughts on Angular or Angular 2? Thoughts on Angular or Angular 2? That's two completely different questions. (Audience) Either or. I've looked at Angular, and I was kind of surprised at how hard it is to make something easy. I don't get it. I hear from people who swear by it, and I think that's great, and maybe it solves the problem they have. I've not observed that it solves a problem that I have. (Audience) It's the Stockholm thing, I think, with Angular. Well the problem with all these frameworks is that once you get committed, you are stuck. You can't go, ooh, that was a mistake and back off. You're fully committed, and so you can't find out if any of these things are good or not until you commit, and so you have to be careful. (Audience) So I went from GitHub to, and you clearly have no love of CSS. There's nothing about CSS on there. Why? What? (Audience) What's your issue with CSS? Oh, you're going to bring up that old wound again. (Audience) Sorry, I, yeah. So CSS was designed for formatting technical documents. That was its purpose in life, and that is not what we use it for. We use it for all kinds of stuff that it is very badly suited for. And we use it because it's the only option, and it's better than it was before it came along because in the past, there were no options. You used to see people doing typographic spacing by having tiny little GIFs that, with the little bit of whitespace in them, which was considered a best practice in the day, but was always a terrible thing to do. So compared to that, CSS is great. But compared to what we deserve, it's awful. (Audience) You can say the same for HTML, right? Yeah, that too. (Audience) Any opinions on the Rust language? No I don't. I have read some of their material. It wasn't locked behind an Apple password, so I have read some of that. And some of it looks promising, but I don't have any conclusions yet. I've not worked with it. (Audience) What about transpilers? I mean, do you have any recommendations? So let me say this about CoffeeScript, which started this craze. I like CoffeeScript a lot. CoffeeScript makes some of the good parts of JavaScript more visible. For some people, it's easier to learn how to use JavaScript well if they learn CoffeeScript also. I think the thing I like best about CoffeeScript is that it doesn't look like C or Java or any of those other languages. It's got its own look. It used to be said that no new programming language could succeed unless it looked like C, and CoffeeScript is showing that that's no longer true. CoffeeScript succeeds because it doesn't look like C, so I like that a lot. Then the question is, should you use CoffeeScript in production? Absolutely not. No, you should not use CoffeeScript in production. It's still an experimental language. It adds a lot of weirdness. It makes it harder to recruit people because who's going to write the CoffeeScript stuff that some hipster left? It's just not worth it. So I always recommend, learn the language. Pick up a new language. And some of the transpiled languages are interesting and are worth a study, but I don't recommend using any of them in production. I think they don't add enough value to justify the weirdness. (Audience) Lots of thank yous and please extend our gratitude for the training. (Audience) Did you find any more? (Audience) Yeah, there's somebody asking, is there a reason that you disabled bug reports on your GitHub projects? I didn't do that. I have complained to GitHub that there is no --- I'm trying to think of the right way to say this. I tend to attract trolls. I don't know why this is, but there are a lot of people who really want to come onto where I am and do troll-ish behavior, which is more than annoying, it's a waste of time. I have to deal with the trolls, and I don't like having to do that. So I've asked GitHub for a no troll option, and so far they have not figured out how to implement that. So there are some things that I've turned things off, but it's not completely off. You can still get messages to me. But if you have trolled me, I put you on the block list so that may be his concern. He may have trolled me in the past and he can't troll me now. So, I'm not going to apologize for that. (Audience) Did you have any opinions on build systems? (Audience) You mentioned practices of concatenating, minifying, and all that stuff. They're saying, things like webpack or whatever. No, I don't care. Whatever works. (Audience) Yeah, that's what I figured. Not seeing any other questions. Anybody in the room? This is your last chance to get an opinion. All right. K, I think we're adjourned. (Audience) We'll conclude it, all right. Thank you. Yep, thank you.