SOLID Principles of Object Oriented Design
-
Definition
The Single Responsibility Principle is defined on Wikipedia to state that every object should have a single responsibility, and that responsibility should be entirely encapsulated by the class. Robert C. "Uncle Bob" Martin, whose very well known author and speaker on these topics, sums this up as saying there should be, there should never be more than one reason for a class to change. This motivational poster describes what happens when an object has too many responsibilities in a great way. And how that affects the usefulness of the object, take the pocket knife, this particular pocket knife has been extended to the point where it can do nearly anything, except maybe fit in your pocket. And as the poster demonstrates, just because you can doesn't mean you should. And that applies to our software design of our classes as well. This relates to the concepts of Cohesion and Coupling. We want to strive for cohesion, but also for loose coupling. So cohesion is basically how strongly related and focused the various responsibilities in a module or class are. And coupling is defined as the degree to which each program module or class relies on each of the other modules. Again we want to strive for low coupling, but high cohesion.
-
Responsibilities
Responsibilities are defined as Axes of Change. Requirements changes typically map to responsibilities. The more responsibilities a class has, the more likelihood of change. Having multiple responsibilities within a class couples together these responsibilities. Making it likely that changes in one responsibility will effect or break features that are the other responsibilities of a class. The more classes a change effects, the more likely the change will introduce errors into our system. Thus it's important to try and craft our classes in such a way that the areas that are most likely to change are encapsulated into separate classes with single responsibilities.
-
Demo: The Problem
Let's move onto a Demo that shows the problem with too many responsibilities. First let's look at a simple example, from one of Uncle Bobs articles on single responsibility principle that demonstrates the issue. In this UML diagram you can see we have a class rectangle, and this class rectangle has two operations. One called area, that will return back the area of a given rectangle. And another called render, which we'll actually draw a rectangle onto a graphical user interface. Now the render operation depends upon a GUI library, and you can see from this diagram here that its depending on this library here for the GUI operations. But the rectangle itself is being used by two different types of applications. The first is a simple geometry service, that encapsulates various types of operations and has no front end, no graphical user interface. And so it's only calling the area method on this rectangle. The other is a graphical application that might make use of the area method, but is primarily going to be drawing rectangles onto the graphical user interface, that it to depends upon. Now you can see from these responsibilities that rectangle is now being used by geometry service and geometry service will therefore have a dependency on this graphical user interface. If the graphical user interface changes, that will require rectangle to be recompiled and ultimately geometry service to be recompiled. Despite the fact that it has no knowledge of or any concerns about graphical user interfaces or rendering. An improvement to this design would be to split the rectangle into two classes, one that was only concerned with geometry and uh calculating areas and such things. And another that was concerned with actual drawing or rendering of rectangles on a screen. Looking at a more Business oriented example we have a concept of an order. An order in this case is defined as what happens when a cart is taken from you know users cart on a website or actual physical items at a point of sale store, and the order is placed, they make their purchase, they check out. And the order in this case has a number of operations that it supports. It supports check out it can also charge credit cards, it can notify the customer, and it can reserve inventory, so that online orders can verify that the inventory is available before the, the card is finally charged and the order is processed. Let's look at the code for this order implementation. You can see that most of the work is done in the check out method, in which we are going to see if the payment method is a credit card, then we're going to charge the card, using the payment details that were provided. Once we know that this has gone through, we'll reserve the inventory from our inventory system, and then finally if the customer has requested that they should be notified, we'll notify the customer that their order was processed. The rest of this class simply demonstrates some sample code that would perform these responsibilities. In this case the customer notification is depending upon using Smtp email. The inventory is using this inventory service that is being newed up here within this method. And then of course the card charging is using a payment gateway and passing in some information to actually charge the card.
-
Problem Analysis
So what's the problem with this design, well there are a number of problems. First of all, you can imagine that a point of sale application in which a customer is checking out items right there at the store, um perhaps using cash is not going to need to know anything about credit card processing. Point of sale transactions also don't need to be concerned with inventory reservations. There's not going to be a back ordered item if the customer has the item in their hand when they walk up to the checkout register. And we can assume that the store inventory is being updated separately in our system. Likewise the point of sale transactions don't need email notifications, because the customer doesn't provide it and of course a customer in a store knows immediately whether or not there order was successful. Unfortunately with this design any change to the notification logic, the credit card processing, or the inventory management will effect order and thus will also affect the web and point of sale implementations of order.
-
Refactoring to a Better Design
So how can we refactor this to a better design? The first step in refactoring our application is to identify the responsibilities that are likely to change. In this case we can see that the three things that our checkout method are doing is processing payment, reserving inventory, and sending notifications. We can break these out into separate interfaces as seen here at the bottom, where we have a payment processer with, an operation for processing credit cards. A reservation service that can reserve inventory, and a notification service that can notify that a customer order has been created. Then we can take the order class and break it up into smaller classes that have fewer responsibilities, and only use the interfaces that are appropriate for them. So for example we might identify an online order, which is created in on the website, and the website only knows about online orders as you can see from this new UML diagram with the dependencies shown. The online order makes use of all three of these interfaces, because it has to do a credit card processing, reservations, as well as notifications. However, the retail point of sale can work different order objects. In this case there could be a point of sale credit order, which would still know how to process payments using the shared interface. Or the point of sale cash order, which would not have any dependencies on any of these interfaces because none of them are concerns of a cash transactions at the point of sale. If we look at the code to see how this is done. We'll see that we start now with an abstract class order, that takes in a cart. Since most of the operations were working on are cart. And exposes a virtual checkout method, that for the sake of argument we'll say logs the order to the database as part of its processing. Then the online order, will go ahead and make use of these interfaces and new up a payment processor, reservations service, notification service. And then you can see that its check out method looks very similar to the one we started out with, in which we process the credit card, reserve our inventory, and then make a notification to the customer. A separate class for point of sale credit order, will have only the dependencies and responsibilities that concern it, in this case the payment processor. And so its checkout method will merely do payment processing, and of course the base checkout. And we see that the point of sale cash order is very simple now at this point. It merely needs to implement the class of order, and allow the base class to take over the responsibility of performing the checkout. Now what this means is that if we decide that we need to change who our payment processor is, that responsibility no longer lives within any of these order classes, it's been completely abstracted away into a payment processor class. Likewise, if we decide to change how we notify our customers that's in a separate method as well, and its only even effecting the online order a implementation. And the point of sale system will have no impact from changes to how we notify customers.
-
Summary
So what are responsibilities? Responsibilities are stated simply as a reason to change, you can think of them as differences in usage scenarios from the clients perspective. Where the client is the code that's making use of your code. Multiple small interfaces, which follow the interface segregation principle, can help you to achieve the single responsibility principle. And you can see from our refactoring that we ended up with three small interfaces that each had one responsibility. And it wasn't any concern of power order implementations, how these were being achieved. And it made it so our order class had the sole responsibility of stitching together these other collaborators in our class, and letting each of them do the job that they're responsible for. So to summarize, following the single responsibility principle can lead you to lower coupling, and higher cohesion. Many small classes that have distinct responsibilities tend to result in a more flexible design. You'll find that if you follow this principle, a lot of the times you'll have very few if statements, or switch statements, or branching logic in your code. Because you've isolated the things that would act differently into separate classes, that know how to behave in their specific environment and there context. They have only one responsibility, they do it, there's not a lot of if checking to see whether or not they should behave one way or another. Some of the related fundamentals to this include the open close principle, and interface segregation principle. Both of which are part of the larger solid acronym of best practices for object oriented design. And it's also related to the concept of separation of concerns. I recommend reading Clean Code by Robert C. Martin, which you can get at the URL listed below. And I need to provide some credits to the images that were used, for the motivational picture as well as the Single Responsibility principle article from which the rectangle example came. Thank you very much this has been a Pluralsight on demand video for the Single Responsibility Principle. Come back soon to find out more about best practices for writing object oriented software.
-
The Open / Closed Principle
Introduction
Hi this is Steve Smith, welcome to this Pluralsight video on one of the software Fundamentals. The Open/Closed Principle, which represents the "O" in the solid. The outline for this module will include a definition of Open/Closed Principle; we'll look at the problem that the principle represents and solves, we'll see an example of what happens when one does not follow the Open/Closed principle. And then we'll examine what's wrong with that example and finally we'll refactor in order to apply the Open/Closed Principle to that example. Finally we'll have a quick summary and we'll look at some related fundamentals.
-
Definition and Overview
The Open/Closed Principle states that software entities such as classes, modules, functions, etcetera should be open for extension, but closed for modification. The specific example is from Wikipedia. This slide demonstrates how the Open/Closed Principle can apply in a graphic notation. It says here below, open chest surgery is not needed when putting on a coat. Likewise when you extend your software you should not need to go and dig around in its internals just to change its behavior. You should be able to extend it by adding to it in a, with new functionality and new classes, new functions. Without the need to change your existing classes and functions, while still achieving new behavior. So the Open/Closed Principle offers up this sort of conundrum in that, it's stating that things should be open to extension. New behavior, changes to behavior can be done in the future, but yet closed to modification. What does that mean? That means you shouldn't need to change the source code or the binary code, you shouldn't have to recompile the existing pieces of your application a necessarily in order to achieve this. Dr. Bertrand Meyer originated the term Open/Closed Principle in his 1988 book Object Oriented Software Construction. So how do we do this? How do we change behavior without changing code? The key is to rely on abstractions. Once we start to rely on abstractions in our code, there is no limit to the number of different ways we can implement that abstraction, and thus no limit to the number of ways we can change the behavior of the code that's using these abstractions. So what do we mean by abstractions? Well in .NET abstractions include Interfaces, as well as Abstract Base Classes. In procedural code we can also achieve some level of a the Open/Closed Principle using parameters. Let's look at an example of some code that has been written in such a way, that it is not closed to modification nor is it open to extension without having to hack at the actual source code.
-
Demo: The Problem
Let's look at this example here where we have a commerce solution that includes a cart. For instance for a shopping cart. The shopping cart has a list of items; each item is of type order item. If we look at order item we'll see that it simply represents an item that one might place in the cart, which has a quantity, and a stock keeping unit, or SKU associated with it. The cart has this important method called total amount that is used to determine what the total price of all items in the cart should be, and because this is a PricingCalculator example, you can imagine that there are different types of rules that determine how various order items within that cart are totaled up to reach there final total amount that the cart represents. In order to make this easy for a demo we've simply stated that the SKU is what determines the rule. And so if the SKU starts with the word each, then that item is going to be exactly five dollars times the quantity. So we see the rule is simply order item quantity times five. However if it is rather the weight then we're going to take the price as a unit price per Kilogram. And so we'll take the order item quantity which we've decided for some reason is in grams, and we're going to say that its four um four dollars per kilogram. Finally we have this concept of special as a third rule for the SKU, special items only cost 40 cents apiece, however they are three for a dollar. So what that means is if you buy one its 40 cents, you buy two its 80 cents, you buy three it's a dollar. And the way this has been implemented is to simply provide a discount of 20 cents for each set of three that, that comes along. Now you can imagine that these rules have been come up with by the business user, the owner, uh whoever it is that is ya know is trying to maximize the sales of this particular store, and we know at this point that there are more rules coming. So this is a valid time for us to be looking at this and thinking, how can we change this code in such a way that we don't have to go in and edit this particular method, every time someone comes up with a different way to price the items that are in the cart. Now it's important to before we make any changes to this sort of code that we look at some of the tests that are applied to it. And of course before we do any refactoring, ideally we would have tests that represent the current state of this code. And so we have over here our unit tests class that we can run, that will verify that this is doing what we expect. So let's go ahead and run the test. and we'll see that they all pass, and so this is telling us that the cart total should return 80 cents with two special items, two with a half a kilo weight item, two dollars with six special items, five with one each item, and 0 when empty and all these pass. And you can, if you look through each one of these tests, you'll see that they are very straight forward, we have a cart that's set up with a class level, and is newed up before each test in our setup method, and in our first method simply says that our cart is 0 when its empty. So when it firsts starts we haven't add anything to it, and we assert that it has a value of 0. If we add an "EACH_WIDGET" with a quantity of one, we expect it to cost a total of five. If we add some "Peanuts" that are valued by weight, with 500 grams, we expect that that's going to be four times half a kilo is two, so we end up with two dollars. We expect 80 cents if we buy two of the special candy bars. And that's what we get here. And we expect it to be, um with the total of six of them, that's three for a dollar, two three for a dollars makes two dollars, so we expect to get two dollars in that case. Now if we think about how we would add additional rules to our cart for additional items here where for instance maybe a its buy four get one free or something like that. Now we have to add yet another else if to our block here in order to achieve that kind of a rule. And very quickly this type of thing gets out of hand. And so we'd really like to apply the Open/Closed Principle at this point, to make it so our cart is much more flexible and maintainable and does not need to have surgery performed on it, every time someone comes up with a new pricing rule within our model.
-
Problem Analysis
So the problem with our cart is that adding your rules requires changes to the calculation method with each new rule. Now every change that we add can introduce additional bugs and requires retesting, plus we've a tightly coupled our pricing logic with our cart. So that anytime we want to be able to test some new pricing logic, we have to set up a cart and actually test the cart, rather than just testing a pricing rule. Ideally we want to avoid introducing changes that cause cascades through many different modules in our application, and in this case if there were many different things that depended on how the carts behavior worked, then each change to the pricing within that cart could potentially cause this sort of change throughout our application. In your own applications you're going to want to look for things that tend to be tightly coupled to many different parts of your app, and ideally you want to remove those dependencies by introducing abstractions that can separate those from different areas of the application. This is uh something that's called, introducing seams within your app where you can separate different parts of your application from one another. Ideally when we are adding to our software if its complying to the Open/Closed Principle, the way that we'll be introducing new behavior is through the writing of new classes. And you'll find that if you follow the Open/Closed Principle often times when your adding additional behavior or changing the way your software works, you'll be doing so by adding multiple new classes, and as a consequence you'll typically have smaller classes. Each one is doing a very tightly focused job, and following the single responsibility principle as well. The benefit of adding new classes, in terms of how easy that code is to write and to design, and to test, is that by definition nothing depends on these new classes yet. They didn't exist previously in your application so none of the code that is already there knows about these classes and can have some dependency on it, that you'll be breaking perhaps inadvertently when you change that code. Also new classes don't have any of the baggage that your existing app may already have, in terms of dependencies on infrastructure, or other hard to test classes. So you'll be able to start fresh with this new class and make something that is very straight forward, simple, easy to design, and hopefully easy for you to test. Now there are typically three approaches to achieving the Open/Closed Principle that I have found in my experience. The first one is really more of something that you would be able to do in a procedural programming language, and that is to use parameters. And so by exposing parameters in your application or in class, or in your function, you allow the client to control behavior via the specifics of that parameter. Now this typically involves setting some kind of state to a function or to a class, for instance a string that has some information in it, and based on that information your class will behave a certain way. But realize to that you can combine this approach with delegates and lambda expressions so that it can be very powerful at actually changing how the class or the function behaves, and what it actually does. An example of how you can use parameters to achieve the Open/Closed Principle that I like to tell, is imagine that you have a digital camera and you've written a C# program that pulls all the files off of your digital camera and drops them into a folder on your C drive, you've decided, or I've decided since my name is Steve that I'm going to put all these files into a folder called Steve's pictures. And I decide to just hard code that into my C# program. Now this program works so well that I decide to post it up on my blog, and a whole lot of people start to use it, but I is soon start getting complaints that the pictures all end up in a folder called Steve's pictures, and not in some folder that they might prefer them to be in, especially those people out there that aren't named Steve. Now the easiest way for me to change my software and allow it to go to whatever folder the user wants to specify is to introduce a command line parameter, where they could specify the output folder for this program. If I don't do that then the only people that will be able to use my software in a way that they would like to use, are developers who are going to have to open up that code, change the a string where I set the folder name, rebuild my application, and then run their application, such that it goes to whatever folder they've set. Now if they are smart they'll introduce a parameter, but if they don't and there is a guy named Bob that wants to have his pictures in Bob's pictures, he could in fact just go in change that one line in my C# file, rebuild it and now he's got a perfectly usable program, that always drops the pictures into Bob's pictures, and he's happy. However the fact that he had to change the source code means that, that was not in fact following the Open/Closed Principle. Now the second way that we typically want to approach the Open/Closed Principle in an object oriented programming language, is to use inheritance and specifically a pattern called the template method pattern. And we'll talk more about the template method pattern in other videos, but suffice to say that this is a pattern by which you can create the default behavior in a route class, often times you'll create a series of steps that need to be followed in a particular order. And then within your child types you can override what each one of these steps might do, by overriding that template method. The third approach, which is the one that we are going to look at now involves composition in addition to inheritance and it uses a design pattern called the strategy pattern. With the strategy pattern, the client code, the code that's calling the behavior depends on an abstraction. And this provides a sort of plug in model, where the actual work that's being done is defined in a class that gets injected into the actual a class doing the work. So with this particular approach we have the implementation utilizing inheritance because it inherits from an, a base class or an interface. And then the client utilizes composition, because rather than using inheritance itself to achieve this new behavior, it exposes a way for other classes to pass into it an implementation which it then sets on a field in its class. So now let's look at how we can refactor our pricing calculator in that shopping cart, so that it has a better design, and maybe conforms to the Open/Closed Principle a little bit better.
-
Refactoring to a Better Design
Let's have a look at how we can refactor our shopping carts so that it takes advantage of the Open/Closed Principle. (typing) The first thing we need to do is identify the fact that there is some other job that this cart is doing, and that job is pricing calculation. So we need to define and interface that defines, what is that pricing calculation, you know what does it need to do and how should it behave. So what I've done is I've created and IPricingCalculator interface that simply is able to calculate the price for a given order item. With this in place now we can go back to our cart, look at its total amount, and see that when we've looped through each order item and we sum up the total, all we need to do is take our pricingCalculator and add the CalculatePrice to it, such that we are summing up each of these rules. This replaces the logic we had before where each if basically represented a different rule, so we had an each rule, a weight rule, we had a special rule, and of course there is more rules coming. So the pricing calculator has a very simple interface, which we implement in the PricingCalculator.cs and here is where we are going to new up a list of pricing rules and how they get evaluated. So when we call CalculatePrice all it's going to do is run through each rule, see whether or not it matches, so this lambda expression says, find me the first pricing rule where there is a match with the current item, and call its CalculatePrice method. So what is IsMatch? And what are these pricing rules? Well here again we had to introduce an interface. And in this case that interface is IPriceRule, there were two pieces of each one of those rules in our original cart. The first one was, how do we know whether or not this particular rule applies? And that was determined by the logic within the if statement. The second rule was the actual use of the amount in the equation that was used in order to determine the unit price or the total price for a given quantity of units, given that rule. So if we look at the price rule, We've abstracted out those two things into two separate fields, one is a Boolean that determines whether or not this price rule matches a particular order item, and the other one calculates the price. Now you could also achieve this same sort of behavior, by just using CalculatePrice if you had some magic behavior that said that if the rule doesn't apply we are just going to return to 0, or negative 1 or something for calculate price. Personally I prefer to be more explicit about this, and make it very clear that this rule applies or does not apply to a given order item. Now that we have our abstractions in place and we have a PricingCalculator that goes through our list of rules, and applies each ones price, we need to actually implement each one of these rules. So the easiest one to do is the EachPriceRule, and if we look at this particular rule we see that the IsMatch simply takes the exact expression that we had inside of our if statement and returns it, so we're going to return item.sku.StartsWith("EACH") and then were going to call CalculatePrice, and here now were going to return simply the body of the if statement. So item.Quantity times 5. If we look at the PerGramPriceRule you can see that it's very similar the body of the if the if expression is the IsMatch and the contents of the if when it is true, represents the CalculatePrice. And finally the special price three for a dollar, is identical we've pulled out the data that was inside of our if statement and we've applied it here. So let's make sure our test still pass We'll come down here and well change which one of our carts were going to use, to use the refactored cart And I think that's going to have to be Model.RefactoredCart there we go, and we'll make this the same Model.RefactoredCart alright, so now we are using the cart here that we just changed. We'll run all of our tests again, and you can see they all passed once more. So what if we wanted to add a different rule. We mentioned that perhaps there is a test method that we want to see, where we have a pricing rule that says that if you buy four you get the fifth one free. Now just for argument sake let's say that each one of these items cost a unit of one, so we'll say that the cart total should return four dollars with four buy four get one free items. And so were going to say cart.Add(new OrderItem) we set our quantity as four. And our SKU, we haven't defined what SKU this is going to be yet, so we are going to say buy four get one (B4GO) instead of BOGO, and what should these be we'll say these are apples. Alright once we've done that we are going to assert that this in fact returns back four dollars and this will be our Cart.TotalAmount. Now we're going to run our tests, and our test fails because of course we don't have a rule that matches this yet, so how would we go about introducing a new rule that matches buy four get one and gives us back four regularly, but otherwise is going to give us the value that we want so that when we buy the fifth one, it also costs four. Well the first thing we need to do is create a new PriceRule. So we can come to our PriceRule and create a drive type, and this PriceRule is going to be called, buy four get one free PriceRule. And of course we are going to want this to be public and were going to need to implement these things, so this will be a match return item.Sku.Startswith(B4GO) B 4 get one. And that looks pretty much like our each price rule right? Yep, and on that same note lets go ahead and pull out the each where we said times 1, and in the spirit of test driven development were going to just do the simplest thing that could possibly work, which I think I've done now to get that test to pass. So when I run this, oh I need to actually add this rule so we need to buy four get one free, we need to add this to our PricingCalculator. So PricingRules.Add new that. Billed, alright, run our tests. And look it works, so of course we haven't actually tested that it does what it should do when there's five items, and so that will be our next step. Where we'll say four dollars with four buy four get one free items, well we think it should be four dollars with actually five of those items as well. So if we change this quantity to five everything else should still work, actually it should fail because we haven't implemented our rule yet. And we get four dollars with five buy four items failed, it actually was five dollars that it returned. So we need to go back to our rule, and while we're here lets go move this to another file. And its buy four get one rule, it's great that we want to do that , but it turns out this rule is actually very similar to the three for a dollar rule. So if we pull out that, we'll just snag that code real quick, and jump in and drop it in here. And what it turns out is that we can just take the total as quantity times one. The sets of, in this case a five equal's item quantity divided by five, and then any time we actually get that we want to just take the sets of five minus one and that will achieve our rule. So it's basically buy four get one free is the same as saying that you get a dollar off every time you buy five, where a dollar happens to also be the cost of each unit. So we'll build this and run our tests one more time, and you can see that they all passed. Now in this case, the way that we achieve this, we didn't have to touch the cart. Our implementation of total amount remained unchanged, our PricingCalculator, we added one item here in its constructor where were setting up these rules. But you could imagine that these rules could easily be passed in through some kind of configuration or database such that we wouldn't have to touch the PricingCalculator itself. The logic of the PricingCalculator, this CalculatePrice method remain unchanged. So the only real new code that we had to write in our buy four get one free rule was in this new class that did not previously exist, where we implemented its interface using this code here. So when do we apply the Open/Closed Principal? It's important to not to just willy nilly try and add abstractions everywhere within your code, because the result is going to be something that is very difficult for anyone to follow, and it be more complex then needed. The first thing that you should think about is what does your experience with this particular type of problem tell you? If you know from your own experience in the problem domain, that a particular class of change is likely to happen. You can apply OCP up front in your design. However, if you don't know this and often times if your working on in new problems domains, you're not going to know. You should follow the fool me once shame on you, fool me twice shame on me practice. Where at first you should just code up the simplest thing that could possibly work. Hard code the values if you need to, put the logic inside of and if then statement, do something that's simple and works, make sure you test it. If the module changes once, do the simplest thing that can make it work. Add an L statement to your if statement if that's all it takes, and continue on making sure again that you've got tests that demonstrate the behavior. However once it changes a second time, now you're at the point where you know this is something that's volatile. It's, it's shown that it has a propensity for frequent change. It's time to refactor it to achieve the Open/Closed Principle. And the way you do that is by finding an abstraction, creating an interface, and then extracting out the if then logic, or switch statement logic, into separate classes where each one represents a particular node in that decision tree. Remember TANSTAAFL, There Ain't No Such Thing As A Free Lunch. Implementing the Open/Closed Principle will add complexity to your design, and you cannot have a design that is closed against all changes. So you want to make sure that you choose those types of changes that are actually likely to occur, for your design to be closed against. Otherwise you'll have added complexity with no benefit. So to summarize if you write your software as such that it conforms to the Open/Closed Principle it will yield flexibility, reusability, and maintainability in your application. You want to try and know which changes to guard against, and of course resist premature abstraction. There are some related fundamentals to this, the Single Responsibility Principle, which is also part of solid. As well as a couple of design patterns, the strategy pattern which we saw utilized here, and which is explained further in other videos. And the template method pattern which I mentioned, but which you'll have to go watch another video in order to sees how its applied. As recommended reading and I highly recommend the book Agile Principles, Patterns, and Practices in C# by Robert C. Martin and Micah Martin. And to wrap up I'd like to include some credits for the motivational poster on the Open/Closed Principle that we saw in this presentation, and I'd like to thank you very much for your time, and I hope that you'll continue to learn from us through additional Pluralsight on demand videos.
-
The Liskov Substitution Principle
Introduction
Hello, my name is Steve Smith and in this Pluralsight on demand video we'll be discussing the Liskov Substitution Principle. One of the software fundamentals and the "L" in the solid principles of object oriented development. In this brief course we are going to define what the Liskov substitution principle means. We'll identify the problem that it intends to solve, we'll show some examples of how violations of this principle can become problematic in your code. And then we'll refactor this problem so that we are able to apply this principle and eliminate these problems. Finally we'll look at some related fundamentals and additional resources; let's get started.
-
Definition and Overview
The Liskov Substitution Principle simply states that subtypes must be substitutable for their base types. This principle is named for Barbara Liskov, who first described the principle in 1988. This motivational poster sort of drives home the idea, that if you have an abstraction that looks like a duck and quacks like a duck, but needs batteries, it's probably not the right one. So, in this case you could say that the rubber ducky that requires batteries, is a duck, but it certainly isn't substitutable in all places where you would expect to have an actual duck. In order for substitutability to work, the child classes must not remove behavior from their base class. Nor should they violate base class invariants. In general calling code should not know that there is any difference at all between a drive type and its base type. One of the things that many software developers learn early on in their education, when they learn about object oriented development, is the use of the IS-A relationship to describe inheritance. It's very common to say that a particular class IS-A whatever base class it is, for example one might have an employee class that IS-A contact, that IS-A person, or you might have a square that IS-A shape, or a car that IS-A vehicle. This is a very common tool, and what the Liskov Substitution Principle suggests is that rather than simply considering whether or not some noun is another noun, you should instead consider whether or not it is substitutable for, that other noun in all situations where one might expect it to be. This is most easily seen through the use of an example, which we'll get to in a moment. Now one of the things that is important when we talk about the Liskov Substitution Principle, is this concept of invariants. Invariants are things that have to do with the integrity of your model, that your classes represent. So they consist of reasonable assumptions of behavior by clients. By other classes that make use of your class. These can often be expressed as preconditions and post conditions for methods, and there not necessarily a indicated within your code. Frequently unit tests can be used to identify what the expected behavior is, for a given method or class. And these unit tests should fail if that behavior is broken or changed, by a subtype that violates it. There's a practice called design by contract, which is a technique that makes defining these pre and post conditions explicit within the code itself. And there are a number of ways that you can use design by contract with C#, but those are beyond the scope of this particular session. In order to follow the Liskov Substitution Principle, drive classes must not violate any of the constraints to find, or assume by clients of their base classes. Let's look at an example of how Liskov Substitution principle violations can cause problems.
-
Demo: The Problem
Using some simple geometric shapes. This demo uses two simple shape related classes; at the top of the screen you see we have a rectangle class which defines a height and a width. And then we've derived square from rectangle, because of course a square is a rectangle, and we are taking advantage of inheritance here to override these virtual properties and instead replace them with our own properties, such that whenever we set the width it will set both the width and the height. And likewise if we are to set the height the setter will set both the width and the height there as well. We've also defined an area calculator, an area calculator knows how to calculate the area for a rectangle, or for a square, and in this case the two are very similar of course. And so we've also written a few unit tests, and so our first unit test simply says that our calculate area method should return six, if we give it a two by three rectangle. And so we set up our test with a two by three rectangle, call area calculator, lets run this one, see if it works. And it passes, likewise we expect that we can return nine from calculate area given a square with a height of three. And again if we run this test, you can see nine for three by three square passed. But then we come to this third test, and this third test suggests that if I have a rectangle, and I give it a width and height of four and five; I should expect and area of 20. In this case we happen to get this rectangle from an instance of square, but that shouldn't have any bearing on our use of a rectangle from that point forward. So, the problem that were going to see in a moment is that this square type that we are using is not in fact substitutable for rectangle, where we would expect this invariant to hold true. So if we run this test we will not surprisingly see that our assertion failed, and that where we expected 20 we actually got a 25. So the problem here is clearly that our square; although, in geometry a square is a rectangle. It is violating the behavior of a rectangle, which simply states that when one sets the height and width of an actual rectangle, these should not have any effect on one another. It should be possible to set the height independent on the width and vice versa. Our implementation of this square class has broken that expectation of clients, it's a reasonable expectation for the behavior of a rectangle. There is another problem with this design that we should look at, if we take a look at the area calculator class, its violating a principle called tell, don't ask. In this case what's going on is, it's asking its parameter, the rectangle for its height and its width. And it's performing some algorithm on it, in this case simply multiplication. Likewise it's asking a square for its height so that it can square the height. The problem with this is that we've got behavior now decoupled from state. The state of the rectangle, its height and its width, is being contained within the rectangle class and the behavior of the rectangle, its area calculation, has been moved to this area calculator. Now, you might think that the Single Responsibility Principle states that a rectangle shouldn't need to know about how to calculate its area, but it's also perfectly valid to argue that our rectangle lacks cohesion. Because operations that are wholly dependent upon rectangle, for instance calculating its area, have now been moved out into this separate class, which can't exist on its own. This class only works if it's able to collaborate with a rectangle. So in this case it might be worth considering a design change; that pushed the responsibility for calculating area, into the rectangle class or the square class as appropriate. Now there are several ways one can do this, typically you would expect that you would put that logic for calculating area, into some kind of a base class, because it's common to many different types of geometric shapes. But let's consider if we don't do that, if we take our simple rectangle, and we have an abstract class shape, that it's going to drive from, but our abstract class does not in fact define an area method. Then we can push an area method onto rectangle that produces a height and a width, we could also define a square that inherits from shape, and return its area using this side length property that we've defined. And then we could write our tests such that they work correctly, and these are the same three tests that we had before for the most part. So we've got a two by three rectangle, where were now calling its area method. We've got a three by three square, calling its area method, and then finally we have this last tests that is actually going to calculate 20 from rectangle, and nine for three by three square. And let's just run them to prove that they are working currently, and you can see that they are. But let's look at how we implemented this, because this is actually not something that we want to be doing. First of all I created a list of shapes so that we could polymorphically enumerate over a set of shapes, and then do something with each one. In a way that didn't care about the particular type of shape that it was. Likewise I created a list of areas just so I have something to check when I'm done with my assertions, so I'm going to check what the first area calculated was. And expect it to be 20, and I'm going to check the second area and expect it to be nine. And that because I've passed in a four by five rectangle as my first item, and a square with a length of three as my second item. Now in order to achieve this I had to a investigate what the type was of each one of the shapes in my list. So when I enumerate through my collection using this for each loop I can then go in and say, well if my shape is of type rectangle, then call rectangles shape area method. And if it's of type square, then in that case call the square, area method. And I'm having to a direct cast here, so you see that this works, but it's not maintainable. The next time I add another shape and I come down here and I say okay let's add a new public class triangle of shape, and we give it something like a public int base,(typing) and a public int height, and we can say public int area. Return .5 times base times height, of course this stops working with int's, but you get the idea. At this point now in order to get my method here to continue to work with any type of shape, I would have to go in and add triangle to this if check. And you can imagine that as I implement multiple different shapes, this would continue to get out of hand. So this is causing an Open/Closed Principle violation, because my code here in this for each loop is no longer closed to modification. I'm going to have to open it up and edit this if statement block, for every new type of shape that I define, and I'm going to have to do that in every location in my code, that has to rely on this kind of behavior. So it's going to end up causing maintenance issues in the future.
-
Problem Analysis
So we see that the problem with these Liskov substitution principle violations is that the none substitutable code breaks polymorphism. The code can no longer be used as if it were its base type. Also client code should expect for these child classes to work in place of their base classes, and they no longer can do so. And finally we saw that fixing substitutability problems can be done by adding if then, or switch statements, but this can quickly become a maintenance nightmare, as it violates the Open/Closed principle.
-
LSP Design Smells
Here are a few of the smells that you should look for in your code that could be indicators of a Liskov Substitution Principle violation. In this case we've got something similar to the example we just showed, where were iterating over a collection of a particular base type, in this case employee. And now we're going to test to see if that employee is a manager, and if that's the case we'll go ahead and call PrintManager, otherwise we'll call PrintEmployee. You can imagine that as we added additional classes of employees we might have to further break up this if statement and add additional cases to it, thus making our code more and more difficult to maintain. It would be better if the manager knew how to print itself or if the PrintManager were able to do the work within a single print method; a regardless of which type the employee was, so this if logic could be removed. Another smell that you'll find is if you have a child type that inherits from a base class or interface, but does not fully implement that interface. So in this case we have an abstract base class called base, which has methods one and two, and then we have this class child, that's decided that it really only wants to implement method two, and it's going to leave method one throwing and exception. This kind of thing is fairly common; it can cause issues if clients of your code are expecting you to have fully implemented the interface, and in fact you've only implemented it partially. If you control all the codes yourself this may not be an issue in your case, but it is something to keep and eye on and be aware of. And it's something also that you might consider adding a unit test for, just to prove that you do in fact expect this not implemented exception. So that a user of your interface will or your actual implementation will know that that's expected behavior. Another thing that can help fix this kind of issue is if you follow the interface segregation principle, which we'll cover in a later episode. The interface segregation principle suggests that you use smaller well factored interfaces that are suited to the client code, rather than larger interfaces than necessary. And so if this were the case where in fact you only needed method two of this base class, you could refactor that base class so that it only gave you the methods that you needed. And specify interface, either a base class or an actual C# interface type that would only include the methods and fields that you require. Let's look at how we can refactor our shape example in order to achieve a better design.
-
Refactoring to a Better Design
Recalled in our first design, we were unable to calculate the area of a rectangle correctly because, when we passed it a rectangle that was actually a square, it treated the object as if it were square and set both the sides to the same value. Then we considered the lack of inheritance version, where we created a new class called shape, that square and rectangle both derived from. So we broke the IS-A relationship between square and rectangle and replaced it with one, that said that both a rectangle and a square are a shape. However in this case we didn't tweak it quite far enough because we didn't move the behavior of area into that abstract class, and therefore we weren't able to use it polymorphically. But instead we had this nasty smell here where we have to check the type of each one, and then call it's particular area method. We can fix both of these problems by applying a little bit of intelligence to our design, and simply creating that abstract area method on shape. Now that we have that method we can implement it via override, on both rectangle and square. And then were able to run those same tests that we saw before. Here we have one where we take a shape from a rectangle and we expect to get back 20. And in this case we do, and in likewise if we take a list of shapes and we deal with it polymorphically, by enumerating through it, our code is much simpler now. There is no if then logic here at all, we are simply adding up each shapes area, and we're not asking each shape for its height and its width, or its side length, or whatever by some third party calculating code. Instead we just are delegating to that shape and say, calculate your area, let me know what it is. If we run this we will also see that it's now working. So by applying the refactoring such that we take rectangle and square and break them apart so that square no longer is a rectangle, since they are not substitutable for one another, and then moving that behavior into shapes such that we can access it polymorphically on each object itself. We're able to get a design that now follows the Liskov Substitution Principle, and also is going to follow the Open/Closed Principle because we'll be able to add additional shapes and add them to this collection and not have to touch this code that calculates the area of multiple shapes. So when should you take the time to look for and fix Liskov Substitution Principle violation? If you notice obvious smells like the one I just showed, either where you're doing if then checks within a polymorphic enumeration of a collection of types. Or where your inheriting from a type or interface and not fully implementing it. In either one of those cases you may want to consider fixing that violation, by using a better interface or simply fully implementing it, or refactoring your code such that a different base class is used, that does in fact offer substitutability. Otherwise you can use the same rules as for the Open/Closed Principle, which is that if you find yourself having to change the code more than once or two times, then it's time to refactor it. Such that it satisfies Open/Closed Principle and also therefore the Liskov Substitution Principle.
-
Tips
A couple of last tips, remember tell, don't ask. You should prefer not to interrogate objects about there internal state. Rather if you find yourself doing that it's a good idea usually to move that behavior into the object in question. Or possibly extract out and object that has the state and the behavior collected together. Rather than asking you know, let me know this property, let me know that property, let me know the other property because I want to do some operation on them. You should just tell the object, run this operation, and it should use its own internal state to do so. You should also consider the refactoring to a new base class, when you have two types that seem related, but you are not able to substitute them one for another as we saw with the square and the rectangle. In that case you can create a third class that does in fact allow each of them to be substituted, and move both of the base classes such that they derive from this new base class.
-
Summary
To summarize, conformance to the Liskov Substitution Principle allows you to properly use polymorphism in your application and will produce more maintainable code. You should remember that is-substitutable-for is the preferred relationship that you should be looking at when you consider inheritance. Rather than simply that IS-A relationship that's so commonly used. Some of the related principles to the Liskov Substitution Principle include polymorphism and inheritance, of course. As well as two other solid principles, the Interface Segregation Principle and the Open/Closed Principle. For more reading on this subject I recommend the Agile Principles, Patterns and Practices in C# book by, Robert C Martin and Micah Martin. Which you can get at the url shown on your screen. And I need to offer some credits for the image of the ducks that I showed, and then I would need to thank you for your time and I hope that you'll return back to Pluralsight on demand to learn more about Software Fundamentals in the near future. Thank you very much.
-
The Interface Segregation Principle
Introduction
Hi, this is Steve Smith, and in this Pluralsight on demand module were going to look at one of the fundamentals of software development, which is The Interface Segregation Principle. The Interface Segregation Principle represents the letter "I" in the solid acronym of Principles of object oriented design. And applying it can help you create projects and applications that have fewer hidden dependencies, and are more cohesive and easier to maintain. In this module we'll begin by defining The Interface Segregation Principle, and see why it's a problem when it isn't followed. We'll look at some examples of classes that have too many dependencies, because Interface Segregation Principle was not followed and these classes were forced to depend on fat interfaces. Once we've analyzed the example and found what the problems are, we'll refactor it in order to apply the Interface Segregation Principle and result in a better, more maintainable design. Finally we'll look at some tips for when and how to apply the ISP Principle, and finally some related fundamentals and wrap up. So the Interface Segregation Principle basically states, that clients should not be forced to depend on methods that they do not use. This comes from the excellent book Agile Principles, Patterns, and Practices in C#. The corollary to this is of course is that you should prefer small cohesive interfaces, to fat interfaces. I really like these motivational posters, you can actually put them up in your team room if you like, or in your cubicle. This one is the Interface Segregation Principle of course, and says "You want me to plug this in where?" The idea here is that we have a very small dependency that we require, which is this upper left USB cable here, but instead of being able to simply plug that into a USB port on a computer let's say. Instead were being forced to haul around this large dependency, represented by whatever this widget here is, that has knobs and switches and buttons, as well as multiple USB ports. And is basically way more then we need, it does provide us with a USB port that we can plug-in to, but it also requires a whole lot more stuff that we don't actually care about. So when we talk about the Interfaces that were supposedly segregating with the Interface Segregation Principle, it's important for us to further define what we mean by an Interface. Of course in C#, an Interface is a reserved word, and represents a non implementable type that specifies a public set of methods and properties, that must be implemented by--- anything that chooses to implement that interface. However it's also the public interface of a class, so any class, any type, whatever it's public interface is, whatever its public methods and properties are, if these are things that are used by some client, and it only requires the use of some small subset of those things, it's possible that you would end up with a better design if you were to segregate that class in some way that made it so the client didn't need to use as much of it. An example of this would be the classic design pattern, the façade pattern, which basically lets you take a large class or a set of complex classes, and replace them with a much simpler class that offers only the subset, or interface that the client actually needs. Let's go ahead and look at an example, showing how violating the Interface Segregation Principle results in a worse design.
-
Demo: The Problem
Let's take a look at a real example first; this is the ASP.NEt membership provider that ships with ASP.NET2.0. So the base type here is membership provider, which is defined in system.web.security.membershipprovider. And this is what's used out of the box for things like the sequel membership provider, that many websites are built on today, and allow you to leverage the login controls and forgot password control, and other things out of the box, with ASP.NET2.0 or later. One of the common criticisms of the membership provider though, is that it has a very fat interface. In fact it has quite a large interface, if we look at this very, very simple implementation of the membership provider, where we have not yet implemented anything at all, you can see that just to fill out this set of non implemented exceptions for each method or property is over 100 lines of code. Now if we look at some of the classes that utilize this a this interface, this membership provider type. We can consider the login control from again ASP.NET world and here we have the system.web.ui..logincontrol. And specifically we want to look at this method here, so we have a method that's absolutely named authenticate using membership provider. And this calls a little utility, called loginutil.getprovider, which fetches the current membership provider. And that method is shown here, you can see it's not doing anything to special, it's just grabbing whatever the currently configured provider might be. And in the part that were interested in is right here, once it's gotten a membership provider, this type here that's returned by the static method. The method that it calls is validate user, now if we look at all the rest of this class we will find that this is in fact the only reference to the membership provider that the login control has. Which makes sense because obviously the login control is a very cohesive control it's very tightly focused, it has a single responsibility pretty much of validating the user. And so it would make sense that it would only need to call this one method, and that it would only need to be concerned with passing that method, the two things that it requires, which are of course the user name and the password. But if we want to leverage the login control in our ASP.NET website and we want to do so with our own custom membership provider that talks to something other than sequel, or you know we write our version of the sequel membership provider, or we talk to Oracle, or we talk to Facebook whatever. We are going to have to implement the membership provider interface. Because the login control depends on membership provider and not on, for example an Ivalidate user interface that would be specific to what it actually needs. Now moving away from this example of ASP.NET membership, because it would be rather time consuming to actually work with this large interface, were going to work with another example that's still very practical. In many of my applications, I use a configuration section in order to define some of the common settings. I think most developers in .NET do this as well. If we look at a very simple implementation called about page, that you could imagine might exist in an ASP.NET application with some modification of course. You can see that it's taking in a text writer, and than its going to basically write out the current application name, and who wrote it, the author name. Now it's not going to hard code these things, it's going to do this in such a way that it's reusable and configurable, and so it's going to use this configuration settings class in order to achieve this. Now of course this application does a lot of things, so this configuration settings class has grown somewhat large over the course of time of building this application. Let's have a look at that class. So the configuration settings class inherits from configuration section, which means that we can use it to read from the web config or app config file of our application. It has some boiler plate code in here whereby it exposes a static settings property. Which is of its own type, and when called will go ahead and give you a loaded up version of this setting. We also take advantage of some nice features with the attributes here where we can specify that some of these configuration settings are required, and these will actually be enforced at run time, when someone attempts to retrieve one of these properties. The actual implementation of all this is beyond the scope of a talk on the Interface Segregation Principle. But if you are using .NET its worth learning how this stuff works with the configuration property. So if we look at a summary of these types we'll see that we've got an application name, an author name, a cash duration, database server name, database name, database user name, database password, and a web service based URI. These are all the things that represent my settings for my application. The way this has been implemented it also is using a configuration file, so you can see that I've defined my new section here, called configuration settings. And then I have all of those settings specified here with their default configuration. Now with this in place, everything actually works, I can go into my about page tester, called about page should, and I can say about page should display the application name. And I if I run this, we should get a green light and we do. So what this is basically saying is that I expect to get Interface Segregation by Steve Smith as my output, when I call about page.render, and I don't pass it in anything. So we have a default constructor, but we have a note here that says this is hard to test, we have to have a app.config in order to test this. If I go and I take my app.config and I remove it or I change the name of anything in here, so I change my name to Steve Smith 2, and then I want to rerun my tests. We find that this is going to be very brittle. Because now were going to see that it failed, Steve Smith was expected, but we actually got Steve Smith 2. So a problem with this is that when you start relying on these configuration files throughout your code two things happen, one you've got a dependency on a file which means your unit tests are much more brittle, and more difficult for multiple different users to have working correctly on their own machines. And the other issue is that now we've got a bunch of stuff that we had to set, even though we didn't need it. So for instance in order to do all this testing of the about page, I had to go into app.config and I had to also put in settings for cash duration, database server name, all this other stuff that I really don't care about at all for this test or for that class. But if I go look at configuration settings, I'm going to find that those things are all required, so if I didn't set them I would be getting an exception at the time when I went to test it. And we can show that real quick as well if we just delete one of these. And then change my name back to Steve Smith, run our test one more time. You can see that now were getting another error that says that we threw and exception and if we look at that exception we'll see that its telling us that the required attribute cash duration was not found. So again because of our dependency on this fat interface of configuration settings, we are now getting hosed by these dependencies on things that we really don't care about. Let's go analyze what some of the problems are with this.
-
Problem Analysis
So the problem is that we have a client class like the login control, that needs simply the validate user method of the membership provider. But unfortunately, it's being forced to use this, this massive membership provider API, this fat interface that has one thing it needs and thirty things that it doesn't need. Likewise we have an about page example, that simply needs and application name and an author name. But it's being forced to deal with this configuration setting class that has additional properties and things that have required to be set on it. But which again our client, the about page class really doesn't care about. Furthermore were having to deal with configuration files in a class that really just cares about two strings. So its bringing along these dependencies that we really don't want to have on our simple little class. So interface Segregation violations result in classes that depend on things they don't need, which increases coupling and reduces flexibility and maintainability. Let's go ahead and look at how we can refactor our design to make it conform with the Interface Segregation Principle, and take care of some of these problems.
-
Refactoring to a Better Design
The first thing we can do, if we go ahead and move into another folder here called configuration2, is we'll extract out the dependency on the settings file into something that we can inject with our test. So what this would let us do is specify an interface called IConfigurationSettings. And then pushing that interface through what's called Dependency Injection and the Strategy Pattern. Which we'll learn about more in the Dependency Inversion Principle talk, otherwise our class if it doesn't have anything passed in, we'll default to using the ConfigurationSettings.Settings property that it was actually using before. So with this refactoring we've gone from our initial setup, that had an about page that simply was hard coded to use this static property called ConfigurationSettings.Settings to now one that goes and uses an interface that gets set to a field. This private read only _ConfigurationSettings. And then that is the class that's used in the render method. So where's this interface, well we created this interface by looking at all the things that our configuration settings class needed. And I added some comments here that kind of shows what I consider to be the general gist of each of these settings, so we have some application identity settings, some performance tuning settings, data access settings, and web service API settings. Now all of these things are in one interface because we have the same configuration settings class that we had before, but we are going to make it so it inherits from the IConfigurationSettings interface. Now truth be told it was doing this before because I've only got one of these classes in my solution, but it wasn't important until we got to this refactoring step. If we look at our test code now, in the Configutaion2, we've got about page should still. And you'll see that if we call about page now which is in Configuration2, it should still work. And this is the exact same test we had before, but we are also now able to create our own implementation of IConfigurationSettings, and basically create a fake version of our configuration here. And I've done that and I've set it to be a test app name, and a test author name, as my two values that I care about. And then I also still had to implement this other a huge interface with all these other things that I don't care about. And I've done what most people do which is, I take that ugly stuff that I don't want anyone to see and I sweep it under the rug using the region tag. Anytime you see people putting things in a region tag, that you know they don't want you to see, or that they are cluttering up the class, usually that's a smell and its telling you that these are things that need to be somewhere else, need to be refactored, need to be gotten rid off. So the next thing that we can do once we've got this class, is we can take the settings class and pass that into our about page, and now we can actually write tests that don't rely on a configuration file. We can show that this particular value right here for test author name, and test app name, is actually what we get when we run this test. And so if we run the tests that are in this class, we'll see that they both pass, showing that the original implementation of running from the config file still works, and now were also able to run this test with our own implementation, through the use of the interface that's being injected into our about page class. So this is all very nice and it's getting us a little bit further away from our dependency on the configuration file, it hasn't really done anything for us in terms of segregating our interface. We took our big fat interface that was configuration settings. And that class's interface consisted of all these public values here. And we simply move that from a class interface to an actual C# interface that still had the exact same signature. So we made a little bit of progress, but we didn't really get to the point where were conforming with the Interface Segregation Principle. So the next step is to go and look at our about page, and say what are, in configuration three the next step is to look at about page, and say what are the actual things that we depend on. And so we've realized especially looking at the comments on IConfigurationSettings about what exactly these things are, we realize that we only care about application identity settings on this page. And so we can create our own interface, called IApplicationIdentitySettings and we'll define it right here. And it only includes the application name, and the author name. Now once we make this change we have an about page that works fine for any new implementation, but if someone is still passing at the old configurations settings class, this is going to break because configuration settings doesn't know anything about IApplicationIdentitySettings. And furthermore if we break the configuration settings and make it so that it no longer has these two properties on its main interface, but only has them on the sub interface, that's going to cause some other things to break. So in order to make this work were going to have to make some changes to our base interface, and so if I do that right now all of this will work, and I can get rid of my to do fixed comments. So if we undo this stuff right now, we'll see that this doesn't work because I'm passing in configuration settings, but I'm expecting IApplicationIdentitySettings, so in order to make that work I can simply delete this stuff here from my main interface. And I can go and say, well actually this interface inherits from IApplicationIdentitySettings, which Oh by the way I need to go and implement that other name space, but once I do that now everything magically works again, because this interface and the other are now the same as they were before. I'm able to rely on just that subset of the interface that I care about, and everyone else is still able to use the exact same interface they used before. So through this simple refactoring I'm able to apply the interface segregation principle and get myself away from having to depend on that fat interface without breaking anything else that's depending on the existing interface. If we look at our test we'll see there's a little bit more that we have to do here. Basically in terms of uncommenting stuff, so this wouldn't have worked before, but now I can uncomment it. And it should work, we can see I've got a test settings like I had before, but there is no longer any ugly region with stuff that is not being implemented. I'm only using the interface that I actually need, and I can run my test now with just this small interface, and see that it works. So if we run these tests, we see that they pass.
-
Design Smells and Tips
So some of the smells that you should be looking for in your code that indicate you might be violating the Interface Segregation Principle include, unimplemented interface methods, whether it's in abstract class or any type of base class, or an actual interface, if you find in your code things where your overriding methods from your base class or your base interface, and then simply throwing a new non implemented exception or doing some other kind of degenerate implementation. You should realize that this is probably violating ISP because clearly the class that's using this implementation is not using this particular method, and therefore its using a smaller subset of the actual interface that's its being forced to depend upon. Remember to that these violate the Liskov Substitution Principle because now these degenerate classes will not be substitutable for their base classes when the clients expect for all of these interfaces, the entire interface to be implemented and all the methods to do something useful. Another smell is when you have a client that references a class, but it only uses a small portion of it. This is very similar to the last smell, but not quite exactly the same. This is more from the client side rather than from the implementation side, when you see this sometimes you can make a façade or some other kind of class, that your class depends on and that makes it so that your not depending upon the larger class which is perhaps more likely to change and break your class. When should we fix violations of the Interface Segregation Principle? Like most of these principles you really only want to address them if there's pain, if there's no pain then there's not really a problem that needs to be attended to and you should continue adding new features, and fixing bugs and generally adding value to your application. However if you find yourself depending on a fat interface that you own and this is causing problems because of the dependencies involved, the best thing to do is create a smaller interface that has just what the client needs, have the fat interface implement this new interface as we just did with our demo. And then reference the new interface within your client code, ignoring the fat interface now. If you find fat interfaces are problematic, but you don't own them, for instance if you had the example I showed with the membership provider that's built into the .NET framework, something that you can do is create a smaller interface with just what you need, and then implement this interface using an adapter, that implements the full interface. So this would allow you to basically work with a subset of a the large interface and the adapter would be between you and the third party interface that you don't have control over. So some basic tips for the Interface Segregation Principle, keep your interfaces small, cohesive, and focused. Whenever possible you want to let the client define the interface, because this will ensure that the interface really only includes what the client needs. And also whenever possible package the interface with the client. Alternately you can package the interface in a third assembly that both the client and implementation depend upon, and only as a last resort should you try and package up interfaces with their implementation.
-
Summary
So to summarize the Interface Segregation Principle states that you should not force client code to--- to depend on things that it doesn't need. You want to make sure you keep your interfaces lean and focused. Refactor large interfaces so they inherit from smaller interfaces that your client uses. There's some related fundamentals here including polymorphism, inheritance, the Liskov Substitution Principle and the façade pattern. I recommend the Agile Principles, Patterns, and Practices book that I mentioned earlier, you can get it at the URL shown here. And I have to provide some credits for the motivational poster that I showed earlier. There is a set of them available at this URL that you can download, they have some high res version so you so you can print them out for free and put them up if you like. And with that thank you very much, this has been a presentation of Pluralsight on demand by Steve Smith on the Interface Segregation Principle, I hope you'll come back again soon, to learn some more about software development.
-
The Dependency Inversion Principle
Introduction
Hi, this is Steve Smith, and in this Pluralsight On-Demand! video we're going to take a look at the Dependency Inversion Principle, the D in the solid principles of object-oriented design. The Dependency Inversion Principle is one of the most important principles of building object-oriented software, and we're actually going to have to split this module into two parts because there's quite a lot to cover. We're going to start off by defining the dependency inversion principle. We'll outline some of the issues that occur in software that does not follow this principle. I'll further demonstrate that with an example, in which we'll see how these problems arise and then we'll analyze that problem and figure out a way to refactor it in order to apply this principle. And then we'll have a look at some related fundamentals that apply as well along the way.
-
Definition
So the Dependency Inversion Principle states that high-level modules should not depend on low-level modules. Both should depend on abstractions. Furthermore, abstractions should not depend on details, but rather details should depend on abstractions. This is from the Agile Principles, Patterns, and Practices in C# book by Robert C. Martin and Micah Martin. So, if you've been following along on these modules thus far, you know that I like these motivational posters that demonstrate these principles. This one is for the Dependency Inversion Principle and it says, Would You Solder A Lamp Directly To The Electrical Wiring In A Wall? As you can see with our electric gadgets, we have a common interface, which is our plug, which if you are in the U.S. looks like the one shown in the upper right here, and using this common interface we're able to plug in any number of different devices that implement that interface. That's what Dependency Inversion allows us to do as well. When we write our classes in such a way that their dependencies are exposed as interfaces, we are then able to pass in implementations of those interfaces just as we're able to plug in any particular device we want into the wall socket.
-
What are Dependencies
So before we go any further talking about dependency inversion, we should consider what exactly are dependencies? Obviously, if you're writing .net software you have taken a dependency on the .net platform, the .net framework, and more or less on Windows unless you're developing for Mono, and this is something that is not, you know, within the scope of dependency inversion per se, it's a dependency that you're probably pretty comfortable with and that you don't expect to change too much with the course of your software. However, the dependencies we're talking about with regard to our application design are more low level and things that we are going to expect to possibly change as part of our application during its lifetime. For example, access to Third Party Libraries. These can often be things that will change frequently and, therefore, if possible we want to be able to inject alternate implementations of these Third Party Libraries into our code, unless we're certain that, you know, our choice of Third Party Libraries is not like to change for the lifetime of our application. Certainly, our database represents a dependency assuming that our application has one and it's definitely one of the things that you will want to wrap in such a way that it is not an implicit dependency within your code, but rather something that could be injected and replaced. We'll see more about how to do that soon. Other dependencies are less obvious. For example, if your code references the file system, if it uses email, if it sends email or even checks email from a POP mailbox for example, if it uses Web Services or really any kind of network access at all, would be a dependency as well. Any sort of system resources, even things like just the clock that you might access via DateTime.Now represent further dependencies that might require you to invert in the case of situations where they affect the behavior of your application and there's no way for you to test it unless you run it at certain times of day. Configuration can be a dependency in terms of the files that you use for configuring your application. The new Keyword is itself a possible indication that you've got a dependency within your application. You want to limit the places in which you allow your application to instantiate new objects, unless they're primitives like strings, for example. And related to that is the use of static methods. Anytime you're calling a static method, you're adding a dependency to your code that cannot easily be separated from the calling code in the case of trying to write a test for it or in the case of wanting to change the way your code works throughout the entire application in one place. If you have static methods sprinkled throughout your code, it's very difficult to change them all through one configuration change or one startup file change. Thread.Sleep can also be a dependency, as well as the use of Random, and it can be very difficult to test code that is supposed to give you random results, so if you specify an interface that you use for generating your random values, then in your test you can override that interface and say, well if the random value is this, then I should expect this result and similarly it allows you to more easily test so-called random data. These are some of the dependencies that you should be aware of. It's certainly not an exhaustive list, but I think I've covered most of the major ones that you'll typically find in a .NET business application.
-
Traditional Programming
Most traditional programming works in such a way that dependencies naturally accrue within the code because the higher-level modules tend to call the lower-level modules and by calling them they also tend to instantiate them as they need them, so it's very typical for you to have, for instance, a user interface application that references some business logic and it might new-up one of those business logic classes, perhaps a customer class or an order class, and do some work with it. And in the course of calling the methods on that class maybe it has to talk to the database or maybe it has to talk to some other sort of infrastructure. Typically that business logic class would then new-up a Data Access component or a, you know, logger component or some other kind of class to do its work, and so ultimately, the user interface is depending on the business logic, the business logic is depending on the infrastructure and utility in the Data Access classes. Furthermore, it's often the case that a façade layer is implemented using static methods in order to facilitate a simpler API for those business level methods. It's much simpler for the business logic method to be able to call, you know, SaveCustomer on a Data Access layer static method than for it to implement 20 lines of ADO.NET code itself, so these can also represent dependencies that can be very tricky to extract out of your application. Then finally, it's typical that class instantiation and call stack logic is scattered throughout the application. This violates the Single Responsibility Principle because now every class that is deciding who it's collaborators are through the use of static methods or the new keyword to instantiate specific instances with which it wants to work, is now responsible not just for its actual work, but also for determining who it's working with, and these are actually separate responsibilities that the Single Responsibility Principle should dictate that you would want to put into separate classes.
-
Class Dependencies
When we're talking about class dependencies, we want to be honest. That means that our class constructor should require any dependencies that that class actually needs. Classes whose constructors make this very clear have what I would call explicit dependencies, however, classes that do not make this clear have implicit or hidden dependencies. They're lying to you. When they say that you can just new them up without passing them anything, but then they don't work if the database isn't there, that's telling you something that is false. It's telling you that this class would work, but then it's not letting you know about the fact that oh, by the way, I actually needed a database in order to do anything. So, for example, here's a class called HelloWorldHidden that has a hidden dependency on the system clock. And so it'll work a certain way in the morning and another way in the afternoon, and another way in the evening, but there's no way for you to alter that dependency. In this case, it's violating the Open/Closed Principle in that if you wanted to change the logic for the dates or the times of day when it does its thing or if you wanted to be able to test it in some way that lets you run all three paths through the method without having to run your tests at different times of day, you would have a hard time doing so. So your classes should instead declare what it is that they need. For instance, with the HelloWorld example, we need a DateTime in order for us to determine which greeting to pass back. We could pass in a DateTime through the constructor or if we wanted to, and it was only being used for one method, we could pass it into the method itself. If we don't want to pass in a particular value object like a DateTime, we could pass in an interface that knows how to return DateTimes, such as an iCalendar that perhaps implements the Now method. And then we could write an implementation of the iCalendar interface that used the System.DateTime.Now implementation by default, but then when we are writing our tests we could use whatever implementation of that we wanted and we could have an afternoon iCalendar that always returned 2:00 p.m., for example, and we could use that in our test to verify that we received Good afternoon whenever we passed in the afternoon calendar implementation.
-
Demo
Let's look at a simple demo showing how violating the Dependency Inversion Principle can cause problems in your application. In this example, we're going to look at a simple ecommerce application which includes an Order class that represents the customer's order. An Order class has a Checkout method which takes in a shopping cart. The shopping cart has some items in it that the user is purchasing. We have some paymentDetails that will have values such as credit card or cash, and we have an option for whether or not we want to notify the customer when their order has checked out. If we look at the logic for this, we can see that when we have a credit card payment type we're going to charge the card. In any event, we're going to reserve some inventory and if the customer has requested it, we will notify the customer. As it stands here, we see that there are no explicit dependencies being set in the constructor for the Order class. Nor are there any implicit dependencies in this method that would be shown with a static method call or a new instance of a particular implementation. However, when we start to drill down into these methods that it's calling, for instance, NotifyCustomer, you can see that we do, in fact, have dependencies on MailMessage and SmtpClient, DateTime.Now, which is a static method, as well as a Logger static method for this error method here that they are using for error handling. Similarly, the inventory reservation method uses an inventory system that it instantiates and ChargeCard, likewise, uses a new PaymentGateway. So the dependencies in this class include the PaymentGateway, the InventorySystem, the SmtpClient and MailMessage classes, the Logger, as well as DateTime.Now. We'll be able to find a lot of these dependencies using the Architecture, Generate Dependency Graph, By Class menu item in Visual Studio 2010, assuming that you have the correct version. And if you do that, you'll get something that looks similar to this, which will show you all the class level dependencies. It won't show you the static methods. So you'll see here that we're depending on PaymentDetails, Cart, Logger, OrderItem, etc. The only static method it's not showing is the DateTime.Now. Our Logger method is, in fact, shown here. In fact, it's not going to show you anything that's in the framework, it's only going to show you classes that are defined within your project with the settings that I've used. So what's the problem here? The issue is that if we want to try and test the Order class, it's going to be difficult to inject some of these classes that it requires. If we look at our testClass, we have two tests written so far. We've named our class OrderCheckoutShould, and that lets us use sort of a sentence structure when we say these tests so we can say OrderCheckoutShould NotFailWithNoItemsNoNotificationNoCreditCard. So we set it up with a new Order, a new Cart, paymentDetails of cash and we set the NotifyCustomer to False. When we run this, we don't have any real indication of success or failure except for the fact that we assumed that if we didn't get an exception it must have worked, because our method is a Void method. And so we can run this one test and it does, in fact, pass when we have nothing to do and no items to do it with. The next one is that we don't want it to fail when we have no items, but we do have notification and we still have NoCreditCard. So at this point, we've set a CustomerEmail to this bogus someone@nowhere.com and we specified shouldNotifyCustomer to true. When we run this test, however, we get an exception because I'm running this on my developer machine and I don't actually have an SMTP Server running so if we view these test results, we'll see that it was a System.Net.WebException. Unable to connect to the remote server and it was unable to make a connection to port 25 of local host when it tried to send the message, and if we look at the Stack Trace, you'll see that this is all SmtpClient trying to send this MailMessage. So at that point I'm pretty much stuck. I've got two options. I can come into my cart should method and I can start doing things like creating a fake SMTP message, I can go into my order and I can say something like, you know, if I'm in test mode, don't actually send my message. There's various hacks I can do to try and get around this. There's even a handy tool that maybe you'll want to use sometime called smtp4dev that you can use. You can download this from CodePlex actually and it'll sit there and listen on port 25 on your dev machine and then if you go and run tests like this one, you'll see it flashes and shows you the message that you received. So here's my order details for instance. And if you completely shut it down, minimize it, and rerun your tests, it'll pop up a little toast message here that shows you that a message was received. So, this will still let you test your emails with this particular thing, and similarly, if this was a database access I could write some kind of a test database that I could use if this were talking to my PaymentGateway and my PaymentGateway supports some kind of, you know, fake credentials that I can pass it that'll give me back some kind of data. There's all kinds of ways that I can do some sort of integration test that really does use the implementation that I expect to use in production. But having to put together and glue together all this infrastructure, just to be able to test the logic of my order, to be able to say that, you know, really I just want to know that when NotifyCustomer is true, that this NotifyCustomer method is called. I don't care about the details of it. I don't care if it's notifying him with an email or with an SMS text or if somebody is running down the hall to let them know that their cart happened to have been processed. I just care that when that flag is set, this method is called, and right now, because of the tight coupling of the Order class to these dependencies, the MailMessage and the SmtpClient, I have no way of testing that and that's causing me a lot of pain when I want to be able to write some tests to show that this is really doing what I expect it to do.
-
The Problem
The problem with our Order class is that it has a lot of hidden dependencies. It's depending on MailMessage, SmtpClient, the InventorySystem, the PaymentGateway, a Logger, and DateTime.Now. And the result is that we have a class that has very tight coupling, there's no easy way to change any of these implementation details, so this is an Open/Closed principle violation. The only way we can change how Order works is to go in and actually change its code, and it all ends up being very difficult to test. So let's talk about a particular technique that we can use to make this easier. Dependency injection is a technique that's used to allow calling code to inject the dependencies that a class needs when it is instantiated. This also goes by the term of the Hollywood Principle, which is basically, Don't call us; we'll call you. So instead of our class calling SmtpClient, it could say that it needs some kind of a notification service that knows how to do notification and it will go ahead and call that, but it doesn't need to instantiate it itself. Now the term dependency injection or DI has three primary techniques that I want to talk about here. The first one is Constructor Injection, the second is Property Injection or Setter Injection, and the third one is Parameter Injection. There are other methods that exist both for Dependency Injection, as well as to solve the general problem of the dependencies that exist when a class knows about the things that it's newing up. One of those is called service location, but those are beyond the scope of this particular module. So the first type of dependency injection is called Constructor Injection, and this is an instance of the Strategy Pattern, which is a very popular design pattern. It's extremely useful in object-oriented programming. With Constructor Injection, dependencies are passed in via the constructor. The constructor then is being honest with the things that call it in that it is explicitly stating the things that it needs in order for it to be in a valid state and to be able to do the work that it expects to be able to do. The pros of this approach are that classes self-document what they need, it works well with or without a container, and I'll talk more about containers in a moment, and classes are always in a valid state once constructed. Some of the cons are that constructors can end up with a great many parameters if they have a lot of dependencies, which is a design smell in and of itself that needs to be addressed. Also, some features, for example, serialization, may require a default constructor. For example, a parameter list constructor, so even though you may have a constructor that explicitly specifies all of the things that your class depends on for reasons like the need to use serialization, you may also still have to expose a parameter list constructor that doesn't have those things set up. Also, some methods in the class may not require things that other methods require, so placing those dependencies into the constructor, effectively makes all of the methods require all of the dependencies. This is also a design smell because if you have methods in your class that don't require certain things, and other methods that do, it's likely telling you that your class lacks cohesion and it would be wise for you to refactor it into multiple classes that share dependencies and share, you know, common tasks. Another type of injection is Property Injection, which is also known as Setter Injection because you set the thing that you're injecting using the Set method on the property. Some of the pros are that the dependency can be changed at any time during the object's lifetime and thus this is very flexible, however, one of the cons is that objects are in an invalid state between construction and the setting of dependencies via setters, unless the constructor calls the setters, and it also can be less intuitive because there isn't any one place that you can go on that class, short of reading the documentation. That tells you exactly the properties that need to be set and perhaps the order of setting those properties in order for the class to ultimately be usable by the client calling code. And then the third method is through Parameter Injection. In this case, the dependencies are simply passed in as a parameter to the method. This is the most granular and it's very flexible. It does not require any changes to the rest of the class, however, some of the cons are that this breaks the method's signature, so if you have a method that's already in use by a great many classes in your application, some of which perhaps you cannot easily change, then breaking that method signature could be very expensive, whereas adding another constructor that allows you to pass in this dependency might be something that you can easily do without breaking any of the existing code that depends on that method. Similar to with Constructor Injection, this can result in having many parameters on your method, but again, this is a design smell in and of itself, and something that you should address in order to create a more cohesive design. You should consider using parameter injection primarily if you only have one method in a particular class that has a certain dependency, otherwise, it's better, in my opinion, to use Constructor Injection because this is able to make it very explicit to anyone using your code, what exactly it needs in order to function.
-
Refactoring
When we refactor our example class, the things that we're going to do in order to apply the dependency inversion principle are to extract the dependencies out into interfaces and then inject implementations of these interfaces into the Order. As a side thing, we'll also reduce Order's responsibilities by applying the Single Responsibility Principle, but we're going to just gloss over that as that's a separate module in and of itself. Let's look back at our solution and at our Order Object. The Order Object has a number of dependencies that we need to extract and we're going to do this by applying the Strategy Pattern and using something called Constructor Injection. So in order to do Constructor Injection, the first thing we need is a constructor, so I'll add that and we can take some of the things that are common throughout the entire class, that are currently parameters on Checkout and move these into our constructor's parameters as well. So for instance, we could say that our order requires a cart and then we can turn that into a field that we can use elsewhere. Likewise, we could say that we always have to have payment details, so we'll add those as well and make that a field too. And then the next thing we need to do is take some of our dependencies and convert those into interfaces that we can inject. So the first one that we see here is NotifyCustomer, and so we can simply take this exact signature and make that our interface. So we're going to say that we have a void NotifyCustomer and we'll come up here and we'll just create a new interface in the same class for now and we'll call it INotifyCustomer for now. And you see it takes a cart in order to do its work. If we look at this, we're going to see that the cart has the customer email and pretty much it looks like everything else it needs is here. Now the NotifyCustomer implementation might need to take in, you know, an ILogger or something at some point, but for now we can do what we need with just this simple interface, I think. So once we have this interface, let's go ahead and create a drive type for it and we'll just do this right here and we'll call this NotifyCustomerService and, of course, we want to implement the members and so we have a NotifyCustomer and we really just want to take this whole method out of here and place it into here and at that point we have this NotifyCustomerService that we could use. If we come back into our constructor, we can then pass in INotifyCustomer, and pass that in as well, and now it's just a matter of wiring up these fields so that they are using the fields instead. So if we come in here and change NotifyCustomer to be _notifyCustomer., now that's going to go ahead and call our NotifyCustomer method on our interface and we've eliminated the dependency within Order on anything having to do with SMTP. Now we'll see in a moment how we can repeat this same process in order to pull out the dependency on the InventorySystem and the dependency on our ChargeCard. So let's look at our class dependencies as they began with our Order class and you can see that the Order class had strong dependencies on the PaymentDetails, the PaymentGateway, the InventorySystem, there's also not shown here a dependency on the System.Net classes that are doing SMTP stuff. Now I've gone ahead and done all of this refactoring already inside of the loose coupling folder here and you can download this code and go through this yourself. The OnlineOrder is something that currently does the work of everything that you would expect for an online version of the order that needs to do notification, as well as credit card processing a inventory reservation. In the Single Responsibility Principle video, we looked at a couple of other alternatives to Order, for instance, if it was a point of sale order you might not need to do a NotificationService because the customer is standing there and they know that they made their purchase, but in this case we're talking about online orders. We identified these four fields that we wanted to pass in addition to the cart, which is being passed in on the base class of Order through its constructor. And finally, with each of these interfaces established, we also created default implementation, so the NotificationService looks like the one that I just showed you, and there's also a PaymentProcessor and a ReservationService that are simply cut and pasted out of what was there previously. So looking at OnlineOrder, we can see that the entire class is now only 36 lines of code and it's very easy to follow. You can also see that the new keyword does not exist anywhere in here. This class now takes in all of its dependencies through its constructor, making it very easy for us to test the class through the use of fake implementations. So let's look at what the tests would look like. If we look at our first test here where we want to send the total amount to the credit card processor, we want to verify that whatever the cart amount is is actually what we're going to charge the user's credit card. So for this, we're going to create fake implementations of each of our dependencies. We have a FakePaymentProcessor, FakeReservationService, FakeNotificationService, and we have a cart with $5.05 in it that's of payment type CreditCard. We create our order and check out. Once we check out, we want to check two things. We want to verify first of all that our PaymentProcessor was called at all and then we want to verify that our PaymentProcessor had the correct amount passed into it and we're going to compare that with the cart.TotalAmount. Now our PaymentProcessor interface does not support WasCalled or AmountPassed. If we go look at IPaymentProcessor, we'll see that the only method that it supports is ProcessCreditCard. So where are these coming from? Well the nice thing about our fake implementation is that we can do anything we want with it, so within our FakePaymentProcessor we don't actually process any credit cards, but we do have a couple of fields that we want to be able to access, one of them called WasCalled, one of them called AmountPassed, and using these we are able to verify that this method WasCalled and that the AmountPassed is the one that we expected. If we were interested in verifying the PaymentDetails, we could expose those here as well and then check for them in our test. Remember that the point of unit tests is not to verify that the full system works or to test whether or not the PaymentProcessor itself works, the point of the unit test on order.Checkout, which is the method that we're testing, is to verify that Checkout does what we expect it to do. What do we expect Checkout to do in this case? Well, we want it to process the credit card with the correct amount and we want it to call our ReservationService and we want it to call our _notificationService. We want to know that it does these things and perhaps that it does them under certain conditions and in a certain order, if necessary, and that's it. We don't want to have to test every little piece of the system just to test this one method, and that's what separating out the dependencies is allowing us to do is to isolate the order.Checkout method and be able to test just the content of this method without having to have actual implementations of all of its dependencies. If we look back at one of the original methods test that we had, we had a test here that said that it should not fail with no items, no credit card, but it should have notification. And if we run this, we see that we get this exact same behavior that we did before. So now let's look at the class dependencies on OnlineOrder and compare them with what we had with Order. We've broken away from having dependencies on many of the concrete classes that Order had, and replace those with interfaces. So we now have dependencies on three interfaces, Reservation, Payment, and Notification, as well as dependencies on still PaymentDetails, the Cart, and the Order, but those were able to new-up without any kind of external infrastructure requirements like Web Services, Database, SMTP Servers, etc, and so this still allows us to test our OnlineOrder without having to use any of those extraneous bits of infrastructure. So just to review, the main change that we made is we went into OnlineOrder, we identified the dependencies that it had and created interfaces for those dependencies. We modeled those interfaces based on simply what this particular class needs so the client is the one that dictated what these interfaces should be, and then we moved the implementation details that previously existed in this class, into implementations of those interfaces. And this was pretty much just a straight cut and paste of the method that we had previously on order we moved into now a NotificationService, who currently has just the one responsibility of sending this message to a customer. This makes this particular class very easy to test and very easy to follow because it's doing only one thing and it cleans up our OnlineOrder class considerably as well, so that it's much easier to follow and simpler too. The nice thing about this is that OnlineOrder is now being very explicit about its dependencies. You know when you create an OnlineOrder that you must have a Cart, a PaymentDetails, a PaymentProcessor, a ReservationService, and a NotificationService. If you don't have those things, you simply cannot create an OnlineOrder. There's no default constructor for it, and so we are able to tell clients of the OnlineOrder class exactly what dependencies it needs in order to function.
-
Design Smells
Now let's take a look at some design smells related to the Dependency Inversion Principle. The first one is the simplest, it's simply the use of the new keyword. If you find in your code that you're using the new keyword and newing up actual instantiations of particular instances of classes, rather than interfaces, it's often a sign that you could apply the Dependency Inversion Principle. In this case, what we're showing you is this new InventorySystem. This is something that if it has external dependencies, maybe it's talking to a database, now the code around it has inherited that dependency, so the only way for you to keep this foreach loop from having to deal with whatever the dependencies that InventorySystem carries along with it, is to replace that with an abstraction and the simplest way to do so is to replace it with an interface that you use and then simply inject that interface using the Strategy Pattern and Constructor Injection. Similar to the use of the new keyword is the use of static methods or properties. This can be something as simple as a DateTime.Now that's crept into your code instead of having an actual DateTime passed in or an abstraction such as an iCalendar or an iDateTime interface that supports the Now method that you want to be able to pass in. Another very common scenario is the use of static methods to create sort of a façade layer for your DataAccess and so you'll see this frequently where you have something that offers a bunch of different methods like SaveCustomer or ValidateCustomer or things like that, which are static. And unfortunately, if you have a method that does a bunch of stuff and then ends with DataAccess.SaveCustomer, and that static method talks directly to ADO.NET and talks directly to the database, there is now no way to eliminate that dependency on the database. Just the same, if that static method is talking to a file system or any other dependency. So the best thing is to avoid static methods because of the fact that they cause these kinds of problems with inherited dependencies. The one place where you should use or could use static methods is when they don't actually touch anything other than the parameters that are passed into them. For example, if you had a static method that added two numbers together and it took in those numbers as parameters, there would be no problem with that because it's not going to cause any sort of a dependency problem, but if you have a static method that instantiates other classes and those classes might have dependencies of their own, now that's something that is going to likely cause you problems when it comes time to test.
-
Where to Instantiate
If we're not instantiating our objects, where do we insatiate them, you know, somewhere we have to do this. So typically what happens when you apply Dependency Injection is you get many small interfaces, which is good, because each one of them is very cohesive, they're loosely coupled to one another, they have the Single Responsibility Principle, they follow the Interface Segregation Principle, but at some point you have to actually create these objects. There's a couple of different choices for this. The first one is you can create a default constructor that then inherits from your constructor that actually takes in the interfaces and provides a default implementation of each of those interfaces. So in the example that we showed, we had an online order that took an INotificationService as one of the parameters in the constructor. We could create a default constructor that automatically passes in a new notification service, which is our default implementation of the INotificationService. Now any code that is calling this will continue to work just as it did before and we'll still be able to inject alternate implementations, for instance, in our tests when we wish to do so. This is sometimes poor man's dependency injection or poor man's IoC. IoC means Inversion of Control and we'll see that in just a moment. The other option is to manually instantiate everything in your application startup routine or main() method. In a web app this could be an application start in the global.asax and another option is to use an IoC container, which does the same thing typically in the main or startup method, but it has a bunch of features that it supports so that you can wire these things up in an intelligent fashion and go to one place and see how your object graph is going to be set up for your application.
-
IoC Containers
So, IoC containers or Inversion of Control Containers are responsible for object graph instantiation. They're initiated when the application begins and typically they either use code or configuration, such as an XML file to determine what is going to be set up to be used whenever an interface is called for. Managed interfaces and implementations to be used are registered with the container, so for instance, you might have in your container something that says you want to register INotificationService and say that anywhere I see an object that requires INotificationService, I want to use a new instance of NotificationService. Dependencies on these interfaces are then resolved either at application startup or at runtime. You can call, you know, an IoC Resolve method that will go and find whatever the instance is that's mapped to that interface at runtime or you can have a container that automatically is able to create the dependency graph for a given class by using its constructor. Typically the constructor that has the most parameters is the one that we'll use. So in the example with our OnlineOrder, we could have an IoC Container create one of those for us and it would automatically supply the types that were needed for the IPaymentProcessor, the IReservationService, and the INotificationService. Sometimes it's necessary to create a factory class that is able to create your class for you and then register the factory class and its dependencies with your IoC Container. IoC Containers are a fairly large topic in and of themselves. There are quite a few of them available, most of which are completely free. Some of those are listed here including Microsoft Unity, StructureMap, Ninject, Windsor, and Funq or Munq, just to name a few.
-
Summary
So to summarize, you should depend on abstractions rather than concrete types whenever possible. You want to avoid forcing your high-level modules to depend on low-level modules through direct instantiation or through static method or property calls. You want to declare your class dependencies explicitly in the their constructors wherever possible. You can inject dependencies through such constructors or alternatively through the property or through a parameter injection. Some of the fundamentals that are related to this topic include SRP, ISP, the Façade Pattern and Inversion of Control Containers, as well as the Strategy Pattern that we discussed. I recommend the Agile Principles, Patterns, and Practices book listed here, as well as Martin Fowler's article on dependency injection, which is at the URL shown right here. I need to provide credits for the one motivational picture I used for the Dependency Inversion Principle as shown here and with that, thank you very much. This has been part 1 of Principles of Object-Oriented Design Software Fundamentals, the Dependency Inversion Principle. We're going to have another second part to this that's going to show you how you can apply this principle at the application level and at the solution and project level in your Visual Studio projects. Thanks.
-
The Dependency Inversion Principle, Part 2
Introduction
Hi, this is Steve Smith, and this is going to be part 2 of the Dependency Inversion Principle, one of the principles of object-oriented design and a software fundamental, part of the solid principles of object-oriented programming. In this module, we're going to talk about Project Dependencies in our Microsoft Visual Studio applications. We'll look at the problem that occurs when we have dependencies flowing in such a way that the infrastructure and low-level concerns of our application are depended upon by all the other projects in our app. We'll go through an example that shows this particular type of architecture, which is very common, and the problems that arise when one uses it, and then we'll refactor this design, applying the dependency inversion principle at the solution level so that we can improve the design and maintainability of our application. Finally, we'll wrap up with some related fundamentals.
-
Definition
In a typical layered or tiered application design, there are separate logical and sometimes physical layers. For instance, it's very common to have a User Interface Layer, a Business Logic Layer, and a Data Access Layer. Oftentimes, these will each correspond to separate projects in Visual Studio. The nice thing about this approach to design is that it supports encapsulation and abstraction. It also works at the application level that is appropriate to each layer. Each level should only know about one level deep if possible, because this allows for the individual layers to be swapped out at a later date without affecting the layers above it. This provides a good unit of reuse, but the lowest levels are generally the most reusable because there are many more layers above them than, for instance, a User Interface Layer, which is typically not reusable, and usually is at the top of the dependency hierarchy. This diagram shows an example of the traditional or naïve layered architecture approach with the flow of dependencies going from top to bottom. As you can see, the User Interface is sitting at the top here and it's depending on a Business Logic Layer that makes up the main central area of the app, and most of the Business Logic Layer classes are going to end up calling Data Access Layer or Common Assembly classes in the Data Access Layer or some kind of service layer is going to talk to the Database or talk to Services. If the dependency flows in this direction and since we know dependency is transitive, this means that the User Interface is always dependent on the Business Layer and the Business Layer is always dependent on this Data Access Layer, which in turn depends on the existence of a Database and Services in order for it to function. What this means is that your application is very difficult to test or change or work with in isolation from a Database or these Services because of this hard dependency that's flowing from the Business Layer through the Data Access Layer. This means also that we are not depending on an abstraction, rather we are depending on an explicit instance of the Data Access Layer which we're going to be working with. We can invert this architecture so that the thing that sits at the bottom of our dependency structure is, instead, the Object Model, the Core, the Domain Objects, as well as perhaps our Business Logic, and Services. These can be packaged together as a single assembly or separately if you prefer. It's also important to note that any of the dependencies that these have are represented by interfaces at this layer. Next, the other areas of the application such as Data Access, User Interface, Tests, I/O Operations, and Web Services or WCF, would all be on top of this and would depend on these services and objects. Infrastructure concerns outside of our application, such as the Database or Web Services themselves, would reside off to the side where only the Data Access Module would depend on it or the WCF or Web Service Module, but note that our Business Logic and Core Domain Objects would not have this dependency, and as a result they could be tested without having to have these in play.
-
What are Dependencies
Let's look at a demo of how we can build an application that follows the traditional hierarchy and what effect this has on our ability to work with that application. One of the new features of Visual Studio 2010 is the ability to use the Architecture tab to generate a dependency graph by assembly. If we do this for this project, we will see something like what's shown here where we have our NTier.Web and WebService assemblies at the top of our dependency hierarchy, and these are correspondingly referencing our Business Logic Layer and the Business Logic Layer is currently referencing the Data Access Layer. The Data Access Layer, as well as pretty much every other assembly is making use of a set of common functions that's in a common assembly and then these are all referencing externals, such as ADO.NET or WebService classes. This is also where you would find things like the Database or external things like the file system, etc. Looking at this design, it's very clear that our Business Layer depends on the Data Access Layer, which in turn depends on externals. This can make it very difficult to change the Business Layer or to test it in an isolated fashion. For instance, if we were to go and create a Test Class like this one, we could say that we wanted to have a class that tests the Security Class of the Business Layer and say that Security's login method should return a user Id for a valid user. We construct our test with a validUserId, email, and password, and of course, we would like to be able to inject these into the system somehow so that we could do this test with our data here that we are passing in. Unfortunately, this is implemented as a static method, making it very difficult for us to do any sort of dependency injection. And when we write this test, if we stop ignoring it and run our test, we'll see that because of this dependency on Data Layer, our test throws an exception and that exception is that it can't find the Connection string in the web.config. Of course, I don't want there to be a connection string in my Unit Test web.config, because I don't want Unit Test to be talking to external resources. I only want to be testing the Business Layer, I don't want to be testing the Data Access Layer or the Database with this particular test suite. Now I actually put together this demo many years ago when .NET 1.0 was out, in order to show how NTier applications should be structured. This is before I was familiar with the Dependency Inversion Principle. And one of the things that I struggled with as I put together this demo was how I could easily swap out one Data Access Layer for another. Part of the intent of this demo was to be able to show how easy it would be to take, for instance, our Security Layer and change it so that the Data Access Layer that it's using could be swapped from one that uses SQL Server through the SqlParameters and SqlConnections for a different Data Access Layer, which is here, called Data Access Layer 2. And Data Access Layer 2 uses an XML file. So the idea was that one could switch from one to the other and in the course of my demo, the way that I would do that, because I didn't know any better, was that I would go in here and I would have to change DAL to DAL2 and I would have to change my references to make sure I had the reference to Data Access Layer 2, and everywhere that I called Data Access Layer 1, I would have to change those references. Of course, if I can just delete the reference and lean on the compiler a bit, it'll show me all those places and I can come in here and change them appropriately. In this particular demo, I only have three methods that call Data Access Layer, so I only have three places where I need to make this change, but the fact that I have to go and touch every line of code, is definitely a violation of the Open Closed principle and it was certainly something that bothered me at the time, but I just didn't know of a proper way to change that and avoid that situation. Many applications that I still see have the same problem because they rely on static methods and/or because they use this same type of dependency hierarchy that puts the Data Access Layer below the Business Layer with a direct class-level reference and assembly-level reference that passes from the Business logic to the Data Access Logic. This results in more fragile applications that are more difficult to change as you evolve them and as you wish to adjust their behavior.
-
Traditional Programming
Let's analyze some of the problems with the existing application structure. The biggest one is that dependencies tend to flow toward infrastructure concerns such as the database and XML files going through Data Access Layers that are tightly coupled to these structures. The Core, Business Layer, and Domain classes, such as we have them in this example, all depend on these implementation details. And if these implementation details change, it requires us to go in and change our Business Layer, at the very least recompiling it, but in this case actually going and touching lots of individual method calls in order to fix their references so that they continue to function. The result of this is that we have tight coupling between the Business Layer and, by extension, the Data Access Layer and the XML file and the infrastructure components that we are using in our application. There's no way to change these implementation details without a recompile resulting in an Open Closed Principle violation, and making this very difficult to test in any kind of an isolated fashion. Dependency Injection shows that dependency is transitive. If the UI depends on the Business Logic Layer, which in turn depends on the Data Access Layer, which in turn depends on the Database, then everything depends on the Database. We want to, instead, depend on abstractions where we know that there are likely to be changes in our application. We want to package up these abstractions, these interfaces, with the client that's using them, in this case the Business Logic Layer, in keeping with the Interface Segregation Principle. Finally, we want to structure our solutions and projects so that the Core or Business Logic Layer is at the center or at the bottom of our dependency hierarchy with the fewest dependencies and with no dependencies on external infrastructure.
-
Class Dependencies
Now let's see a demo of how we can refactor our naïve interior design into something that follows the Dependency Inversion Principle, and gain some of the benefits of a more flexible architecture. We're going to refactor this NTier application in order to make it work in such a way that the Business Layer is the center of all the dependencies rather than depending upon the Data Access Layer. I've changed this application back to where it was when we started so that it's now using the NTier Data Access Layer assembly, which is depending on the SQL Database. If we run our test here, we can see that it fails and it fails with the exception saying that it cannot find our Connection string. The reason for that is because this Security.Login method is defined over here and it's calling Data Access Layer Security.Login through a static method call, and if we look at the Data Access Layer Security.Login we can see that it's actually generating some sql, opening up a connection, and ultimately executing that using ADO.NET. When we're done, we'll be able to use this test with a little bit of additional code in order to pass in the expected record that we want to find, run our login, and verify that it did, in fact, return the correct result. We'll then be able to add additional tests if we would like to make it so that our Login and Security classes function the way we expect them to. Let's get started. The first thing we need to do is go into the Security.Login method and make it so that it's no longer static. Static methods are very difficult to test and very difficult to mock out in your test, so eliminating that is the first step in improving our design. Once we do this, we need to build and fix anything that was expecting a static method and change it by simply newing up the class that we need and applying a couple of parentheses. If we do this in each location where it is being used, we quickly get back to a state where everything builds. There, now we have a Security Class instance that we can use. The next step is we want to generate an interface that shows the things that this depends upon. I'm going to use what's called the Repository Pattern in the definition of my interface to make this work. Now if we look at what the security code is actually doing, it's selecting a user based on their email and password and returning back that user record. So let's look at how we would identify an interface for this. In the Security Class at the Business Logic Layer level, we would say that this is depending on some kind of a Data Access repository that's able to get a user by email and password. So, I'm going to say that we need a new interface. We'll call this public interface UserRepository, and we'll say that it has a method called GetByEmailPassword. That takes in a string email and a string password, and it returns back an int, which is the User ID. Now I can also come in here and say that I need a new constructor for Security and in this constructor I am expecting, we'll make this an IUserRepository, I am expecting an IUserRepository called userRepository and I'm going to introduce and initialize a field called _userRepository. And now I can take that instance and use its method here, so I really want to say _userRepository.GetByEmailPassword and at this point I'm back to a state where I can rebuild, except that I don't have a default constructor, so the next thing I need to do is create a default constructor and I'm going to have it use the other constructor's interface and pass in a default implementation, which in this case we're going to say is a SqlUserRepository. Now I don't actually have one of these yet, so we're going to generate one, and this all looks good, and we're going to implement the members, and now we should be able to build. Alright, maybe not, what did we miss? Oh yes, we need some curly braces and now we can build. Alright, so at this point we have an interface, we have a couple of constructors now, and we've created a SqlUserRepository that doesn't actually do anything yet, but it will. And the SqlUserRepository right now is living inside of this same file, but we're going to break that up in a moment. So the next step is to move around some files. We need this interface to live in its own file so we're going to say move it to another file and we further want the actual SqlUserRepository to live in its own file, and in fact, we're going to see that we want it to live in an entirely different location, different assembly, but for now we're going to create a new folder for our interfaces and we'll move our IUserRepository into this folder, and then I also want to create another location to put this SqlUserRepository. So I'm just going to create a new Data Access Project, so we're going to have a new project here. We'll call it a C#, Class Library of NTier.Data.Sql. NTier.Data.Sql does not need to have any classes in it at the moment, but we do want it to have our SqlUserRepository, which we can now delete from here and this SqlUserRepository is going to have to have a reference back to that interface, so we'll add a project reference NTierBLL and we'll ensure that this repository here has the right namespace. And that looks good. Now the challenge here is that we no longer are able to have our default constructor here know about that SqlUserRepository, so we're going to just comment that out for now, but we'll take that code and we'll have to put it where things are calling this. So in these three places where we call our constructor, really only two of them, we're going to need to insert that code, so we're going to insert it here and in addition to this we're going to have to give them a reference now to this new location. So NTierWeb is going to have to know about this new project reference with NTierData.Sql. And NTierWebService is going to also have to have a new reference to that same one and that's just so that we can build. Now if we look back to our error list, I have one more I need to fix, I believe. I'm pretty sure there were two. So here I need to add that and that's happy. And then here I need that namespace so it's happy. Now in my test, I don't want to add any dependency on the Sql, because that's not what I want, so I want to have instead, something like a new FakeUserRepository. And for this we're going to go ahead and new this up and a FakeUserRepository will need to implement that method as well and we can tell a FakeUserRepository do something like return validUserId, where validUserId is a public int field on that class. We can further say that we have a public string ValidEmail, public string ValidPassword, and then we can simply say, if email = ValidEmail and password = ValidPassword, return validUserId, else, return 0. And now we can come up here and we can change this so that we say var fakeRepository = new FakeRepository, fakeRepository.ValidUserId = validUserId, fakeRepository.ValidEmail = validEmail, and fakerepository.ValidPassword = validPassword. Now the last thing we need to do is use our fakeRepository instead of this new FakeUserRepository. And at this point we should be able to run our test and we should have a passing test. If we look back at our Business Logic Layer and specifically at the method that we changed, you can see that we have in this case a single interface that we needed to create and we eliminated the new Keyword from anywhere inside this class. We delegated the responsibility for determining which particular instance of a UserRepository we would be using to the caller, which is inverting that dependency, that's applying the Dependency Inversion Principle. So this is poor man's IoC we weren't able to use in this case because we moved the implementation into its own assembly that we don't want to be referencing and pushed that responsibility for newing up that SqlUserRepository to our calling code. What this allowed us to do is to have our User Interface Layer, which is this Login.aspx, is able to say that it wants to use a SQL Database here and, in fact, we could use an IoC container to make this much more flexible, but for our purposes here you can see that we can just hard code it into the user interface and this still allows our test to pass in a different implementation of the IUserRepository. In this case, we've created a FakeUserRepository that lets us set the valid user and password information and then test for it. Now there's nothing to say that I had to do this in this particular way with three public fields. Instead, I could have had an array list of users that I wanted to specify, and that would've worked just as well. One last note is in addition to the, sort of, services, like Security here, that doesn't have any kind of state, but is simply a collection of methods that apply, there are also going to be domain objects. In this case, the only domain object that we really have in our application is this UserDetails class that we haven't really done anything with thus far, and right now this lives in the Common assembly because it's used by multiple layers in this application. It would be perfectly acceptable to take this and move this class into our Core as a domain object, call it User most likely, and any kind of logic that was specific to the user would also live in this class, and then it would be depended upon by any of the other areas in our application. So for instance, if we implemented our user's services, which have things like AddUser and GetUser, these would turn into a repository. The repository would know about how to create users or fetch users, and when returning a user, it would return back that Core domain user class that would be inside of that Business Layer assembly rather than in a separate assembly. After we complete our refactoring of the NTier Business Layer to make it use separate interfaces for its dependencies, you can see that the dependency graph now looks something like this with the User Interface, Web, and Web Service Layers depending on the infrastructure or SQL Layer directly and the Business Logic Layer, but notice that the Business Logic Layer no longer depends on its infrastructure. Unit tests are able to depend on it and test it in isolation. This dependency from the UI to the Data Layer, would also easily be broken if we introduced another assembly that was specifically for dependency resolution. And it's a common practice to create such an assembly where one's IoC container work resides. So to summarize, the most important thing that I want you to take away from this module is that you should not depend on infrastructure assemblies from your Core Business Layer in your Visual Studio solution. You can apply the Dependency Inversion Principle to reverse these dependencies by taking new interfaces, adding them to your Business Layer, and then having the implementations of those interfaces be in separate projects that depend on the Core Business Layer. Some of the fundamentals that are related to what we've talked about include the Open Closed Principle, the Interface Segregation Principle, and the Strategy Design Pattern. And for additional reading I suggest the Agile Principles, Patterns, and Practices book by Robert C. Martin and Micah Martin, as well as this Martin Fowler article on injection in general. This has been Principles of Object-Oriented Design, the Dependency Inversion Principle Part 2 for Pluralsight On-Demand. Thanks for watching and I hope that you'll find additional videos useful on the Pluralsight On-Demand library.
-
The Don't Repeat Yourself Principle, Part 1
Introduction
Hi, this is Steve Smith. Welcome to this Pluralsight on-demand module on The Don't Repeat Yourself Principle. The Don't Repeat Yourself Principle or DRY is one of the fundamentals of object-oriented design and software engineering, and personally I think that this is one of the most important principles for every developer to learn. In this module, we'll start out by defining the Don't Repeat Yourself Principle and then we'll go through a demo showing how misuse of this principle results in Spaghetti code that's very difficult to maintain. After a little bit of analysis, we'll go through a series of demos showing how we can refactor such code to DRY it up and make it much easier to maintain and continue to use. We'll briefly look at some Code Generation options before moving on to show what repetition in your process can do and how it adds to the waste of your process. And then we'll show a quick demo on how we can automate our process to apply DRY as well. Finally, we'll wrap up with a summary and related fundamentals.
-
Definition
If you've been following this series, you probably know that I like these motivational posters. This one I produced myself, and as you can see it shows the punishment for repeating yourself should be something like repeating yourself at the chalkboard. I do believe that repetition is the root of all software evil and that most of the major problems and bugs that crop up in software or that make it more difficult to maintain could be avoided through the proper use of the Don't Repeat Yourself Principle. This principle was first coined in the Pragmatic Programmer book and stated as, "every piece of knowledge must have a single unambiguous representation in the system." Note that these pieces of knowledge are not just values, but they might also be processes or algorithms or approaches to problems and abstractions in your application. In the book 97 Things Every Programmer Should Know, I noted that "Repetition in logic calls for abstraction. Repetition in process calls for automation." Variations of the DRY Principle include Once and Only Once, as well as Duplication is Evil.
-
Demo App and Analysis
Now let's take a look at a demo of a simple Data Warehouse example that I wrote which violates DRY all over the place, and then we'll see how we can clean this up by applying this principle. In this sample application we have a Data Warehouse loading program, which we're calling Extract, Transform, Load, or ETL. ETL is a common pattern in Data Warehousing and if you do a search for that pattern you'll find a number of commercial products that focus on Data Warehousing solutions. In this case, the basic steps that we want to apply is to extract some data from our Data Source, perform some kind of aggregation or transformation on that data, and then load this into our summary tables or our Data Warehouse, and then finally finishing up the program. In this example, I'm going to be talking to the Northwind Database. As you can see here in my hard-coded connectionString, my Extract method is going to open up a SqlConnection, perform a query against Invoices, and put that data into an invoiceTable. It's going to do some logging and output what it's doing as it goes. Then it's going to open up another connection, also to Northwind, and it's going to query the employees and store this information in an employeeTable, and as it goes it'll write out the results. In our Transform step we're going to first take that list of shippers that we loaded in the Load section in the extraction method and we're going to through each one of these shippers, get a count of them, and then basically take out the sum of the freight that that shipper received over the period of time that we loaded. When we're done, we'll also calculate the total freight from that period of time as well. Next we'll go and analyze the employees. We'll pull out whether or not that employee is a manager based on how many employees report to that employee, and then we'll calculate the bonus for each employee based on that calculation of whether or not they're a manager. Finally, now that we have all of our data we're prepared to load it back into our Data Warehouse, so we'll have to open up another connection. In this case we're talking back to the same Data Source, but in a real system it would likely be a different one. We'll clean up any data that was already there and then we'll insert our records with our new summary data. We'll do that same thing for the employees, cleaning up what was there and then inserting new records for their bonuses. You can see that this set of code is about 200 lines of code and the only classes in it are our Main Program class here, as well as a class for FreightByShipper, which simply shows the shipper name and the freight that they charged during that period. And then for employees, these employees simply have the name, whether or not they're a manager, and their bonus. If we run this code, we'll see that it all executes, it finishes up in not too much time, and you can see that it logged all of the things that we expected it to out to the console. Now let's see what we can find as violations of the Don't Repeat Yourself Principle in this code and what we can do to correct it. A very quick analysis of the code shows a number of problems with the Don't Repeat Yourself Principle. These can be broken out into the following bullet points. The first one is the use of many Magic Strings or Values throughout the code followed by duplicate logic showing up in many different locations. There is repeated if-then logic in some branches of the code, as well as the use of conditionals instead of polymorphism. There are some repeated execution patterns where the same couple of lines or several lines of code are executed with some slight variation over and over again. And there's a lot of duplicate, most likely copy-pasted code that we can probably find with an automated tool and clean up. We have only manual tests in the code at the moment and there are a lot of static methods, which we'll see can also prove to be problematic.
-
Refactor Magic Strings
There are quite a few examples of Magic Strings or Values in this code. The first one is the use of the ConnectionString being hard-coded throughout the code. Obviously this is not a good practice and something that we should move to a more secure location perhaps, and in any event, a single location that defines our ConnectionStrings for this application. Formatting Strings are also specified in many different locations, as well as these blocks of foreach logic to dump things to the console. If at some point we want to change how this data is formatted or if we wanted to change where it's being displayed, these would need to be updated in many different locations. Another section is using magic numbers, in this case the number 1 is specified in multiple locations here without any indicator of why it is significant. And again, there's the use of formatting strings here that are repeated in multiple locations. Let's look at how we can apply DRY to remove magic strings. We're going to look at applying the DRY Principle to remove some magic strings and values from our application. We've analyzed it and found that we have a number of cases where we're setting this ConnectionString over and over in our code. We'd like to move that to a single unambiguous location that defines the ConnectionString for the application. One option would be to move this to ConnectionStrings. We would do this by adding ConfigurationManager, which we'll have to add a reference to. So we'll add a reference to .NET, System.Configuration, and ConfigurationManager has a ConnectionString section where we could specify a name for this. We'll call it the Main Connection string .ConnectionString Now in order for this to work, we'll also need to add a new App.config value, and within our app.config we'll add a ConnectionString section with a new connectionString. We'll paste in the actual ConnectionString we're using and we'll also give it a name. By doing that, we can delete this one here and run our code, since we have no actual tests, and if it still seems to work, then we can be confident that that's working. However, if we go through and add this logic that specifies the use of ConfigurationManager, everywhere that we're using a ConnectionString in this code, we'll simply have changed one kind of duplication in exchange for another, because now if we later decide to change what the name of that value is for Main, or if we decide to use a different method of storing the ConnectionString or passing it into this method, we're going to have to change this in many different locations. So we've improved where the value of the ConnectionString is stored, that's no longer repeated, it's only in one place here, but we haven't gotten rid of the repetition in how we get to that ConnectionString. We can do that by simply pulling that method up to a class-level variable. So we'll say string connectionString equals this value and we'll replace this with this and now there's a refactoring that we can use here where we can introduce this as a field called _connectionString and now we're using that _connectionString field, and if we move this initialization code up here to the beginning of our Main method, we can then use _connectionString throughout for all of our ConnectionStrings. (Typing) I think that's all of them, so now if we run the code again it should still work, and it does. And at this point we can see that we've pulled out a fair bit of repetition into this one line that now aggregates all of that information in one place. Another thing that we saw when we analyzed our code was the frequent use of two things, one being this Console.WriteLine, another being the use of this format string, and in fact, this entire foreach loop is somewhat repetitive. As you can see, we're doing it both for our invoice table here, as well as down here for our employee table. We can apply an Extract method refactoring in order to pull this data out and give us a common way to write it out. If we want to pull out this logic for writing out a number of rows inside of a DataTable, rather than working at the Row level, we can simply create a new method that takes in the table and dumps out the results using this kind of format. We'll start off by creating this new method, and at this point, we simply want to take the logic we had before, for instance, this foreach loop, and convert it so that it's using our table. However, we want it to work over the number of columns that the table has, so we'll do something like this. We've added in a bit of logic here that will determine what the total number of columns is and then loop through those columns, adding in the row with that column index and then a hyphen between each one and finally a WriteLine at the end if we don't have any more columns. This will dump out all of the columns in the table. If we want to actually limit this, because for instance, our actual usage up here is perhaps only showing a certain number of columns, 3, even though it's selecting all the columns in the table, we can pass in a column count, so int columnsToDisplay, let's say, and we'll use that here in place of columnCount. So instead of just table.Columns.Count, we'll say it's going to be that, but if columnsToDisplay is greater than 0 and less than columnCount, we'll say columnCount = columnsToDisplay. And then we can delete this old Console.WriteLine that we had before and use our OutputTable method now in place of these locations here. So we'll say OutputTable and we'll pass in invoiceTable, which we want to show 3 columns, and we can have our columnsToDisplay have a default value if we want to, so we'll make that 0, = 0. And that'll let us pass in nothing for it when we use it down here for our employeeTable. Now if we run this, it hopefully will still work, and it does. And if we scroll all the way to the top, well we'll have to turn off some of our logging here to make this easier. Let's just do our load and run to show. These are all the invoices being output, and at the end these are our customers or our employees, rather. So you can see this is still working the way it did before. We have our hyphen-separated columns. The next magic value that we had in our application appeared in our Transform section where we're doing these if statement checks and we're checking whether the count was greater than 0 and then we're grabbing the first item, and if the count was greater than 1, then we're going to grab the second item, which has index 1. And this number 1 here is the same as this number 1 here and this number 1 over here, and so we could do a simple change to say something, obviously we could fix this in several ways, but the simplest change here to remove that magic number is to have something like int index, which we'll set to 1 and then replace those 1's with our index value. Now if we need to change that, even if it's a copy-paste change, we won't forget one of those 1's when we do it down here and we have these 2's instead. We'll see how we can clean up those if statements in a separate refactoring. And the other thing that we have here is a FormatString. This is for formatting numeric values, decimals in this case. If we want to pull that out, it would simply be a matter of creating a new value. We could create it here, for instance, and then anywhere we want to display that decimal FormatString or we want to use it rather, we would just replace it here, and here, and here with that FormatString. Alternately, this method of doing this WriteLine could be extracted out into its own method. In fact, this entire chunk of if logic could be extracted into its own method or its own separate loop that does this work, but again, that's a separate exercise. So at the moment, we've pulled out a number of magic strings, we've eliminated the connectionString duplication, we've shown how we can output an entire table, rather than having repeated loops for displaying rows, we could apply that same logic in other places in our code where we're doing looping and doing output to the console, and lastly we've looked at some display FormatStrings and shown how we can easily replace those with a variable in our application.
-
Refactor Duplicate Logic
We see that our application has duplicate logic in multiple places. For instance, there is this logic here for specifying how to output data to the console, as well as separate locations where we're opening up a connection. Let's look at how we can apply some refactorings to clean this code up as well. We'll start with a fresh copy of our original solution. Here's our code, and we can see that we have our ConnectionStrings duplicated as we did before, and we have logic here for dumping things out to the console, which we would like to replace with a single new method. We also have a number of different places where we're doing this connection logic to open up a connection and get back a DataTable. We're doing that twice here, as well as down here in the Fill section we're opening up a connection, executing a query, and in this case we're doing an ExecuteNonQuery call and we're doing that a couple of times, in fact, we're doing it four times just in this one method. Now we can extract out this code ourselves or there are a number of well-proven tools that will do this for us. Let's look at an example tool. The simplest tool would be to add a Northwind DataContext using LINQ to SQL. So we're going to go add a new item, we're going to use a LINQ to SQL Class, we'll call it Northwind, we'll go over to our Server Explorer and we'll drag on the tables that we're interested in, and let's see, we also wanted employees and invoices. We'll set up the property for this. We'll say that our Entity Namespace should be entities and everything else looks good. Now if we look at our code, the first thing we're going to do is delete a FreightSummary. So instead of using a connection, we're going to say using(var db = new NorthwindDataContext. And because we don't have our configuration set up for our ConnectionString, we'll just hard-code it for now. And within this DataContext we want to first grab the one that we want to delete, so our shipperToDelete is going to be from shippers in db.FreightSummaries where shippers.Name = the individual shipper in our loop that we're on, .name, and the date, the RunDate = DateTime.Today. We're going to select the whole record and we just want the first one, so we'll grab that and for our first we'll do that here. We'll say db.FreightSummaries.DeleteOnSubmit(shipperToDelete.First And we'll go ahead and submit this now. That takes care of this big chunk of code here. Next, we want to insert a new record with the RunDate and the values from our shipper that we're on. So for that, we'll do basically the same thing. We'll say our shipperToInsert is a new FreightSummary. Now this is our entity FreightSummary that LINQ to SQL just created for us. And we will say that its freight is our shipper, freight, and its name is our shipper.name, and its RunDate is still DateTime.Today. And then we want to just do an insert on this, so it's db.FreightSummaries.Insert (shipperToInsert), db.SubmitChanges And if we apply this now to this logic here, this is our using statement, you can see it becomes much smaller at this point and we should be able to run our code and it should work, and it still does. We would do the exact same thing here, but for the sake of time we won't show that to you. And so that's an example of how we can use LINQ to SQL in order to eliminate a lot of repeated ADO.NET code. We're still going to see some repetition in this pattern where we're looping through each one of these and doing some kind of a Delete followed by some kind of an Insert and we could certainly eliminate that duplication as well using a different refactoring technique. The other duplicate logic that we saw was this output logic up here where we're looping and dumping out this stuff. We actually cleaned that up in our magic string section, but just for the sake of completeness, we'll show that technique here as well. If we go and we add a method to Output the Table here, then we can refactor these calls so that they simply say, OutputTable with our table name, and the same thing here for our invoices, taking care to change this. And we only want to show three columns for the invoice one. And if we run this you can see that we're still getting here are our employees and here are invoices being output as expected. So we've done that refactoring. Let's look at the next one.
-
Refactor Repeat if-then
Another symptom of violations of the Don't Repeat Yourself Principle that we see in our code is this Repeated if-then Logic. In this case, too, we can see that we're simply checking whether or not the count is a certain value in order to ensure that we are able to write out the value to our table. Let's look at how we can clean this up. So this logic occurs within our Transform method where the first thing that we're doing is grabbing this invoiceTable and loading it with the values that we want into our freightShipperList. So for each row, we're adding a value and then we're checking to see if our list has a certain count. We want to go and calculate the freight for that particular entry and specify it here. And then if it's another count of greater than 1 or a count greater than 2, we're going to do this, and you can imagine that if we had dozens and dozens of shippers we would end up with many, many instances of this particular if-then block. This can be replaced quite easily with a loop. In fact, it could all be done inside the loop that we already have up here, where instead of specifying what we're doing in a separate if block, we can just extract out a method for it and say here that the freightByShipper, well first we'll pull out the actual entity that we want, so we'll say, var myShipper = and we'll grab this new value here that we want and I'm sure there's a refactoring that'll do that for me. And we'll put in myShipper there, and then the only other thing we need to do is say that myShipper.Freight = CalculateFreightForShipper, and we'll pass in, what do we need, the name, so myShipper.ShipperName, and as well the invoiceTable, which we have a reference to, so we'll do this, we'll generate this method. And at this point we simply copy out the guts of our if-then logic here, paste that into our new method here, and say that we're going to return this calculation here. And we really only need the ShipperName at this location here, so that lets us do our computation and eliminates the need for this block. Now we want to write it out, so we want the ShipperName, as well as the result, (Typing) and then we can just return the result. That allows us to eliminate all of these if checks by doing all of that work inside of this one loop, which now is one line of code. So we've reduced our total lines of code significantly. We're taking advantage of a loop that we already had. If we didn't have this loop, we could have simply created our own loop that looped through each of the freightByShipperList entries to do this work as well. But it's common refactoring to be able to take a number of if statements and replace them with some kind of a loop. And so now we should be able to run this application and everything still works as it did before.
-
Refactor Conditional w/Polymorphism
Another violation that we have to the Don't Repeat Yourself Principle is caused by this Conditional Instead of Polymorphism. Here we see an example of the Flags Over Objects anti-pattern, where it's violating the Tell, Don't Ask principle, which is also known as the Dependency Inversion Principle. Because we're asking this particular object whether or not it's a manager and then using that to compute the bonus ourselves within this external calling code and setting it in a field within the employee object. Let's look at how we can refactor this to apply the DRY principle and make it use polymorphism instead of this if-then logic that will eventually be scattered throughout our code. So let's look at our Employee class. We can see that it has this Boolean for whether or not it's a manager. We can take that out of there and create a new class using inheritance we can create a Manager class that is an Employee, and then the other thing we can do is we can create, as well as this bonus, we want to be able to calculate the bonus based on the freight that's passed in our case. So, we'll create a public decimal SetBonus, let's say, and let's make this one only support Get. Bonus is protected and we have a SetBonus method that takes in freightUsedForbonus and we want to make this virtual and then we will say for an employee that bonus now equals freightUsedForBonus divided by 1,000, but for a manager we'll override that and we'll simply say that bonus is equal to the freight divided by 10 that was our logic that we used before. And this can actually be void it turns out, because it's doing all the work. Now we simply need to be able to create things as managers or employees as appropriate in our code. So if we look back at our program here, we're going to say that we get a new employee, and then based on whether or not it's got this particular row value, we're going to set it to something or the other. We can certainly create a factory method of some sort that takes in this row data and generates the correct class for us, but in this case we can simply move these things around. We'll save our employee. Actually, we'll say Employee, employee here, and then we'll specify in our if statement that employee = new Manager with a name = row 0 .ToString and that's all we really need. Otherwise, we can pull this block here and say that we equal a new Employee using our block we just snagged, and add them. Now that doesn't change this logic significantly. Let's see, there we go. But our foreach logic now makes more sense because now instead of having to do this check and ask, this is a violation of the Tell, Don't Ask Principle, because what we're doing is we're saying I'm going to ask this whether or not it's a manager and then I'm going to set something on it. And so this is a very common anti-pattern that you want to avoid. Instead, we want to just let our object do what it's responsible for and it should be responsible for calculating the bonus in this case or we could create another object that does it, but certainly our Main program shouldn't be doing this logic. So now we can simply call employee.SetBonus and we need to pass it in what the totalFreight value was and all the rest of this can disappear. And it eliminates some repeated logic where throughout our code we might be doing this inspection of whether or not an employee is a manager or not and doing some kind of different behavior based on that. Using inheritance and polymorphism, we're able to apply this right here in such a way that it does that generally and we can use that same pattern throughout our code. If we run the code, we see that we still get the same results that we got before. Here you can see managers get 20 grand, employees get $207.00, which is what we had to begin with.
-
Summary
This wraps up part 1 of the Don't Repeat Yourself Principle, one of the fundamentals of software engineering and a principle of Object-Oriented Design. To summarize, repetition breeds errors and waste. You want to try and refactor your code to remove repetition using a number of known refactorings, design patterns, and principles. I mentioned two books that I recommend. The first one is the Pragmatic Programmer: From Journeyman to Master, available at the URL shown here, and 97 Things Every Programmer Should Know, available at this URL. Thank you very much. This has been the Don't Repeat Yourself Principle Part 1 by Steve Smith for Pluralsight on-demand. Stick around and view part 2 where we'll show some additional refactorings of the code that we were looking at, as well as show how we can apply the DRY Principle to processes in addition to just code.
-
The Don't Repeat Yourself Principle, Part 2
Introduction
Hi, this is Steve Smith, and this is part 2 of the Don't Repeat Yourself Principle, one of the software fundamentals and part of the Principles of Object-Oriented Design course. In part 1, we defined the Don't Repeat Yourself Principle, we went through a number of demos showing how there was repetition in our code, we analyzed the code and looked for ways that we could improve it, and we made a checklist of problems that we found. We went through a number of refactorings to apply the Don't Repeat Yourself Principle and we didn't get through them all, so in part 2 we're going to continue with those and with this part we're going to have an emphasis on testability. So at the end of this section, we'll have our code refactored to a state where we can easily apply Unit Tests to it and we'll also examine some testing concepts such as mocking. Next, in part 3, we'll talk a little bit about Code Generation, some tools that you can use to eliminate repetition in your code and to discover repeat blocks of code, as well as how you can use automation to eliminate repetition in your software design and development processes.
-
Analysis
In part 1, we went through these first four sections. We eliminated some magic strings and values, we found duplicate logic in several methods, we eliminated some repeated if-then logic, and we used polymorphism instead of conditionals. Go ahead and watch part 1 if you want to see some of those refactorings in action. Now we're going to jump down to the testing and static method section, so look for, in this section, we're going to be talking about how we can use automated tests instead of just the manual testing that we had previously, and also we're going to eliminate some static cling and get rid of a bunch of static methods and replace those with methods that are more easily tested. Finally, in part 3, as I said we'll come back and look at some additional refactorings, find some duplicate code using a tool, and also get into that automation of our processes. Without further ado, let's look at how we can add some testability to our code.
-
Tests and Static Cling
Wrapping up our analysis of the application, we see that it has no tests. You may have noticed inside the various solutions that I've been working with that there are projects there for Unit Tests and Integration Tests, but right now those have no actual tests within them. The only way that we've been testing our code is through the use of manual tests or in this case we've been just running the code over and over again, hoping that we still get the same result that we got when we started. As we add functionality, it becomes more and more expensive to verify that the results are correct, especially as the data gets larger or the number of different operations get larger, and if the time required to actually run the code gets longer and longer, our manual testing process gets more and more expensive as more and more time is wasted on these long-running queries. Another problem with this code when it comes to testing is the use of static methods. The static methods introduce tight coupling in our application because there's no way to replace, through Dependency Inversion or similar procedures, the code that's being executed within the static method. These are difficult to test because of these tight-coupled dependencies, which oftentimes lead to third-party infrastructure things that the code is tied to like a database, and it makes it very difficult to change the behavior without actually going in and changing the code. It also limits our ability to use object-oriented design techniques such as Inheritance and Polymorphism because static methods inherently don't support those, at least in the C# language. Now it's worth noting that there are some third-party tools that will go and do some tricks within the code to make it so that you can mock out or change what static methods do, but speaking only about the stuff that ships with the .NET framework and the types of code that you would typically use to test the code, static methods are a problem.
-
Demo Adding Integration Tests
Let's look at a demo of how we can refactor this code to eliminate the static cling or the use of static methods and properties, and also to add in some tests, and we'll look at some real quick Unit Tests and Integration Tests. The easiest thing to do with a tightly-coupled Legacy application that doesn't have a lot of abstractions, is to start out by adding some Integration Tests. So we'll look at our Integration Tests project where I've got a Tests class and the first thing that we're going to want to do is verify that, let's say Extraction loads up the correct values for the things that it's doing, so Extract, we can see that it's trying to get invoices after a certain data, and then it's trying to get the employees also from the Northwind Table. So in this case, we can say something like ExtractShouldLoadInvoices, and let's just say that Should is our name, and then we can put in a public void LoadInvoices, and we'll call this a Test so it will run, and now we can go and try and test our code. So, let's rename our class here, Rename the file, and of course, there's no easy way to test our code at the moment because all the logic is inside the program, the program is marked as internal, and so it's very difficult to get to this code. So the first thing that we need to do is try and extract out the Extract method into something that we can work with. And so to start with that, we'll create a new class and we'll call this class ExtractionService, and we'll simply move this entire block of code into that Service. And we'll keep it all as a static method, and now here we just call ExtractionService.Extract. Now the problem is that our ExtractionService needs to actually pass back some of this data. This employeeTable needs to come back, this invoiceTable needs to come back, so instead of having a void, we're going to create a new result. We'll call it ExtractionResult, and it has a public DataTable InvoiceTable and a public DataTable EmployeeTable. There's a number of different ways that we could do this, but this one will work. And then in our method we'll say that we need a var result = newExtractionResult. And then we'll set this to be result.InvoiceTable, replace that everywhere. Similarly, the EmployeeTable will be Result.EmployeeTable, replace that everywhere. And then instead of void, we need to return an ExtractionResult, which means that down here at the end we return result, and here in our code we will also say var result is equal to this, make that public. Good, and then we need those to be on our local values, so we'll say that employeeTable = result and invoiceTable = result.InvoiceTable. Now we hope that that still works, so we'll try and build, we'll try and run. The code still runs with our manual test, so now we can go into our class here and we can do the same thing. We can say var result = (Typing) let's make sure we've got our namespace, ExtractionService. there we go, get our namespace .Extract. Now you can imagine that I'm going to be talking to a different database, so let's go ahead and we'll run this test right now and we'll watch it fail. So, let's go and run our Unit Test and it's failing right now because it could not load the assembly or one of its dependencies. Now at this point we can actually run our test and it should work because we're using hard-coded ConnectionStrings within our program. So when we talk to this database here, it's going to be talking to the same database that our live application talks to. Now obviously we've got another to-do item to pull this out into some kind of a configuration value, and then within our IntegrationTest, we would change that out to talk to a test database, but for now let's go ahead and run this Unit Test and here you can see that we got our output. The other thing that we can then do to make our test actually valuable, is we can start asserting something about the results, so we can say Assert.AreEqual(result.EmployeeTable.Count, or rather .Rows.Count. And for now, we're just actually trying to put our system under test so that we know that it does what it does. So, we expect it to be some number and we're going to verify whatever that number is here in just a moment, so we'll change this to be InvoiceTable. We know it's not 0, we're going to set it to what it really is. So, 2155 is the number that it got back for invoices here. We'll set that. And we'll run it again, and we got 9 employees, so we'll set that here. And we've got a successful test. So this is our first IntegrationTest now that we can run. And this basically verifies very little. It proves that we're getting back the number of rows that we expect and the reason why we expect them is because it's the amount of rows we get back currently. This is called a Fixing Test because it fixes what our application does and what we expect it to do at this point in time, and it lets us continue to make changes now, confident in the knowledge that when we run this test it'll verify that the behavior that the system currently does has not been changed. These are nice to have when you're going in and doing a large refactoring, because the purpose of a refactoring is to change how your code is designed and how it operates, but not to change its behavior. So you would expect that whatever it was doing previously, it's still doing. And so these are valuable from that point of view, however, they are very slow tests. If we look at the time that this test took, let's run it one more time, and we can turn on timings here. You can see that this test took about 1 second to run. Now it doesn't sound like that's too bad, but as we continue to add tests to this, we're going to have hundreds of tests and if our code takes 100's of seconds to run, it's going to get very, very slow to run these every time we want to do a check-in or every time we want to verify that our code still does what we expect.
-
Demo Adding Unit Tests
So the other thing that we want to do is add some Unit Tests to this code. So if we look here now at our UnitTest code, we can change what this class is doing to say something like ExtractShould as well. And what we want Extract to do is simply to get some data from these two Data Sources and we don't really care what that data is in this case. So we're going to do some additional refactoring in order to get to the point where we can do this test, but basically we want to have something like our extractionService, well first we need a method, so we'll create a method for our test and we'll say, ExtractShould GetDataFromInvoicesAndEmployees, make this a test, and then we can say var service = new ExtractionService, add our reference, and then we want to basically, this is going to use a pattern called Arrange, Act, so we'll say var result = service. Actually we don't have that as a non-static method yet, but we will in a moment. So we'll just leave this out of here for now. And then we'll have our Assert. And in our Assert we can verify that the state of the code, in this case we'll say something like Assert.IsNotNull, and it's going to be our result. So for now we'll just say new Object, right. So there's not much here yet. Let's go make it so that we can actually run this UnitTest and verify that it's doing the interactions that we expect. Now the thing about the UnitTest is that it's not going to have any kind of a reference to System.Data, it's not going to talk to the database at all, and the way we're going to achieve that is through the use of something called mock objects, and we'll see how those work in just a moment, or through the use of a fake, which we could also use, so we'll show both of those. So let's go back to our program. We've got this ExtractionService. At this point, let's go ahead and move that to another file and we've got our result. We'll move that to another file as well. And in our Service, we can see that it's talking to a database, it's doing these two commands. Basically it's going to need to talk to an InvoiceTable, and it's going to need to talk to an EmployeeTable to do these things, so I'm going to go ahead and say that this is a non-static method as our first refactoring and let's see what that breaks. Well first off in Program I'm going to have to have this be a new ExtractionService for that to work and then same thing in our IntegrationTest we just wrote, and then it's a good idea to run all of our tests to verify everything still works, and they still run. So we can continue with our refactoring. So we come back to our Service, which is now non-static, and we need to say that it depends on some things. so we're going to say, well, we need a constructor, and our constructor is going to say that we need an interface for, I think, something like an IInvoiceRepository and an IInvoiceRepository needs to be able to return back, let's say, a DataTable to keep things easy, so it's going to have a DataTable ListInvoices. We could specify the date that we want, but for now let's leave that hard-coded and we'll do the simplest thing that works, and we'll also need a drive type of this. We'll call this a SqlInvoiceRepository and we'll implement the member and then in here we basically want to refactor out all the stuff that's using a connection, grab all that, stick it into our ListInvoices, and once we get to the point where we have an InvoiceTable here, we want to return it. Actually that's not quite right. We need a Table there and then we need to Fill it. So we'll say var table, and then we'll Fill that table. And then we'll return that table, right? Now once we have that, we can come back in here and we can say you know what, I need an IInvoiceRepository called invoiceRepository. I need a local variable set to that, so I've got now an invoiceRepository that's set to that. And I need this to work even if nothing is passed in so that my code still works the way it always did. So we're going to have a public ExtractionService empty constructor, which calls this and passes in a new SqlInvoiceRepository, and then does nothing else. So this ensures that my code still works the way it always did. This is some poor man's Dependency Injection here. And then in our code we only need this foreach once we get our InvoiceTable, so pull this out. We need this result and we need to say result.InvoiceTable = _invoiceRepository.ListInvoices. And we can now get rid of this big using block. Next, we want to do the same thing with the employees, so we'll just do, let's get these out of our file, so we'll move this to another file, come back to our Service, move this to another file, come back to our Service, create our new interface for employees. So we have a public interface IEmployeeRepository, and it'll have a DataTable for ListEmployees. Now we just need to create a drive type from that. We'll call it SqlEmployeeRepository. We'll implement the members and we'll do the same thing we did before, we'll pull out all this code here and drop that in, and we'll need to have a table that we Fill, and of course, you can see there's some duplication here that we would get rid of through methods already shown in other demos. We'll Fill the table, we'll return that table, and then change our code here to take in that Repository, So we have an IEmployeeRepository employeeRepository. We're going to introduce and initialize a field EmployeeRepository, just as we did before. We're also going to change our poor man's injection to automatically inject a SqlEmployeeRepository, so our code continues to work as it did before. And then we need to say that our code here, we have to get rid of all the SQL stuff and those closing curly braces, and we just need to say that result.EmployeeTable = _employeeRepository.ListEmployees. Now another instance of Don't Repeat Yourself that you might notice here that I might change at some point is I've got employeeRepository and then I've got ListEmployees, that gets to be kind of redundant, so it's pretty common to simply call this a List method instead, and that way you know its listing employees because, well, it's an employeeRepository. For now, we'll leave it the way it is and verify that we can still run our code. It still runs. Of course, we should be also able to verify that by running our Unit Tests, which right now is just this IntegrationTest, and it still runs, which brings us now back to our UnitTest. So our UnitTests, let's go ahead and rename our file, and that's good. Now we're able to actually call our Service, so call service.Extract and get some results back. And since we didn't pass in any kind of variables in here, right now we're going to be doing the same thing as our IntegrationTest. So now we can verify result and running this should run just like our IntegrationTest, and it does. And you can see we get all this data from our actual database. We don't really want to be talking to the database, though, at all. We really just want to be verifying that our methods were called, so what we're going to do is we're going to create a fake repository for invoice, so that will say public class FakeInvoiceRepository inherits from IInvoiceRepository, and we're going to implement that member. And in this case, we just want to verify that this was called. So we're going to return a new DataTable with nothing in it, but we're going to have a Boolean here, so we'll have a public boolWasCalled = false, and then we can set WasCalled = true. We'll do the same thing for our Employee class, so we'll have a FakeEmployeeRepository, which inherits from IEmployeeRepository, implement its member, and here we'll say WasCalled = true, return new DataTable. Now in our UnitTest, we can pass in these particular repositories. So we'll say var invoiceRepository = new FakeInvoiceRepository, var employeeRepository = new FakeEmployeeRepository, and we'll pass these into our code now. This is using the Strategy Pattern to inject these dependencies and then once we go to Assert, I'm sort of Asserting against the result, this is called state-based testing where we're testing that the state of our system is what we expect. What we're going to be doing is something called behavior-based testing. We're going to verify that the code under Test, this Extract method, did the things that we expected it to do in terms of its interactions with other code. So all we want to Assert here is that invoiceRepositoryo.WasCalled, and Assert, That(employeeRepository.WasCalled) as well. And now we can run this test, and you can see that it ran in 3 milliseconds and came back as true, showing that these two things were, in fact, called and it did not talk to the database. So that's an example of refactoring our code to allow for behavior-based testing. Now if we looked at our Service again and said that maybe based on the invoices we would or would not call the EmployeeTable, we can see how that might change our UnitTest. So we could say something like, if (result.InvoiceTable.Rows.Count > 0), then we expect to call this. This is the kind of logic that we would be testing with our behavior-based test, because now we can come back to our UnitTest and we know that we're passing back an empty DataTable for Invoice, so we're going to expect this to be false in that case. So now we can run this UnitTest and see that it does what we expect. So that's an example of how we would test the behavior of the method under Test, the Extract method, separately from what the actual data is that it might be returning. And that wraps up our changing of static to instance methods, as well as some refactoring to get these interfaces. So we can move this interface to another file, go back to our Service, move our Repository to another file, and then before we're done here let's look at what we have. We've got now our program has gotten a little bit smaller, it no longer has the extraction stuff in it, instead it's using this ExtractionService. We've got a couple of invoices, which we'd probably move to a separate folder. We have a couple of implementations of those interfaces, which we would probably move to a separate folder or ideally to a separate assembly that would be responsible for our DatatAccess. And now we've got both IntegrationTests that talk to the database, as well as UnitTests that verify the behavior of the particular method that we're looking at.
-
Demo Mock Objects
Let's talk about how we could use a Mock object instead of having to write our own Fake objects to change how we would do this UnitTest, and then we'll be done with this section of the module. So using a Mock framework like Moq, or mock or mo-q, you can easily create Mock versions of these interfaces that you can then use. So in the case here where we want to verify that when we get back an empty result from the InvoiceRepository, that we do not call the EmployeeRepository, we would implement that something like this. First we need to change our invoiceRepository to make it so it is a Mock one, so we'll say this is a new Mock of
-
Summary
So to summarize, it's important to realize in our software development that repetition is a source of errors and waste. We should avoid repetition within our code by refactoring it so that there is only one canonical representation, and one of the fundamentals that I've found to be the most useful in doing this is the use of certain design patterns such as the Template Method Pattern or the Command Pattern, You can learn more about patterns from Pluralsight's Pattern Library, and also the Dependency Inversion Principle, which we covered already in this Principle of Object-Oriented Design Course. Some books that I recommend reading for additional learning about this topic is The Pragmatic Programmer: From Journeyman to Master, as well as 97 Things Every Programmer Should Know, which has a bunch of very useful tips and stories from experienced software developers. Thank you very much. This has been part 2 of the Don't Repeat Yourself Principle, part of the Principles of Object-Oriented Design Course. I hope to see you in class soon.
-
The Don't Repeat Yourself Principle, Part 3
Introduction
Hi, this is Steve Smith, and this is Part 3 of the Don't Repeat Yourself Principle. I feel like I'm repeating myself a little bit with these title slides. This is one of the software fundamentals and it's the final module in the Principles of Object-Oriented Design that's related to the Don't Repeat Yourself Principle. In Part 1, we went through the definition and a bunch of demos on refactorings, we did some analysis of our code and we cleaned it up a bit. In Part 2, we continued that process with an emphasis on testability and adding tests and using Mocks to make it so that our code was easily tested in isolation. Now in Part 3, we're going to focus here on a couple of aspects, specifically Code Generation and automation of our processes so that we can eliminate repetition in those areas as well, which will result in a more consistent build, and better quality software that we're going to be able to deliver more consistently and frequently.
-
Analysis
So, we're going to start off with one more demo showing how we can eliminate some repetition in our code and then we're going to jump right into the tools that we talked about. So this is the section that we're going to be looking at right now.
-
Repeated Execution Patterns
So when we talk about Repeated Execution Patterns in our code, what we mean is areas where we see the same kind of pattern of code being done multiple times, either within the same method as we see here at the top, this is part of our Main method where we first do an Extract and then a Transform, and then a Load, but each one of these has a very similar Console or Logging statement that's saying when it began and when it completed, or they could be in multiple methods. For instance, here we see some very standard code for opening up a SqlConnection followed by the actual command that's being applied to the SQL database, and this same pattern is repeated in multiple placed in our code as well.
-
Demo: Refactor using Action of T
Let's look at how we can refactor our application to eliminate this repetition. Looking at repeated execution patterns in our application, we immediately see that the Main method is doing the same thing over and over again, where it initially writes out a Log message, calls some method, and then writes out another Log message. This is repeated three times in our Main method, and if we had this program continuing to grow, we can imagine that as we added additional methods that it might call, it would continue to grow and we would have a lot of repetition in how this pattern is followed. The first step in refactoring this would be to extract out this method. So we can use the Extract method Refactoring, and generate a method that we'll just call Command for now. And so in our Command, we're going to write out our WriteLine, then call the Extract Command, and then finally write out the next WriteLine. So the next step is to change this so that instead of being this particular command, we'll call this something else. For instance, we could say this could be StepInProcess, and then the actual command that it's going to follow is a particular action. So we'll pass that in now since we want to be able to vary this particular line, while keeping these other lines more or less the same. So in this case, we'll just call Action, and then here we would have to pass in our Extract Command and that will allow us to do pretty much the same thing we were doing before and we can test this using our manual testing process, and we see that the application continues to run as it did before. At this point, the next step would be to add support for these next two lines that are repeated as well. And so in order to do that, we'll simply duplicate this call to StepInProcess and pass in the other method names, Transform and Load. With that, we've eliminated the repetition of how the logging is done, except that now we're seeing that the extract, and extract here, needs to change as well. So we can change this by simply putting in a variable in this case and for that variable, we'll use the action.Method.Name instead, and in this case we're going to get a slightly different formatted result, but that's okay because this is just logging. We'll save this and run it and you can see that it still runs as it has before and here we can see it Finished Extract, it Finished Transform, Beginning Load. You can see those messages are still being output. So now we have a little bit of repetition here, in that we see that we're calling this StepInProcess, that StepInProcess, the next StepInProcess. It would probably be better if we could refactor that as well, just so that if we did get a much larger application where we were having to orchestrate a whole lot of different steps in a particular workflow, we could do that in a more maintainable fashion. The easiest way to do that is to take those steps and apply a loop to them. So, we're simply moving this out of the Main section and saying something like, foreach, let's call this GetSteps, and we'll say this is a step. Then we'll just say StepInProcess(step). Right, and at that point I don't really like the name StepInProcess. Let's call this ExecuteStep. And then we need to get Steps to be created for us, so we'll create this method GetSteps and it'll return an IEnumerable, which is correct. And here we can just say that we want to yield return Extract, and then we'll do yield return Transform, and yield return Load. So with that, we'll be doing the exact same thing as we had before, but now we've moved this data or rather this logic from our Main method and put it into this GetSteps method that we could push into a different application assembly if we wanted to, and it's making it so that our Main method doesn't have the responsibility of determining which steps or what order the steps are done in, that's now the responsibility of this method here. Let's show that that still works, and it does. You can see we're still Finish Transform, Beginning Load, Finish Load, as before. Now this yield keyword, if you're not familiar with it, it's a C# keyword that allows this particular method to yield its result back to the calling method and then as the foreach enumerates through this method, it'll continue to call into this. So it'll first call this one, and then it will call this one, and then it will call this one. Now if that syntax is strange to you, another option would be to simply do something like this, return new List where we're going to say that this contains Extract, Transform, Load, and since List is an enumerable, we could do this as well, and this would run as well. So it's just a matter of your preference whether you want to create a list to return or simply use this yield syntax. Yield syntax is a little bit smaller, so I'm going to go with that, but it really would make no difference to the application.
-
Demo: Refactor Data Access
Now let's look at the next area where there's repetition in logic in this code, and that's our data access code. You can see here where we're using a Connection and in that same method we're using a connection again. In our Transform, we're not really doing anything with the database, but then in our Load we're using a Connection as well, using another connection as well. So you can see there's a lot of places where we're doing this kind of logic, the same thing with these commands. The easiest way to refactor that is going to be to provide some kind of an abstraction over our Data Layer. A tool that I've used for many years now is the SqlHelper tool here, and you can find this from the Data Access Application Block. You can reference that DLL directly or you can take the code and drop it into your own assembly, which I've done here. And so this is all code that I didn't have to write, that I pulled directly from the Data Access Application Block SqlHelper, along with a few other Helper classes that you can see here in this Data folder. And then within our program, anywhere we want to work with the SQL Database using the SqlHelper, we can refactor this block of code here that's doing all of this work to load up an invoiceTable, and we can refactor that to use the SqlHelper with pretty much one line of code. So we know we have an invoiceTable that we want to set to be full of this data, so we'll start off with that. And we'll say this equals SqlHelper, and we'll add our reference. And this is going to use the ExecuteDataSet method. In later versions I think there might be an ExecuteDataTable, but it's easy enough to just grab the first table off the DataSet. Now we need a Connection, Command, Command Text, and some Parameters. So for our Connection, we can just pass in the ConnectionString. So since the refactoring of the connectionString is a separate exercise, we're going to continue to just use our hard-coded magic string here, but of course, you would want to change that in your actual final version of this code. And then we want to pass in the Command Type and this is simply going to be CommandType.Text as our default. And to make this a little bit easier to read we'll put these each on their own line, right. Now we need the actual Command we want to run, so I'm going to pull that out so that it's still a separate string. So we'll grab this right here, throw it up above, and reference it here as a parameter. myQuery. And then we need some parameters. I could pull the parameters out separately as well or I can just put in here new SqlParameter array that includes a new SqlParameter, and that parameter is this guy right here. And of course now we need to make that a DataTable. So, as I said, we'll just grab the first Table off the DataSet and we can get rid of that because it can infer the type. And then with that we can get rid of all of this and these, and these, and reformat. Alright, so now this is all we have to make that call instead of what we had before. We could further, of course, get rid of the duplication of the call here and move some of this into a separate class, but just by doing this initial refactoring we've eliminated all of those using blocks, and the direct access to the ADO.NET code. If we were to apply this same change here as well, this is also a DataTable, so it would be pretty trivial to make that change. In fact, let's go ahead and do it just to show you. We've got a string myQuery we should probably just reuse that so we'll grab this and put it right here and I think the only things that are going to change is our actual query we'll need to change. We'll paste that in. Our ConnectionString stays the same, and our command in this case doesn't even have any parameters, so we can eliminate the need to pass in any parameters. And then we simply get rid of all this code here that was working on filling up our Table. Of course, that's the employeeTable this time, so let's make that change as well. And with the employeeTable filled, we can do our work on it, we can get rid of a few more angle brackets, or curly brackets, rather, reformat our code a little bit, and now our logic is much, much simpler. We can fit the entire method pretty much on one screen here, which is always nice so you can see everything that's going on. And there's no more duplication and there's much less nesting of our curly braces with those using statements. And we know now that we're never going to forget a using statement because SqlHelper is already vetted and trusted as a Data Access wrapper that makes sure that it handles those cases. So, this is the kind of thing that you can eliminate duplication of through the use of a third-party library or your own, sort of, Utility class that you've extracted out the logic of the duplicated code into. Even better than using SqlHelper, in this case I would recommend using some kind of an object-relational mapper, for instance, LINQ to SQL or LINQ to Entities or a third-party one, because that will even further eliminate this tight coupling to the SQL internals. So right now we still have these magic strings for our SQL queries. I would much rather see that turned into something like a LINQ expression where I have some actual type checking of the values and I can do refactorings against those property names and have them applied throughout my C# code, without having to go in and remember to look at what these SQL magic words might mean, for instance, the Table name or the Column name. If those change, I'm not going to have a refactoring tool in my C# code that's going to be able to detect those changes and fix those.
-
Demo: Find Duplicate Code - Atomiq
In the original version of our code, we have a lot of duplicate copy-pasted blocks of code. In fact, there's too many of them to simply list on a power point slide, so we're going to use a commercial tool that there's a good free version of as well, called Atomiq. Now in the interest of full disclosure, I actually own the company that produces Atomiq, so take anything I say with, you know, a grain of salt, because I'm biased, but also realize that there is a free version of this tool that you can use, and it's really simple and easy to get started with it. Let's go ahead and see how we can apply this tool to find some duplication. So when you first run Atomiq, the first thing that you're going to need to do is open up the project that you want to use and then create a new project where you specify the folder that you want to use, the types of files you want to look at, what the similarity is in terms of the number of lines of code that should be similar for it to pick it up, and then you can exclude certain files where you know there's going to be a lot of repetition, because for instance, designer files are often generated Code so you can ignore those. I already have a project that I'm going to open up and this is the DRY Demo project. And so here you see it opens up now. In the case of this demo, we only have pretty much one C# file, and that's our Program.cs. When we're done, we might have multiple different files and in any real application you would, of course, have many different files, but in this case there's just one. Now one of the cool visual pieces of Atomiq is this Wheel view, and unfortunately with only one file, it makes a pretty little diagram, but it doesn't really show you quite the same value that you would get if you had many files. The way this works is we have all the different files arranged around the outside of the ring and then these two items here represent different files and then each one of these is a line running from one file to another. Now in this case, you can see they all are running around the outside because they're all communicating from one file to that same file, and so you see each one says same file. I'll show you a real application and what this would look like, when I'm done here. Looking at our code, we can see these are all the duplicates that it found. There were 28 blocks and 88 lines of code, total, with one file, and now we can go and explore our code. So we can auto-hide this to give us a little more room, and we can see that this section of code right here where we're saying using, myConnection, ConnectionString, myQuery, this is repeated and it's repeated in multiple times, so each one of these lines is showing us a different location where that's repeated, and then down here we can see those two blocks and how similar they are. Now in this case they're identical, but Atomiq will also detect sections where the code isn't identical, but it's very, very similar. And over here we can explore and see other areas where we found duplicate logic. For instance, when we're adding our parameters, that's duplicated as well, and each one of these we can see here what the duplication is that was found. So in this case, you can see that the use of RunDate and ShipperName was the same in both of these instances. Let's take a look at what Atomiq would show us for a real application. I've loaded up the nservicebus application. Let's re-analyze it. I've set the similarity length to 6, which is the default for Atomiq. We'll say OK and it'll analyze the files. Nservicebus is an open source tool you can use for adding messaging to your application and it's a very mature application, it's been around for many years. We can see here's the view now within Service bus and all the different things where it found duplicate blocks of code. It's actually a very well-designed application from what I've seen and so you'll see there's very few large sections of duplication here. Most of these only have a small amount of duplication and, for instance, in this case you can see it's where we're setting some properties within a test, etc., but it will let us see a nicer view of the Wheel. So when we open up the Wheel now, you can see here, these are the different sections, namespaces, and classes, and files within the application, and in each one of these lines represents a set of duplicate code between one class and another. So here you see that we've duplicated code from the NHibernate tests and the SagaPersisters in the Hibernate test. Both of those are coming out of test projects, so we're probably not too concerned with that duplication. These types of views here represent same file duplication, just like we saw within our original app. But overall what you want to avoid is a lot of lines passing through the center of the Wheel. And if you have a little bit of these running around on the outside where duplication within the same file, especially if they're inside of a testing folder, it's not a huge concern. Within our real application, we want to try and minimize how much duplication there is, however. So that's Atomiq, again, there's a free version you can get or I think it's like 25 or 30 dollars or something if you buy it, and it's a nice tool that you can run in order to look for duplication within your code.
-
Code Generation
When looking at duplication in our code, one of the things that we could consider is the use of code generators. Visual Studio has great support for code generation through the use of T4 templates. And these custom code generation templates are, you know, used within Visual Studio and are also built into several of the products that you can use such as Entity Framework. So if you use Entity Framework, you're already using these T4 templates and you can customize the kind of code that they produce by changing the templates. If you use an object-relational mapper tool like LINQ to SQL or Entity Framework, as well as others like nHibernate or LLBLGen, these reduce repetitive data access code and eliminate a lot of the common errors, and also our form of code generation. In most cases, they point at your database and generate a set of Entity Classes, as well as code to manipulate those entities in a way that eliminates your need to hand-code all of that data access logic. A number of commercial tools are available that also are devoted to code generation, including CodeSmith, CodeBreeze, and CodeHayStack, which you might check out, among others.
-
Repetition in Process
The last thing that we want to discuss here in our Don't Repeat Yourself Course, is the types of repetition that occur in your process. If you look at your testing, performing testing by hand is tedious and wasteful. If you look at your builds, performing builds by hand is tedious and wasteful. If you look at how your deployment process works, you'll find that performing deployments by hand is tedious and wasteful. In fact, you might see some repetition in this particular slide where we could just say that performing whatever by hand is tedious and wasteful. So if we have things that can easily be automated, we should do so. And in our process, the things that can be automated include testing, building, and deploying our application. Let's look at a quick demo that shows how we can automate these processes.
-
Demo: Automation with MSBuild
When it comes to automation, MSBuild is a very powerful tool for automating a lot of the tasks that we have to do when it comes to building our project, testing our project, and even deploying our project. So first of all, let's look at a simple MSBuild project file and go through some of the properties that it has. Here at the top there's a PropertyGroup where I'm defining what my output path will be and also the name of the project, because I use the same template for many different ClickToBuild projects. Then within here, I've got a target named DebugBuild, which is simply going to do a Clean and a Build on my project using its project name and using the Debug configuration. One of the things that makes MSBuild very powerful as a build tool is the fact that it supports dependencies between these various tasks or targets, and so the next target is called BuildAndTest, which depends on DebugBuild and then it's going to do an Nunit call to run my Unit Test, and then another Nunit call to run my Integration Test. Then I have a ReleaseBuild that depends on BuildAndTest being successful and it just does a Clean and a Build in Release mode now of my project. And lastly, if I want to Deploy, I can use this Deploy target, which depends on ReleaseBuild, so it will only work if all my tests pass and my ReleaseBuild is complete. And it will copy my output to the Deploy, whatever path I've set for that Deploy path. So if we look at this folder, let me pull it up here. So if we look at this folder, we can see that we have our Build.proj file, as well as a ClickToBuild and a Build.bat. Let me show you those real quick. ClickTobuild simply calls Build.bat with the ReleaseBuild target being passed in and then it pauses, which is going to require me to press Enter in order for the window to disappear. And Build.bat simply calls MSBuild and passes in the Build.proj and the /t parameter, which is for specifying the target with whatever gets passed into it. So with all that, I can now come in here to my project and I can run ClickToBuild and we can watch it execute. And see that it succeeded. So what this did is it ran our Build with the DebugBuild doing a Clean and then a Build. Next, it ran our Unit Test, and then it finally ran the Integration Test and then the ReleaseBuild as well. I don't have enough room in the buffer to show you all of that. We could pipe this of something if we wanted to see it, but you can see that it all ran with one click. And the nice thing about this is if I extract this out from SourceControl and then just run it, in this particular application I'm having to use a Northwind database that would need to be set up, but assuming that Northwind was set up on the local host of whatever computer I was on, this should work. Now if you're grabbing this sample file yourself and trying to run it, that's the one thing you're going to have to do is set up Northwind on your database and run the SQL Script that's there as well, and then this should work for you as well, as long as you have Northwind and .NET 4 installed. ClickToDeploy, if we look at what this is doing, as you might expect it's simply going to call that same Build.bat file, but now it's going to pass in Deploy and we'll see that this produces an output of our application. So when we run ClickToDeploy, it's going to do everything the other one did including running our Unit Tests and Integration Tests and when it's finished you can see that we now have a Deploy Folder, and in our Deploy Folder we have our application, built in Release mode, that we can go ahead and run and see that it works. So that's an example of showing how you can automate your process of compiling and testing your application using MSBuild. The next step would be to have this happen automatically whenever you do a check-in and it's something that you can do easily with Microsoft's Team Foundation Server or with third-party tools such as Jetbrains, TeamCity or CruiseControl.net.
-
Summary
So in summary, the Don't Repeat Yourself Principle basically says that repetition breeds errors and waste. There are several different types of repetition that we should watch out for in our software development, this includes in our code, as well as in our processes. In this particular module, we saw how we can use some tools such as an object-relational mapper to eliminate repeated data access code. We also used another tool to locate repeated code within our application. We found that it was possible to automate our Build and Test process and, of course, you would want to use a continuous integration server to make sure that that happens every time you check-in your code. Here's some recommended books, the Pragmatic Programmer, as well as 97 Things Every Programmer Should Know. You can get to those using the links shown here. The tool that we used, Atomiq, you can get from the URL shown below as well. Finally, I want to make note that the final code that includes all of the refactorings including using Entity Framework for the object-relational mapper is available as part of Part 3's Zip file. So download those demos and check out the final totally refactored version. And with that, thank you very much. This has been another module from Pluralsight on-demand. My name is Steve Smith. I hope to help you out with additional courses in the future. Talk to you later.