Going Dark
15 juni 2008Jeff Atwood has some inspiring things to say today: Don’t Go Dark. Seriously. Read it.
Even if it’s just stuff he’s copied and pasted from others, it’s still something every developer should take note of.
(For those of you who are too lazy to click the link, here’s the one line summary: “programmers should never work alone for more than a few days in a row”.)
The reason it’s such an inspiring piece to read is that sitting in a dark room churning out code is probably how we feel, deep down inside, developing software should be. In an ideal world, we don’t have managers changing the specifications around every single day, we don’t have colleagues complaining about how our interfaces don’t do what they expected them to do and we do have team leaders that take a look at what we came up with after weeks of hard coding and heap lavish praise upon us for our brilliance. We just sit in a room all day and produce reams of beautiful, elegant code.
Like communism, this ideal is nonsense.
The reason our managers change the specifications around every single day is that they are incapable of telling us exactly what it is they want in terms we can understand and that we, in turn, have difficulty translating the politically motivated gibberish their management requires them to shroud their desires with into lines of code. Every time we finish a part of the project, it turns out to be nothing like what they had originally intended for us to build.
Our colleagues, in turn, complain about our interfaces because they are working on a completely different piece of the same project and what makes sense for us to supply them in an interface with, is quite possibly completely useless to them.
And the team leader? Well, he’s right. We’re not brilliant. What seems an inspired stroke of genius to us to them is convoluted garbage that will be impossible to maintain by someone else once we’ve moved on to a different job in another part of the country.
The idea of programming as nothing more than a simple exercise in problem solving where the cleverest bit of code wins the day is completely and utterly wrong. In the real world, it’s not cleverness that makes a project successful, it’s communication. An elegantly constructed algorithm will not help you one bit if it doesn’t help your manager solve the business problem he’s trying to solve and it won’t help your colleagues if it makes it more difficult for them to interface their code to yours. It should therefore come as no surprise that your team leader is indifferent about it. Real world projects succeed because the people working on them communicate well. What exactly is your manager trying to achieve? What are your colleagues working on? How well do you keep up with everything that’s going on?
If you’re sitting in a dark room for three weeks, you’re not communicating and the project will fail.
Writing a framework [4]: the model
21 mei 2008Last time I talked about how the Model View Controller paradigm (MVC) and object orientation (OO) were a perfect fit for writing your own web application framework. In today’s installment I will discuss the Model part of this model.
Now if you’ll recall, in part two of this series I decided that a web application framework needs to create, read, update and delete (CRUD) bits of data from some sort of storage mechanism, and I wrote a module (which was, in fact, an object, though it doesn’t need to be) that exposed abstract CRUD functions for a given type of data storage. Then, in part three I said the Model part of MVC is the part that determines how data is stored, retrieved and manipulated. Strictly speaking, therefore, we already have a base class for the model part of our web application framework. Every time we need a new data type (a list of books, for example), we can simply subclass this data abstraction class as presumably that will give us all the methods we need.
However, it might be a better idea to create a separate base model class instead of simply extending the data abstraction layer every time we build a new model. There are three reasons for this. First, when thinking about the model part of our application, we really don’t want to be bothered by the fact that the basic operations (create, read, update and delete) are in fact database operations. Now of course our data abstraction class is, in fact, very abstract (as you’ll recall it contains methods that automatically compose queries for simple use cases), so this, by itself is no reason to build a separate model object. The second reason isn’t terribly convincing either, as it’s the often heard statement that you should abstract out your data storage mechanism so you can simply swap it out and replace it with a different one should it be decided at some point in time that the data that is currently in a MySQL database (for example) will now be stored in a series of flat text files. I have been developing web applications for a little over seven years now and never once have I run into such a situation. Should you ever come across it, however, I can assure you that rewriting a bunch of models to deal with a different data storage mechanism will be the least of your worries as you’re trying to coerce a database dump file into a set of flat files on the server.
There is, however, a third reason why a separate model class is preferable, and this one actually makes sense: a model in an MVC web application might need to do other things besides interact with the data storage mechanism. The prime example of this is logging: you don’t want to provide all sorts of fancy logging facilities in your data storage abstraction class, but you probably do want it in the model of a web application.
If logging isn’t enough reason for you to create a separate model class, by all means don’t. Ultimately, it isn’t very important. What is important is that you have a base class for your own models that proves create, read, update and delete methods. With that class we can start building controllers, which is what we’ll do in the next installment.
Previously:
Writing a framework [3]: The Model View Controller paradigm and object orientation
18 mei 2008In the last installment of this series I talked about building a module for abstracting the data storage of your web application. In that installment, I purposely avoided mentioning explicitly whether this data abstraction layer was to be a collection of functions or an proper object oriented (hereafter: OO) class. The reason was that for a data abstraction layer it doesn’t really matter if you’re using OO or something else. After all, all it needs to do is provide some functions for talking to some data storage mechanism or other. In today’s installment, however, I’m going to be more explicit: we’re going to be leaving the old school unix (”objects? we don’t need no stinking objects”) and beginner PHP programmer (”what’s a class?”) mindset behind to dive into one of the finest design (and oldest: it was originally invented for Smalltalk) patterns the object oriented world has to offer: the Model-View-Controller paradigm.
In essence, the Model-View-Controller paradigm is, like most useful software architecture concepts, nothing more than good old common sense. It dictates that you separate out the storage of data (the model) from the presentation of data (the view). The part of your program that connects the presentation to the data is called the controller. Hence: Model, View, Controller (MVC for short).
MVC is a paradigm that is inordinately well suited for developing web applications. Think about it: you’re fetching and storing data from a database or some flat text files (the model) and displaying the results on a web page (the view). The part of your program that sends the data to the web page is the controller. Even if you don’t know you’re doing it, you’re automatically using MVC whenever you’re developing a database driven web site.
The simple fact that MVC is a natural way to think about web application architecture and that MVC is an inherently object oriented paradigm is not the only reason to use OO for the framework you’re building, however. Another, more important reason, is that OO jargon is perfect for describing the mechanism of using abstract functions as a base upon which you build more specialized functionality. The rather convoluted statement “the function ‘retrieve_list_of_books’ uses functions from the ‘retrieve stuff from the database’ file to build a list of hashes that contain information about specific books” in OO speak becomes the much simpler “the ‘books’ class extends the ‘data storage class’”. Since simple, clear speech is an expression of simple, clear thought (when it comes to technical matters, of course. In politics it’s rather different), you will want to build your web application framework as a small set of objects and classes. This way the underlying mechanism of your framework becomes a limited number of simple concepts that are easy to keep in the back of your head while you’re working on the complexities of actually building a specific web application using your framework.
Which is the reason why, from now on, this series will go OO. There is no reason why the data abstraction layer from the last installment can’t remain a bunch of functions (if that is the way you built it) but if you built it in that way and you don’t actually have any experience working in an OO fashion, now would be the time to rewrite it as a proper object so you will be familiar with the OO concepts that will be thrown about in the next installment of this series, where we’ll decide whether or not we want to build a Model class.
Writing a framework [2]: abstracting out data storage functions
14 mei 2008The first thing to do when building a framework for web applications that fits your own method of working in a way no pre made web application framework can, is to abstract out functions for working with your data storage mechanism. If you don’t need to store data, you can of course skip this step, but in that case your application is probably simple enough that it doesn’t need a framework at all. However, if you do need to store and retrieve data from the server, here’s something to think about:
Are you going to be using a database?
There are two traditional methods for storing data on the server. One is to maintain a bunch of text files (the blogging tool Movable Type does this, for example), the other is to use a database (the blogging tool Wordpress does this, for example, as do most other web applications). Recently some other storage options have become available, such as the S3 storage engine from Amazon. Of course, if your application is a mash up of someone else’s data, your “storage” will be interfaces to the web interfaces or REST interfaces of the providers of that data. Only in the last case are you not going to have a choice in the data storage mechanism you will be using. Assuming your application is not one of those special cases, however, I would suggest you simply use a database. The reason is that databases are mature and well proven tools for storing and retrieving data and it will be difficult to match their versatility and performance in, for example, a flat file storage mechanism you design yourself. The specific database engine you choose, however, is up to you: it doesn’t really make a difference (other than the fact that some cost money and others don’t). Unless you’re dealing with sensitive and / or financial data, in which case you will probably not get to determine which database you’ll be using yourself, I’d say just pick one. MySQL and PostgreSQL tend to be popular choices. For really light weight applications, SQLite might also be an option.
Hide the fact that you’re using a database
Once you’ve chosen a database (or some other storage mechanism), it’s time to think about which operations you wish to perform on your data. Do this thinking in the most general way possible: if you’re deciding you need to perform INSERT queries you should abstract that into “you need to store bits of data”. The reason is that when you’re ready to design the parts of your application that your users will work with, you really don’t want to keep thinking about INSERT queries. What you want is to “store some data”.
Common operations to perform on data are creating bits of data, reading bits of data, updating bits of data and deleting bits of data. A web application that needs to do any more than that is a rare piece of software indeed and unless your application is really, really special (and it probably isn’t), create, read, update, delete (commonly abbreviated to CRUD) is all you need to worry about.
Once you’ve determined that you need to be able to do CRUD, write a module that exposes only those four features. This will be quite a bit of work, as you’ll need to abstract out a lot of the underlying data storage mechanism (connecting to a database server, composing queries, running queries, that sort of thing). From now on, those four features are the only things you’ll ever need to think about when writing code that interacts with your data.
Keep in mind that for reads, updates and deletes you will need to be able to filter your data set. For example, when updating it will be necessary to specify which conditions data items need to match before they’re updated. In practice this means you are going to have to expose more of your data storage mechanism than might seem sensible at first glance. My own data storage abstraction module (which I will not publish here in full as a tiny bit of it was developed on company time), for example, does in fact provide a method for supplying raw SQL code. It does not, however, require me to supply SQL, so for something simple like updating the password of a specific user I can simply do:
Data.update({'name': 'some_name'}, {'password':'some_password'})
In other words, update the “password” field to “some_password” for items where the “name” field has the value “some_name”.
In fact, the data abstraction module I’ve built also requires you to supply the table name (as I feel thinking about stored data in terms of tables and records is, in fact, quite a sensible way to think about such things), meaning the actual code(*) reads something like this:
Data.update('users', {'name': 'some_name'}, {'password':'some_password'})
What this should show you is that your own data storage module should strive to expose nothing but the simple CRUD features and that it should only expose more if not doing so would make your module unusable (as a database abstraction class without a notion of “tables” would be). As far as allowing raw SQL is concerned, I tend to think database abstraction classes that don’t allow this become inordinately complicated as they need to implement their own version of common SQL features such as LIKE clauses. Simply allowing raw SQL when it’s necessary (but not requiring it for the simple operations) is probably best(**).
That’s it for now. The next time I will be using the data storage abstraction to build objects for specific types of data and construct the beginnings of a simple Model View Controller framework.
* The real code is written in Perl as that is what my company uses, but for reasons of clarity – Python code often times reads like pseudo code that even programmers who don’t use Python can understand reasonably well – for the examples in this series I will be using Python
** In fact, my own database abstraction module also provides a function for executing complete queries. This violates the idea that such an abstraction needs to hide the fact that you’re working with a database, but it does allow you to make use of its features should you really need to.
(see also part one and part three of this series)
Writing a framework
10 mei 2008Unless you’ve been living under a rock for the past five years you’ll be aware that frameworks (large collections of modules and libraries that abstract away the more mundane aspects of programming) are what all the cool developers use for creating web applications. You’ll probably also know that the most popular of these frameworks, in the sense that it’s the one that gets talked about the most, is called Ruby On Rails and if you are, like me, somewhat of a web development junky, you’ll probably also have tried out building something small with Ruby On Rails to find out what all the excitement is all about.
Unfortunately, these pre made web programming frameworks tend to be highly complex and difficult to learn. For any but the simplest of use cases you will need to have an intimate and in depth knowledge of how the framework actually works and for obtaining that knowledge it is necessary to have digested vast amounts of documentation (if such documentation exists at all, which isn’t always the case). This will take an investment of time that is hard to justify unless you happen to be one of those rare people that actually makes a living building one web application from scratch after another. From an economic point of view, it may, in fact, make more sense to simply copy over some code from a previous project every time you start a new one. The amount of time you waste doing this pales in comparison to the time it would take you to learn the intricacies of Ruby On Rails, Django or the Catalyst framework you need to know in order to be productive.
Being a developer, however, you detest (and if you don’t, you really should, detest) copying and pasting code from one project into another. Code that is worth copying and pasting is code worth abstracting out into a library, after all. So the thing to do the next time you’re about to begin building a new web application is to spend some time designing components that can be reused easily in case you need to build yet another application from scratch at some point in the future.
In effect, what you’re doing is creating your own framework from scratch. This may seem like a bad idea because, after all, there’s so many well designed (and tested!) frameworks already out there and it would seem wasteful to duplicate those efforts, but it really isn’t: a framework you design yourself will only do what you need, it will do it in a way that makes sense for you (one need only look at the terrible complications Ruby On Rails imposes on you when you’re trying to fetch data from two different tables into one model object to see why that is a bonus) and, best of all, you will become intimate with it with no effort at all while you’re busy implementing it.
Over the course of the next several days, weeks or months (depending on my desire to write) I wil be talking about the steps I myself have taken to construct such a framework for a web application I needed to build. In no way am I going to suggest that those steps are steps that you should take as well. However, reading about how someone else tackled a specific problem may give you some ideas about how you yourself could go about building something that works for you. Stay tuned!
(see also part two and part three of this series)