"In an ideal object world, there would be no need for databases. Objects would serialize and deserilize itself." – David West [Object Thinking].
Introduction
When you design an application you should think as you was an object, you should not think as you are a computer. An object should describe it self, it should have control over it self; no other object should control it. An object should represent a "real-world" object, like a person, cat, book etc. When we use an object it resides in the computer’s memory. But when we don’t use an object we mostly need to persists the object into some kind of data source, probably into a database so it will not use all computer’s memory. But that is not the only reason, if the computer suddenly lost its power, all objects in the memory are lost.
Something that is very common in most applications is a way to access persistent data. Most enterprise applications need to or choose to work against RDBMS (Relational Database Management System), such as Microsoft Sql server, Oracle or My Sql etc. Some applications also use OODBMS (Object Oriented Database Management System). OODBMS has not yet successfully been accepted by the market as an alternative to RDBMS. The idea of an OODBMS is to store an object as it is. In some OO (Object Oriented) application is could be very useful to use an OODBMS but before you are thinking of using an OODBMS you must be ready to use it, it’s a big step to go from a relational database thinking to object oriented database thinking. You must also learn a new technology and it could take time before you have enough skills to use it in the right way. Let’s leave the OODBMS behind and instead focus on how we can use an RDBMS in an OO application (I will use the word database instead of RDBMS in the rest of the article).
Introduction to O/R-Mapping
In most application it’s natural to work directly with the relational model, using SQL statements. This is what Martin Fowler call a Transaction Script [PoEAA] "Organizes business logic by procedures where each procedure handles a single request from the presentation". Most of you have probably used ADO.Net to fill a DataSet with data from a data base. To use a DataSet to fill it with data from a data source tables is what I will call a Table Driven Design (I first heard this term from my brother Johan Normén). A DataSet is a collection of Tables that has rows and columns that reflects a database table or tables. A DataSet is what Martin Fowler calls a Record Set [PoEAA] "An in-memory representation of tabular data". DataSet are known to be clumsy because of its collections and hierarchy of objects and are in most situations heavy because of the schema information it holds etc. With ADO.Net 2.0 it’s now possible to create a "lighter" DataSet. In an OO application you would probably in most of the cases, use domain objects that represent objects from the “real-world". A DataSet don’t represents a "real" word object. You can of course use a typed DataSet to make it easier to navigate and get data from the DataSet based on the name of the table and column etc, and make it look more like a "real-word" object, but it’s not even close a "real-word" object, it’s still a representation of a table or tables from a database.
To use a database with OO application code can be tricky. You have to make one or more tables with fields to reflect a domain object. In an OO application where you use Domain Driven Design, you can use different patterns to fill a domain object with data from a database. For example by using Active Record [PoEAA] where every mapped row of a database table represents one instance of an object. The Data Mapper pattern [PoEAA] could also be used. A Data Mapper will be used to "move data between objects and a database while keeping them independent of each other and the mapper itself". With a Data Mapper you can map one or more tables to reflect a domain object, this could be done with an Object Relational Mapper (O/R-Mapper). The domain object will focus on data representation and domain logic, while the data mapper will be concerned with persistence aspects (transparent persistence).
There are several products implementing the Data Mapper patterns such as nHibernate (nHibernate is a port of the excellent Java Hibernate relational persistence tool), nPersist and WilsonORmapper. There could have bean a data mapper tool from Microsoft shipped with .Net 2.0 called ObjectSpace, but this tool will not be part of the framework and will probably be released after .Net 2.0 is released and together with WinFS (successor of NTFS). With for example nHibernate, the mapping details can either be held in code (embedded resources) or in metadata files such as XML documents. Data mapper usually use the Unit Of Work pattern [PoEAA] "Maintains a list of objects affected by a business transaction and coordinates the writing out of changes and the resolution of concurrency problems"
to maintain a set of objects whose state has been changed by business operations in the current transaction. Object that has been changed during a transaction is said to be dirty. The unit of work can make sure to only update data that has been changed; this will make highly efficiently commits. nHibernate use its Session object as an unit of work, the Session also serves as Identity Map [PoEAA] "ensuing that each object gets loaded only once by keeping every loaded object in a map". With an identity map you can get an object by using its unique identifier, the unique id of a domain object is in most case mapped to the primary key of the table that holds the data for the object, the object uses the Identity Filed pattern [PoEAA] "Saves a database ID field in an object to maintain identity between an in-memory object and a database row". In a Domain Driven Design an object with a unique identifier is called an Entity object.
Unit of Work and Transactions
Sophisticated data mappers such as nHibernate also support transactions. The unit of work could work in a similar way as a transaction, but it’s not a substitute to transaction. When you create new objects and updates objects, you add them into the unit of works. The update and insert against a database for the objects will not take place until you commit the work. If some of the works will fail when you have commit the works added to the unit of work, you can have inconsistence data stored into the database. To make it possible to rollback the works and to protect from having inconsistence data stored into the database, you have to use a transaction. A unit of work should have the possibility to use a transaction for making it possible to automatically rollback the works if one work fails. nHibernate’s Session class support transactions.
The following is an example where nHibernate is used to create a unit of work (Session) that uses a transaction. The example will add two users to the Session and commit the insert and the transaction:
Configuration cfg = new Configuration();
cfg.AddAssembly("MyAssembly");
ISessionFactory factory = cfg.BuildSessionFactory();
ISession session = factory.OpenSession();
ITransaction transaction = session.BeginTransaction();
User newUser = new User();
newUser.Id = 10;
newUser.UserName = "Joseph Cool";
User newUser2 = new User();
newUser2.Id =12;
newUser2.UserName = "John Cool";
// Tell NHibernate that newUser should be saved
session.Save(newUser);
// Tell NHibernate that newUser2 object should be saved
session.Save(newUser2);
// commit all of the changes to the DB and close the ISession
transaction.Commit();
session.Close();
The code above uses the Configuration class to initialize the use of nHibernate and to locate the XML file with the mapping detail, which is added as embedded resources to the MyAssembly assembly. The ISessionFactory is used to create a Session.
Is Dirty
When an object changed its state, it has been dirty. If one field of an object has been changed it could be advisable to only update the changed field against a database. Data mappers use different kind of solution for handling dirty objects. For example Hibernate uses a snapshot before an update to see if an object is dirty and which field that should be updated. Other data mappers require objects to implements interfaces where it’s the objects responsible to notify if the object is dirty. Data mappers that support snapshots will not require the author of objects to implement IsDirty functionality. In most cases it’s only important for the data layer to know if an object it’s dirty or not, so by using a data mapper that will handle this automatically could be advisable. It’s important to know that data mappers that uses snapshot could affect the performance of the application. Before an update is executed, a data mapper could make an extra roundtrip to the database and execute a select query to collect the original value of the object that’s going to be updated. The data mapper will use the returned values to compare if the object is dirty or not and which fields that needs to be updated. If it’s up to the developer to implement the IsDirty functionality into each object, the extra roundtrips to the database is not needed, but require more work for the developers. Another way of handling automatic comparison if an object is dirty or not, is to create a clone of the requested object and add it to a cache, and when the object should be updated, the data mapper could locate the original object in the cache, and compare it with the object hate are going to be updated to see if there has bean some changes. A sophisticated data mapper could us both an interface for letting developers to implement is dirty methods, and snapshot for identifying if an object is dirty. The data mapper could check if the requested object has the specific "is dirty" interface during an update, if is ha not, the data mapper could do a snapshot to check if the object is dirty or not.
Lazy Load
A data mapper fills the whole requested object with its data. If the object has a hierarchy of other objects (called Aggregate in Domain Driven Design) it could also fill those aggregates. In situation where you only want to display some of the object’s properties, it could be advisable to not fill the whole object with data because of resources issues. Instead value objects or Aggregate could be filled when you first request them. To solve this you can use Lazy Load [PoEAA] "An object that doesn't contain all of the data you need but knows how to get it". Most popular data mappers support lazy loading. Object that uses Lazy Load will have a dependency to a data mapper. To solve this issue, some data mappers use the Proxy pattern [GoF] "Provide a surrogate or placeholder for another object to control access to it" and other data mappers uses interfaces to remove most of the dependency. Lazy Load with Aspect Orientated Programming (AOP) could be another way for removing dependency from an object.
Caching
In a stateless and heavy loaded environment such as a web application it could be advisable to cache objects efficiently. There are different ways you can cache objects, for example in the application scope or session scope of a web application, or you can also use a data mappers that could be attached with a caching products. Some data mappers such as Hibernate for Java support different cache products. Hibernate uses its Session class to also temporary cache objects until the unit of work will be committed.
Query Language
There are probably situations where you want to get an object based on its properties value. An id of an object will not be enough in the most cases to locate an object. For example, maybe you want to get a Customer object that has a specific name and title.
Most data mappers support queries to locate objects based on its properties value. nHibernate uses a query called HQL (Hibernate Query Language) to get objects. The HQL is very much like SQL. The HQL language support joins etc. Hibernate has a great benefit over several of other data mappers, because they don’t have such developed query language like HQL.
Data Access Helper and Data mapping
O/R mapping is not so popular in the .Net world yet, but I defiantly think it will be very popular in the future. The interest of Domain Driven Design among .Net developers has increased lately and the use of data mappers also. A data mapper is a suitable tool for working with object and persist the object into a relation database. There are some cases where a data mapper could not be suitable, for example, when you want to execute batch queries for updating several of rows or other kind of operations against the data source that the data mapper could not handle for you. In those situations you have to use another data access solution such as the Microsoft DAAB (Data Access Application Block). Different databases have different dialect of the SQL language, in most cases they use the same statements, but that is not true in all cases. nHibernate can work against different databases and uses the Plugin pattern [PoEAA] "Links classes during configuration rather than compilation" to make it possible to replace the data access provider and dialect to work against different data sources. If you have to use a data access helper block and a data mapper within an environment with different databases (or where a database should be replaced with another database), then make sure the data access helper can also work against different databases. The best solution would be to use a data mapper that uses the Plugin pattern and can also provide you with a data access helper class, where the same database provider used by the data mapper will be used for the data access helper classes.
Summary
In this article I have given you an introduction to O/R mappers and some of the patterns a data mapper are using, such as the Unit Of Work, Identity Map and Plugin. I hope this article has given you some valuable information that could be useful to have in mind when you decide how you should persist objects.
References
[PoEAA] Patterns of Enterprise Application Architecture – Martin Fowler, ISBN 0-321-12742-0
[Object Thinking] Object Thinking - David West, ISBN 0-7356-1965-4
[GoF] Design Patterns – Erich Gamma, Richard Helm, Ralph Johnson and Johan Vlissides (Gang of four), ISBN 0-201-63361-2.