Jul

2

Unit testing using in-memory databases

Posted by Keith McMillan

July 2, 2009 | 6 Comments

Unit testing software tends to break down at the boundaries of a system.  It’s difficult to test the graphical user interfaces (although with some of the new technologies like Selenium, testing browser-based GUIs is easier…) and the database access. I’ll save the UI for another time, today’s topic is the database access.

[Update – Read Some unfortunate behavior in DBUnit for important gotchas in this approach]

What’s so difficult about database testing?  It mostly comes down to the question of what data can you expect to be in the database at a given time. Because databases are designed to keep your data safe, running a test typically involves reading and writing data, then checking the results.  Because it’s in a database, this means that the order of your tests can affect what data is present when your test runs.  Putting that another way, if you run test A, which inserts data, before you run test B, which reads it, then test B is going to see different data than if you didn’t run test A. In short, your tests can become dependent on one another.

It would be very convenient to find a way to re-initialize a database to a known state before every test, and to do that with an in-memory database so we can just throw it away when we’re done with the testing pass.  Last week at my current client’s site, we came up with a way to do just that.

I’d found unitils last year, and begun fiddling around with it.  I’ve been using it primarily for it’s extensions to DBUnit, the database testing framework.  With unitils, I can have per-test-class or even per-method data sets loaded into the database, and get convenient access to them using the @DataSet annotation that Unitils provides.  This means I don’t have to write a test that simply checks to see if there’s more than one row in a table, or even a before-count and an after-count. Since I have complete control of the database contents on a per-test basis, I can say it has 6 rows after I’m done.  This level of confidence makes me happier. It’s really pretty simple, if I have a class, say FooTest, and I want to set up the database for that test, I just do the following:

@DataSet
public class FooTest extends UnitilsJUnit4 {
...

When Unitils sees that @DataSet annotation, it looks for a file called “FooTest.xml” in the same directory as the test case, and uses it to set up the table data as I direct. Here’s the FooTest.xml file:

<?xml version='1.0' encoding='UTF-8' ?>
<dataset>
<bar id="1" name="my bar"/>
<baz/>
</dataset>

What’s all that mean? It says the bar table should have a single row, with a 1 in the “ID” column, and “my bar” in the name column.  The baz table should be empty.  Unitils and DBUnit take care of making it so before any test in this class is run.  What about any tables that aren’t mentioned? They’re left unmolested by this, so if you want a table to be empty you need to say so, like we did with “baz” above.

Unitils works with a bunch of different databases, but I wanted to make it work with HSQLDB, which was packaged along with Unitils, and in particular, with the in-memory database option of HSQLDB.  Setting up the database itself was pretty simple. Unitils has a properties file, in which you specify the URL of the database, and the schema name. Mine looks like this:

database.url=jdbc:hsqldb:mem:PUBLIC
database.dialect=hsqldb
database.driverClassname=org.hsqldb.jdbcDriver
database.schemaNames=PUBLIC

The key for an in-memory database is the URL. Another thing of note here: with an in-memory database, HSQLDB doesn’t seem to pay any attention to the schema name, creating all the tables in the PUBLIC schema, but Unitils does pay attention. Unitils consequently can’t see the tables we’re going to create in a minute unless you say “PUBLIC” for the schema name here.

So that sets up a connection to an in-memory database, how do we get access to that database in our tests?  We’re going to need to use the same connection that Unitils/DBUnit uses  All we need to do is create a field in our test class that has the correct annotation:

@TestDataSource
protected DataSource testDataSource;

Unitils will set that field with the connection after it’s created the data source, and we can then use it to pass along to our classes that mutate data in the database.

The Secret Sauce

So far, we’ve got an in-memory database, and unitils prepared to use it.  The problem we now encounter is that because it’s an in-memory database, we have to find a way to create the tables before Unitils can use them. Otherwise, when Unitils tries to insert data, the table won’t exist yet.

The way we found to do this was to create a custom module for Unitils that creates the schema before we insert data.  We had to feel our way through doing this, but it turned out to be pretty simple.  In order to hook into the Unitils lifecycle, we need a DbUnitListener.  With one of these, we can hook into the beforeTestSetUp method. The out-of-the-box version of this class is packaged in the DbUnitModule.java file, as an inner class, and we’ll stick with that pattern.

It looks like this (I’ve edited out some of the fluff, the complete class is available Here):

package com.adeptechllc.unitils.dbunit;
...
public class InMemoryDbUnitModule extends DbUnitModule {

...
    private boolean isInitialized = false;
    /**
     * @return The TestListener object that implements Unitils' DbUnit support
     */
    @Override
    public TestListener getTestListener() {
        return new DbUnitListener();
    }
    public void initializeSchema() {
        try {
            String schemaName = configuration.getProperty(DATABASE_SCHEMA_NAME,
                    DEFAULT_SCHEMA_NAME);
            DbUnitDatabaseConnection connection = getDbUnitDatabaseConnection(schemaName);
            Statement statement = connection.getConnection().createStatement();
            for (String tableString : getTableStatements())
            {
                statement.executeUpdate(tableString);
            }
            connection.close();
            isInitialized = true;
        } catch (SQLException sqle) {
            System.err.println(sqle);
        }
    }
...
    /**
     *
     * Test listener that is called while the test framework is running tests
     */
    protected class DbUnitListener extends
            org.unitils.dbunit.DbUnitModule.DbUnitListener {
        @Override
        public void beforeTestSetUp(Object testObject, Method testMethod) {
            if (!isInitialized) {
                initializeSchema();
            }
            super.beforeTestSetUp(testObject, testMethod);
        }
    }
}

The important parts are our custom listener, which calls “initializeSchema” if we haven’t done so already, and the override of “getTestListener,” which returns our custom listener.  The only thing we need to do now is to get Unitils to use our custom module instead of the stock one. In order to do that, we add one more setting to that unitils.properties file:

database.url=jdbc:hsqldb:mem:PUBLIC
database.dialect=hsqldb
database.driverClassname=org.hsqldb.jdbcDriver
database.schemaNames=PUBLIC
unitils.module.dbunit.className=com.adeptechllc.unitils.InMemoryDbUnitModule

Now, when unitils loads up, it will use our custom DbUnitModule instead of it’s own. Most of the functions are exactly the same, except our listener will create the database the first time beforeTestSetUp is run.

It’s amazing how fast this works, by the way: the first few times we ran the DBUnit tests, we had to make the tests fail by inserting wrong data to convince ourselves it was really doing what we asked that fast…

I hope this helps you put better tests into palce for your data accessor code, I know I’m going to be using it quite a bit… We’ve been using it to enable TDD on our data accessors, and it’s working out quite well.


Comments

RSS feed | Trackback URI

6 Comments »

2009-07-07 07:24:30

[…] Unit testing using in-memory databases […]

 
Comment by jaime
2009-10-13 16:00:59

hi, thank you for the article, now i’m using your approach, but I have I problem trying to use another schema instead of public, say “TESTSCHEMA”

Comment by Keith McMillan
2009-10-13 20:50:12

Hi Jaime,

I also tried a schema other than PUBLIC when I first started trying to use HSQLDB, but if I recall correctly, you must use PUBLIC for an in-memory database. If I’m remembering correctly, this is because that’s hard-coded in HSQLDB for the in-memory driver.

Is it important that you use a name other than PUBLIC? The database will be destroyed when your JVM exits after all.

Hope this helps.

Keith

 
 
Comment by Felipe
2010-04-27 12:31:45

Hey, Keith,
I’ve tried using this class you’ve created but for some reason it doesn’t seem to be creating the schema.
My create_database.sql (maybe you could post an example of create_database.sql) has a single line:
CREATE SCHEMA UM5P AUTHORIZATION DBA
My hibernate entities are annotated as the example:
@Entity
@Table(name=”PTC”,schema=”UM5P”)
public class Ptc …

Funny thing is if I use HSQLDB in file mode, and put the UM5P schema creation in the .script file generated by HSQLDB, it works fine, but I can’t seem to make it work in mem mode. Any wild guesses?

Thank you

Comment by Keith McMillan
2010-04-28 10:00:34

Hello Felipe,

I have a pretty good guess what’s going on. HSQL with the in-memory database creates files in the PUBLIC schema, regardless of what you tell it to use. Since you’re using annotations, it may be difficult for you to swap out what schema hibernate is using in production vs unit testing, but that’s what you need to do. If you externalize your Hibernate mappings into XML, you might have an easier time.

Good luck!
Keith

 
 
Comment by Siddharth
2011-11-10 14:56:24

Thanks a lot. This saved my life.

 
Name (required)
E-mail (required - never shown publicly)
URI
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> in your comment.

Blogroll