Navigation

Unit testing can be a blessing and curse at the same time. Once you start doing it on a regular basis, it can become an addiction. You test everything, you feel the satisfaction of 110% test coverage giving you confidence in your code. But after a while, testing suddenly seems to slow you down. Everytime you make a change in your code you have to adapt several unrelated tests. Adding a <ul>-tag to your template becomes a matter of minutes, rather than seconds. And, worst of all, your test suite is slow.

What happened?

Unit vs. System Tests

To begin with, you need to understand the difference between unit and system tests and the impact both have on testing performance.

Unit tests are atomic. They test classes and components in isolation from any other service they depend on. Such services can be external services, like mailing systems, databases or webservices, and internal services, like a calculator, a data store or similar service classes. Running these tests in isolation gives you several advantages:

  • Speed of execution
  • Speed of development
  • Test confidence

Speed of execution, because the test doesn’t need to wait for the related service to process its internal logic. Speed of development, because you (hopefully) don’t have to adapt the test when you refactor the internals of the related service class or add new features to it. And test confidence, because the test will only fail when the tested class is erroneous, not if any of the related services provide wrong data.

System tests (acceptance or functional tests are a kind of system tests) test whether all the classes in your application collaborate correctly. Because you unit tested these classes in isolation, you had to assume how collaborating services should behave. If this assumption is wrong, no unit test will tell you. And that’s the space that system tests fill. The big, big disadavantage of system tests is that they are slow. They have to initialize the application, read its configuration files, initialize its database with fresh data (it doesn’t make a lot of sense to disconnect system tests from the real database) and more. All this takes time.

The conclusion from the last paragraphs is that features of your application should always be tested in unit tests. These tests are fast and guide you to the error if things go wrong. There is nothing more harmful to test performance than testing whether a form works correctly by means of a system test. If the form is a class, write a unit test. If the form is no class, make it one.

Decouple Your Code

If you reach the point that you have lots of unit tests and few system tests, you made the first step towards becoming a test speed king. The second step is to decouple your classes.

Highly coupled code negatively affects your development in several ways. Because your code depends on other classes, these other classes (the dependent-on components or DOCs) all have to be instantiated during testing. Depending on their nature, this process may take time and slow your test down. Introducing new features becomes difficult because it results in lot of test code that needs to be adapted. If one of your classes breaks, not only the test of the class fails, but also several other tests of dependent classes. Shortly spoken: Extending and maintaining the application becomes a nightmare.

To decouple your code, get a clear picture of the responsibilities inside of your application. If a class does too much, tear it apart like a wild monkey smelling a banana in a big pile of waste paper.

The following code samples are based on the PHP5 frameworks symfony, Doctrine and Lime 2. The same concepts can be applied to other frameworks and languages just as well.

Let’s look at a (slightly modified) example I recently saw in a code review:

class sfValidatorUsername
{
  public function clean($value)
  {
    $user = Doctrine_Query::create()
      ->from('User u')
      ->where('u.name = ? AND u.active = 1', $value)
      ->fetchOne();
 
    if (is_null($user))
    {
      throw new sfValidatorError('invalid', $this);
    }
 
    return $value;
  }
}

It should be pretty obvious what this code is doing. What is the problem? The validator depends on the database. This is not a problem per se, but querying the database for User objects is beyond the validator’s responsibilities. This task should rather be done by your data acccess objects (DAOs), entity stores, tables or whatever you call them. And these objects can easily be injected into the validator using Dependency Injection.

Happy that we have identified this code smell, we can already start removing it:

class sfValidatorUsername
{
  protected $table;
 
  public function __construct(UserTable $table)
  {
    $this->table = $table;
  }
 
  public function clean($value)
  {
    if (is_null($this->table->findActive($value))
    {
      throw new sfValidatorError('invalid', $this);
    }
 
    return $value;
  }
}

We removed the database code and instead inject the table object into the constructor. The responsibility for fetching an active user by its name was outsourced to the table.

An even faster solution is to only count the matching user objects in the database instead of querying and hydrating them. But that’s not in the scope of this article.

So, mummy, is our test faster already? Is it?

Replace Dependencies During Testing

No, kid. We are still using the UserTable when testing the validator – of course, because we need it. So what can we do?

Now that the validator only depends on the UserTable, and not on some mystical database query magic, we can replace the UserTable – which is tested somewhere else and assumed to be working – with a fake implementation. This is usually called stubbing (because you are replacing it with a non-functional stub). With Lime 2, you can do it like this:

$t = new LimeTest();
 
// @Test: clean() returns the cleaned value if the user is found
 
  $table = $t->stub('UserTable');
  $table->findActive('bob')->returns(new User());
  $table->replay();
  $validator = new sfValidatorUsername($table);
 
  $t->is($validator->clean('bob'), 'bob');

The stub acts as if it was the real UserTable object, but in reality it has no internal logic at all. So compared to the real table, the stub is very fast.

More documentation about Lime 2 and its stubbing/mocking capabilities can currently be found on the Lime 2 GitHub repository.

A word of caution shall be spoken. You should only replace services (classes that process some data and return a result) with stubs, not entities (classes that simply contain data). Otherwise changing your domain model (the entities) will soon become a nightmare.

Testing With Databases

Hurray! The validator test is so fast, I can execute it 10 times in a row and it’s still fun. One more time! Nice.

But wait – we can’t decouple everything from the database, can we? Of course, we can’t. Code that is responsible for collaborating with the database, like the above UserTable, needs to access the database. In order for that test to work, the database has to be bootstrapped and filled with data. Because tests should run in isolation from each other, the database further needs to be reset before every single test.

Now a common mistake that severely affects testing performance can often be seen in symfony projects. Many developers use a test database on their development DBMS, like a MySQL server. This database is filled with the content of fixtures.yml, which usually contains all fixture data commonly used to test the application in the browser. And for good reasons this is very slow:

  • Most DBMS store data on the file system. Hard disk access is an expensive operation.
  • Filling the database with all the test data takes a lot of time
  • Parsing YAML files is slow

So ideally we don’t use the database and don’t use any test data. Obviously, that’s not possible.

In-Memory Databases

What is possible is to use an in-memory database. While CPU access on the hard disk usually takes around 4 170 000 ns (nano seconds) for 7200 RPM drives, memory access only takes around 14 ns! Thus, using an in-memory database is a very cheap way to increase your testing performance immensely. Unfortunately not all database vendors offer in-memory alternatives for their file-based DBMS (such as MySQL). In such cases I use SQLite in-memory databases, because SQLite supports most of the SQL92 standard, which is generally sufficient for testing purposes.

If you use symfony with Doctrine, you can place the following code snippet into your bootstrap/unit.php file which initializes the SQLite in-memory database:

// initialize Doctrine
$cacheFile = sprintf('sf_autoload_unit_doctrine_%s.data', md5(__FILE__));
$autoload = sfSimpleAutoload::getInstance(sys_get_temp_dir().'/'.$cacheFile);
$autoload->addDirectory(ROOT_DIR.'/lib/model');
$autoload->addDirectory(ROOT_DIR.'/lib/form');
$autoload->addDirectory(ROOT_DIR.'/lib/filter');
$autoload->register();
 
$database = new sfDoctrineDatabase(array(
  'name' => 'doctrine',
  'dsn' => 'sqlite::memory:'
));
 
// load all missing model files
Doctrine::loadModels(ROOT_DIR.'/lib/model');
 
/**
 * Reloads the database.
 */
function reload()
{
  // close the connection to the in-memory database to recreate the database
  Doctrine_Manager::getInstance()->getCurrentConnection()->close();
 
  // create the database tables from the loaded models
  Doctrine::createTablesFromModels();
 
  // clear the Doctrine cache
  foreach (Doctrine::getLoadedModels() as $model)
  {
    Doctrine::getTable($model)->clear();
  }
}

Make sure the constant ROOT_DIR is defined and points to the root of the tested project (your real project or the fixture project).

You might wonder why the $cacheFile is required. This parameter allows us to run multiple tests in parallel while still leveraging autoloading performance by using a cache.

The real interesting piece of code is the reload() function. It allows you to recreate the database on demand. Whenever you write a test that requires a fresh database, call reload() and you are good to go.

Fixture Management

As described above, it is important to keep an eye on the test data that is loaded into your database. The more data you load, the more time it takes. Loading the whole fixtures.yml is a recipe for slow (not only because it contains a lot of data, but also because the YAML file must be parsed).

A much more efficient solution is to programmatively create and load test data on-the-fly. Only create what you need, and nothing more. Below you can find a sample test for our above method UserTable::findActive():

$t = new LimeTest();
 
// @Test: findActive() returns an active user by username
 
  reload();
  $user = new User();
  $user->username = 'bob';
  $user->active = true;
  $user->save();
  $table = Doctrine::getTable('User');
  $t->is($table->findActive('bob'), $user);

This test is really fast now. The database is created in-memory, exactly one single User object is inserted and that’s it. The downside is that creating all the fixture data inline becomes a little messy, especially if you have to fill a lot of properties only because they are defined as NOT NULL in the database.

Creation Methods

To work around this issue, I prefer the Creation Method pattern by Gerard Meszaros[1]:

We should use a Creation Method whenever consructing a Fresh Fixture requires significant complexity and we value Tests as Documentation.

The pattern is very simple. Encapsulate the code required to create an object within a utility function that can be reused across your project. My rule of thumb is to write creation methods for all tested Doctrine records and initialize all NOT NULL fields and relations with default values so that the object can be saved without further modification. Because this leads to a lot of code duplication, I also use a base function createObject().

function createUser(array $properties = array())
{
  return createObject('User', $properties, array(
    // NOT NULL properties
    'username' => 'francis',
    ...
    // NOT NULL relations
    'Group' => createGroup(),
    ...
  );
}
 
function createObject($class, array $properties, array $defaultProperties)
{
  $properties = array_merge($defaultProperties, $properties);
 
  $object = new $class();
  foreach ($properties as $property => $value)
  {
    $object->set($property, $value);
  }
 
  return $object;
}

Now you can call createUser() everytime you need to create and save a new User object. Optionally, you can inject property values into the constructor.

$t = new LimeTest();
 
// @Test: findActive() returns an active user by username
 
  reload();
  $user = createUser(array(
    'username' => 'bob',
    'active' => true,
  ));
  $user->save();
  $table = Doctrine::getTable('User');
  $t->is($table->findActive('bob'), $user);

A Few More Utility Methods

In the following snippet you can find some more utility methods that I find very useful when dealing with fresh fixtures. Put them into your bootstrap/unit.php file or where ever they suit you best.

/**
 * Saves all objects passed as arguments.
 * @param Doctrine_Record $object1
 * @param Doctrine_Record ...
 */
function save()
{
  foreach (func_get_args() as $object)
  {
    $object->save();
  }
}
 
/**
 * Creates a collection containing the passed objects.
 * @param Doctrine_Record $object1
 * @param Doctrine_Record ...
 */
function collection()
{
  if (func_num_args() == 0)
  {
    throw new LogicException('You must pass at least one object');
  }
 
  $objects = func_get_args();
  $collection = new Doctrine_Collection(get_class($objects[0]));
  foreach ($objects as $object)
  {
    $collection->add($object);
  }
 
  return $collection;
}

The functions can be used like this:

$user1 = createUser();
$user2 = createUser();
$user3 = createUser();
save($user1, $user2, $user3);
 
$coll = collection($user1, $user3);
// etc.

Pulling All Levers

Now that we have optimized our test code to the highest degree, there is one last thing we can do: Increase the hardware power of the test computers and optimize the test suite to utilize its power.

Because most CPUs nowadays feature multiple cores, you should design your test suite for parallel processing. Make sure that no two tests are using the same resources (another benefit of the in-memory databases) and use a testing framework of your choice that supports multiprocessing.

Lime 2, for example, supports multiprocessing by adding the --processes option:

$ php lime --processes=16

While most of my CPU cores are pretty bored when running the test suite in one process, using multiple processes sparks a sudden increase in activity. Not only that, I could record speed increases of over 300% (15 minutes in one process, 4.5 minutes in 16 processes).

Conclusion

As I have tried to portray in this article, writing tests alone is not enough to improve your development. If tests are slow or prevent change, your developers won’t run and use them. Fortunately there are several ways to make changes easier and to increase test speed.

To summarize:

  1. Test single features in unit tests. System tests should only be used to verify whether the classes collaborate correctly within different processes.
  2. Decouple your classes. This allows introduction of changes without having to adapt multiple tests.
  3. Only classes that really need to should deal with remote services. Replace these classes with stubs in all other tests.
  4. Choose the fastest version of a remote service available. When you test database classes, use in-memory databases.
  5. Only create databases during testing when you really need them
  6. Only load data into the database that you really need. Do so programmatively without taking the overhead of parsing XML or YAML files.
  7. Make use of multiprocessing, if supported by your testing framework.

How is your testing experience? Are you satisfied with the performance of your tests?

References

[1] Gerard Meszaros: xUnit Test Patterns. Refactoring Test Code. Addison-Wesley, 2007. Page 415

Posted Thursday, March 11th, 2010 at 18:55
Written by: | Filed Under Category: Best Practices
You can leave a response, or trackback from your own site.

13

Responses to “Writing Efficient Tests”

Very nice post !

One important thing, when dealing with multi process tests, is that you might conflicts between concurrent process. So if you tests generate or use temporary files you need to make sure the configuration/log/temp folders are unique per process.

Bernhard

Thanks for the comment, Thomas. You are absolutely right. I think I will publish another post that will deal exclusively with concurrent testing.

Matthieu

Great post!

What do you think about transactions?

Between the execution of each function in my (Lime2) test class, a beginTransaction() and endTransaction() is called to revert the database to its previous state. This is really fast I think.

Bernhard

@Matthieu: Sounds like a good solution, too. I probably need to experiment a little more with that approach.

One more very good article on this blog.

The transaction trick is a good one, but it’s still very slow – at least regarding my quick personnal testing.

The in-memory SQLite database trick is fantastic, unfortunately hardly usable as soon as you’re using RDBMS-specific features such as ones provided by MySQL or Postgresql (they’re providing so many useful ones it’d be a shame not to benefit from them… I know ORM are not intended to be used this way but the shared features accross RDBMS implementation are really thin actually.)

I think at some point System Tests will always end beeing slow, just by design: if the system to be tested is complex, related tests will test this complexity; and testing complexity naturally takes time. But of course I agree we should always try to reduce at max the amount of unnecessary executions to gain maximum efficiency, and rapid feedback to keep agility.

Personnaly, I’m a big fan of continuous integration, the tests are run periodically so I can keep concentrate on the code and I get alerts as soon as something breaks (so you need of course to commit atomically to reduce the surprise effect ;) I didn’t find the perfect compromise yet, but will look forwards giving a try to lime2 because its mocking api is really interesting by itself :)

Keep up the good work!

Very interesting article! I think I can use a lot of this even when I’m testing with PHPUnit.

But I wonder about Lime2. So far it has been mentioned that Symfony 2 is going to use PHPUnit, symfony 1.4 is not supposed to switch base technologies, Lime2 development became low priority, ..
What is the purpose and goal of Lime2? I mean I do think your ideas so far have been inspiring, but what will be the outcome?
Shouldn’t there be a Lime2 website finally?

@NiKo Apart from the initial part (Filling the database with all the test data from YAML files) that I do just once (or each time I change my fixtures), it doesn’t seem to me that using transactions is slow.
Anyway I guess it is faster than reloading the test data.

Social comments and analytics for this post…

This post was mentioned on Twitter by webmozart: New blog post published. http://bit.ly/c4cxdE #efficient #testing #symfony…

This is an excellent article. Developers knowing what is a unit test is rare nowadays…

[...] paying more attention to unit testing and writing tests in my symfony apps. But not until this awesome post from Bernhard Schussek, I really understood how you can make your testing life easier and get tests done better, faster [...]

Nicolae`

Thank’you very much. That made my life easier

Arnau

I just landed here and happily found that some ideas I had to improve my tests are defined here as best practices, like the creation methods or the use of transactions to clean the database for every test.

I was digging into the use of memory databases some months ago when I started creating tests for the database classes in Symfony2 and realize how easily the suite became slow. Finally I achieve it but, as the schema of the database was created for every test in the SetUp() function, the result was even slower than using MySql.
Reading the article I realized that I was doing it wrong according to that sentence:
“If you use symfony with Doctrine, you can place the following code snippet into your bootstrap/unit.php file which initializes the SQLite in-memory database”.
As the structure of the framework has changed a lot, how would you do that now in Symfony2?

Bernhard

@Arnau: I haven’t tried that, but you would probably create a common base class with a setUpBeforeClass() method that contains that logic.

Leave a Reply

 

Additional Resources