Thoughts on Preparing Test Data

Oct 30, 2010

As a developer ,testing is already a part of life.Every day we write tests ,follow the "Red-Green-Refactoring" rhythm,yeah life is a thing of  beauty.Recently i am on a project in TWU,while,things are going well.This week ,i have paired with Jonathan,and there are lots of fun along with many insights gained from him. One of those is about how to prepare test data.
A Traditional Way
It reminds me that ,on previous project,i also ran into that scenario that there is an object called "Alarm",whose constructor receive three parameters:a "Timezone"  representing the timezone that alarm is in, a "Date" carrying out time but without timezone info,a "String" showing the name of user who holds this alarm.  The thing is that when writing test either unit- or integration- , some guy need a local timezone alarm ,so he/she just wrote this line :
Alarm alarm  = new Alarm(TimezoneInfo.Local , Datetime.Now , "Tina");
this is pretty simple ,while another coder said "hey Tina need a alarm with Beijing time" ,so she/he just wrote this in his/her test code:
Alarm alarm  = new Alarm(timezoneOfBeiJing ,DateTime.Now ,  "Tina");
At that time ,things just accumulated but  went smoothly.
Potential Problem
Well ,life is always up-and-down.There is a requirement saying alarm need a create time to show when this alarm is created. You probably said "this is easy ,just add a new constructor that take four pararmeters." Yeah,not bad, it wouldn't break your unit-tests ".Thinking about this:alarm is an entity ,which gotta be saved to database and hibernate wouldn't allow map null createDatetime to database. And that is the real-world case in previous project. And all the integration-tests are broken,because  create datetime is need and it can't be null. So what you gotta do ?  how about adding setCreatedTime(Datetime time) to alarm and call this method below alarm instantiation in all tests?  Again,not bad ,except coming at price of nightmare.  This thing can be overwhelming, because you got ,first, find all places that needed , then  manually add this line. That is not good,dude! I have been through that nightmare before , and there is no doubt that it is like torturing.So let's unveil that monster.The first point is that ,like forementioned . 1. it is very painful when you change the signature of constructor or change sth of that object ,which possibly caused a butter-fly effect. Yeah ,a chain of pain. 2. it is not straightforward. What that means is that you must infer what it does from navigating into constructor and checking the parameters , definitely not intention-revealing. 3.duplication.  You can see  that  instantiation codes have to declare "Tina"  and "Datetime.Now" twice. So there must be some way to handle this .Here it is ,the Object Mother.
Solution One: Object Mother
What basically Object Mother does is that it encapsulates all the creation code in one place.It is full of Factory Method ,which is very self-explanatory. What is really good about this is that it centralizes all the creation logics and make tests more manageable. Here is what it looks like when applying Object Mother to Alarm:
Alarm alarm = AlarmMother.createLocalNowAlarmToTina();
the second one is :
Alarm alarm = AlarmMother.createBeiJingNowAlarmToTina();
As you can see , it is more intention-revealing and you don't even need go to constructor of alarm to guess what it really does-----that is what tests really care. The tests just say "I need what kinda of data and just give me ,and i don't care how it got constructed". Object Mother uses an approach that stands on a higher-level point.It is more abstract and it fits to what tests should serve for perfectly. Test is a document,it should be clear,straightforward and should not be too detailed that confuse people or let them spend time inferring what test means. Well the downside of Object Mother is that it is slightly rigid and it is easy to get it bloated. It can't deal with variations of tests very well.What that means  is that ,if you need slightly  different test data ,it turns out you just keep adding method to Object Mother.
Alarm alarm = AlarmMother.createLocalNowAlarmToTina();

Alarm alarm = AlarmMother.createLocalMidNightAlarmToTina();
See , as time goes by ,the AlarmMother could easily get out of controll and unmanageable.And there are more and more fine-grained little methods flying around. There is another situation you may encounter:(i quote from Chris Richardson's slide "Improving Tests With Object Mothers And Internal Dsls") But overall, the shinning side of Object Mother way far weigh over the downsides.
Solution Two: Test Data Builder
Here it is:Test Data Builder. Every time you want use a class in tests ,create a Builder for that:(Quote from "Test Data Builders: an alternative to the Object Mother pattern")
1.Has an instance variable for each constructor parameter 2.Initialises its instance variables to commonly used or safe values 3.Has a `build` method that creates a new object using the values in its instance variables 4.Has "chainable" public methods for overriding the values in its instance variables
Following code is builder for alarm:
public class AlarmBuilder {
Timezone timezone ;
Datetime time ;
String username ;
Datetime createdTime = Datetime.now;

public AlarmBuilder withTimezone(Timezone timezone) {
this.timezone = timezone;
return this;
}

public AlarmBuilder withTime(Datetime time) {
this.time = time;
return this;
}

public AlarmBuilder withUsername(String username) {
this.username = username;
return this;
}

public AlarmBuilder withCreatedTime(Datetime createTime) {
this.createdTime = createdTime;
return this;
}

public Alarm build() {
return new Alarm(timezone , time , username ,createdTime);
}
}
So the instantiation process could be :
Alarm alarm = new AlarmBuilder().withTimezone(TimezoneInfo.Local)
.withTime(Datetime.Now).withUsername("Tina").build();

Alarm alarm = new  AlarmBuilder().withTimezone(beijingTimezone).withTime(Datetime.Now)
.withUsername("Tina").build();
As you can see ,it is pretty easy and manageable.To some extent ,it is not more intention-revealing than Object Mother. Consider this: suppose you wanna control all the parameters on object ,in object mother ,you probably can't use a very long method name to handle that ,because it is very easy to get Object Mother bloated:
alarm = AlarmMother.createAlarm(bangaloreTimezone,
DateTimeHelper.convert("2010-10-30 21:03:03") , "TuoHuang" ,
DateTimeHelper.convert("2010-09-30 21:03:03"));
And look at it under Builder :
Alarm alarm = new AlarmBuilder().withTimezone(bangaloreTimezone)
.withTime( DateTimeHelper.convert("2010-10-3021:03:03"))
.withUsername("TuoHuang")
.withCreatedTime(DateTimeHelper.convert("2010-09-30 21:03:03")).build();
The transcendence of builder is that you can distinguish that the later datetime is for created time and the former is for alarm time . But you can't get that info from using Object Mother. But Test Data Builder sometimes focus  on lower-level detail that it probably can't show the real meanings behind those data ,instead Object Mother handles this very well.
An Interesting Case
Put it more concrete , let's take an example from HuKai's blog . Suppose KungFuPanda must have following features simultaneously :
createPanda() matureAnimal() heavyWeight() animalLivesInUSA() animalAcrobat()
To create an KungFuPanda , if you use Test Data Builder to do this ,it probably looks like following :
new PandaBuilder().withMature().withHeavyWeight()
.withCountry("USA").withAcrobat();
When you first look at the code , you maybe can't figure out that that code is to create a "KungFu" panda,because its code is too low-level detail,which force you to infer from those code---probably not good idea. But instead  ,if look at the code implemented by Object Mother
PandaMother.createKungFuPanda();
It takes just one line to let you know "what it is"  without inferring it from "How it does".  This way is probably is more favorable for tests:straightforward , clear. More importantly ,it offers you a way to unaware the twisted implementation details behind the scene,which means a lot when you review the code.
Go Beyond the Surface
Here is another philosophy behind that: you shouldn't scatter the "new" keyword on your production code ,instead you need to either let Dependency Injection framework handles this or centralizes those creation logic to one place like "Factory" or sth. Yeah ,it is also true for test code. Centralized creation logic for tests is also important,for more info check out  Misko Hevery 's blog How to Think About the “new” Operator with Respect to Unit Testing
Which approach i should take?
In this blog ,probaActually, it is up to you. One thing i learned from software  is that there is no specific way belongs to  sovereign remedy. You should consider the context ,and adopt the way that best fits to that situation.What is really important is that you really gain deep insights behind the scene ,so you can tweak it to apply to different cases.