Steve McConnell says it in a nutshell (Code Complete
, p. 618, italics added) -
Suppose that you've tested a product thoroughly and found no errors. Suppose that
the product is then changed in one area and you want to be sure that it still
passes all the tests it did before the change
- that the change didn't introduce any new defects. Testing to make sure the software
hasn't taken a step backwards, or "regressed", is called "regression testing".
[...] If you run the different tests after each change, you have no way of knowing
for sure that no new defects were introduced. Consequently, regression testing must
run the same tests each time. Sometimes new tests are added as the product matures,
but the old tests are kept too
.
The only practical way to manage regression testing is to automate it.
People become numbed from running the same tests many times and seeing the same
test results many times. It becomes too easy to overlook errors, which defeats the
purpose of regression testing.
The main tools used to support automatic testing generate input, capture output,
and compare actual output with expected output
.
Now, suppose we do have automated software to take care of the task, so we
define the concept rigorously, without care for non-automated approximations -
From the start of the software project, every new capability is accompanied by a
short test battery. This battery tests out the new capability as thoroughly as the
designers could want. It is easy to write because it has no other concern than the
new capability, and that capability is fresh-coded, or better, yet to be coded.
As a test battery is applied, needless tests are weeded out and new ones added for
forgotten corners.
Once the battery looks good (normally, this takes less than an hour), and once the
software meets all of it (this may take longer, but fixing things is easiest when
the code is freshest), correct results are garnered for all the tests, and stored
as files (text, data, screen images, etc.).
Anytime a new capability is added, with its new test battery, all previous,
validated tests are run, and the results compared with the standard results already
stored on file. This is precisely what is called a regression test. Computer
time is the cheapest resource around. Anything that goes wrong with the old tests
can be traced to something done between the last time the regression test was run,
and the time the latest one has run. Normally, that should be twenty-four hours.
This truly narrows down bug searching.
The same full regression test is run whenever the implementation is changed, even
if no new capability is introduced.
You can quickly write small applications simply to test a portion of your project.
For instance, to test one specific dialog (perhaps using internal variables as "output"),
or to run through a specific sequence of operations.
If this is done every day (perhaps in the evening), then the "unintended results"
found can be traced out quickly (say, at the start of the next day), fixed and re-tested
(full regression test, as always). At that point, you know that your code, in its
current state, passes every single test you ever thought up for it, and found to
be useful. All these little tests have been written quickly, each to try one aspect
of one capability. It is the sum of them that creates the super-solid overall test.
By the time the project is into its third month, tens of thousands of impossibly
boring is-same verifications will have been run by the automated testing software,
with binary reliability.
There is a programming method that takes this one step further - the complete
regression test runs several times a day. The method is called Extreme Programming.
(See Extreme Programming, Kent Beck, Addison-Wesley, 2000 - short, well thought-out
and well-written.)
So, what should automated software do to support regression testing?
It's a duh-point that it should record macros, both for mouse and for keyboard.
More importantly, it should record them by default as Windows input commands
(toggling a check box, modifying an edit box, etc.), not as absolute, blind, screen-relative
actions. A test should not break because the user interface is tweaked! In fact,
not only should the default recording be relative to controls, but it should locate
them by the window they belong to, and identify this window by its window class,
instance number and, optionally, by its caption. All automatically, of course. Then,
there should be the option to record blind, with absolute screen positions, precisely
to test whether the UI has changed accidentally.
This automatic recording should output a human-editable script
, in a decent script language. This is essential to the three requirements that
follow.
Most tests are not functional (user-level) tests, they are unit tests
. They test the interfaces of libraries before they are integrated into the code
base. Unit tests use test harnesses (applets making the required library calls)
and test-data files. Most often, their one human input is "go". Within a framework
of automated support for many small tests, the testing tool will not be doing half
its job for input, if the best it can do for a unit test is click "go". Therefore,
its script language should be supported by an interfacing library and by more advanced
tools that will allow the scripts to "look into" library interfaces and call them
directly. It's not possible to do everything a test harness can do with program--source
code, but the automated tool should at least allow its scripts to do most of the
"harness" work. This of course isn't done by recording, but by hand coding in the
script language.
What we've said up to now for test input, goes double for test output
. The script language should allow output to be read off the screen in Windows terms
(as well as in pixel terms for special cases). It should also be able to get and
read output files. And, if it has support for "internal" access, then of course
the script language will be able to deal with the output of units as well as it
deals with the input.
Once all of this is done, we still have not dealt with the real tough nut of regression
testing - endless checking of test output for identity against a standard output.
What our script language must also support, then, is automated output analysis
. The test isn't done until it says "ok" or "not ok". The script language must support
comparisons, take decisions and signal its results to the automated testing tool.
All of this is easily coded in standard programming language, it's only to be expected
of a modern testing tool. But how far have we moved beyond simple macro recording!
Now we get to the tool itself. Another duh-point: it must manage the test structure
. Know what tests to run, and how to report the results. Practically, it must at
least have a good set of optional filters, since after some months one regression
test involves hundreds of small tests, and the "human overload" problem will occur
if the tool forces the user to look over all results, find the one "not-ok" among
643 "ok"s.
Another necessary part of automated test management is file management
. Just as it keeps a record of all the scripts needed, the software must also record
all the files needed, where they are and for what test they're needed. Then, it
must also keep a record of all files, in whatever format, that are kept as standards
against which to compare in the second phase of each test, output analysis.
Finally, the software must be a good failure manager
. Whenever an executable or a library is not found, an input file appears to have
changed without warning, a comparison standard has gone missing, the software must
report this concisely, skip what needs skipping, and go on with the work that can
still be done. The last thing we need is automated test software that forces us
to figure out why things didn't work out as regards running the tests, rather than
as regards their results.
AutomatedQA's coming AQtest does all of the above, as you might expect. It also does
much more; this is only one side of it. But that isn't the topic of this white paper.
Good Luck
Philippe Ranger
AutomatedQA Corp