TestComplete 4 Sneak Peak - OCR
By Jeff Young @ 12:50 AM
Here's a sneak preview of a new feature which will be debuted in the forthcoming TestComplete 4:
(No release date available yet. You'll be the first to know when it's announced.)
For
most applications, TestComplete can 'read' the words and characters
displayed on-screen. This makes it easy to create tests that detect the
state of the application and react accordingly. Some applications are
difficult to test because they write text to the screen in a way that
is difficult to read. We call them 'black box' applications. Usually
they are hard to read because they 'paint' the text to the screen
instead of using Windows to display the characters. Text rendered this
way is just dots to the computer and to testing applications. That
makes it difficult or impossible to create robust, reliable tests for
black-box applications. TestComplete 4 will include a new feature which
solves this problem for many applications, Optical Character
Recognition or OCR.
What is OCR?
OCR translates images
of printed text into computer readable text. TestComplete 4 can capture
an image of a black-box application screen and use OCR to 'read' the
text on it and convert it to usable ASCII or Unicode text. This text
can be used to create solid, reliable tests.
OCR Is Scriptable
TestComplete 4's OCR feature is, of course, completely scriptable using the new elements, OCR and OCRObject. The new script object, OCR, has just one method: CreateObject. Pass OCR.CreateObject a captured screen image that contains the text to be recognized and it returns a new 'OCRObject' which is used to perform the text recognition.
A little bit about OCRObject
To start the character recognition process, we can call OCRObject.GetText or OCRObject.FindRectByText.
OCRObject.GetText takes no parameters and returns all OCR readable text from the image.
OCRObject.FindRectByText
takes a string parameter and tries to locate that text in the image. If
it can find the text then it returns the image coordinates of the
region where the text was found.
OCRObject.OCROptions provides access to OCR customization settings.
How about a script example?
Here
is a simple, one-line TestComplete 4 script that takes an image of the
active window and returns all readable text found in it. Examples for
all five of TestComplete's scripting languages are included.
VBScript:
Log.Message OCR.CreateObject(Sys.Desktop.ActiveWindow).GetText
JScript:
Log.Message(OCR.CreateObject(Sys.Desktop.ActiveWindow).GetText());
DelphiScript:
Log.Message(OCR.CreateObject(Sys.Desktop.ActiveWindow).GetText);
C#Script and C++Script:
Log["Message"](OCR["CreateObject"](Sys["Desktop"]["ActiveWindow"])["GetText"]());
How does it work?
TestComplete
4 can recognize 52 lower-case and upper-case Latin characters, 10
digits, and 31 special characters in almost any font, size or style.
We've shown it can be done in a script with as little as one line of
code, but what's going on inside of TestComplete 4? How does it read
the text in the image? To successfully read on-screen text,
TestComplete 4 has to create a common ground between the installed
Windows fonts and the captured image of the black-box application text.
Recognizing any installed font on a Windows PC with dozens or
even hundreds of installed fonts would waste valuable processing time.
TestComplete 4 creates and uses 'font collections' to limit the
readable fonts to just the ones needed for the tested application. The
default font collection is Arial, Courier New, Times Roman, Fixed Sys,
System, and MS Sans Serif, each in five sizes and five styles. You'll
be able to create custom font collections with any combination of
installed fonts, sizes and styles.
To prepare the font
collection to be recognized, TestComplete 4 generates an image of every
recognized character in all designated sizes and styles for each font
in the collection. It stores these character images in a master
character table used later to compare to the on-screen image.
A
process called 'fragmentation is used to prepare the screen image of
the black-box application for comparison. Fragmentation helps to
simplify the internal representation of the image, identify the
recognizable elements, and helps separate the text fragments. It
locates the rectangular regions within the screen image and tries to
find several non-intersecting rectangular fragments, each with its own
predominant color. Then TestComplete 4 transforms the 'fragmented'
screen image to a binary representation. Every pixel becomes completely
black and white with no shades of gray. The simple black and white
image of each character is the common ground used to compare the
contents of the font collection to the contents of the black-box
application screen. TestComplete 4 simplifies the elements and compares
every possible item. When a match is found, a character is 'read'.
The hard work needed to make OCR tick goes on inside the
TestComplete 4 engine, so we can just write a couple of lines of script
and get back all of the readable text or search for a specific string.
OCR in TestComplete 4 is going to make black-box application testing
much easier.