Pattern Hatching
Type Laundering
C++
Report, February 1997
Something's wrong. Things are getting ugly.
You're jumping through hoops trying to recover lost type information.
Over and over you find yourself using the
D-word---dynamic_cast. But you've got no choice, because the
silly framework you're using doesn't know about the extensions you've
made to its interfaces. This kind of suffering is a sure sign of a
design bug. The good news is that bugs have a way of turning into
features, in this case a rather useful one.
Picture a real-time control framework that defines an abstract base
class Event. An application based on this framework uses
subclasses of Event to model domain-specific events.
Different applications will need different kinds events: those
generated by, say, a cruise missile will be rather different than
those of a vending machine (unless it's designed for particularly
tough neighborhoods).
Given the diversity of domain-specific events, the framework's designer
didn't even try to come up with the be-all and end-all Event
interface. Instead, Event defines just a couple operations
that make sense for any and all kinds of events:
virtual long timestamp() = 0;
virtual const char* rep() = 0;
where timestamp defines the precise time of an event's
occurrence, and rep returns a low-level representation of the
event, perhaps a packet of bytes straight from the network or device
under control. It's the job of subclasses to define and implement more
specific and application-friendly operations.
Take the vending machine. Its
CoinInsertedEvent subclass
adds a Cents getCoin() operation that returns the value of
the coin a customer deposited. Another kind of event,
CoinReleaseEvent, gets instantiated when the customer wants
his or her money back. These operations and others would be
implemented using rep. Clients of these events could of
course use rep directly, assuming it's public. But there's
little reason to make it so: rep offers almost no
abstraction, and it makes clients work pretty hard to get at the
information they need. More likely, rep would be
protected---of interest only to subclasses, which use it to
implement more specialized interfaces.
DEALING WITH LOSS
There's a fundamental problem in all of this, however. It stems from
the inability to define a universal interface for events. The
framework doesn't and can't know anything about domain-specific
Event subclasses, because they're defined by application
programmers long after the framework was designed, developed, and
stamped onto CD-ROM. All the framework knows about events is that
they implement a bare-bones interface comprising timestamp
and rep operations.
That begs two questions:
-
How does the framework create instances of domain-specific
subclasses?
-
How does application code access subclass-specific operations
when all it gets from the framework is objects of type Event?
An answer to the first question lies in any one of several creational
patterns described in Design Patterns. For example, the
framework can define factory methods (from the Factory Method pattern)
that return instances of domain-specific Event subclasses.
Whenever it needs a new instance, it uses a factory method instead of
calling new. An application overrides these factory methods
to return domain-specific instances.
If you don't want to have to subclass just to return domain-specific
events, then use the Prototype pattern. Prototype offers a
compositional alternative to Factory Method. By adding a
virtual
Event* copy() operation to the Event base class,
framework code can use event objects to create copies of themselves.
Then instead of writing
Event* e = new CoinReleaseEvent;
(which the framework can't possibly do because it refers to a
domain-specific class), we write
Event* e = prototype->copy();
where prototype is an instance of a type known to the
framework, namely Event. Because
copy is a
polymorphic operation, e can be an instance of any
Event subclass, domain-specific or not. The framework
implementor just has to make sure the prototype variable gets
initialized to an instance of the right Event subclass before
prototype gets used. That's something that an application
can do in its initialization phase or at any other time before the
framework calls prototype->copy().
So much for creating subclass-specific instances. Now for the second
question. Are there patterns for recovering type information from an
instance? More specifically, if the framework provides an operation
like
virtual Event* nextEvent();
how does the application know which kind of event it gets so that it can
call the right subclass-specific operations?
Well, there's always the brute-force approach:
Event* e = nextEvent();
CoinInsertedEvent* ie;
CoinReleaseEvent* re;
// similar declarations for other kinds of events
if (ie = dynamic_cast<CoinInsertedEvent*>(e)) {
// call CoinInsertedEvent-specific operations on ie
} else if (re = dynamic_cast<CoinReleaseEvent*>(e)) {
// call CoinReleaseEvent-specific operations on re
} else if (...) {
// ...you get the idea
}
It would be painful indeed to have to do this wherever the
application needs to handle an event from the framework. And that's
not the end of it: the pain intensifies later if and when we define a
new subclass of Event. There's got to be a better way.
The Visitor pattern [GHJV95] is the classic
technique for recovering lost type information without resorting to
dynamic casts. The first step in applying the pattern adds a
void
accept(EventVisitor*) operation to the Event base class,
where EventVisitor is the base class for objects that can visit
events. Since the framework defines the Event class, it must
also define the EventVisitor class---at which point we stumble
across another dilemma: What does EventVisitor's interface look
like?
According to the Visitor pattern, the abstract Visitor interface must
define visit operations for each kind of object a visitor can
visit. Okay, but what if the types of these objects are unknown to
the framework? Our visitor of vending machine events would need
operations like
virtual void visit(CoinInsertedEvent*);
virtual void visit(CoinReleaseEvent*);
// and so forth, a visit operation for each domain-specific event
Obviously, these operations can't be defined by a framework class like
EventVisitor. Looks like even the Visitor pattern can't save
us from the dreaded dynamic_cast.
Sigh.
CELEBRATE ADVERSITY
Despite appearances, the point of this article is not to bemoan the
loss of type information but to use it to our advantage. Forget about
Event for now and consider a seemingly unrelated issue in the Memento
pattern from Design Patterns. (Not to worry---I'll examine a
radically different solution to Event's travails next time.)
Memento's intent is to capture and externalize an object's state so
that the object can be restored to that state at a later time. That
may sound easy, but I've left out an important constraint:
externalizing the state must done without violating the object's
encapsulation. In other words, the object's internal state
should be available but not visible to other
objects. Contradictory, no?
Nope. A simple example will illustrate the distinction. As we
describe in the Iterator pattern [GHJV95], a
cursor is an iterator that does nothing but mark a position
in a traversal. During traversal, the structure being traversed
"advances" the cursor (that is, makes it point to the next element in
the traversal) and can "dereference" it (that is, return the element
it points to) on behalf of a client, like so:
Structure s;
Cursor c;
for (s.first(c); s.more(c); s.next(c)) {
Element e = s.element(c);
// use Element e
}
The cursor has no client-accessible operations. Only the structure
being traversed can access the cursor's internals. It deserves
exclusive privilege because the information in the cursor is actually
part of the structure's internal state. As such it must remain
encapsulated, and that's why the cursor is a memento. As for the
other participants in the pattern, the structure is the memento's
originator, and the client acts as caretaker.
The key to pulling this off is in implementing what amounts to a
two-faced object. The structure sees a wide interface that allows
access to state information. Other clients see a narrow or even
non-existent memento interface; to give them any access to the state
inside the memento would be to compromise the structure's
encapsulation. But how do we give an object two different interfaces
in C++?
A NEED IN FRIEND
The Memento pattern suggests using the
friend keyword. The
originator is a friend of the memento, permitting access to a wide
interface while denying access to other classes.
class Cursor {
public:
virtual ~Cursor();
private:
friend class Structure;
Cursor () { _current = 0; }
ListElem* current () { return _current; } // gets _current
void current (ListElem* e) { _current = e; } // sets _current
private:
ListElem* _current;
};
In this scenario, Cursor keeps just a li'l ol' pointer. It
points to a ListElem, a class that
Structure uses
internally to represent nodes in a doubly-linked list.
ListElem objects maintain a pointer to the predecessor and
successor in the list along with a pointer to an Element
object. Structure operations manipulate
_current to
keep track of the point in the traversal. For example:
class Structure {
// ...
virtual void first (Cursor& c) {
c.current(_head); // _head is the head of the linked list,
// which Structure keeps internally
}
virtual bool more (Cursor& c) {
return c.current()->_next != 0;
}
virtual void next (Cursor& c) {
c.current(c.current()->_next); // set current to next ListElem*
}
virtual Element& element (Cursor& c) {
return c.current()->_element;
}
// ...
};
In sum, the Memento pattern lets the structure furrow away just enough
sensitive information in the Cursor memento to mark the
current stage of a traversal.
Connoisseurs of friend may notice a potentially serious
shortcoming of this approach. Because friendship isn't inherited, a
Substructure subclass of
Structure doesn't have the
access privileges of its parent. In other words,
Substructure code can't access
Cursor's secret
interface.
This is no big deal if Substructure merely inherits it
cursor-handling operations from Structure. But if it needs
to override them or to implement other cursor-dependent functionality,
it won't be able to call the private Cursor operations.
Suppose Substructure keeps its own linked list of subelements
that should be included in traversals. In other words, when
next reaches the end of
Structure's linked list, it
advances to the head of Substructure's list transparently.
That'll require overriding next and setting the cursor's
_current member appropriately.
One work-around would be to define protected operations in
Structure that parallel
Cursor's interface, except
that they delegate their implementation to a cursor:
class Structure {
// ...
protected:
ListElem* current (Cursor& c) { return c.current(); }
void current (Cursor& c, ListElem* e) { c.current(e); }
// ...
};
That effectively extends
Structure's privileges to its
subclasses. But parallel interfaces are usually a mistake; they're
ugly, they're redundant, and they make changing an interface that much
more work. If we can avoid doing it---or better yet, avoid
friend altogether---we'll probably end up thanking ourselves
in the long run.
EVENT'S LOSS IS CURSOR'S GAIN
Here's where we turn a design bug into a feature. I even have a name
for it: "type laundering." The idea is to define an abstract base
class for Cursor that includes only those aspects of its
interface that should be public, which in this case is just the
destructor:
class Cursor {
public:
virtual ~Cursor () { }
protected:
Cursor () { }
};
We protect the constructor to preclude instantiation, that is, to
ensure Cursor acts as an abstract class. We could do the
same thing merely by declaring the destructor pure virtual, but that
forces subclasses to define the destructor even if they don't need to.
Either way, subclasses define the privileged interface:
class ListCursor : public Cursor {
public:
ListCursor () { _current = 0; }
ListElem* current () { return _current; }
void current (ListElem* e) { _current = e; }
private:
ListElem* _current;
};
This arrangement means that
Structure operations that take a
Cursor as an argument must downcast it to a
ListCursor before they can access the extended interface:
class Structure {
// ...
virtual void first (Cursor& c) {
ListCursor* lc;
if (lc = dynamic_cast<ListCursor*>(&c)) {
lc->current(_head); // _head is the head of the linked list,
// which Structure keeps internally
}
}
// ...
};
The dynamic cast ensures that the structure will access and modify
ListCursor objects and only
ListCursor objects.
The final flourish to this design is how cursors get instantiated.
Obviously, clients can no longer instantiate Cursor or its
subclasses directly, since only a Structure (subclass) knows
the kind of cursor it uses. Instead we use a variation on Factory
Method to abstract the instantiation process:
class Structure {
public:
// ...
virtual Cursor* cursor () { return new ListCursor; }
// ...
};
Because cursor() returns something of type
Cursor*,
clients can't access the subclass-specific operation unless they start
(dynamic) casting randomly to figure out the type---and even that's
not an option if ListCursor isn't exported in a header file.
Meanwhile, Structure subclasses are free to redefine
cursor-manipulating operations like more,
next, and
element.
To see how the type laundering-based implementation differs from that
shown in the Memento pattern, here's a revised Structure diagram:
The main difference here is the introduction of a ConcreteMemento
subclass that adds the privileged interface to the bare-bones Memento
interface. Originators know they're dealing with concrete
mementos---they instantiate them, after all. But caretakers can do
next to nothing with mementos, because all they see are the bare
bones. Although this diagram can't show it, type laundering absolves
a C++ implementation from using friend and having to work
around its shortcomings.
Amazing how a little type laundering can clean up your design.
(Cough!)
THANKS FOR THE MEMORY LEAKS
Okay, so I lied when I described the implementation of
cursor() as the "final flourish" to this design. If we were
implementing this approach in a garbage collected language, I would
have been telling the truth. But in the context of C++, I've made the
client responsible for deleting the cursor that cursor()
creates---providing ample opportunities for memory leaks. Returning a
Cursor* rather than a
Cursor& also makes for
unsightly dereferences when we pass the heap-allocated cursor to
operations like first and
more.
We can get around both problems by applying Dijkstra's panacea: adding
a level of indirection. Specifically, we'll implement a variant of
Cope's Envelope-Letter idiom [Coplien92].
Instead of having Cursor as our abstract Memento base class,
we'll define a "letter" class CursorImp to act in its place:
class CursorImp {
public:
virtual ~CursorImp () { }
void ref () { ++_count; }
void unref () { if (--_count == 0) delete this; }
protected:
CursorImp () { _count = 0; }
private:
int _count;
};
Like most "letters" in the Envelope-Letter idiom,
CursorImp
objects are reference-counted. Concrete subclasses of
CursorImp are ConcreteMementos; that is, they define
privileged interfaces. Thus our ListCursor example becomes
class ListCursorImp : public CursorImp {
public:
ListCursorImp () { _current = 0; }
ListElem* current () { return _current; } // same privileged
void current (ListElem* e) { _current = e; } // operations as before
private:
ListElem* _current;
};
Now comes the key difference between this approach and the original:
Clients don't deal with CursorImp objects directly. Instead,
we introduce a concrete Cursor class to act as an "envelope"
for our CursorImp "letter":
class Cursor {
public:
Cursor (CursorImp* i) { _imp = i; _imp->ref(); }
Cursor (Cursor& c) { _imp = c.imp(); _imp->ref(); }
~Cursor () { _imp->unref(); }
CursorImp* imp () { return _imp; }
private:
static void* operator new (size_t) { return 0; }
static void operator delete (void *) { }
Cursor& operator = (Cursor& c) { return c; }
// disallow heap allocation and assignment for
// simplicity and to avert common mishaps
private:
CursorImp* _imp;
};
As an envelope, Cursor aggregates an instance of a
CursorImp subclass.
Cursor also sees to it that the
instance is reference-counted correctly. An originator employs these
classes to return what appears to be a purely stack-allocated
Cursor object:
class Structure {
public:
// ...
virtual Cursor cursor () { return Cursor(new ListCursorImp); }
// ...
};
cursor() returns a
Cursor, not a reference thereto,
thereby ensuring that clients will invoke the copy constructor:
Structure s;
Cursor c = s.cursor(); // sole modification to earlier example
for (s.first(c); s.more(c); s.next(c)) {
Element e = s.element(c);
// use Element e
}
Note there's no need to dereference
c, as is the case with
the original, pointer-returning version of cursor().
The only other change we need to make is in the code that does the
dynamic_cast to recover the ConcreteMemento:
class Structure {
// ...
virtual void first (Cursor& c) {
ListCursorImp* imp;
if (imp = dynamic_cast<ListCursorImp*>(c.imp())) {
imp->current(_head);
}
}
// ...
};
True, this is a bit more complicated for the Memento implementer than
the non-reference-counted version. But it makes the type
laundering-based version as easy for clients to use as the
friend-based version---which had some implementation
intricacies of its own.
Still, I've never been wild about taking out the garbage. ;-)
FEEDBACK
This month brings us (belated) words of wisdom from Kevlin Henney:
I meant to write this a lot sooner ("Visiting Rights," in fact!), but
time got the better of me. The "quirk" you mentioned in the footnote
on page 24 of [the September 1996 C++ Report regarding] C++'s
overloading does not require having to overload all versions of
Visit or that you abandon overloading the
Visit
member.
As well as supporting namespace concepts, the
using
declaration allows you to inject names from a base class into the
current class for overloading:
class NewPresenter : public Presenter {
public:
using Presenter::Visit; // pull in all Visit functions
// for overloading
virtual void Visit(Subject*); // override Subject* variant
};
This maintains the regularity that overloading offers. It is
non-invasive as users are not forced to remember what names or
convention to use for the Visitor function; it allows a newer
release of Presenter to incorporate changes without affecting
client code.
ACKNOWLEDGMENTS
Thanks to Erich and Ralph for their brutally incisive commentary!
References
[Coplien92] Coplien, J. Advanced C++ Programming Styles and
Idioms, Addison-Wesley, Reading, MA, 1992.
[GHJV95] Gamma, E., R. Helm, R. Johnson, J. Vlissides. Design
Patterns: Elements of Reusable Object-Oriented Software,
Addison-Wesley, Reading, MA, 1995.
©1997 by John Vlissides. All Rights Reserved.
Back to Design Patterns
Back to Patterns Home Page
|