|
|||||||||||||||||
|
Section 20:
|
[20.4] What happens in the hardware when I call a virtual function? How many layers of indirection are there? How much overhead is there?
This is a drill-down of the previous FAQ. The answer is entirely compiler-dependent, so your mileage may vary, but most C++ compilers use a scheme similar to the one presented here. Let's work an example. Suppose class Base has 5 virtual functions: virt0() through virt4().
// Your original C++ source code
class Base {
public:
virtual arbitrary_return_type virt0(...arbitrary params...);
virtual arbitrary_return_type virt1(...arbitrary params...);
virtual arbitrary_return_type virt2(...arbitrary params...);
virtual arbitrary_return_type virt3(...arbitrary params...);
virtual arbitrary_return_type virt4(...arbitrary params...);
...
};
Step #1: the compiler builds a static table containing 5 function-pointers,
burying that table into static memory somewhere. Many (not all) compilers
define this table while compiling the .cpp that defines Base's first
non-inline virtual function. We call that table the v-table; let's pretend
its technical name is Base::__vtable. If a function pointer fits into
one machine word on the target hardware platform, Base::__vtable will
end up consuming 5 hidden words of memory. Not 5 per instance, not 5 per
function; just 5. It might look something like the following pseudo-code:
// Pseudo-code (not C++, not C) for a static table defined within file Base.cpp
// Pretend FunctionPtr is a generic pointer to a generic member function
// (Remember: this is pseudo-code, not C++ code)
FunctionPtr Base::__vtable[5] = {
&Base::virt0, &Base::virt1, &Base::virt2, &Base::virt3, &Base::virt4
};
Step #2: the compiler adds a hidden pointer (typically also a machine-word) to
each object of class Base. This is called the v-pointer. Think of
this hidden pointer as a hidden data member, as if the compiler rewrites your
class to something like this:
// Your original C++ source code
class Base {
public:
...
FunctionPtr* __vptr; ← supplied by the compiler, hidden from the programmer
...
};
Step #3: the compiler initializes this->__vptr within each
constructor. The idea is to cause each object's v-pointer to point at its
class's v-table, as if it adds the following instruction in each constructor's
init-list:
Base::Base(...arbitrary params...)
: __vptr(&Base::__vtable[0]) ← supplied by the compiler, hidden from the programmer
...
{
...
}
Now let's work out a derived class. Suppose your C++ code defines class
Der that inherits from class Base. The compiler repeats steps
#1 and #3 (but not #2). In step #1, the compiler creates a hidden v-table,
keeping the same function-pointers as in Base::__vtable but replacing
those slots that correspond to overrides. For instance, if Der
overrides virt0() through virt2() and inherits the others
as-is, Der's v-table might look something like this (pretend
Der doesn't add any new virtuals):
// Pseudo-code (not C++, not C) for a static table defined within file Der.cpp
// Pretend FunctionPtr is a generic pointer to a generic member function
// (Remember: this is pseudo-code, not C++ code)
FunctionPtr Der::__vtable[5] = {
&Der::virt0, &Der::virt1, &Der::virt2, &Base::virt3, &Base::virt4
}; ^^^^----------^^^^---inherited as-is
In step #3, the compiler adds a similar pointer-assignment at the beginning of
each of Der's constructors. The idea is to change each Der
object's v-pointer so it points at its class's v-table. (This is not a second
v-pointer; it's the same v-pointer that was defined in the base class,
Base; remember, the compiler does not repeat step #2 in class
Der.)
Finally, let's see how the compiler implements a call to a virtual function. Your code might look like this:
// Your original C++ code
void mycode(Base* p)
{
p->virt3();
}
The compiler has no idea whether this is going to call Base::virt3()
or Der::virt3() or perhaps the virt3() method of another
derived class that doesn't even exist yet. It only knows for sure that you
are calling virt3() which happens to be the function in slot #3 of the
v-table. It rewrites that call into something like this:
// Pseudo-code that the compiler generates from your C++
void mycode(Base* p)
{
p->__vptr[3](p);
}
On typical hardware, the machine-code is two 'load's plus a call:
Conclusions:
Caveat: I've intentionally ignored multiple inheritance, virtual inheritance and RTTI. Depending on the compiler, these can make things a little more complicated. If you want to know about these things, DO NOT EMAIL ME, but instead ask comp.lang.c++. Caveat: Everything in this FAQ is compiler-dependent. Your mileage may vary. |
||||||||||||||||