Document Number: | JTC 1/SC22/WG21/N1751 J16/05-0011 |
Date: | 2005-01-14 |
Project: | JTC1.22.32 Programming Language C++ Evolution Working Group |
Reference: | ISO/IEC IS 14882:2003(E) |
Abstract
There exist a lot of applications where reflection comes in as a useful technique. So, quite a number of people think that some mechanisms supporting reflection should be added to the next version of C++.
This paper presents several typical applications of reflection, several different aspects of reflection and several different currently existing approaches to reflection in C++ and other languages. Then it describes how the different applications can be implemented based on the different kinds of reflection.
Note: This paper was put together in a rush. It is not a fully researched academic paper or a complete survey of all options and existing approaches. It does not meet formal academic standards and uses sometimes sloppy terminology.
Also this paper is not an actual proposal. It provides some background on reflection and is intended to serve as a starting point for the discussion about reflection and how it can be added to C++.
This section simply presents some typical applications of reflection. It does not present any solutions/realizations here, but intends to span the problem domain for which reflection is assumed to be a solution. The later sections will then discuss whether and how these applications can be realized using the different kinds of reflection.
Typical applications of reflection is all kind of externalization:
(De-)Serialization to/from a flat file
Transmission through a stream connection
Persistence on an external DB
The latter example has some interesting details. If the external DB is table based, the externalization often consists of two parts: a first part to define the DB schema, i.e. to create the table definitions in some form, and a second part to actually write objects to a table (or read from a table). As example, let's have
class JournalEntry { // ... private: string description; Date valueDate; Money amount; };
For the first part, we would have something like createTableDef(ofile, Person) which then should write into ofile something like
create table journal_entry ( description varchar(80), valueDate date, amount numeric(17,2) )
In the second part, a call like
JournalEntry curEntry; // ... myDb.store(curEntry);should actually store the user object in the DB.
While this second part typically poses no special problems, given the usual reflection facilities, the first part requires information not directly available from the C++ code above. So the limiting of description to 80 characters in the table definition is something, for which the information must be provided somehow. Also the mapping of the C++ class Date to the DB type date, though it seems simple, requires a lot of information not directly available to the compiler; the same holds for the mapping from Money to numeric.
Quite a lot of general Java applications exist that provide a static analysis of the source code. They all build on published, standardized interfaces to collect their information. For C++, the number of such tools is much smaller, one of the reasons being the lack of such an interface. Some of these tools provide simple metrics information, others create graphical representations of the code, and still others create automated tests based on class definitions, like JCrasher.
Another category of tools provide not only useful information based on the source code, but also insight on runtime structures. Such tools help to explore unknown program code and often even support runtime debugging. An example of such a tool is eDOBS. Other tools create test cases based on the analysis of the runtime behaviour of a program, like Sabicu.
So-called aspect-oriented mechanisms provide interesting solutions to some programming problems. Common to these approaches is that they provide mechanisms to act on the edges of a call graph; i.e. before a function is called or after it returns some specific actions are performed. Well-kown and very general examples of this approach are AspectJ or AspectC++, but the same approach is found in tools like profilers or tracers.
Reflection can be catagorized along two orthogonal design principles:
at which time reflection is provided, and
functionality provided by the reflection mechanism.
Compile-time reflection generally provides mechanisms at the source code level. These mechanisms provide information that can be directly derived from the source code, i.e. information about types and their definitions, variables and executable code like expressions or control structures.
This information can then be used to inject or transform code, to instantiate templates or to generate external data: from simple metrics information to full-fledged representations of the source code entities. This external data together with some injected code can then be used to provide such information at runtime.
An interesting question in C++ is: When is compile-time? A rough distinction of different phases during compilation could be made by separating preprocessing, template instantiation and code generation. (This is not really an accurate description of the C++ compilation model, but sufficient for this discussion.) Is information available on preprocessor macros or physical locations, i.e. source file names and line numbers? More interestingly, is information available about template instantiations and which function of an overload set is called in an expression, or what automatic conversions are applied? A related question is: Is reflection recursive, i.e. is e.g. injected code itself a possible object for reflection?
Most existing approaches for C++ (see below) are realized as pre-compilers that pre-run an external preprocessor, so while typically physical location information is preserved, information about preprocessor macros is not possible and not visible. With this approach, information about template instantiations or overload resolution is not available. But a native reflection mechanism could provide such information, though some of it only at link time (due to export).
Runtime reflection provides mechanisms during program execution. Information available generally includes what objects currently exist of a specific class or which variables have which values. But it also comprises typical debugging information like stack frames or even contents of processor registers.
The manipulation mechanisms include changing values, calling functions or creation of new instances of a type, but also modification of existing functions or classes or addition of new ones. Related to debugging, runtime reflection also provides notification mechanisms for events like calling or leaving a function or creation or deletion of an object.
In many existing approaches, (limited) runtime reflection is provided by creating a meta-data repository and injecting respective code using compile-time reflection.
As already noted above, compile-time and runtime are neither exactly specified here nor do they desribe all possible points when reflection is possible. At least two more phases can be usefully distinguished, namely link time and load time.
Reflection can either be a purely informational facility or can provide manipulation facilities. Introspection is generally the informational part of reflection, the mechanism to examine all kind of structure of a program. At compile-time, introspection provides information such as what base classes or what members a class has, but also what statements are in a function definition. At runtime, introspection allows to ask an object about its type, to query the values of an object's data member or to examine the call stack at a specific point in execution. Also at runtime, introspection provides notification of specific events, such as the invocation or exiting of a function, the creation of an object or the throwing of an exception.
Self-modification generally provides mechanisms to manipulate the reflected entity. At compile-time it allows the deletion, transformation or injection of source code, e.g. addition of members to a class or replacing the target of a function call. At runtime, self-modification allows to invoke functions, to change the values of objects or to create new objects. More importantly, self-modification allows the introduction of new classes and functions at runtime and to make them an inherent part of the running executable.
This section presents several (proposed) approaches to provide reflection facilities for C++.
A number of tools exist for analysis and transformation of C++ source code, e.g. OpenC++, PUMA or The Pivot. They generally work at compile-time on preprocessed source code providing an introspection interface and a modification interface. The latter can be used to build a repository with meta-data with which a related library can provide runtime introspection and even (limited) runtime self-modification.
Daveed Vandevoorde's Metacode extension is a (not yet officially) proposed extension for C++0x. While the main objective (according to the presentation given at the Oxford meeting) is a simpler replacement for template metaprogramming, with appropriate functions in stdmeta it can serve as a complete replacement for the above source code tools. It can possibly even provide more functionality, as it is part of the compiler and can therefore provide information about template instantiations and overload resolutions.
Arne Adams provides a reflection library mainly targeted at database schema generation. Though the current definition interface and implementation is (very) ugly, it provides some interesting functionality, e.g. a compile-time iteration over the data members of a class (and therefore through overload resolution easily distinguished actions on differently typed members).
The author has presented an approach to implement a full runtime meta-object protocol on top of compile-time reflection. This approach can be used to roll an own application-level meta-object facility, but it can not really replace a built-in runtime self-modification facility.
Java provides a number of reflection mechanisms.
The standard Java Reflection provides essentially compile-time introspection information and some runtime manipulation through a runtime interface.
The Java Debug Interface defines a superset of the functionality of the standard Java reflection and provides that through an external interface. Its main purpose is to reflect an executing Java program through an external program, but it can also be used to reflect the own program. It has a comprehensive runtime introspection interface, including notifications on all kind of events, but has only limited runtime modification functionality.
The third-party Java ByteCode Engineering Library is essentially the Java counterpart for the C++ source code transformation tools. It works not on source code level but on byte code, but that has enough information to provide a full compile-time introspection and modification functionality.
An interesting recent addition to the Java language are Java Annotations. These provide a mechanism for the programmer to define additional meta-data for a class that can be accessed through the Java reflection mechanisms.
Mirrors are an interesting approach for a reflection mechanism in a language. It can be used for all kind of reflection: compile-time, runtime, introspection and modification. A particularly interesting property of mirrors is that they are not intrusive, i.e. they are not part of the interface of a class, and that even runtime reflection can be realized separated from the running executable.
This section looks again at the applications presented in the first section and discusses some approaches to implement them.
The externalization applications can generally be implemented based on inspection only. While the DB schema generation can be done with compile-time introspection only, the actual externalization/storing/reading requires either runtime introspection or a facility to inject additional code at compile-time. The problem where the DB schema generation requires additional information can be solved with a mechanism like the Java annotations.
Code analysis can generally be done using compile-time introspection only. This poses no specific problems.
Aspect-oriented programming can generally be done using compile-time reflection only. But it requires not only introspection but full-fledged modification mechanisms. This is how most AOP implementations today work, the Java based using byte code manipulations, and AspectC++ using PUMA.
The runtime development tools described in the first section can generally be implemented using runtime introspection, sometimes with limited modification facilities such as invocation of functions or manipulation of object values.
This paper tried to present some aspects of a reflection facility in C++. Actually, most reflection mechanisms can be based on or implemented using compile-time reflection. That can probably be realized using the Metacode approach, for which an official proposal is expected from Daveed Vandevoorde.
Nevertheless, an additional Standard Library interface for runtime reflection, including compile-time introspection information, should be defined for C++0x. Such an interface should probably be based on the mirror approach.
[JCrasher] http://www.cc.gatech.edu/jcrasher/. .
[AspectC++] http://www.aspectc.org/. .
[AspectJ] http://eclipse.org/aspectj/. .
[Annotations] http://java.sun.com/j2se/1.5.0/docs/guide/language/annotations.html. .
[Java Debug Interface] http://java.sun.com/j2se/1.4.2/docs/guide/jpda/jdi/index.html. .
[Java Reflection] http://java.sun.com/j2se/1.4.2/docs/guide/reflection/spec/java-reflectionTOC.doc.html. .
[Java ByteCode Engineering Library] http://jakarta.apache.org/bcel/. .
[Mirrors] http://bracha.org/mirrors.pdf. .
[OpenC++] http://opencxx.sourceforge.net/. .
[PUMA] http://ivs.cs.uni-magdeburg.de/~puma/. .
[Arne Adams] http://www.arneadams.com/reflection_doku/index.html. .
[metacode] ISO/IEC JTC1/SC22WG21/N1471. .
[Meta-Object Protocol] http://www.vollmann.ch/en/pubs/meta/index.html. .