Author: | Lawrence Crowl and Thorsten Ottosen |
---|---|
Contact: | lawrence.crowl@sun.com and tottosen@dezide.com |
Organization: | Sun Microsystems and Dezide Aps |
Date: | 2006-02-25 |
Number: | WG21/N1962 and J16/06-0032. This proposal is a revision of paper n1866. |
Working Group: | Evolution |
Abstract
Contract Programming is about providing the programmer with stronger tools for expressing correctness arguments directly in the source code. Moreover, the contracts enable new and powerful optimization opportunities. The benefits are among others a higher degree of self-documenting source code, better design tools, and easier debugging and testing. The proposal consists of largely orthogonal mechanisms which may be pursued in isolation.
Fact 31: Error removal is the most time-consuming phase of the[software] life cycle.Fact 36: Programmer-created built-in debug code, preferably optionallyincluded in the object code based on compiler parameters,is an important supplement to testing tools.Excerpted from Robert L. Glass' fascinating book "Facts and Fallacies of Software Engineering".
Language support for Contract Programming in C++ has several benefits:
Please refer to n1800 and n1613 for an extensive discussion of these benefits.
Already with the current proposal, it is quite remarkable how much of an interface that can be documented. For a good example, please see how std::vector is declared in vector_example.hpp; there is only a small portion of the specification of std::vector that cannot be described in the contracts.
We remind the reader that the proposal consists of largely orthogonal mechanisms which may be pursued in isolation. See Alternative ways to realize this proposals for a discussion.
Work on Contract Programming exists in many other languages and the most important may be found below.
This proposal should be viewed a collection of different proposals each building on the basic notion of assertions and adding more and more features.
This lead us to the following possible proposals:
Of course, it wouldn't make much sense to pursue only postconditions but not preconditions; clearly preconditions are more essential than any of the other mechanisms.
The idea is to extend C++
These new contract scopes can contain assertions that may be evaluated either at compile-time or at run-time as part of the program to verify their correctness.
The basic building blocks of contracts are assertions.
There are two flavors of assertions:
default assertions
- syntax:
- assertion : boolean-expression ;
- example:
std::cout.good();
select assertions
- syntax:
- assertion-sequence : assertion
- assertion-sequence : assertion-sequence assertion
- assertion-body : assertion
- assertion-body : { assertion-sequence }
- if-assertion : if( condition ) assertion-body else-assertion opt
- else-assertion : else assertion-body
- example:
- if( empty() )
result == end();
The following general points are worth noting:
Default assertions defaults to calling terminate(), but the behavior can be customized.
The select assertions remove the need for the implication operator that was discussed in n1613.
The syntactic freedom has been kept minimal to ensure a good documentation effect.
All visible functions and variables can be used within contract scope.
Member reference within contract scope is subject to the same constraints as member reference within const member functions.
The function declaration and definition are changed into:
- precontracted-function : function-head
- precontracted-function : function-head precondition-contract
- postcontracted-function : precontracted-function
- postcontracted-function : precontracted-function postcondition-contract
- function-declaration : postcontracted-function ;
- function-definition : postcontracted-function { function-body }
- precondition-contract : precondition { assertion-sequence }
- postcondition-contract : postcondition return-value-declarationopt { assertion-sequence }
- return-value-declaration: ( identifier )
example:
double sqrt( double r ) precondition { r > 0.; } postcondition( result ) { equal_within_precision( result * result, r ); }
In the return-value-declaration we declare a local variable that acts as a const reference to the computed return value. It follows that a return-value-declaration is illegal if the return type is void.
The following general rules apply:
If a function has both a declaration and a definition, the contracts must appear on the declaration.
The execution of a function is now
- evaluate the precondition assertion by assertion in order starting from the top,
- evaluate oldof occurrences in the postcondition,
- evaluate the function body,
- evaluate the postcondition assertion by assertion in order starting from the top [11].
The execution of a member function is now as in 2, but
- 2.a is followed by a call to the class invariant [9]
- 2.d is preceded by a call to the class invariant,
Function-try-blocks can appear together with pre- and postconditions on function definitions.
Pre- and postconditions are evaluated outside any function-try-block.
During execution of the postcondition other contracts (recursively encountered) are not evaluated.
If the function F is virtual, we require that
Section 3.5 of n1613 explains how little redefinition of preconditions is used. Even though subcontracting is theoretically sound, it ends up being fairly useless in practice [3].
Postconditions can be made stronger in a derived class as seen in the subcontracting_example.hpp. We can summarize the remaining rules for virtual functions as
The last rule ensures that the function in the derived class has a stronger postcondition by design. A programmer can via a downcast obtain the stronger guarantee, but the runtime check must still follow rule 3 above.
In connection with multiple inheritance we require special rules. if F overrides more than one virtual function due to multiple inheritance, it holds that
Constructors behave much like member functions. This means that a constructor can have a precondition and a postcondition. The precondition may not reference member variables.
If the address of a function is taken, it holds that
Remark: A compiler might be able to determine that the precondition is satisfied, in that case the compiler may generate a call directly to the function body (thus sharing implementation with the function pointer call).
It is very useful to be able to take a copy of a variable to compare it with other values later. In particular, this is true in function postconditions.
Assume i is a visible int, then the compiler translates
{ oldof i == 42; }
into
{ const int __old_i( i ); __old_i == 42; }
A set of rules govern how oldof behaves within a certain contract:
The precondition should be evaluated first to allow it to report its errors first. Notice that we can only apply oldof to objects of a type that defines a copy-constructor taking a const reference (ie., std::auto_ptr cannot be used).
Within class scope it is possible to define a class invariant:
- syntax:
- class-invariant : invariant { assertion-sequence }
- example:
- see vector_example.hpp
The class invariant has the following properties:
It must appear in the declaration of a class.
It is inherited in the sense that the invariant of a derived class implicitly will have the base-class invariant(s) ANDed to it (see also item 6-7).
example:
struct base { invariant { ... } }; struct derived : base { invariant { base::invariant(); // implicitly generated ... } };(Remark: the above is pseudo-code and is not affected by the disabling of assertions within assertions.)
It is called implicitly from public
constructors,
- call the class invariant before the postcondition,
destructors,
- call the class invariant before the destructor body.
member functions that exit abnormally via an exception [5]
It is always called as if surrounded by a try-catch block:
try { class_invariant(); } catch( ... ) { std::class_invariant_broken(); }
- rationale: this ensures that exceptions thrown in the invariant cannot escape from functions and in particular not from the destructor. [6]
For its execution during construction [7] holds that:
each assertion is executed in order starting from the top excluding any part inherited,
- rationale: if we do not exclude inherited parts, those parts would be evaluated multiple times.
For its execution after construction holds that:
- each assertion is executed in order starting from the top including any part inherited,
- if the class has any sub-classes, then the inherited portion is executed first,
- if the class has several sub-classes the order of the classes in the base-clause is followed.
During execution of the invariant other contracts (recursively encountered) are not evaluated.
To better support assertions at block scope we use block invariants:
- syntax:
- block-invariant: invariant { assertion-sequence }
example:
void foo() { int i = 0; for(;;) { invariant { i < 10; } ... } }
The block invariants use the same syntax as the class invariant. The properties of block invariants are:
they can appear multiple times in the same scope,
they are always called as if surrounded by a try-catch block:
try { block_invariant(); // pseudo code } catch( ... ) { std::block_invariant_broken(); }
the order of evaluation of the assertions in an invariant is from top to bottom,
The default behavior for all default assertions is to call terminate() via a call to their respective handler. As seen below, we have a handler for each type of contract. If the default behavior is undesirable, the standard library provides the following handlers:
namespace std { void precondition_broken(); void postcondition_broken(); void class_invariant_broken(); void block_invariant_broken(); typedef void (*broken_contract_handler)(); broken_contract_handler set_precondition_broken_handler( broken_contract_handler r ) throw(); broken_contract_handler set_postcondition_broken_handler( broken_contract_handler r ) throw(); broken_contract_handler set_class_invariant_broken_handler( broken_contract_handler r ) throw(); broken_contract_handler set_block_invariant_handler( broken_contract_handler r ) throw(); }
This should provide plenty of room for customization. The precondition for all the set_XX_handler functions should be r != 0.
We suggest that no required changes should be made to the standard library. An implementation is free to implement all standard guarantees in terms of contract code, but not required to.
This section discusses how contracts can help the compiler to generate faster code.
A limited form of contract programming is the use of the assert() macro. Its use is standard practice in many software projects. Unfortunately, it suffers from several problems. We will evaluate these problems by distinguishing between a library and its client.
As a consequence, the enabling or disabling of assertion checking is a compile-time decision of the library. The result is generally either
In contrast, preconditions may be evaluated in the client. The client decides on the acceptable tradeoff between performance and checking.
In well-tested production code, validating computation is rarely needed, because the function has proven itself. On the other hand, checking arguments has continuing need because of the evolution of callers. With only a single form, programmers that turn off validating computation for performance, will also turn off checking arguments.
In contrast, the proposed syntactic distinction between preconditions, postconditions and invariants preserves the very real need to enable them separately.
As a consequence it is very difficult for a compiler to make use of the contents of an assert() macro.
In contrast, preconditions, postconditions and invariants are very visible to the compiler. The compiler can use preconditions and postconditions as a constraint on the value-space of the code, which in turn enables optimization.
Explicit preconditions and postconditions enable optimization in two ways.
The compiler can eliminate conditions known statically.
For example, given
char *foo() postcondition( result ) { result != 0; }; void bar( char *arg ) precondition { arg != 0; } ... bar( foo() )the compiler can propagate the postcondition of foo() to the precondition of bar() and entirely eliminate the precondition checking in bar().
The compiler can use conditions to guide code generation.
For example, given
double sum( int count, double array[] ) precondition { count % 4 == 0; } { double accum = 0.0; for ( int i = 0; i < count; i++ ) accum += array[i]; return accum; }the compiler can unroll the loop four times, and avoid generating the stuttering at the start of the loop.
Furthermore, given the same code but with a precondition of count > 1000, the compiler could chose to implement the loop in parallel. Likewise, with count< 100, the compiler may forgo any checking for the possibility of parallelism.
This section discusses the feasibility of implementation. We provide an implementation model that is reasonable and consistent with the above proposal.
Consider the function
int factorial( int n ) precondition { 0 <= n && n <= 12; } postcondition( result ) { result >= 1; } { if ( n < 2 ) return 1; else return n * factorial( n - 1 ); }
When all conditions are enabled, the compiler generates three functions:
The core function without any surrounding condition checking.
Call this function when compiler switches disable all condition evaluation.
A postcondition evaluator, which
- saves oldof values
- evaluates class invariant
- calls core function
- evaluates class invariant
evaluates the postcondition.
Call this function from normal code when the caller evaluates the preconditions.
A precondition evaluator, which executes the precondition and then calls the postcondition evaluator.
Call this function when the caller does not or cannot evaluate the preconditions. (Remark: the precondition evaluator would for example be used when a function is called via a function-pointer.)
When postconditions and invariants are disabled, probably a library-build-time decision, the first two functions above are identical, and may be implemented as a single function with two labels.
When preconditions are disabled, a client-build-time decision, the client simply calls the second function.
As an additional performance option, compilers can turn of invariant evaluation while still enabling other conditions. Furthermore, a compiler might choose to provide a switch for generating a single call to the invariant in the constructor.
We expect that it will take about 3 man months to implement the proposal without dedicating any time to optimizations.
In Lillehammer it was suggested to use normal C++ syntax for separation of assertions. In particular it was suggested to write an assertion-sequence as
{ foo() && bar() && true; }
instead of
{ foo(); bar(); true; }
It is quite obvious that this syntax has the following drawbacks:
/////////////////////////////////////// // Tools /////////////////////////////////////// template< class Iter, class T > bool all_equals( Iter first, Iter last, const T& val ) { /* for simplicity, let us assume T's can be compared */ } template< class Iter, class Size > bool equal_distance( Iter first, Iter last, Size size ) { /* internal tagging mechnism so even input iterators can be passed */ } /////////////////////////////////////// // New vector interface /////////////////////////////////////// template< class T, class Alloc = allocator<T> > class vector { invariant { ( size() == 0 ) == empty(); size() == std::distance( begin(), end() ); size() == std::distance( rbegin(), rend() ); size() <= capacity(); capacity() <= max_size(); } public: typedef Alloc allocator_type; typedef typename Alloc::pointer pointer; typedef typename Alloc::const_pointer const_pointer; typedef typename Alloc::reference reference; typedef typename Alloc::const_reference const_reference; typedef typename Alloc::value_type value_type; typedef ... iterator; typedef ... const_iterator; typedef ... size_type; typedef ... difference_type; typedef reverse_iterator<iterator> reverse_iterator; typedef reverse_iterator<const_iterator> const_reverse_iterator; vector() postcondition { empty(); } explicit vector( const Alloc& al ) postcondition { empty(); al == get_allocator(); } explicit vector( size_type count ) postcondition { size() == count; all_equals( begin(), end(), T() ); } vector( size_type count, const T& val ) postcondition { size() == count; all_equals( begin(), end(), val ); } vector( size_type count, const T& val, const Alloc& al ) postcondition { size == count; all_equals( begin(), end(), val ); al == get_allocator(); }: vector( const vector& right ) postcondition { right == *this; } template< class InIt > vector( InIt first, InIt last ) postcondition { equal_distance( first, last, size() ); } template< class InIt > vector( InIt first, InIt last, const Alloc& al ); postcondition { equal_distance( first, last, size() ); al == get_allocator(); } void reserve( size_type count ) precondition { count < max_size(); } postcondition { capacity() >= count; } size_type capacity() const; postcondition( result ) { result >= size(); } iterator begin() postcondition( result ) { if( empty() ) result == end(); } const_iterator begin() const postcondition( result ) { if( empty() ) result == end(); } iterator end(); const_iterator end() const; reverse_iterator rbegin() postcondition( result ) { if( empty() ) result == rend(); } const_reverse_iterator rbegin() const postcondition( result ) { if( empty() ) result == rend(); } reverse_iterator rend(); const_reverse_iterator rend() const; void resize( size_type newsize ) postcondition // version 1: with an if { size() == newsize; if( newsize > oldof size() ) all_equals( begin() + oldof size(), end(), T() ); } void resize( size_type newsize, T val ) postcondition // version 2: with ternary operator { size() == newsize; newsize > oldof size()) ? all_equals( begin() + oldof size(), end(), val ) : true; } size_type size() const postcondition( result ) { result <= capacity(); } size_type max_size() const postcondition( result ) { result >= capacity(); } bool empty() const postcondition( result ) { result == ( size() == 0 ); } Alloc get_allocator() const; reference at( size_type off ); precondition { off < size(); } const_reference at( size_type off ) const; precondition { off < size(); } reference operator[]( size_type off ) precondition { off < size(); } const_reference operator[]( size_type off ) precondition { off < size(); } reference front(); precondition { not empty(); } const_reference front() const; precondition { not empty(); } reference back() precondition { not empty(); } const_reference back() const; precondition { not empty(); } void push_back( const T& val ) precondition { size() < max_size(); } postcondition { back() == val; size() == oldof size() + 1; capacity() >= oldof capacity(); } void pop_back() precondition { not empty(); } postcondition { size() == oldof size() - 1; } template< class InIt > void assign( InIt first, InIt last ) precondition {"The range [first,last) is not a sub-range of [begin(),end())";} postcondition { equal_distance( first, last, size() ); } void assign( size_type count, const T& val ) precondition { count <= max_size(); } postcondition { all_equals( begin(), end(), val ); } iterator insert( iterator where, const T& val ) precondition { size() < max_size(); } postcondition( result ) { *result == val; size() == oldof size() + 1; capacity() >= oldof capacity(); if( capacity() > oldof capacity() ) "All iterators have been invalidated"; else "All iterators in the range [where,end()) have been invalidated"; } void insert( iterator where, size_type count, const T& val ) precondition { size() + count <= max_size(); } postcondition { size() == oldof size() + count; capacity() >= oldof capacity(); if( capacity() == oldof capacity() ) { all_equals( oldof prior(where), oldof prior(where) + count, val ); "All iterators in the range [where,end()) have been invalidated"; } else "All iterators have been invalidated"; } template< class InIt > void insert( iterator where, InIt first, InIt last ) precondition { "The range [first,last) is not a sub-range of [begin(),end())"; size() + count <= max_size(); } postcondition { size() == oldof size() + count; capacity() >= oldof capacity(); if( capacity() == oldof capacity() ) { all_equals( oldof prior(where), oldof prior(where) + count, val ); "All iterators in the range [where,end()) have been invalidated"; } else "All iterators have been invalidated"; } iterator erase( iterator where ); precondition { not empty(); where != end(); } postcondition( result ) { size() == oldof size() - 1; if( empty() ) result == end(); "All iterators in the range [where,end()) have been invalidated"; } iterator erase( iterator first, iterator last ) precondition { size() >= std::distance( first, last ); } postcondition( result ) { size() == oldof size() - std::distance( first, last ); if( empty() ) result == end(); "All iterators in the range [first,end()) have been invalidated"; } void clear(); postcondition { empty(); } void swap( vector& right ) postcondition { oldof *this == right; oldof right == *this; } }; // class 'vector'
// // This example shows how Contract Programming // allows one to keep the virtual function public // and still enforce the contract of the function // class shape { public: virtual ~shape() {} virtual int compute_area() const = 0 postcondition( result ) { result > 0; } }; class circle : public shape { public: int radius() const; virtual int compute_area() const postcondition( result ) { result == pi * radius() * radius(); } };
Consider the relationship between these two comparison operators:
bool operator==( T l, T r ) postcondition( result ) { result == !( l != r ); } bool operator!=( T l, T r ) postcondition( result ) { result == !( l == r ); }
There is no easy way we can express that using an assert() in the body of the functions.
The table below contains a comparison of how Contract Programming is implemented in D, Eiffel and C++.
Feature | ISE Eiffel 5.4 | D | C++ Proposal |
keywords | require, ensure, do, require else, ensure then, invariant, old, result, variant | in, out, body, invariant, assert, static | precondition, postcondition, invariant, oldof |
on failure | throws exception | throws exception | defaults to terminate(), defaults can be customized, might throw |
expression copying in postconditions | yes, old keyword | no | yes, oldof keyword |
subcontracting | yes | yes | yes, but only considers postconditions |
contracts on abstract functions | yes | no (planned) | yes |
arbitrary code contracts | yes | yes | no, must be const correct |
function code ordering | pre -> body -> post | pre -> post -> body | pre -> post -> body |
compile-time assertions | no | yes | no |
loop invariants | yes | no | no |
loop variants | yes | no | no |
const-correct | no | no | yes |
invariant calls |
|
|
|
disabling of checks during assertions | yes | no | yes, but not in preconditions |
when public func. call public func. | disable all checks | disable nothing | disable nothing |
removable from object code | yes | yes | yes |
This section provides links to a few of the many resources on Contract Programming.
Earlier proposals:
C++ Contract Programming:
Contract Programming in other languages:
The following people has been provided feedback and comments throughout the proposals history: Reece Dunn, Douglas Gregor, Alf Steinbach, Matthew Wilson, Berend de Boer, Darren Cook, Pavel Vozenilek, Scott Meyers, Attila Fehér, Walter Bright, Per Madsen, Kevlin Henney, Sergey Vlasov, Bob Bell, John Nagle, Daveed Vandevoorde, Jaakko Jarvi, Christopher Diggins, John Torjo, Robert Kawulak, Dave Harris, Michael Wong, Thomas Witt, Tim Rowe, Scott Meyers and Andrei Alexandrescu.
Special thanks goes to David Abrahams and James Widman.
[0] | For example, if the compiler can determine that a precondition is satisfied, it can call a function where the precondition is not checked. Also, the compiler can always assume contracts to be true; hence the compiler can take advantage of the precondition in the function body. |
[1] | Even experienced programmers can choke on the word "invariant" and what it means for a class to have one. There are a few simple rules we want all programmers to learn:
|
[2] | A good compiler will emit warnings whenever a mutation is detected within a contract. This is a major advantage compared to the ad hoc solutions programmers use today; there is no special scope and hence no way the compiler can emit those warnings. |
[3] | A weaker precondition can be taken advantage of if we know the particular type of the object. If weaker preconditions should be allowed, then there exists two alternatives: to allow reuse of an existing contract or to require a complete redefinition. The former favours expressiveness, the latter favours overview. |
[4] | In some cases the compiler will be able to determine overlapping expression and hence optimize some assertions away. |
[5] | To ensure that the function gives the basic guarantee of exception-safety. |
[6] | One can imagine an approach where exceptions could escape from an invariant by not checking invariants when functions exit via an exception. The benefits of such an approach are not obvious and it will complicate flow analysis unnecessarily. |
[7] | During construction means until the post-condition of the constructor has ended. |
[8] | An unfortunate problem is in OO-programming is that a newly added virtual function in a base class B may suddenly be overridden by an otherwise non-virtual member function in the derived class D. With a contract-enabled compiler we can avoid some of these problems:
|
[9] | Strictly speaking the invariant needs to hold before the precondition is executed because the contracts might call public functions of the class which require the invariant to be established. Currently we do not expect that to be a practical problem though, and the current specification is easier to implement. |
[10] | (1, 2) In many contexts it is useful to inform the compiler of certain properties. This is true for efficient garbage collection as well as Contract Programming. The same mechanism may be reused to tag a piece of code with some property; in this case we would like to tag individual assertions. |
[11] | Note that if an exception is thrown in the function body, the postcondition is not evaluated. |