Doc No: | N3415 = 12-0105 |
Date: | 2012-09-13 |
Reply to: | Bill Seymour <stdbill.h@pobox.com> |
As this is probably not something that we would want to require of every standard library implementation, this library is proposed for a Technical Report (the former Type III TR) rather than a Technical Specification (née Type II TR).
One possible design might be something that looks like the
In general, constructors and destructors are not shown below unless there’s something interesting about them.
The author will ask “What’s your interest?” in Portland; and he’ll have a working model ready for Bristol if he’s encouraged to build one.
class connection { public: connection(const std::string& protocol, const std::string& database, const std::string& user_id, const std::string& password); connection(); bool has_scrolling_cursors() const noexcept; void commit(); void rollback(); bool auto_commit() const noexcept; bool auto_commit(bool) noexcept; };
At a minimum, we’ll want to be able to specify which database we want to connect to and supply a user ID and password.
The constructor’s protocol argument
specifies the mechanism for communicating with the database; and acceptable strings
should include at least "SQL/CLI" which specifies the mechanism described
in
What goes in the database argument is implementation-defined and probably depends on the protocol. For example, if the protocol is "Oracle Call Interface", database might be a string that gets looked up in a file called tnsnames.ora.
The default constructor will create a connection object from some sort of connection pool. This needs to be explored further.
In general, we won’t know what features
we have available until after we’ve connected to a particular database.
The
Transaction control typically happens at the connection level, not the statement
level; so
Some databases have an “auto-commit” feature that treats every statement as a separate transaction. At a minimum, we need a way to turn that off.
Note that, unlike a
class statement { public: typedef /* some unsigned integer type */ position_type; typedef /* some integer type, possibly bool */ indicator; explicit statement(connection&); statement(connection&, const std::string& sql); virtual ~statement() noexcept; virtual void set_sql(const std::string& sql); void prepare(); void prepare(const std::string& sql) { set_sql(sql); prepare(); } template<class T> void set(position_type pos, const T& val); template<class T> void set_null(position_type pos); template<class T> void set_null(position_type pos, const T&); template<class T> void bind(position_type pos, T* val, indicator* ind = 0); void bind(position_type pos, char* val, std::size_t siz, indicator* ind = 0); void execute(); void execute(const std::string& sql) { set_sql(sql); execute(); } };This class provides all the functionality required to execute SQL statements that don’t return results; and statements that do return results can be derived from this. (Alternatively, this could be an abstract base class and we could derive from it a class called, say, ddl_statement, that provides no additional functionality; but that seems needlessly fussy.)
We use “dynamic SQL”; that is, the SQL statement that we intend to execute is just a string that gets interpreted by the database engine at run time. That’s fundamentally how ODBC and similar connection mechanisms work; and we can’t change that even though we might want to.
The SQL can be specified eagerly at construction time
The SQL statement can have placeholders for data that change between executions; and such statements typically need to be “prepared”. Exactly what that means depends on the database engine.
After such statements have been prepared, but before they’re executed,
the placeholders need to be replaced with the actual data; and that’s what
We sometimes need to set placeholders to a null value; and that’s what the set_null template does. The second argument in the two-argument version isn’t used for anything except inferring the template argument.
Both placeholders and returned values can be hooked to program variables. This is variously called “binding” or “describing” depending on whether you’re talking about input or output data. Both are just called “binding” in this paper.
For example, let’s say I have a placeholder of some integer type.
Rather than calling
int val; // ... my_statement.bind(1, &val);and then just assign a value to val before each execution.
This works for data returned by a query as well: the bound variable will have the new value after each fetch of a query’s cursor. (More about queries and cursors later.)
There’s a complication, however, if the value can be null. Each bound program variable optionally has associated with it an “indicator variable”, an integer that serves as a boolean indicating whether the bound variable contains real data or rather should be considered null.
We’ll also have a non-template
Note that this paper doesn’t address an interesting problem: inferring
SQL types given C++ types. That might not even be possible in general and might require
Do we want to do this with template specialization rather than subclassing? (The template argument could be the result type with void being a reasonable choice.)
If we do use template specialization and have something like:
typedef basic_statement<void> ddl_statement;should we move the
On the other hand, having separate
class dml_statement : public statement { public: explicit dml_statement(connection&); dml_statement(connection&, const std::string& sql); typedef std::size_t result_type; result_type results() const; };These are INSERT, UPDATE and DELETE statements. They return the number of rows affected. If you don’t care how many rows are affected, you can just use an object of the statement type.
class row; class call_statement : public statement { public: explicit call_statement(connection&); call_statement(connection&, const std::string& sql); void set_sql(const std::string&); typedef row result_type; result_type results(); };Some databases allow storing procedures and functions, typically written in some vendor-lock-in language, for execution by the database itself; and such routines can return zero or more data elements. (A row is a possibly empty collection of data elements. More about that shortly.)
An overload of
If the routine doesn’t return any values, you might be able to just use a statement. It depends on whether we need to do any translation of the string that contains the call.
If the database doesn’t support such stored routines, the constructor should probably throw an exception.
template<class Cursor> class table; template<class Cursor> class query : public statement { public: explicit query(connection&); query(connection&, const std::string& sql); void prefetch_count(std::size_t) noexcept; typedef table<Cursor> result_type; result_type results(); };These are SELECT statements. They return whole tables. (A table is a possibly empty collection of rows. More about that shortly.)
Some databases allow specifying the maximum number of rows that will be
returned in a single burst of communication from the database box.
The
class column { public: // copyable, moveable, swappable bool is_null() const noexcept; template<class T> void get(T&) const; template<class T> T get() const; }; bool operator==(const column&, const column&); bool operator!=(const column&, const column&); bool operator< (const column&, const column&); bool operator> (const column&, const column&); bool operator<=(const column&, const column&); bool operator>=(const column&, const column&);This is a kind of “any” type. Unfortunately, we really don’t know the actual type until run time.
They need to be copyable and moveable so that they can be stored in standard-library containers.
The get templates are the equivalent of a Java ResultSet’s
We can have just these two member templates instead of a whole zoo
of
(We should probably also support LOBs and BLOBs somehow, but this author has no experience with them and doesn’t understand the issues.)
Note that, in the database world, “null” is a possible value for anything; and any operation involving a null value yields a null. This is a problem with comparisons and is usually resolved by saying that any comparison involving a null is false. Do we want to mimic that behavior? It can sometimes have surprising results. (For example, == and != can both be false.)
At a minimum, it needs to be moveable since
class row { public: typedef implementation-detail cols; // the underlying container type typedef cols::size_type size_type; typedef cols::value_type value_type; typedef cols::const_reference reference; typedef cols::const_reference const_reference; // copyable, moveable, swappable size_type size() const; bool empty() const; reference operator[](size_type index) const; reference at(size_type index) const; reference operator[](const std::string& name) const; reference at(const std::string& name) const; // // And just in case anybody actually cares: // typedef cols::difference_type difference_type; typedef cols::const_pointer pointer; typedef cols::const_pointer const_pointer; typedef cols::const_iterator iterator; typedef cols::const_iterator const_iterator; typedef cols::const_reverse_iterator reverse_iterator; typedef cols::const_reverse_iterator const_reverse_iterator; reference front() const; reference back() const; iterator begin() const; iterator end() const; reverse_iterator rbegin() const; reverse_iterator rend() const; const_iterator cbegin() const; const_iterator cend() const; const_reverse_iterator crbegin() const; const_reverse_iterator crend() const; }; bool operator==(const row&, const row&); bool operator!=(const row&, const row&); bool operator< (const row&, const row&); bool operator> (const row&, const row&); bool operator<=(const row&, const row&); bool operator>=(const row&, const row&);
A table does not satisfy the requirements for a container. Indeed, it’s basically just an iterator factory.
template<class Cursor> class table { public: typedef const row value_type; typedef value_type* pointer; typedef value_type& reference; typedef Cursor iterator; typedef Cursor const_iterator; // copyable, moveable, swappable const_iterator begin() const; const_iterator end() const; const_iterator cbegin() const; const_iterator cend() const; };Should we make it reversible?
template<> class table<scrolling_cursor> { public: typedef const row value_type; typedef value_type* pointer; typedef value_type& reference; typedef scrolling_cursor iterator; typedef scrolling_cursor const_iterator; typedef std::reverse_iterator<iterator> reverse_iterator; typedef std::reverse_iterator<const_iterator> const_reverse_iterator; // copyable, moveable, swappable const_iterator begin() const; const_iterator end() const; const_iterator cbegin() const; const_iterator cend() const; const_reverse_iterator rbegin() const; const_reverse_iterator rend() const; const_reverse_iterator crbegin() const; const_reverse_iterator crend() const; };
struct cursor_tag : public std::input_iterator_tag { }; struct scrolling_cursor_tag : public cursor_tag { };Unfortunately, even a scrolling cursor, despite being able to move back and forth through the data, is still just an input iterator. The reason is that we’re dealing with read-only data. Furthermore, typically only one row is in memory at a time, and so every use potentially fetches a new row (very much like an istream_iterator).
But we can still have a scrolling_cursor_tag so that algorithms that want bidirectional or random-access iterators, but only need the moving-back-and-forth behavior, can be written to use scrolling_cursors as well.
class cursor_base { public: typedef /* signed integer type */ difference_type; typedef /* unsigned integer type */ position_type; typedef const row value_type; typedef value_type* pointer; typedef value_type& reference; // copyable, moveable, swappable bool fetch_next(); reference operator*() const; pointer operator->() const; }; bool operator==(const cursor_base&, const cursor_base&); bool operator!=(const cursor_base&, const cursor_base&);This is a base class into which we refactor all the stuff that doesn’t depend on whether it’s a scrolling cursor.
For two cursors, lhs and rhs,
class cursor : public cursor_base { public: typedef cursor_tag iterator_category; // copyable, moveable, swappable cursor& operator++() cursor operator++(int); };Given cursor_base, all our non-scrolling cursor needs to add are iterator_category and the
Note that the copy returned by the postfix
class scrolling_cursor : public cursor_base { public: typedef scrolling_cursor_tag iterator_category; // copyable, moveable, swappable bool fetch_prior(); bool fetch_first(); bool fetch_last(); bool fetch_relative(difference_type); bool fetch_absolute(position_type); scrolling_cursor& operator++(); scrolling_cursor operator++(int); scrolling_cursor& operator--(); scrolling_cursor operator--(int); scrolling_cursor& operator+=(difference_type); scrolling_cursor& operator-=(difference_type); scrolling_cursor operator+(difference_type) const; scrolling_cursor operator-(difference_type) const; difference_type operator-(const scrolling_cursor&) const; reference operator[](position_type); // does fetch_absolute }; void advance(scrolling_cursor&, scrolling_cursor::difference_type); scrolling_cursor::difference_type distance(const scrolling_cursor&, const scrolling_cursor&); bool operator< (const scrolling_cursor&, const scrolling_cursor&); bool operator> (const scrolling_cursor&, const scrolling_cursor&); bool operator<=(const scrolling_cursor&, const scrolling_cursor&); bool operator>=(const scrolling_cursor&, const scrolling_cursor&);As stated before, despite the ability to move back and forth through the data, this is still just an input iterator: the data is read-only, the
For two scrolling_cursors, lhs and rhs, where
lhs most recently fetched the
class sql_error : public std::logic_error { int cd; char st[6]; public: sql_error(const std::string& msg, int code, const char* state = 0); sql_error(const char* msg, int code, const char* state = 0); int sqlcode() const noexcept { return cd; } const char* sqlstate() const noexcept { return st; } const char* errmsg() const noexcept { return logic_error::what(); } };The constructors’ third argument can be a null pointer, and defaults to a null pointer, for the benefit of sub-standard database engines that don’t support SQLSTATE.
Since SQLSTATE, when it’s supported at all, is always at most five characters,
this can be implemented as a
extern std::ostream* error_stream; extern std::ostream* warning_stream; bool write_error_message(const char*) noexcept; bool write_error_message(const std::string&) noexcept; bool write_warning_message(const char*) noexcept; bool write_warning_message(const std::string&) noexcept;This is mainly an implementation detail in the author’s proof of concept (in development…probably ready for Bristol); but there doesn’t seem to be any compelling reason to keep it secret; and the user might find it useful to control where error and warning messages get written. Indeed, many users might want to turn off warnings entirely.
Both of the above pointers are initialized to