Doc. no. | N2070=06-0140 |
Date: | 2006-09-08 |
Project: | Programming Language C++ |
Reply to: | Martin Sebor |
time_get
facet for POSIX®
compatibility
The time_get
and time_put
facets
provide a low-level asymmetric interface for the parsing
and formatting of time values. The interfaces are
asymmetric because the time_put
facet is
capable of producing a much larger set of sequences than
the time_get
facet is capable of parsing.
The time_put
interface can also readily
expose useful implementation-defined extensions by
recognizing additional formatting specifiers and modifiers
while the time_get
interface provides no such
flexibility. The behavior of the time_put
facet is specified in terms of the C standard library
function strftime
and the facet's interface
allows programs to take advantage of the rich set of the
60 or so strftime
conversion specifies
(including their optional modifiers). In contrast, the
behavior of time_get
is restricted to parsing
a limited set time and date sequences produced by a
handful of formatting specifiers, namely the
locale-independent and trivial %T
(which is
the same as "%H:%M:%S"
, the 24 hour time
representation), the locale-specific and less trivial
%x
(the locale's date representation), and to
parsing simple weekday names (%a
and
%A
) and the names of calendar months
(%b
and %B
). Presumably, this
restriction exists only because the C standard library
provides no function for parsing time sequences. Such a
function is, however, specified by the ISO/IEC
9945 standard (also known as POSIX) -- see strptime
.
Thus, C++ programs that need to process date and time
sequences produced by any of the other other 56 or so
formatting specifiers are unable to do so by relying on
the time_get
's parsing functionality, even
though much of it often exists in implementations that
parse non-trivial date sequences but is not exposed in the
interface of the facet. For instance, even the simple
task of parsing a 12 hour time representation is beyond
the ability of the facet, as is the often needed ability
to recognize and interpret time zones.
This paper proposes to extend the time_get
facet interface in a way to permit the parsing of most of
the same set of date and time sequences as produced by
time_put
, thus providing a subset of the same
functionality as POSIX
strptime
. Specifically, we propose to add two
get
and one do_get
member
functions to class time_get
to parallel those
declared by time_put
.
Add to the declaration of class time_get
in
[lib.locale.time.get], immediately below the declaration
of the member function get_year
, the
following:
iter_type get (iter_type s, iter_type end, ios_base& f,
ios_base::iostate& err, tm* t,
char format, char modifier = 0) const;
iter_type get (iter_type s, iter_type end, ios_base& f,
ios_base::iostate& err, tm* t,
const char_type* fmt, const char_type *end) const;
Add to the declaration of class time_get
,
immediately below the declaration of the virtual member
function do_get_year
, the following:
virtual iter_type get (iter_type s, iter_type end,
ios_base& f,
ios_base::iostate& err, tm* t,
char format, char modifier) const;
Add to the end of [lib.locale.time.get.members] the following text:
iter_type get (iter_type s, iter_type end, ios_base& f, ios_base::iostate& err, tm* t, char format, char modifier = 0) const;
Returns:
do_get(s, end, f, err, t, format, modifier)
iter_type get (iter_type s, iter_type end, ios_base& f, ios_base::iostate& err, tm* t, const char_type* fmt, const char_type* end) const;
Requires:
[fmt, end)
is a valid range.Effects: The function starts by evaluating
err = ios_base::goodbit
. It then enters a loop, reading zero or more characters froms
at each iteration. Unless otherwise specified below, the loop terminates when the first of the following conditions holds:
- The expression
(fmt == end)
evaluates to true.- The expression
(err == ios_base::goodbit)
evaluates to false.- The expression
(s == end)
evaluates to true, in which case the function evaluateserr = ios_base::eofbit | ios_base::failbit
.- The next element of
fmt
is equal to'%'
, optionally followed by amodifier
character, followed by a conversion specifier character,format
, together forming a conversion specification valid for the ISO/IEC 9945 functionstrptime
. If the number of elements in the range[fmt, end)
is not sufficient to unambiguously determine whether the conversion specification is complete and valid the function evaluateserr = ios_base::failbit
. Otherwise, the function evaluatess = do_get(s, end, f, err, t, format, modifier)
, where the value ofmodifier
is'\0'
when the optional modifier is absent from the conversion specification. If(err == ios_base::goodbit
) holds after the evaluation of the expression the function increments fmt to point just past the end of the conversion specification and continues looping.- The expresion
isspace(*fmt, f.getloc())
evaluates to true, in which case the function first incrementsfmt
until(fmt == end || !isspace(*fmt, f.getloc())
evaluates to true, advancess
until(s == end || !isspace(*s, f.getloc()))
is true, and then resumes looping.- The next character read from
s
matches the element pointed to byfmt
in a case-insensitive comparison, in which case the function evaluates++fmt, ++s
and continues looping. Otherwise, the function evaluateserr = ios_base::failbit
.Note: The function uses the
ctype<charT>
facet installed inf
's locale to determine valid whitespace characters. It is unspecified by what means the function performs case-insensitive comparison or whether multi-character sequences are considered while doing so.Returns:
s
.
Add the following paragraphs to the end of [lib.locale.time.get.virtuals]:
iter_type do_get (iter_type s, iter_type end, ios_base& f, ios_base::iostate& err, tm* t, char format, char modifier) const;
Requires:
t
is a valid pointer.Effects: The function starts by evaluating
err = ios_base::goodbit
. It then reads characters starting ats
until it encounters an error, or until it has extracted thosestruct tm
members, and any remaining format characters, corresponding to a conversion directive appropriate for the ISO/IEC 9945 functionstrptime
formed by concatenating'%'
, themodifier
character, when non-NUL, and theformat
character. When the concatenation fails to yield a valid complete directive the function leaves the object pointed to byt
unchanged and evaluateserr |= ios_base::failbit
. When(s == end)
evaluates to true after reading a character the function evaluateserr |= ios_base::eofbit
.Note: It is unspecified whether multiple calls to
do_get()
with the address of the samestruct tm
object will update the current contents of the object or simply overwrite its members. Portable programs must zero out the object before invoking the function.Returns: An iterator pointing immediately beyond the last character recognized as possibly part of a valid input sequence for the given
format
andmodifier
.
A reference implementation of this extension is available for review in the Open Source Apache C++ Standard Library. The same extension has been implemented in the Rogue Wave® C++ Standard Library and shipped since 2001. See this page for the latest documentation of the feature.
The proposed extensions are largely source compatible with
the existing interface of the time_get
facet
(there is a very small chance that the introduction of a
new a base class member function might affect the
well-formedness or even the behavior of a program that
calls a function with the same name in a class derived
from the base). Adding a new virtual member function is a
binary incompatible change.