Doc. no. J16/04-0016=WG21/N1576
Date: 6 February 2004
Project: Programming Language C++
Reply to: Beman Dawes <bdawes@acm.org>
This paper is a query to determine interest by the Library Working Group in a future proposal for a C++ filesystem component based on the Boost Filesystem Library. Such a component would be suitable for a future standard or a future TR. This paper is not itself such a proposal.
The Boost Filesystem Library (www.boost.org/libs/filesystem) provides portable facilities to query and manipulate paths, files, and directories. The library is widely used. It would be a pure addition to the C++ standard, leaving in place existing standard library functionality where there is overlap.
The motivation for the library is the desire to perform portable, safe, script-like filesystem operations from within C++ programs. Because the C++ Standard Library currently contains no facilities for such filesystem tasks as directory iteration or directory creation, programmers currently must rely on operating system specific C-style interfaces, making it difficult to write portable programs.
The intent is not to compete with Python, Perl, or shell scripting languages, but rather to provide filesystem operations where C++ is already the language of choice. The design encourages, but does not require, safe and portable filesystem usage.
#include "boost/filesystem/operations.hpp" #include <iostream> namespace fs = boost::filesystem; using std::cout; int main( int argc, char* argv[] ) { fs::path p( argc <= 1 ? "." : argv[1] ); if ( !fs::exists( p ) ) // does not exist cout << "Not found: " << argv[1] << '\n'; else if ( fs::is_directory( p ) ) // is a directory { for ( fs::directory_iterator dir_itr( p ); dir_itr != fs::directory_iterator(); ++dir_itr ) { // display only the rightmost name in the path cout << dir_itr->leaf() << '\n'; } } else // is a file cout << "Found: " << argv[1] << '\n'; return 0; }
Users say they prefer the Filesystem library's interface to native operating system or POSIX API's, even in code without portability requirements.
The library provides only functionality and behavior which can be supported uniformly on many different operating systems. As a practical matter, this means functionality and behavior which can be specified to work uniformly on POSIX and Windows. Since modern versions of legacy operating systems such as OS/390 and System/z provide POSIX support, the library can be implemented on these systems. Examples of behavior which is not supported because of portability concerns includes manipulation of file and directory attributes. The emphasis on portable behavior drove many design choices.
Consider this code:
if ( !exists( "foobar/cheese" ) ) cout << "Something is rotten in foobar\n";
The exists()
function returns true if the indicated
file or directory is present in the external file system. The signature is:
bool exists( const path & );
The "foobar/cheese"
argument is written according to a
portable generic path grammar and is converted to an object of class path,
which the implementation translates into the operating system's native
format for use in operating system calls. For example, if the operating system uses colons as path element
separators, the path above would be passed to the operating system as "foobar:cheese"
.
Class path has much useful and interesting functionality for manipulating
filesystem paths, and for ensuring that names in paths meet application specific
requirements. Non-portable (native) path grammar is also supported.
Because of the desire to support simple "script-like" usage, use cases often
drove design choices. For example, class path
has conversion
constructors from const char *
and const std::string &
,
allowing users to write if (exists( "foo"))
rather than if (exists(path("foo")))
.
Like all I/O, filesystem operations often encounter runtime errors both expected and unexpected. The library reports runtime errors via C++ exceptions.
Filesystem operations often encounter errors such as "File not found" which must be reported to human users. To ensure that the exceptions thrown for such errors contain sufficient information for users to resolve the error, and to eliminate the need for programs to include numerous try/catch blocks, the library throws relatively heavy-weight exceptions. There is a single filesystem_error type, with two error codes, two paths, and two messages. While the details could certainly change a great deal, the overall needs for avoiding try/catch blocks after every operation and for allowing detailed user customization based on error details has to be dealt with one way or another.
Because there is no such thing as absolute portability for names of files and directories, the design uses a relative portability approach which allows the user to specify which name portability rules are desired. Default, global user-specified, and per constructor user-specified portability checking allows an application to perform as much or as little portability checking as desired. The experience with automatic checking is that it often identifies programmer oversights before they become serious problems.
The Filesystem library includes several components which are essentially new
versions of components already in the current C++ Standard Library.
Specifically: remove, rename, basic_filebuf, filebuf, wfilebuf,
basic_ifstream, ifstream, wifstream, basic_ofstream, ofstream, wofstream,
basic_fstream, fstream, and wfstream. The primary difference for the
iostream (clause 27) classes is that seven constructors and open functions now
take arguments of const path &
. Specifications and implementation
simply reference the equivalent components in clause 27 of the current standard.
remove and rename differ in the type of their arguments, their return
types, and how they handle errors. Note that there is no intent to deprecate any
components in the current standard; these are in use in millions of lines of
existing code and must be preserved.
The versioning problem this creates is not unique to the Filesystem library; it is simply the first place where the C++ committee must face the problem.
Two choices were considered; to give the components completely different names
or to place them in a sub-namespace. My thinking for the Boost library was that
new names would be a serious confusion, and so the new components were placed in
sub-namespace filesystem
. For the standard library, a filesystem
component should use the same versioning approach used by other standard library
components.
The Boost Filesystem library is not currently internationalized; that work is underway. The approach being prototyped uses a basic_path template, with path and wpath typedefs, similar to strings and iostreams. Paths will need the ability to imbue a locale, to handle the conversion between internal and external representations.
© Copyright Beman Dawes, 2004
Revised February 09, 2004