Personal tools
2.2 The Architecture of Iostreams
Click on the banner to return to the user guide home page.
2.2 The Architecture of Iostreams
This section will introduce you to iostreams: what they are, how they work, what kinds of problems they help solve, and how they are structured. Section 2.2.4 provides an overview of the class templates in iostreams. If you want to skip over the software architecture of iostreams, please go on to Section 2.3 on formatted input/output.
2.2.1 What Are the Standard Iostreams?
The Standard C++ Library includes classes for text stream input/output. Before the current ANSI/ISO standard, most C++ compilers were delivered with a class library commonly known as the iostreams library. In this section, we refer to this library as the traditional iostreams, in contrast to the standard iostreams that are now part of the ANSI/ISO Standard C++ Library. The standard iostreams are to some extent compatible with the traditional iostreams, in that the overall architecture and the most commonly used interfaces are retained. Section 2.14 describes the incompatibilities in greater detail.
We can compare the standard iostreams not only with the traditional C++ iostreams library, but also with the I/O support in the Standard C Library. Many former C programmers still prefer the input/output functions offered by the C library, often referred to as C stdio. Their familiarity with the C library is justification enough for using the C stdio instead of C++ iostreams, but there are other reasons as well. For example, calls to the C functions printf() and scanf() are admittedly more concise with C stdio. However, C stdio has drawbacks, too, such as type insecurity and inability to extend consistently for user-defined classes. We'll discuss these in more detail in the following sections.
2.2.1.1 Type Safety
Let us compare a call to stdio functions with the use of standard iostreams. The stdio call reads as follows:
int i = 25; char name[50] = "Janakiraman"; fprintf(stdout, "%d %s", i, name);
It correctly prints: 25 Janakiraman.
But what if we inadvertently switch the arguments to fprintf? The error will be detected no sooner than run time. Anything can happen, from peculiar output to a system crash. This is not the case with the standard iostreams:
cout << i << ' ' << name << '\n';
Since there are overloaded versions of the shift operator operator<<(), the right operator will always be called. The function cout << i calls operator<<(int), and cout << name calls operator<<(const char*). Hence, the standard iostreams are typesafe.
2.2.1.2 Extensibility to New Types
Another advantage of the standard iostreams is that user-defined types can be made to fit in seamlessly. Consider a type Pair that we want to print:
struct Pair { int x; string y; }
All we need to do is overload operator<<() for this new type Pair, and we can output pairs this way:
Pair p(5, "May"); cout << p;
The corresponding operator<<() can be implemented as:
basic_ostream<char>& operator<<(basic_ostream<char>& o, const Pair& p) { return o << p.x << ' ' << p.y; }
2.2.2 How Do the Standard Iostreams Work?
The main purpose of the standard iostreams is to serve as a tool for input and output of text. Generally, input and output are the transfer of data between a program and any kind of external device, as illustrated in Figure 1 below:
Figure 1. Data transfer supported by iostreams
The internal representation of such data is meant to be convenient for data processing in a program. On the other hand, the external representation can vary quite a bit: it might be a display in human-readable form, or a portable data exchange format. The intent of a representation, such as conserving space for storage, can also influence the representation.
Text I/O involves the external representation of a sequence of characters; every other case involves binary I/O. Traditionally, iostreams are used for text processing. Such text processing through iostreams involves two processes: formatting and code conversion.
Formatting is the transformation from a byte sequence representing internal data into a human-readable character sequence; for example, from a floating point number, or an integer value held in a variable, into a sequence of digits. Figure 2 below illustrates the formatting process:
Figure 2. Formatting program data
Code conversion is the process of translating one character representation into another; for example, from wide characters held internally to a sequence of multibyte characters for external use. Wide characters are all the same size, and thus are convenient for internal data processing. Multibyte characters have different sizes and are stored more compactly. They are typically used for data transfer, or for storage on external devices such as files. Figure 3 below illustrates the conversion process:
Figure 3. Code conversion between multibytes and wide characters
2.2.2.1 The Iostream Layers
The iostreams facility has two layers: one that handles formatting, and another that handles code conversion and transport of characters to and from the external device. The layers communicate through a buffer, as illustrated in Figure 4 below:
Figure 4. The iostreams layers
Let's take a look at the function of each layer in more detail:
The Formatting Layer. Here the transformation between a program's internal data representation and a readable representation as a character sequence takes place. This formatting and parsing may involve, among other things:
Precision and notation of floating point numbers;
Hexadecimal, octal, or decimal representation of integers;
Skipping of white space in the input;
Field width for output;
Adapting of number formatting to local conventions.
The Transport Layer. This layer is responsible for producing and consuming characters. It encapsulates knowledge about the properties of a specific external device. Among other things, this involves:
Block-wise output to files through system calls;
Code conversion to multibyte encodings.
To reduce the number of accesses to the external device, a buffer is used. For output, the formatting layer sends sequences of characters to the transport layer, which stores them in a stream buffer. The actual transport to the external device happens only when the buffer is full. For input, the transport layer reads from the external device and fills the buffer. The formatting layer receives characters from the buffer. When the buffer is empty, the transport layer is responsible for refilling it.
Locales. Both the formatting and the transport layers use the stream's locale. (See the section on internationalization and locales.) The formatting layer delegates the handling of numeric entities to the locale's numeric facets. The transport layer uses the locale's code conversion facet for character-wise transformation between the buffer content and characters transported to and from the external device. Figure 5 below shows how locales are used with iostreams:
Figure 5. Use of locales in iostreams
2.2.2.2 File and In-Memory I/O
Iostreams support two kinds of I/O: file I/O and in-memory I/O.
File I/O involves the transfer of data to and from an external device. The device need not necessarily be a file in the usual sense of the word. It could just as well be a communication channel, or another construct that conforms to the file abstraction.
In contrast, in-memory I/O involves no external device. Thus code conversion and transport are not necessary; only formatting is performed. The result of such formatting is maintained in memory, and can be retrieved in the form of a character string.
2.2.3 How Do the Standard Iostreams Help Solve Problems?
There are many situations in which iostreams are useful:
File I/O. Iostreams can still be used for input and output to files, although file I/O has lost some of it former importance. In the past, alpha-numeric user-interfaces were often built using file input/output to the standard input and output channels. Today almost all applications have graphical user interfaces.
Nevertheless, iostreams are still useful for input and output to files other than the standard input and output channels, and to all other kinds of external media that fit into the file abstraction. For example, the Rogue Wave class library for network communications programming, Net.h++, uses iostreams for input and output to various kinds of communication streams like sockets and pipes.
In-Memory I/O. Iostreams can perform in-memory formatting and parsing. Even with a graphical user interface, you have to format the text you want to display. The standard iostreams offer internationalized in-memory I/O, which is a great help for text processing tasks like formatting. The formatting of numeric values, for example, depends on cultural conventions. The formatting layer uses a locale's numeric facets to adapt its formatting and parsing to cultural conventions.
Internationalized Text Processing. This function is actively supported by iostreams.
Iostreams use locales. As locales are extensible, any kind of facet can be carried by a locale, and thus used by a stream. By default, iostreams use only the numeric and the code conversion facets of a locale. However, date , time, and monetary facets are available in the Standard C++ Library. Other cultural dependencies can be encapsulated in unique facets and made accessible to a stream. You can easily internationalize your use of iostreams to meet your needs.
Binary I/O. The traditional iostreams suffer from a number of limitations. The biggest is the lack of conversion abilities: if you insert a double into a stream, for example, you do not know what format will be used to represent this double on the external device. There is no portable way to insert it as binary.
Standard iostreams are by far more flexible. The code conversion performed on transfer of internal data to external devices can be customized: the transport layer delegates the task of converting to a code conversion facet. To provide a stream with a suitable code conversion facet for binary output, you can insert a double into a file stream in a portable binary data exchange format. No such code conversion facets are provided by the Standard Library, however, and implementing such a facet is not trivial. As an alternative, you might consider implementing an entire stream buffer layer that can handle binary I/O.
Extending Iostreams. In a way, you can think of iostreams as a framework that can be extended and customized. You can add input and output operators for user-defined types, or create your own formatting elements, the manipulators. You can specialize entire streams, usually in conjunction with specialized stream buffers. You can provide different locales to represent different cultural conventions, or to contain special purpose facets. You can instantiate iostreams classes for new character types, other than char or wchar_t.
2.2.4 The Internal Structure of the Iostreams Layers
As explained earlier, iostreams have two layers, one for formatting, and another for code conversion and transport of characters to and from the external device. For convenience, let's repeat here in Figure 6 the illustration of the iostreams layers given in Figure 4 of Section 2.2.2:
Figure 6. The iostreams layers
This section will give a more detailed description of the iostreams software architecture, including the classes and their inheritance relationship and respective responsibilities. If you would rather start using iostreams directly, go on to Section 2.3.
2.2.4.1 The Internal Structure of the Formatting Layer
Classes that belong to the formatting layer are often referred to as the stream classes. Figure 7 illustrates the class hierarchy of all the stream classes:
Figure 7. Internal class hierarchy of the formatting layer [14]
Let us discuss in more detail the components and characteristics of the class hierarchy given in the figure:
The Iostreams Base Class ios_base. This class is the base class of all stream classes. Independent of character type, it encapsulates information that is needed by all streams. This information includes:
Control information for parsing and formatting;
Additional information for the user's special needs (a way to extend iostreams, as we will see later on);
The locale imbued on the stream;
Additionally, ios_base defines several types that are used by all stream classes, such as format flags, status bits, open mode, exception class, etc.
The Iostreams Character Type-Dependent Base Class. Here is the virtual base class for the stream classes:
basic_ios<class charT, class traits=char_traits<charT> >
The class holds a pointer to the stream buffer, and
State information that reflects the integrity of the stream buffer;
Note that basic_ios<> is a class template taking two parameters, the type of character handled by the stream, and the character traits.
The type of character can be type char for single-byte characters, or type wchar_t for wide characters, or any other user-defined character type. There are instantiations for char and wchar_t provided by the Standard C++ Library.
For convenience, there are typedefs for these instantiations:
typedef basic_ios<char> ios and typedef basic_ios<wchar_t> wios
Note that ios is not a class anymore, as it was in the traditional iostreams. If you have existing programs that use the old iostreams, they may no longer be compilable with the standard iostreams. (See list of incompatibilities in section 2.14)
Character Traits. These describe the properties of a character type. Many things change with the character type, such as:
The end-of-file value. For type char, the end-of file value is represented by an integral constant called EOF. For type wchar_t, there is a constant defined that is called WEOF. For an arbitrary user-defined character type, the associated character traits define what the end-of-file value for this particular character type is.
The type of the EOF value. This needs to be a type that can hold the EOF value. For example, for single-byte characters, this type is int, different from the actual character type char.
The equality of two characters. For an exotic user-defined character type, the equality of two characters might mean something different from just bit-wise equality. Here you can define it.
A complete list of character traits is given in the string section that explains character traits.
There are specializations defined for type char and wchar_t. In general, this class template is not meant to be instantiated for a character type. You should always define class template specializations.
Fortunately, the Standard C++ Library is designed to make the most common cases the easiest. The traits template parameter has a sensible default value, so usually you don't have to bother with character traits at all.
The Input and Output Streams. The three stream classes for input and output are:
basic_istream <class charT, class traits=char_traits<charT> > basic_ostream <class charT, class traits=char_traits<charT> > basic_iostream<class charT, class traits=char_traits<charT> >
Class istream handles input, class ostream is for output. Class iostream deals with input and output; such a stream is called a bidirectional stream.
The three stream classes define functions for parsing and formatting, which are overloaded versions of operator>>() for input, called extractors, and overloaded versions of operator<<() for output, called inserters.
Additionally, there are member functions for unformatted input and output, like get(), put(), etc.
The File Streams. The file stream classes support input and output to and from files. They are:
basic_ifstream<class charT, class traits=char_traits<charT> > basic_ofstream<class charT, class traits=char_traits<charT> > basic_fstream<class charT, class traits=char_traits<charT> >
There are functions for opening and closing files, similar to the C functions fopen() and fclose(). Internally they use a special kind of stream buffer, called a file buffer, to control the transport of characters to/from the associated file. The function of the file streams is illustrated in Figure 8:
Figure 8. File I/O
The String Streams. The string stream classes support in-memory I/O; that is, reading and writing to a string held in memory. They are:
basic_istringstream<class charT, class traits=char_traits<charT> > basic_ostringstream<class charT, class traits=char_traits<charT> > basic_stringstream<class charT, class traits=char_traits<charT> >
There are functions for getting and setting the string to be used as a buffer. Internally a specialized stream buffer is used. In this particular case, the buffer and the external device are the same. Figure 9 below illustrates how the string stream classes work:
Figure 9. In-memory I/O
2.2.4.2 The Transport Layer's Internal Structure
Classes of the transport layer are often referred to as the stream buffer classes. Figure 10 gives the class hierarchy of all stream buffer classes:
Figure 10. Hierarchy of the transport layer
The stream buffer classes are responsible for transfer of characters from and to external devices.
The Stream Buffer. This class represents an abstract stream buffer:
basic_streambuf<class charT, class traits=char_traits<charT> >
It does not have any knowledge about the external device. Instead, it defines two virtual functions, overflow() and underflow(), to perform the actual transport. These two functions have knowledge of the peculiarities of the external device they are connected to. They have to be overwritten by all concrete stream buffer classes, like file and string buffers.
The stream buffer class maintains two character sequences: the get area, which represents the input sequence read from an external device, and the put area, which is the output sequence to be written to the device. There are functions for providing the next character from the buffer, such as sgetc(), etc. They are typically called by the formatting layer in order to receive characters for parsing. Accordingly, there are also functions for placing the next character into the buffer, such as sputc(), etc.
A stream buffer also carries a locale object.
The File Buffer. The file buffer classes associate the input and output sequences with a file. A file buffer takes the form:
basic_filebuf<class charT, class traits=char_traits<charT> >
The file buffer has functions like open() and close(). The file buffer class inherits a locale object from its stream buffer base class. It uses the locale's code conversion facet for transforming the external character encoding to the encoding used internally. Figure 11 shows how the file buffer works:
Figure 11. Character code conversion performed by the file buffer
The String Stream Buffer. These classes implement the in-memory I/O:
basic_stringbuf<class charT, class traits=char_traits<charT> >
With string buffers, the internal buffer and the external device are one and the same. The internal buffer is dynamic, in that it is extended if necessary to hold all the characters written to it. You can obtain copies of the internally held buffer, and you can provide a string to be copied into the internal buffer.
2.2.4.3 Collaboration of Streams and Stream Buffers
The base class basic_ios<> holds a pointer to a stream buffer. The derived stream classes, like file and string streams, contain a file or string buffer object. The stream buffer pointer of the base class refers to this embedded object. This architecture is illustrated in Figure 12 below:
Figure 12. How an input file stream uses a file buffer
Stream buffers can be used independently of streams, as for unformatted I/O, for example. However, streams always need a stream buffer.
2.2.4.4 Collaboration of Locales and Iostreams
The base class ios_base contains a locale object. The formatting and parsing functions defined by the derived stream classes use the numeric facets of that locale.
The class basic_ios<charT> holds a pointer to the stream buffer. This stream buffer has a locale object, too, usually a copy of the same locale object used by the functions of the stream classes. The stream buffer's input and output functions use the code conversion facet of the attached locale. Figure 13 below illustrates the architecture:
Figure 13. How an input file stream uses locales
©Copyright 1996, Rogue Wave Software, Inc.