Libraries and Packages: The VOS Interface

2.5 File I/O -- fio


File I/O takes place using a stream, that is, an I/O channel available to the SPP program. The standard streams, referred to as STDIN, STDOUT, and STDERR (macros for integer values specifying a stream), are always open. That is, you need not call open() to access them. STDIN and STDOUT read from and write to the user terminal when working interactively but may be redirected or piped. STDERR is for warning or error messages. The fio library permits input from and output to binary or text files.

Before any I/O can be done on a file, the file must be opened. The open() procedure may be used to access ordinary files containing either text or binary data. To access a file on one of the special devices such as magnetic tape, a special open procedure must be used. To conserve resources (file descriptors, buffer space) a file should be closed when no longer needed. Any file buffers that may have been created and written into will be flushed before being deallocated. close() ignores any attempts to close STDIN. Attempts to close STDOUT, or STDERR cause the respective output byte stream to be flushed, but are otherwise ignored. An error results if one attempts to close a file that is not open. File I/O functions are listed in Table 2.37; if you are working with binary data, Table 2.42, "Binary File I/O Functions.," on page 98 lists additional functions.

Table 2.37: File I/O Functions.

The access modes (the mode argument to open()) are:

Table 2.38: File Access Modes.

The file types (the type argument to open()) are:

Table 2.39: File Types.



Table 2.40: File Manipulation Commands

In the above procedures, the common calling sequence variables are declared as follows:

Table 2.41: File Variables.

Any file may be accessed after specifying only the filename, access mode, and file type parameters using the open() call. Occasionally, however, it is desirable to change the default file control parameters, to optimize I/O to the file. The fset() procedure is used to set the FIO parameters for a particular file, while fget() is used to inspect the values of these parameters. The special value DEFAULT will restore the default value of the indicated parameter. The procedure seek() is used to move the file pointer (offset in a file at which the next data transfer will occur). With text files, one can only seek to the start of a line, the position of which must have been determined by a prior call to note(). For binary files, seek() merely sets the logical offset within the file. The logical offset is the character offset in the file at which the next I/O transfer will occur. In general, there is no simple relationship between the logical offset and the actual physical offset in the file.

Binary File I/O

The minimum size addressable SPP data item is a character, usually implemented as a short (two byte) integer. Therefore, in binary file I/O, the size of the buffer is specified in units of chars, or shorts. It is possible to pack bit and byte data into chars. See the osb procedures described in "Bit & Byte Operations -- osb" on page 123.

Table 2.42: Binary File I/O Functions.

The read() procedure reads a maximum of nch characters from the file with descriptor fd into the user supplied memory buffer. The following example (Example 2.25) illustrates reading a binary file and extracting values. This is a straightforward example because all of the desired values are short integers at the beginning of the file.



The next slightly more complicated example () demonstrates extracting individual bytes from a binary file. The fragment of code reads a single word consisting of four bytes and assigns the individual byte values to separate short integers using the osb bytmov() procedure.



Text Character I/O

The procedures getc() and putc() read and write character data, a single character at a time.

Table 2.43: Text Character I/O Operations.

Note that getchar() and putchar() deal with STDIN and STDOUT respectively so they don't require a file descriptor. The other procedures require a previous call to open() or may specify one of the standard streams STDIN, STDOUT, or STDERR. The newline character is returned as part of a line read by getline(). The maximum size of a line (size of a line buffer) is set at compile time by the system wide constant SZ_LINE. getline() reads at most SZ_LINE characters. To read more in one call, use getlline() which includes an argument specifying how many characters to read.

Pushback

Characters and strings (and even binary data) may be pushed back into the input stream. ungetc() pushes a single character. Subsequent calls to getc(), getline(), read(), etc. will read out the characters in the order in which they were pushed (first in, first out). When all of the pushback data have been read, reading resumes at the preceding file position, which may either be in one of the primary buffers, or an earlier state in the pushback buffer.

Table 2.44: Pushback Text Functions.

ungets() differs from ungetc() in that it pushes back whole strings, in a last in, first out fashion. ungets() is used to implement recursive macro expansions. The amount of recursion permitted may be specified after the file is opened, and before any data are pushed back. Recursion is limited by the size of the input pointer stack, and pushback capacity by the size of the pushback buffer.

Filename Templates

The filename template package contains routines to expand a filename template string into a list of filenames, and to access the individual elements of the list. It is primarily a convenience for users to allow wildcards in filenames and pointers to files containing lists of names. The template is a list of filenames, patterns, or list filenames. The concatenation operator (//) may be used within input list elements to form new output filenames. String substitution may also be used to form new filenames.

A sample template string is:

alpha, *.x, data* // .pix, [a-m]*, @list_file
This template would be expanded as the file alpha, followed in successive calls by all the files in the current directory whose names end in .x, followed by all files whose names begin with data with the extension .pix appended, and so on. The @ character signifies a list file. That is, a file containing regular filenames.

String substitution uses the first string given for the template, expands the template, and for each filename generated by the template, substitutes the second string to generate a new filename. Some examples follow.

Table 2.45: String Substitution Characters.

The following procedures (with a b suffix) are the highest level and most convenient to use.

Table 2.46: High-Level Template Functions.

The remaining lower level routines expand a template on the fly and do not permit sorting or determination of the length of the list.

Table 2.47: Low-Level Template Routines.

Table 2.37: - File I/O Functions.
Table 2.38: - File Access Modes.
Table 2.39: - File Types.
Table 2.40: - File Manipulation Commands
Table 2.41: - File Variables.
Binary File I/O
Table 2.42: - Binary File I/O Functions.
Text Character I/O
Table 2.43: - Text Character I/O Operations.
Pushback
Table 2.44: - Pushback Text Functions.
Filename Templates
Table 2.45: - String Substitution Characters.
Table 2.46: - High-Level Template Functions.
Table 2.47: - Low-Level Template Routines.

Generated with CERN WebMaker