Data Formats


HST data sets are stored in i Generic Edited Information Set (GEIS) format. Like IRAF format, images are stored in two files:

The two biggest differences between OIF and GEIS files, from the user's point of view, are:

Data Structure Overview

GEIS data files have a structure that is based on, but significantly different from, the FITS group format.
*1 The GEIS group format was designed to accommodate data such as time-resolved spectroscopy, where many small data arrays share common header information, but also have a certain amount of array-specific information. For example, the FOS and HRS instruments aboard HST have observing modes for time-resolved spectroscopy which can produce hundreds of spectra, all of the same size (in pixels), and all for the same observation set. If these were each stored as separate files, the overhead in file size and management would be horrendous. These data are, therefore, combined into single observation sets in GEIS group format. In this case, the descriptors that are common for all the spectra (such as the size of each spectrum, the instrument configuration, etc.) are stored in the header file, but the group-specific descriptors (e.g., the offsets for the GIMP correction) would be stored with the binary data.

Parameters that are specific to a subset of the data are called group parameters, and they are stored in a reserved area of the binary data file called the group parameter block. Each group*2 of binary data has a reserved area for the group parameter block, allowing each group to have different values for group-dependent parameters. A representation of the structure is shown in Figure 3.3.

Note that in GEIS format the group parameter block follows each group of binary data, and that the group parameters can be any supported data type. But as with FITS groups, each image must be of the same data type.

Structure of a STSDAS Group Format File

Image Descriptors--Header Keywords

Header files are composed solely of ASCII text, and can be printed using standard text-handling facilities (type, page, etc.). However, the lines in GEIS headers are always 80-character, fixed-length ASCII records. Any change you make in the text must not change the length of a line from 80 characters. The images.hedit task will preserve the 80-character line width. There are also tasks such as chcalpar (in the ctools package) which will change the calibration switches if you wish to recalibrate a dataset.

Always edit headers using tasks like hedit. Editing headers with a text editor may corrupt the files by creating incorrect line lengths.

Group Descriptors--Group Parameter Block

Because the actual values of group parameters are stored in the group parameter block with the binary data, you must use the tasks images.hedit or
toolbox.headers.groupmod. Similarly, you cannot view the values of group parameters simply by looking at the ASCII text header. Rather, you must use a header listing tool such as images.imheader, which will list the full set of header parameters, including both the main header and the group parameter block. For example, the first several lines of the file hrs.hhh might appear as shown in Figure 3.4.

Header Listing from imheader Task

Note that the six group parameters are listed as part of the normal header, by name (DATAMIN, DATAMAX, CRPIX1, CRVAL1, CTYPE1, CD1_1), and their values have been extracted from the binary data for the listing. It is not immediately obvious which header parameters are from the main ASCII header, and which are from the group parameter block, but from the user's (and programmer's) point of view, it does not matter. If you need to know the origin of a parameter, just search the file header for the string "PTYPE", and you will find the names of all group parameters as shown in Figure 3.5. For each group parameter there is a corresponding keyword PDTYPEn (where n is the parameter number concatenated to PTYPE) which gives the data type of the parameter and PSIZEn which gives its size (in bits). PCOUNT gives the total number of group parameters. GCOUNT gives the total number of groups in the binary file.

Finding Group Parameters

Disk Binary Formats--Physical Implementation

The binary data is stored in the host computer's natural format for all supported data types. There is no conversion to IEEE floating point format on systems such as VMS where the natural floating point format is not IEEE. The physical structure of GEIS files is machine dependent. On VAX/VMS systems, a GEIS file header is a sequential file consisting of fixed-length 80-byte records. The binary data are written as VAX reals, VAX integers (i.e., byte-swapped, or little-Endian format), or whatever the default machine format is for the given data type.

The Unix file system is much simpler than the VMS file system, and all files are treated as byte streams. The GEIS header file is still composed of 80-character card images, with each card image terminated by a "\n" (newline). The binary data is a byte stream, just as on VMS, but with the data written in the natural binary format of the host system.

On most Unix systems the floating point format is the IEEE floating point standard and data are not byte-swapped, as they are on the VAX. The byte order is big-Endian. On the DECStation family (based on the MIPS RISC chip) the floating point format is IEEE, but all data are byte-swapped (little Endian).

STSDAS data files may contain data in any standard Fortran data type, as well as several non-standard types. The allowed data types are shown in Table 3.1.

Data Types Allowed in STSDAS Files

In practice, HST observers are likely to only encounter short integer and single-precision real data. IRAF and STSDAS programs dynamically determine the data type of an input file, and will convert data into whatever internal representation is needed.

Data Structure Overview
Image Descriptors--Header Keywords
Group Descriptors--Group Parameter Block
Disk Binary Formats--Physical Implementation

Generated with WebMaker