eng2tab -- Extract information from engineering data and place in tables
eng2tab input output
eng2tab, using a supplied "definition table", extracts bit-pack information from Hubble Space Telescope (HST) observations, and places the information into a table.
Observations from all of HST's instruments produce images that contain some sort of engineering data, usually in the Unique Data Log (UDL) and Standard Header Packet (SHP) files. The image data in these files contain bit-packed information. Specifically, individual bits or small numbers of bits contain specific information. However, since these information bits are packed into full words, accessing that information is tedious. eng2tab, through the use of the "definition table", helps to automate the process of retrieving such information.
- The problem is to get the value of a piece of information from a
bit-packed array. This item may be just a single bit, a few bits, a
whole word, or multiple words, in the image data. Thus, to get the
value of an item, some extraction method needs to be defined. The
extract expression, specified for each item, is a function that
transforms the image data into the value for the desired item.
Very often, the value of an item itself encodes information. For example, a bit flag which can have values 0 or 1 may mean that a shutter is open or closed, or a calibration lamp is on or off. To translate these "meaningless" numerical values into something more human-understandable, one can define a format expression. The format expression defines a function to transform an item's value into some other, hopefully more useful, value.
Thus, the whole extraction process can be expressed by the formula:
item's value = format (extract (image data))
The definition file contains the extract and format information for each possible item in a particular image type.
- DEFINITION FILE
- The expressions to extract and format items are specified in a
"definition file". This file is an IRAF Text DataBase.
The format of the definition file is a text file. Each item is defined by a series of lines. The format of each item definition is as follows:
begin item_name extract extract_expression format format_expression units units_description descrip general_description
An item begins with a line that has "begin" followed by the name of the item. An item name can be anything, as long as it is no more than 19 characters (SZ_COLNAME from tables) long. The following lines define the separate "fields" of the item definition. They can appear in any order and do not all need to be defined. Each line can be up to 1023 characters long (SZ_COMMAND). The fields are as follows:
- The extract expression is found by a line beginning with the word "extract" followed by an arbitrary expression used to extract the item's value from the parent data.
- The format expression is found by a line beginning with the word "format" followed by an arbitrary expression used to format the item's value.
- The units field, defined by a line beginning with the word "units", is used when writing a BU object out to a table. The value of the units field is placed in the column's "units" descriptor. The length of the units field should be no more than 19 characters (SZ_COLUNITS).
- The descrip field, defined by a line beginning with the word "descrip", simply contains a description of the item. Currently, eng2tab makes no use of this field.
- Expressions are used to define the extract and format functions.
An expression is a simple mathematical formula.
The following Fortran-type arithmetic operators are supported. If the second argument of the exponentiation is not an integer, the result will be undefined if the the first argument is not positive. Again, remember that integer division truncates.
+ addition - subtraction * multiplication / division - negation ** exponentiation // concatenation
The following logical operators are supported. Logical operators will return a value of 1 if true or 0 if false.
|| logical or && logical and = equality != inequality < less than > greater than <= less or equal >= greater or equal ! not
The following functions are supported. These functions all take a single argument, which may be an expression. The argument or result of trigonometric functions are in radians.
abs absolute value acos arc cosine asin arc sine atan arc tangent cos arc cosine str convert to string exp E raised to power int convert to integer log natural logarithm log10 common logarithm nint nearest integer real convert to real sin sine tan tangent sqr second power sqrt square root
The following functions take two arguments.
atan2 arc tangent max maximum min minimum mod modulus and Bit-wise AND shift Bit-wise SHIFT
The function "lin" implements piece-wise linear interpolation. The function looks like:
where v is the X value for which to interpolate the Y value. pairs is a string containing x/y pairs of numbers which define the points to interpolate/extropolate from.
The following functions implement decision-making concepts:
x ? true:false If 'x' is true, then return the 'true' expression If 'x' is false, then return the 'false' expression switch (key,val0,val1,...,val14) If key is an integer between 0 and 14, return the value of the key'th expression. Note that not all arguments need to be defined. word_find (key,strval) 'strval' is a space/comma separated list of words. Return the key'th word in the list.
For the extract expression, there is only one variable allowed:
This variable is the value of the 'i'th element of the image array.
For the format expression, the d[i] variable is available, along with the variable:
The value of v is the value of the item after it has been extracted from the image data.
If a name of an item appears in another item's extract or format expressions, the name will be treated as a variable whose value is the result of the item's extract expression. Care must be taken not to create recursive or circular definitions; such definitions will cause the item not be given a value or formatted.
- ITEM MATCHING
- The parameter items is used to determine which items in the definition database are extracted from the data. The value of items is a comma/space separated list of "regular expressions". A regular expression is a pattern matching language to help specify. See the help for the match task for a description of regular expression pattern matching.
- OUTPUT TABLE
- The columns in the output table are the item names as found in the
definition file. The units associated with each column comes from the
units field in the definition file. The datatype of each column
depends on whether the format expression has been used. If the
parameter format is "no" or a format expression does not exist for
a particular item, the column will be an integer column.
However, if the format parameter is "yes" and a format field exists for an item, the datatype of the column depends on the result of the format expression.
There is also a column, file, which contains the file name of the source image data.
- PREDEFINED DEFINITION FILES
- As of the current release STSDAS, the following definition files are
distributed. They can be found in the directory "ctools$eng2tab".
- Definitions for the GHRS UDL file. For HST observations, the image for which this definition file is relevant has the extension ".ulh".
- Definitions for the GHRS extracted data file. For HST observations, the image for which this definition file is relevant has the extension ".x0h".
- Definitions to retrieve various temperature monitors from the Standard
Header Packet (SHP) image for the GHRS. For HST observations, the image
for which this definition file is relevant has the extension ".shh".
For HST observation, the SHP is common among all instruments and contains much more information that defined in this definition file. The definitions included here relate only to the temperature monitors found in the SHP header keywords under the section "CDBS KEYWORDS" for GHRS observations.
- input [file name template]
- A list of images to extract the information from.
- output [file name]
- Table to place the extract information.
- (definition [file name])
- Name of the text database file containing the definitions of the items to extract from the input images. See the discussion for details on the format.
- (items = "*") [string]
- The items to extract from the input data. See the help for the task match for a discussion of the syntax for pattern matching.
- (defext = "") [string]
- Extension to place onto the input file names, if the input file names do not have a specified extension.
- (format = yes) [boolean]
- If yes, the format expression, found in the definition file, will be used to translate item values before writing to the table. If no, only the integer values of the extracted items will be written.
The following examples demonstrate how to run the task and usage of its various parameters.
1. Using the GHRS UDL definition file supplied with STSDAS, decode the information in the GHRS observation z0x20107m UDL file.
cl> eng2tab z0x20107m.ulh ulh.tab \ definition=ctools$eng2tab/ghrs_uld_def.db
2. Same as example 1, but do not format the table. Note that the table contents are now only integer columns.
cl> eng2tab z0x20107m.ulh ulh.tab \ definition=ctools$eng2tabghrs_uld_def.db format-
3. To always use the ULH extension on file names, set the defext to "ulh".
cl> eng2tab z0x20107m ulh.tab definition=ghrs_uld_def.db \ defext="ulh"
4. Some examples of matching items. Only extract the information for the zconof_det1 and zconof_det2 items from the GHRS UDL definiton file.
cl> eng2tab z0x20107m.ulh ulh.tab definition=ghrs_uld_def.db \ items="zconof_det1,zconof_det2"
5. Retrieve all items that have to do with BIN ID from a GHRS UDL file.
cl> eng2tab z0x20107m.ulh ulh.tab ghrs_uld_def.db items="zdefid"
The following examples demonstrate how to construct and use the definition files.
6. A basic entry. This entry retrieves the 6th element out of an integer image. No other formatting or processing is done to it.
begin basic extract d format v descr Just get the 6th pixel out. units pixel units
7. The following entry is from the ghrs_udl_def.db file. The item zfacolim resides in the 8th pixel of the UDL image. Its value can range from 0 to 65535. Since the pixels in a UDL image are only 16-bits (short), any task that reads this pixel may interpret it as a signed value, turning any valid value greater than 32767 into a negative number. To prevent this, use the bit function and to force it into a positive number.
begin zfacolim extract and(d,65535) descr Maximum allowable coincidence counts per integration period, 0-65535 format v units counts/integration
8. The following entry is from the ghrs_udl_def.db file. The item zflim is a 25 bit value. Since each pixel in the UDL image is only 16 bits, the zflim value is split over two pixels of the image. The 9 most significant bits reside in pixel 9 while the least significant 16 bits reside in pixel 10. Combine these to get the actual value.
A note about the two 'and's: The first garauntees that the rest of the bits in pixel 9 are zero. The second makes sure that the value of pixel 10 is taken as a positive number.
begin zflim extract and(d,511)*65536 + and(d, 65535) descr On-the-fly adder limit format v units counts
9. The following entry is from the ghrs_udl_def.db file. The item zfconof_det2 is a bit-flag, residing in bit 8 of the first pixel in the UDL. The flag indicates whether detector 2 for the GHRS is 1 (on) or 0 (off). The extract expression simply isolates that bit. The format expression uses the function word_find to translate the value into a string. If the value is 0, word_find will return "off", and if 1, word_find will return "on".
begin zfconof_det2 extract and(shift(d,-7),1) descr Detector 2 ON/OFF format word_find(v,"off,on") units on/off
10. The following entry is from the ghrs_udl_def.db file. The item zfix1_present is a bit flag indicating whether an address code is present for address 1. Instead of using word_find as the format expression, a boolean expression is used instead to create a boolean column in the table, as opposed to a string column as would be created in example 9. If the extracted value is equal to 1, the boolean value in the column will be true or yes.
begin zfix1_present extract and(shift(d,-15),1) descr Address code present for address #1 format v==1 units boolean
11. Example of a conditional expression. The following entry is from the ghrs_udl_def.db file. The item zfscidmp is a two's compliment number. Since the input image may or may not be unsigned, the extract expression forces the value to be unsigned (since the default data type of UDL images is unsigned). The format expression is then used to re-sign the value. This is done by checking how large the unsigned value is. If greater than 32767, subtract 65536, else just use the value.
begin zfscidmp extract and(d,65535) descr Number of substep patterns between data dumps format v>32767?v-65536:v units substep pattern
12. The following entry is from the ghrs_shp_temps_def.db file and demonstrates the use of the lin linear interpolation function. The monitor zpcc1 is a photocathode current monitor. The value of the monitor as returned in the image is a value from 0 to 255. However, the current is actually measured in milliamps. To map the image value to milliamps, linear interpolation is used. The interpolation is based on two pairs of values: when the pixel is 0, the current is 0. If the value is 255, the current is 196.154 milliamps. The entry looks like:
begin zpcc1 extract d descr Det. 1 photocathode current format lin(v,"0 0 255 +1.96154e2") units mAmp
See entry zpabt1 for an example of linear interpolation using more than two pairs of values.
13. The following entries are from the ghrs_shp_temps_def.db file. This example demonstrates the use of other items in an item's expression and is a demonstration of how well the definition database handles complicated data dependencies and expressions.
A feature of the SHP data, at least for the GHRS, is that a pixel can be multiplexed, i.e. a single pixel will hold different values which mean different things depending on certain conditions. For this example, pixel 357 from the SHP can hold any of five temperature monitor values.
Which temperature is represented at any one time is determined by a formula that involves information from four other pixels within the SHP. The formula is:
pid10 = page - int (12 * page / real (nminor + 1 - 10))
The value of this formula, pid10, determines which temperature resides at pixel 357. So, starting from the bottom, we first find the value of nminor by the record:
begin nminor extract d descr Minor frame number format v units frame number
However, the value of page is not so easily determined. There are two sources of page in the SHP, depending on which GHRS detector is on. The page sources for each individual detector is found from the entries page1 and page2:
begin page1 extract (and(d,15)) descr System 1 page I.D. format v units id begin page2 extract (and(d,15)) descr System 2 page I.D. format v units id
Which value of page should be used is determined by knowing which detector is on. There are two items which indicate detector status. However, for this case, one only needs to know whether detector 1 is on; it will be assumed that if detector 1 is off, detector 2 is on. The record for the status of detector 1 is:
begin syson1 extract and(shift(d,-7),1) descr System 1 main electronics box on format word_find(v,"off,on") units on/off
Now, the value of page can be defined relative to these three definitions:
begin page extract syson1=1?page1:page2 descr Page number of the current SHP dump format v units page
Note the use of the above items in the extract field of page. Also note that the page entry itself does not represent any direct data from the SHP image; it is more of a function definition than an "item" definition in the SHP datastream.
To carry the idea further, one can now define an item which embodies the function:
begin pid10 extract page - int (12 * page / real (nminor + 1 - 10)) descr Page id calculations (n=10) format v units page id
Finally, one can now extract a temperature monitor from pixel 357 depending on the value of the pid10 item, for example, the temperature monitor zsct1:
begin zsct1 extract d descr Det. 1 spectral calibration lamp temperature format pid10!=0?INDEF:-5.51991e1 + v*(1.46379 + v*(-2.31820e-2 + v*(2.45424e-4 + v*(-1.41260e-6 + v*(4.13073e-9 + v*(-4.74151e-12)))))) units degrees Celsius
If the value of pid10 is 0, then the value of item zsct1 is a polynomial expression based on the extracted value from pixel 357. If it is not 0, then the value is declared undefined, INDEF.
For further comparison, examine entries ztst11, zobt11, zet11, and zet31.
Note that this entry could have also been written as:
begin zsct1 extract pid10!=0?INDEF:d descr Det. 1 spectral calibration lamp temperature format -5.51991e1 + v*(1.46379 + v*(-2.31820e-2 + v*(2.45424e-4 + v*(-1.41260e-6 + v*(4.13073e-9 + v*(-4.74151e-12)))))) units degrees Celsius
STSDAS Contact: Jonathan Eisenhamer
For information about the GHRS and tasks related to it, explore the STSDAS package hst_calib.hrs.