| tstat | tables | tstat |
tstat -- Get statistics for a table column.
tstat intable column
This task gets the mean, standard deviation, median, minimum and maximum values for a table column. The output will be written to cl parameters and may also be written either to the standard output (STDOUT) or to a table. When more than one table is specified as intable, the statistics are determined for each table separately, not cumulatively. The values in the cl parameters therefore refer to the last table in the list.
If an input table contains only one column (either in fact or due to the use of a column selector with the table name), then the column parameter is ignored, and statistics are computed for that one column. If intable includes more than one table, the column parameter may be required for some tables (those with more than one column) but not for others.
The range of rows to use for statistics may be restricted either by the rows parameter or by use of a row selector with the table name. Both may be used, in which case rows is interpreted to mean selected row numbers, rather than rows in the underlying table. That is, the row selector with the table name is applied first, then the rows parameter is used to further restrict the rows.
For a column that contains arrays, this task reads all elements of all selected rows and computes statistics on all those elements together. Typical usage for array columns would be to specify just one row, but any number of rows may be included, limited only by memory.
Lower and upper limits may be set using the parameters lowlim and highlim such that table values outside that range are not used when computing the statistics. Either the lower or upper limit may be set individually. If there are no values within the range specified and within the range of rows given by the rows parameter, then the average, etc, will be printed as INDEF.
For some tables, one can get statistics on the data in a row by using tdump and piping the output to tstat. See the examples for more information.
1. Get statistics on column "flux" in all tables, putting the output (assuming outtable="STDOUT") in the ASCII file flux.lis:
tt> tstat *.tab flux > flux.lis
2. In order to get statistics on the data in a row rather than a column, you can use tdump for one row and specify pwidth to be so small that each value will be printed on a separate line. The output of tdump will then be a one-column table containing the row from the input table, and tstat can be run on that one-column table. Since the input is redirected, we don't specify the table name. Note also that in this case the input contains only one column, so we don't specify the column name either. In this example, we get statistics on row 17 of "bs.fits":
tt> tdump bs.fits cdfile="" pfile="" \
>>> row=17 pwidth=15 | tstat
3. When the input is redirected and has multiple columns, the command-line argument should be the column name to use, not the table name. The table name in this case will internally be set to "STDIN".
tt> dir l+ | tstat c3
4. The statistics on column "flux" in hr465.tab are put in parameters tstat.nrows, tstat.mean, etc., and are not written to STDOUT or to a table. We only include rows for which column V is no larger than 12.
tt> tstat "hr465.tab[r:v=:12][c:flux]" outtable=""
5. The output statistics are written to a table. The default column name for the mean value is overridden:
tt> tstat hr465.tab flux outtable=hr465s.tab n_mean="mean_flux"
6. Get statistics on column "flux" in table hr465.tab, but only for rows 17 through 116, row 271, and row 952:
tt> tstat hr465.tab[c:flux] outtable="STDOUT" row="17-116,271,952"
This task was written by Phil Hodge.
thistogram, ranges
Type "help tables opt=sys" for a higher-level description of the tables package.