| fitprofs | noao.onedspec | fitprofs |
fitprofs -- Fit 1D profiles to features in image vectors
fitprofs input
The following are the fitting parameters.
The following parameters are used for error estimates as described below in the ERROR ESTIMATES section.
sigma**2 = sigma0**2 + invgain * I
where I is the pixel value and "**2" means the square of the quantity. If either parameter is specified as INDEF or with a value less than zero then no sigma estimates are made and so no error estimates for the measured parameters is made.
The following parameters determine the output of the task.
Fitprofs fits one dimensional profile functions to image vectors and outputs the fitting parameters, plots, and model or residual image vectors. This is done noninteractively using a file of initial profile positions and widths. Interactive profile fitting may be done with the deblending option of splot or stsdas.fitting.ngaussfit .
The input consists of images in a variety of formats. These include all the spectral formats as well as standard images. For two dimensional images (or the first 2D plane of higher dimensional images) either the lines or columns may be fit with possible summing of adjacent lines or columns to increase the signal-to-noise. A subset of the image apertures, lines, or columns may be specified or all image vectors may be fit.
The fitting parameters consist of a fitting region, a list of initial positions, peaks, and widths, initial background endpoints, the fitting function, and the parameters to be fit or constrained. The coordinates and units used for the positions and widths are those defined by the standard linear coordinate header parameters. For dispersion corrected spectra these are generally wavelengths in Angstroms and otherwise they are generally pixels. A fitting region must be specified by a pair of numbers.
The background parameter may be left empty to select the pixel values at the endpoints of the fitting region for defining the initial linear background. Or values at the endpoints of the fitting region may be given explicitly in pixel space order (i.e. the first value is for the edge of the fitting region which has smaller pixel coordinate0 Values can also be computed from the data using the functions "avg(w1,w2)" or "med(w1,w2)" where w1 and w2 are dispersion coordinates. The pixels in the specified range are average or medianed and the dispersion point for the linear background is the average of the dispersion coordinates of the pixels.
The position list file consists of one or more columns. The format of this file has one or more columns. The columns are the wavelength, the peak value (relative to the continuum with negative values being absorption), the profile type (gaussian, lorentzian, or voigt), and the gaussian and/or lorentzian FWHM. End columns may be missing or INDEF values may be specified to use the default parameter values (the profile and widths) or determine the peak from the data. Below are examples of the file line formats
wavelength wavelength peak wavelength peak (gaussian|lorenzian|voigt) wavelength peak gaussian gfwhm wavelength peak lorentzian lfwhm wavelength peak voigt gfwhm wavelength peak voigt gfwhm lfwhm 1234.5 <- Wavelength only 1234.5 -100 <- Wavelength and peak 1234.5 INDEF v <- Wavelength and profile type 1234.5 INDEF g 12 <- Wavelength and gaussian FWHM
where peak is the peak value, gfwhm is the gaussian FWHM, and lfwhm is the lorentzian FWHM. This format is the same as used by splot and also by artdata.mk1dspec (except in the latter case the peak is normalized to a continuum of 1).
The profile parameters fit are the central position, the peak amplitude, and the profile widths. The fitting may be constrained in number of ways. The linear background may be fixed or simultaneously fit with the profiles. The profile positions may be fixed, the relative separations fixed but a single zero point shift fit, or all positions may be fit simultaneously. The profile widths may also be fixed, the relative ratios of the widths fixed while fitting a single scale factor, or all widths fit simultaneously. The profile amplitudes are always fit.
The fitting technique uses a nonlinear iterative Levenberg-Marquardt algorithm to reduce the Chi-square of the fit. The execution time increases rapidly with the number of profiles fit so there is an effective limit to the number of profiles that can be fit at once.
The output includes a number of formats. The fitted parameters are recorded in a logfile (if specified) and printed on the standard output (if the verbose flag is set). This output includes the date, image vector, fitting parameters used, and a table of fitted or derived quantities. The parameters included some quantities relevant to spectral lines but others apply to any image data. The quantities are the profile center, the background or continuum at the center of the profile, the integral or flux of the profile (which is negative for profiles below the background), the equivalent width, the profile peak amplitude or core value, and the profile full width at half maximum. Pure gaussian and lorentzian profiles will have one of the widths set to zero while voigt profiles will have both values.
Summary plots are recored in a plotfile (if specified). The plots show the data with the total fit, individual profiles, and residuals overplotted. The plotfile may be examined and printed using the task gkimosaic as well as other tasks which interpret GKI metacode.
The final output consists of images in the same format as the input. The images may be of the total fit (sum of profiles without background) or of the difference (residuals) of the data minus the model.
Error estimates may be computed for the fitted parameters. This requires a model for the pixel sigmas. Currently this model is based on a Poisson statistics model of the data. The model parameters are a constant Gaussian sigma and an "inverse gain" as specified by the parameters sigma0 and invgain . These parameters are used to compute the pixel value sigma from the following formula:
sigma**2 = sigma0**2 + invgain * I
where I is the pixel value and "**2" means the square of the quantity.
If either the constant sigma or the inverse gain are specified as INDEF or with values less than zero then no noise model is applied and no error estimates are computed. Also if the number of error samples is less than 10 then no error estimates are computed. Note that for processed spectra this noise model will not generally be the same as the detector readout noise and gain. These parameters would need to be estimated in some way using the statistics of the spectrum. The use of an inverse gain rather than a direct gain was choosed to allow a value of zero for this parameters. This provides a model with constant uncertainties.
The error estimates are computed by Monte-Carlo simulation. The model is fit to the data (using the noise sigmas) and this model is used to describe the noise-free spectrum. A number of simulations, given by the nerrsample , are created in which random Gaussian noise is added to the noise-free spectrum based on the pixel sigmas from the noise model. The model fitting is done for each simulation and the absolute deviation of each fitted parameter to model parameter is recorded. The error estimate for the each parameter is then the absolute deviation containing 68.3% of the parameter estimates. This corresponds to one sigma if the distribution of parameter estimates is Gaussian though this method does not assume this.
The Monte-Carlo technique automatically includes all effects of parameter correlations and does not depend on any approximations. However the computation of the errors does take a significant amount of time. The amount of time and the accuracy of the error estimates depend on how many simulations are done. A small number of samples (of order 10) is fast but gives crude estimates. A large number (greater than 100) is slow but gives very good estimates. A compromise value of 50 is recommended for many applications.
1. The following example creates an artificial spectrum and fits it. It requires the artdata and proto packages be loaded.
cl> mk1dspec test slope=1 temp=0 lines=testlines nl=20
cl> mknoise test rdnoise=10 poisson=yes
cl> fields testlines fields=1,3 > fitlines
cl> fitprofs test reg="4000 8000" pos=fitlines
# Jul 27 17:49 test - Ap 1:
# Nfit=20, background=YES, positions=all, gfwhm=all, lfwhm=all
# center cont flux eqw core gfwhm lfwhm
6832.611 1363.188 -13461.8 9.875 -408.339 30.97 0.
7963.674 1507.641 -8193.58 5.435 -395.207 19.48 0.
5688.055 1217.01 -7075.11 5.814 -392.006 16.96 0.
6831.3 1363.02 -7102.01 5.21 -456.463 14.62 0.
7217.335 1412.323 -10110. 7.158 -427.797 22.2 0.
6709.286 1347.437 -4985.06 3.7 -225.346 20.78 0.
6434.317 1312.319 -7121.03 5.426 -342.849 19.51 0.
6130.415 1273.506 -6164. 4.84 -224.146 25.83 0.
4569.375 1074.138 -3904.6 3.635 -183.963 19.94 0.
5656.645 1212.999 -8202.81 6.762 -303.617 25.38 0.
4219.53 1029.458 -5161.64 5.014 -241.135 20.11 0.
4551.424 1071.845 -3802.61 3.548 -139.39 25.63 0.
4604.649 1078.643 -5539.15 5.135 -264.654 19.66 0.
6966.557 1380.294 -11717.5 8.489 -600.581 18.33 0.
4259.019 1034.501 -4280.38 4.138 -213.446 18.84 0.
5952.958 1250.843 -8006.98 6.401 -318.313 23.63 0.
4531.89 1069.351 -712.598 0.6664 -155.197 4.313 0.
7814.418 1488.579 -2926.49 1.966 -164.891 16.67 0.
5310.929 1168.846 -10132.2 8.669 -487.502 19.53 0.
5022.948 1132.066 -7532.8 6.654 -325.594 21.73 0.
2. Suppose there is no obvious continuum level near the fitting region but you want to specify a flat continuum level as the average of pixels in a specified wavelength region. The background region would be specified as
background = "avg(4250,4425.3) avg(4250,4425.3)"
Note that the value must be given twice to get a flat continuum.
The following CPU times were obtained with a Sun Sparcstation I. The number of pixels in the fitting region and the number of lines fit were varied. The worst case of fitting all parameters and a background was considered as well as the constrained case of fitting line positions and a single width with fixed background.
Npixels Nprofs Fitbkg Fitpos Fitsig CPU(sec) 100 5 yes all all 1.9 100 10 yes all all 3.3 100 15 yes all all 5.6 100 20 yes all all 9.0 512 5 yes all all 4.7 512 10 yes all all 10.0 512 15 yes all all 17.6 512 20 yes all all 27.8 1000 5 yes all all 8.0 1000 10 yes all all 18.0 1000 15 yes all all 31.8 1000 20 yes all all 50.2 1000 25 yes all all 72.8 1000 30 yes all all 100.2 512 5 no all single 2.8 512 10 no all single 5.3 512 15 no all single 8.6 512 20 no all single 12.8
Crudely this implies CPU time goes as the 1.4 power of the number of profiles and the 0.75 power of the number of pixels.
splot, stsdas.fitting.ngaussfit