gtlike

Synopsis:

Performs unbinned and binned likelihood analysis of the LAT data.

Usage: gtlike irfs expcube srcmdl statistic optimizer evfile
scfile expmap cmap bexpmap

The application of the likelihood method to photon-counting experiments was described by Cash 1979 (Cash W. 1979, ApJ 228, 939). Maximum likelihood method was applied in the analysis of EGRET data as parameter estimation (Mattox J. R. et al 1996, ApJ 461, 396), and it will be applied in the analysis of GLAST data as well.

The likelihood statistic L is the probability of obtaining observational data given an input model. In our case, the input model is the distribution of gamma-ray sources on the sky, and includes their intensity and spectra. We use this likelihood to find the best fit model parameters. These parameters include the description of a source's spectrum, its position, and intensity.

During its lifetime GLAST LAT will observe hundreds of millions of photons (counts), but for most analysis we will be interested in a subset of only a few hundred or a few thousand. The data will be too sparse to allow the use of CHI2 as test statistic. In that case, a full Poisson likelihood is needed for model parameter estimation.

For a small number of counts the unbinned likelihood can be calculated rapidly; but as the number of counts increases the time to calculate the likelihood becomes prohibitive, and the binned likelihood must be used.

For an overview of the mathematical and statistical reason to apply the likelihood analysis in the analysis of the GLAST data it is highly recommended to read the Cicerone Manual.

In the next section the definition of source region on region of interest are given. The explanation of how to generate the source model for the likelihood analysis is described after that. The different optimizers to produce the fitting in the likelihood analysis are given in the section MODEL FITTING. Finally an overview of the steps of how to perform an unbinned and binned likelihood analysis is described.

SOURCE REGION AND REGION OF INTEREST

Due to the large LAT point spread function at low energies (e.g., 68% of the counts will be within 3.5 degrees at 100 MeV (see GLAST LAT Performance for a review of LAT performance); to analyze a single source, the counts within a region around the source have to be included. We call that region the "region of interest" (ROI). The ROI is selected from the original event file using the gtselect tool (see the gtselect documentation). The ROI should be several times the characteristic PSF size in order to satisfy the restrictions of the Likelihood package. Nearby sources will contribute counts to that region, so they have to be included in the model as well. The region that includes those sources is called "Source Region". All these sources will be in the source model file that has to be input in gtlike (see the "Source Model section" in this help). The "Source Region" is centered on the ROI, with a radius that is larger than the ROI radius by several PSF length scales. For example, when fitting a single point source, an ROI with a radius of 10 degrees and a Source Region radius of 20 degrees would be appropriate. Since the size of the LAT PSF goes roughly as (PSF_100MeV) x (E/100)^{-0.8} (with E in MeV), if you are considering only higher energy photons (e.g., > 1 GeV) smaller ROI and Source Region radii of just a few degrees may be used.

THE SOURCE MODEL

The gtlike tool reads the source model from an XML file. SAE has a GUI-tool, called "ModelEditor", to create this file. With that tool you can create the xml source files without any knowledge of xml.

For information about the type of source that you are able to create,the spatial models and the spectral functions, we recommend reading the ModelEditor and/or the Cicerone Manual, Model Selection.

MODEL FITTING

In the GLAST SCIENCE TOOLS there are five optimizers to maximize the log likelihood function: DRMNGB, DRMNFB, NEWMINUIT, MINUIT and LBFGS.

The optimizers find the best-fit spectral parameters, but not the location (the fitting tool does not fit the source coordinates). However, a tool is provided that performs a grid search-mapping out the maximum likelihood value over a grid of locations: gtfindsrc (see the gtfindsrc help for details).

DRMNGB finds local minimum of a continuously differentiable function subject to simple upper and lower bound constrains. It uses a variant of Newton's method with quasi-Newton Hessian updating method, and model/trust-region technique to aid convergence from poor starting
values. The original code obtained from Netlib is in Fortran, but it was converting in C++. It has some converge problems.

DRMNFB uses many of the same subroutines as DRMNGB, but handles the derivative information differently and seems not to suffer from some of the convergence problems encountered with DRMNGB.

MINUIT is a well-know package from CERN, original in Fortran it was also translated to C++. In the GLAST Science Tools, only a few of MINUIT's possibilities are used. For example all variables are treated as bounded. No user interaction is allowed and only MIGRAD algorithm is implemented. For more information about MINUIT see the MINUIT, Function Minimization and Error Analysis Reference Manual.

NEWMINUIT is also an interface to the C++ version of MINUIT. This class implements an Optimizer by using MINUIT. It uses only a few of MINUIT's features: The MIGRAD and HESSE algorithms. All variables are treated as bounded. No user interaction is allowed. It has no limits on the number of free parameters.

LBFGS was original obtained from Netlib. The original code is in Fortran, but as the others, it was translated to C++. The ``L'' in the name means ``limited memory''. That means that the full approximate Hessian is not available.

Generally speaking, the faster way to find the parameters estimation in the likelihood GLAST Science tool is to use DRMNGB (or DRMNFB) approach to find initial values and then use MINUIT (or NEWMINUIT) to find more accurate results.

PROCEDURE FOR LIKELIHOOD ANALYSIS

There are two possible ways to perform a likelihood analysis in the LAT data: Unbinned and binned likelihood. A summary for each case is given in the following subsections, a further explanation can be obtained from the workbook.

1-UNBINNED LIKELIHOOD ANALYSIS

The event data file provided by the GSSC website and the spacecraft data file are needed to perform a likelihood analysis. For simulated data the spacecraft data file could be produced by gtorbsim (see the gtorbsim help), but for real data can be obtained from the GSSC website. You can also perform a likelihood analysis for simulated LAT data (see the gtobssim help, to know how to generate simulated data).

To perform an unbinned likelihood several GLAST Science Tools have to be run before running gtlike. The following paragraphs describe the steps to obtain the significance of a given detection:

a) The first step is to select the ROI (see the SOURCE REGION AND REGION OF INTEREST section in this document) using gtselect (see the gtselect help). You may also need to use gtmktime to select the Good Time Intervals (GTI) that is a time range when the data can be considered valid (see the gtmktime help for details).

b) Creation of the exposure map needed for gtlike in the unbinned likelihood analysis.

This requires two steps:

1) To create the livetime cube map. (See the gtltcube tool for more information about how to run this tool.)

2) To create the exposure map: The exposure map needed for unbinned likelihood analysis can be generated using the gtexpmap tool (see the gtexpmap help). The event file input to gtexpmap is in general a filtered event file from an original event file obtained from the GSSC website or produced by Monte Carlo simulations. This original event file could be filtered using, for example, the gtselect tool (see the gtselect help).

c) To run gtdiffrsp. (See the gtdiffrsp help for more information about how to run this tool.)

d) To run gtlike. Gtlike will give you the TS value for each source in the source model as well as the values obtained in the fitting process of the parameters in the source model file (see THE SOURCE MODEL section above). The source model file should be provided as input in gtlike. Other inputs are the exposure cube and the exposure created in previous steps. You are also allowed to chose in this step from different optimizer in the fitting process (see the "MODEL FITTING" section in this help).

2) BINNED LIKELIHOOD ANALYSIS

To perform a binned likelihood analysis the event data file and the spacecraft data file are needed. Both files could be obtained from the GSSC website. There are several steps to perform a binned likelihood analysis; the full description of all of them is in the workbook. The summary is given below:

a) Create a 3D Counts Map. The first step in a binned likelihood analysis is to create a count map of the ROI (see SOURCE REGION AND REGION OF INTEREST section in this document). In the binned analysis the ROI is defined as the boundaries of the count map, so it is not appropriate to apply acceptance cone cuts (see the gtselect help) on the event data.

b) Identify Candidate Sources and Define the Model. The second step for the binned likelihood analysis is to create the source model file. See THE SOURCE MODEL section in this document.

c) Compute Livetimes. To speed up the exposure calculations performed by Likelihood, it is helpful to pre-compute the livetime as a function of sky position and off-axis angle. The gtltcube application does this for the whole sky (see the gtltcube help for details).

d) Compute Source Maps. These are model count maps of each source, i.e., multiplied by exposure and convolved with the effective PSF. The gtsrcmaps application performs this task. (See the gtsrcmaps help.)

e) Run gtlike. The "binned" option of the "statistic" parameter has to be chosen. The optimizer has to be chosen as well (see the "MODEL FITTING" section in this help); as output gtlike will give you as output the TS and the predicted number of counts; the parameters of the spectra (the spectral index, for example, in the case of a power law spectra) for each source you have in your source model file.

f) Create a Model Map. For comparison to the counts map data, you can create a model map of the region based on the fit parameters. This map is essentially an infinite-statistics counts map of the ROI based on your model fit. The gtmodel application reads in the fitted model, applies the proper scaling to the source maps, and adds them together to get the final map. (See the gtmodel help.)

Examples: gtlike

Parameters are passed following the FTOOLs model (i.e., they can be passed interactively by: answering a prompt; as a list in a command line; or by editing the parameter file).

To run gtlike, enter (at the command line): gtlike

You will be prompted for parameter values.

Note: "Hidden" parameters are not prompted. If you want to override one of the "hidden" parameters, specify the values in the command line .For example, if you want to change the relative fit tolerance, enter (at the command line): gtlike ftol=1e-6

gtlike has specific parameters for each of the statistics selected; for example, the parameter cmap only works if you chose the binned likelihood.

Example 1: unbinned likelihood analysis

An example of how to run the tool for unbinned likelihood analysis is given below:

> gtlike
Statistic to use <BINNED|UNBINNED> [UNBINNED] :
Spacecraft file [FT2.fits] :
Event file [_3C279_3C273_back_filtered.fits] :
Unbinned exposure map [expMap_120_120.fits] : expMap.fits
Exposure hypercube file [expCube_3C279_3C273_back.fits] :
Source model file [src_model.xml] :
Response functions to use [HANDOFF] :
Optimizer <DRMNFB|NEWMINUIT|MINUIT|DRMNGB|LBFGS> [MINUIT] :

In the example above the event file is called _3C279_3C273_back_filtered.fits. It is a file that contains simulated data for 3C279, 3C273 and the Galactic and Extragalactic background. It was generated using gtobssim, an SAE tool to generate LAT simulated data (see the gtobssim help). The source model with the spectral and spatial information includes the sources 3C279, 3C273 and the Galactic and Extragalactic background and it is called src_model.xml. The exposure cube and exposure map were produced previously with gtltcube and gtexpmap, respectively. MINUIT was chosen as optimizer. The Spacecraft data file selected was: FT2.fits
and the response function HANDOFF was selected. The results of the likelihood analysis are printed in the Standard Output and also saved in an ASCII file (results.dat) and in a FITS file with information of the number of counts and fluxes for each energy bin and for each source.

The above example can also be run from the command line as follows:

gtlike irfs=HANDOFF expcube=expCube_3C279_3C273_back.fits
srcmdl=src_model.xml statistic=UNBINNED optimizer=MINUIT
evfile=_3C279_3C273_back_filtered.fits scfile=FT2.fits
expmap=expMap.fits

Example 2: Binned likelihood analysis

An example of how to run the tool for a binned likelihood analysis is given below:

> gtlike
Statistic to use <BINNED|UNBINNED> [BINNED] :
Counts map file [srcMaps_v1.fits] : srcMaps.fits
Binned exposure map [none] : binned_exposure.fits
Exposure hypercube file [expCube_3C279_3C273_back.fits] :
expCube_3C279_3C273_back.fits
Source model file [src_model.xml] :
Response functions to use [HANDOFF] :
Optimizer <DRMNFB|NEWMINUIT|MINUIT|DRMNGB|LBFGS> [DRMNFB] : MINUIT

In the example the livetime cube was previously generated by gtltcube (see the gtltcube help), the source maps was generated by gtsrcmaps (see the gtsrcmaps documentation), MINUIT was chosen as optimizer and the response function HANDOFF was selected.

The above example can also be run from the command line as follows:

>gtlike irfs=HANDOFF expcube=expCube_3C279_3C273_back.fits
srcmdl=src_model.xml statistic=BINNED optimizer=MINUIT
cmap=srcMaps.fits bexpmap=none