APM example - Handout from 10/18/2000 course
The following is a file prepared by Richa Agarwala describing a
simple use of APM on a real data set. I have removed references
to the data set to avoid revealing any research information.
This describes one usage. Other usages are possible.
APM comes in two flavors:
- apm --- 2 locus disease vs. marker analysis (analog of MLINK)
- apmmult -- multilocus analysis (analog of LINKMAP)
In each case one must follow usage of the main program with usage
of a simulation program to obtain empirical p-values.
Running APM
Required input files:
- LINKAGE format locus file
- LINKAGE format pedigree file
Required intermediate files:
- Input files:
- For apm - region.apm
- For apmmult - region..apmmult
Note: The intermediate files are created from LINKAGE format
files using the APM utility program chapm
- output files:
- For apm - table.apm
- For sim - sim.apm
- For apmmult - table..apmmult
- For simmult - sim..apmmult
- results: Summary of results
Note: other file names may be specified.
APM defines some file formats for its internal usage.
apm needs input file in ML format and apmmult needs input file in MULT
format. APM package has utility program called chapm which converts
LINKAGE format pedigree and locus files to ML/MULT format.
VERY IMPORTANT:
Due to theoretical considerations, APM is much more stringent about
allele frquencies.. None of the allele frequencies should be 0.0 or too small.
The allele frequencies should add up to something between
(0.999, 1.001)
Using the chapm utility program
Usage: chapm
Steps:
- File format to convert from: L (for LINKAGE)
- File format to convert to: ML (for apm), MULT (for apmmult)
- Input pedigree and locus file names.
- Number of disease locus: 1
- Label for affecteds: 2
- Choose marker loci to keep
- Input name of output file
Using apm
Usage: apm
Steps:
- Input the datafile file name. VERY IMPORTANT:
This is the output file from chapm
- Input the limitation on memory use in megabytes. Entering 0 will set it to about 20 megabytes
- Now enter the name of the file of coefficients if it exists (if not just press ). Note: In cookbook usage, there is no coefficients file.
- If you wish to create a new file of coefficients, enter the name (if not just press ). Note: In cookbook usage, there is no coefficients file.
This creates table.out, out1.dat, out1p.dat, and outsqr.dat
Using sim
Usage: sim
Steps:
- Input integer i, 0 <= i <= 30,000. Note: This is to initialize random number generator
- Input data file name: It will be one of out1.dat, out1p.dat, outsqr.dat depending on the function you want to keep.
- Input the desired number of iterations (1000 is good): 1000
Creates sim.out and one tstat*.out for every marker used.
VERY IMPORTANT
p-values are at the top of sim.out
just above the p-values are means and variances for the simulated runs.
If the mean is far from 0.0 and/or the variance is far from 1,
one should rerun with more rpelicates.
Using apmmult
Usage: apmmult
Steps:
- Input the datafile file name
- Input the limitation on memory use in megabytes. Entering 0 will set it to about 20 megabytes
- Input thetas between loci
This creates table.out, out1.dat, out1p.dat, and outsqr.dat
Using simmult
Usage: simmult
Steps:
- Print statistics for each family and each marker? (y/n): n
- Make a file full of statistics? (y/n): n
- Change seed for random number generator? (y/n): n
- Input data file name: outsqr.dat Note: Cookbook usage is to always use outswr.dat here.
- Input thetas
- Input number of replicates
Creates XSIMSUM.OUT which contains simulation results.
BEWARE of WARNINGS about allele frequencies and their sums.