The errormodel command allows the user to specify an error distribution. CAFE will correct for this error before calculating ancestral family sizes and estimating λ values. The errormodel function is also used by caferror.py to estimate error in the input data set.
-model error model file: This option allows the user to specify the errorfile to use in order to correct the input data for errors. The error model file format should be as follows:
In this file, maxcnt is the largest family size observed in the dataset. Errorclasses (for all following rows) are defined with cntdiff and act as labels for error distributions for each gene family size. Error classes must be space-delimited positive or negative integers (and 0). The error class with label 0 means that this corresponds to no change in gene family size due to error. After the first two lines, each possible family size in the dataset (size 0 to maxcnt ) should have an error distribution defined. Any omitted family size follows the distribution for the previous row. The error distribution for each count should be space delimited probabilities whose columns correspond to the error classes defined in line two.
Default: No error model is applied.
-sp: This option is required to specify the species to which the error model will be applied. Species names must be identical to those in the data file and the input tree. The user may specify any combination of species with the same or different error model files with separate errormodel commands, or the user may specify all species with the same error model file in one errormodel command using -all as the species option here.
-all (see above)