trtools.dumpSTR module

trtools.dumpSTR.ApplyCallFilters(record, call_filters, sample_info, sample_names)

Apply call-level filters to a record.

Returns a TRRecord object with the FILTER (or DUMPSTR_FILTER) format field updated for each sample. Also updates sample_info with sample level stats

Parameters

record (trtools.utils.tr_harmonizer.TRRecord) – The record to apply filters to. Note: once this method has been run, this object will be in an inconsistent state. All further use should be directed towards the returned TRRecord object.
call_filters (List[trtools.dumpSTR.filters.FilterBase]) – List of call filters to apply
sample_info (Dict[str, numpy.ndarray]) – Dictionary of sample stats to keep updated, from name of filter to array of length nsamples which counts the number of times that filter has been applied to each sample across all loci
sample_names (List[str]) – Names of all the samples in the vcf. Used for formatting error messages.

Returns

A reference to the same underlying cyvcf2.Variant object, which has now been modified to contain all the new call-level filters.

Return type

trh.TRRecord

trtools.dumpSTR.ApplyLocusFilters(record, locus_filters, loc_info, drop_filtered)

Apply locus-level filters to a record.

If not drop_filtered, then the input record’s FILTER field is set as either PASS or the names of the filters which filtered it.

Parameters

record (trtools.utils.tr_harmonizer.TRRecord) – The record to apply filters to.
call_filters – List of locus filters to apply
loc_info (Dict[str, int]) – Dictionary of locus stats to keep updated, from name of filter to count of times the filter has been applied
drop_filtered (bool) – Whether or not filtered loci should be written to or dropped from the output vcf.
locus_filters (List[trtools.dumpSTR.filters.FilterBase]) –

Returns

locus_filtered – True if this locus was filtered

Return type

bool

trtools.dumpSTR.BuildCallFilters(args)

Build list of locus-level filters to include

Parameters: args (argparse namespace) – User input arguments used to decide on filters
Returns: filter_list – List of call-level filters to apply
Return type: list of filters.Filter

trtools.dumpSTR.BuildLocusFilters(args)

Build list of locus-level filters to include.

These filters should in general not be tool specific

Parameters: args (argparse namespace) – User input arguments used to decide on filters
Returns: filter_list – List of locus-level filters
Return type: list of filters.Filter

trtools.dumpSTR.CheckAdVNTRFilters(format_fields, args)

Check adVNTR call-level filters

Parameters

format_fields – The format fields used in this VCF
args (argparse namespace) – Contains user arguments

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.CheckEHFilters(format_fields, args)

Check ExpansionHunter call-level filters

Parameters

format_fields – The format fields used in this VCF
args (argparse namespace) – Contains user arguments

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.CheckFilters(format_fields, args, vcftype, is_beagle)

Perform checks on user input for filters.

Assert that user input matches the type of the input vcf.

Parameters

format_fields (Set[str]) – The format fields used in this VCF
args (argparse.Namespace) – Contains user arguments
vcftype (trtools.utils.tr_harmonizer.VcfTypes) – Specifies which tool this VCF came from.
is_beagle (bool) – Was this VCF generated by Beagle imputation?

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.CheckGangSTRFilters(format_fields, args)

Check GangSTR call-level filters

Parameters

format_fields – The format fields used in this VCF
args (argparse namespace) – Contains user arguments

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.CheckHipSTRFilters(format_fields, args)

Check HipSTR call-level filters

Parameters

format_fields – The format fields used in this VCF
args (argparse namespace) – Contains user arguments

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.CheckLocusFilters(args, vcftype, is_beagle)

Perform checks on user inputs for locus-level filters

Parameters

args (argparse namespace) – Contains user arguments
vcftype (enum.) – Specifies which tool this VCF came from. Must be included in trh.VCFTYPES
is_beagle (bool) – Was this VCF generated by Beagle imputation?

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.CheckPopSTRFilters(format_fields, args)

Check PopSTR call-level filters

Parameters

format_fields – The format fields used in this VCF
args (argparse namespace) – Contains user arguments

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.GetAllCallFilters(call_filters)

List all possible call filters

Parameters: call_filters (list of filters.Reason) – List of all call-level filters
Returns: reasons – A list of call-level filter reason strings
Return type: list of str

trtools.dumpSTR.MakeWriter(outfile, invcf, command)

Create a VCF writer with a dumpSTR header

Adds a header line with the dumpSTR command used

Parameters

outfile (str) – Name of the output file
invcf (vcf.Reader object) – Input VCF. Used to grab header info
command (str) – String command used to run dumpSTR

Returns

writer – VCF writer initialized with header of input VCF Set to None if we had a problem writing the file

Return type

vcf.Writer object

trtools.dumpSTR.WriteLocLog(loc_info, fname)

Write locus-level features to log file

Parameters

loc_info (dict of str->value) – Dictionary containing locus-level stats. Must have at least keys: ‘totalcalls’, ‘PASS’
fname (str) – Output log filename

Returns

success – Set to true if outputting the log was successful

Return type

bool

trtools.dumpSTR.WriteSampLog(sample_info, sample_names, fname)

Write sample-level features to log file.

Parameters

sample_info (Dict[str, numpy.ndarray]) – Mapping from statistic name to 1D array of values per sample
sample_names (List[str]) – List of sample names, same length as above arrays
fname (str) – Output filename

trtools.dumpSTR.getargs()

trtools.dumpSTR.main(args)

trtools.dumpSTR.run()