trtools.dumpSTR module
- trtools.dumpSTR.ApplyCallFilters(record, call_filters, sample_info, sample_names)
Apply call-level filters to a record.
Returns a TRRecord object with the FILTER (or DUMPSTR_FILTER) format field updated for each sample. Also updates sample_info with sample level stats
- Parameters
record (trtools.utils.tr_harmonizer.TRRecord) – The record to apply filters to. Note: once this method has been run, this object will be in an inconsistent state. All further use should be directed towards the returned TRRecord object.
call_filters (List[trtools.dumpSTR.filters.FilterBase]) – List of call filters to apply
sample_info (Dict[str, numpy.ndarray]) – Dictionary of sample stats to keep updated, from name of filter to array of length nsamples which counts the number of times that filter has been applied to each sample across all loci
sample_names (List[str]) – Names of all the samples in the vcf. Used for formatting error messages.
- Returns
A reference to the same underlying cyvcf2.Variant object, which has now been modified to contain all the new call-level filters.
- Return type
trh.TRRecord
- trtools.dumpSTR.ApplyLocusFilters(record, locus_filters, loc_info, drop_filtered)
Apply locus-level filters to a record.
If not drop_filtered, then the input record’s FILTER field is set as either PASS or the names of the filters which filtered it.
- Parameters
record (trtools.utils.tr_harmonizer.TRRecord) – The record to apply filters to.
call_filters – List of locus filters to apply
loc_info (Dict[str, int]) – Dictionary of locus stats to keep updated, from name of filter to count of times the filter has been applied
drop_filtered (bool) – Whether or not filtered loci should be written to or dropped from the output vcf.
locus_filters (List[trtools.dumpSTR.filters.FilterBase]) –
- Returns
locus_filtered – True if this locus was filtered
- Return type
bool
- trtools.dumpSTR.BuildCallFilters(args)
Build list of locus-level filters to include
- Parameters
args (argparse namespace) – User input arguments used to decide on filters
- Returns
filter_list – List of call-level filters to apply
- Return type
list of filters.Filter
- trtools.dumpSTR.BuildLocusFilters(args)
Build list of locus-level filters to include.
These filters should in general not be tool specific
- Parameters
args (argparse namespace) – User input arguments used to decide on filters
- Returns
filter_list – List of locus-level filters
- Return type
list of filters.Filter
- trtools.dumpSTR.CheckAdVNTRFilters(format_fields, args)
Check adVNTR call-level filters
- Parameters
format_fields – The format fields used in this VCF
args (argparse namespace) – Contains user arguments
- Returns
checks – Set to True if all filters look ok. Set to False if filters are invalid
- Return type
bool
- trtools.dumpSTR.CheckEHFilters(format_fields, args)
Check ExpansionHunter call-level filters
- Parameters
format_fields – The format fields used in this VCF
args (argparse namespace) – Contains user arguments
- Returns
checks – Set to True if all filters look ok. Set to False if filters are invalid
- Return type
bool
- trtools.dumpSTR.CheckFilters(format_fields, args, vcftype, is_beagle)
Perform checks on user input for filters.
Assert that user input matches the type of the input vcf.
- Parameters
format_fields (Set[str]) – The format fields used in this VCF
args (argparse.Namespace) – Contains user arguments
vcftype (trtools.utils.tr_harmonizer.VcfTypes) – Specifies which tool this VCF came from.
is_beagle (bool) – Was this VCF generated by Beagle imputation?
- Returns
checks – Set to True if all filters look ok. Set to False if filters are invalid
- Return type
bool
- trtools.dumpSTR.CheckGangSTRFilters(format_fields, args)
Check GangSTR call-level filters
- Parameters
format_fields – The format fields used in this VCF
args (argparse namespace) – Contains user arguments
- Returns
checks – Set to True if all filters look ok. Set to False if filters are invalid
- Return type
bool
- trtools.dumpSTR.CheckHipSTRFilters(format_fields, args)
Check HipSTR call-level filters
- Parameters
format_fields – The format fields used in this VCF
args (argparse namespace) – Contains user arguments
- Returns
checks – Set to True if all filters look ok. Set to False if filters are invalid
- Return type
bool
- trtools.dumpSTR.CheckLocusFilters(args, vcftype, is_beagle)
Perform checks on user inputs for locus-level filters
- Parameters
args (argparse namespace) – Contains user arguments
vcftype (enum.) – Specifies which tool this VCF came from. Must be included in trh.VCFTYPES
is_beagle (bool) – Was this VCF generated by Beagle imputation?
- Returns
checks – Set to True if all filters look ok. Set to False if filters are invalid
- Return type
bool
- trtools.dumpSTR.CheckLongTRFilters(format_fields, args)
Check LongTR call-level filters
- Parameters
format_fields – The format fields used in this VCF
args (argparse namespace) – Contains user arguments
- Returns
checks – Set to True if all filters look ok. Set to False if filters are invalid
- Return type
bool
- trtools.dumpSTR.CheckPopSTRFilters(format_fields, args)
Check PopSTR call-level filters
- Parameters
format_fields – The format fields used in this VCF
args (argparse namespace) – Contains user arguments
- Returns
checks – Set to True if all filters look ok. Set to False if filters are invalid
- Return type
bool
- trtools.dumpSTR.GetAllCallFilters(call_filters)
List all possible call filters
- Parameters
call_filters (list of filters.Reason) – List of all call-level filters
- Returns
reasons – A list of call-level filter reason strings
- Return type
list of str
- trtools.dumpSTR.MakeWriter(outfile, invcf, command)
Create a VCF writer with a dumpSTR header
Adds a header line with the dumpSTR command used
- Parameters
outfile (str) – Name of the output file
invcf (vcf.Reader object) – Input VCF. Used to grab header info
command (str) – String command used to run dumpSTR
- Returns
writer – VCF writer initialized with header of input VCF Set to None if we had a problem writing the file
- Return type
vcf.Writer object
- trtools.dumpSTR.WriteLocLog(loc_info, fname)
Write locus-level features to log file
- Parameters
loc_info (dict of str->value) – Dictionary containing locus-level stats. Must have at least keys: ‘totalcalls’, ‘PASS’
fname (str) – Output log filename
- Returns
success – Set to true if outputting the log was successful
- Return type
bool
- trtools.dumpSTR.WriteSampLog(sample_info, sample_names, fname)
Write sample-level features to log file.
- Parameters
sample_info (Dict[str, numpy.ndarray]) – Mapping from statistic name to 1D array of values per sample
sample_names (List[str]) – List of sample names, same length as above arrays
fname (str) – Output filename
- trtools.dumpSTR.getargs()
- trtools.dumpSTR.main(args)
- trtools.dumpSTR.run()