trtools.dumpSTR module

trtools.dumpSTR.ApplyCallFilters(record, call_filters, sample_info, sample_names)

Apply call-level filters to a record.

Returns a TRRecord object with the FILTER (or DUMPSTR_FILTER) format field updated for each sample. Also updates sample_info with sample level stats

Parameters
  • record (trtools.utils.tr_harmonizer.TRRecord) – The record to apply filters to. Note: once this method has been run, this object will be in an inconsistent state. All further use should be directed towards the returned TRRecord object.

  • call_filters (List[trtools.dumpSTR.filters.FilterBase]) – List of call filters to apply

  • sample_info (Dict[str, numpy.ndarray]) – Dictionary of sample stats to keep updated, from name of filter to array of length nsamples which counts the number of times that filter has been applied to each sample across all loci

  • sample_names (List[str]) – Names of all the samples in the vcf. Used for formatting error messages.

Returns

A reference to the same underlying cyvcf2.Variant object, which has now been modified to contain all the new call-level filters.

Return type

trh.TRRecord

trtools.dumpSTR.ApplyLocusFilters(record, locus_filters, loc_info, drop_filtered)

Apply locus-level filters to a record.

If not drop_filtered, then the input record’s FILTER field is set as either PASS or the names of the filters which filtered it.

Parameters
  • record (trtools.utils.tr_harmonizer.TRRecord) – The record to apply filters to.

  • call_filters – List of locus filters to apply

  • loc_info (Dict[str, int]) – Dictionary of locus stats to keep updated, from name of filter to count of times the filter has been applied

  • drop_filtered (bool) – Whether or not filtered loci should be written to or dropped from the output vcf.

  • locus_filters (List[trtools.dumpSTR.filters.FilterBase]) –

Returns

locus_filtered – True if this locus was filtered

Return type

bool

trtools.dumpSTR.BuildCallFilters(args)

Build list of locus-level filters to include

Parameters

args (argparse namespace) – User input arguments used to decide on filters

Returns

filter_list – List of call-level filters to apply

Return type

list of filters.Filter

trtools.dumpSTR.BuildLocusFilters(args)

Build list of locus-level filters to include.

These filters should in general not be tool specific

Parameters

args (argparse namespace) – User input arguments used to decide on filters

Returns

filter_list – List of locus-level filters

Return type

list of filters.Filter

trtools.dumpSTR.CheckAdVNTRFilters(format_fields, args)

Check adVNTR call-level filters

Parameters
  • format_fields – The format fields used in this VCF

  • args (argparse namespace) – Contains user arguments

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.CheckEHFilters(format_fields, args)

Check ExpansionHunter call-level filters

Parameters
  • format_fields – The format fields used in this VCF

  • args (argparse namespace) – Contains user arguments

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.CheckFilters(format_fields, args, vcftype, is_beagle)

Perform checks on user input for filters.

Assert that user input matches the type of the input vcf.

Parameters
  • format_fields (Set[str]) – The format fields used in this VCF

  • args (argparse.Namespace) – Contains user arguments

  • vcftype (trtools.utils.tr_harmonizer.VcfTypes) – Specifies which tool this VCF came from.

  • is_beagle (bool) – Was this VCF generated by Beagle imputation?

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.CheckGangSTRFilters(format_fields, args)

Check GangSTR call-level filters

Parameters
  • format_fields – The format fields used in this VCF

  • args (argparse namespace) – Contains user arguments

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.CheckHipSTRFilters(format_fields, args)

Check HipSTR call-level filters

Parameters
  • format_fields – The format fields used in this VCF

  • args (argparse namespace) – Contains user arguments

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.CheckLocusFilters(args, vcftype, is_beagle)

Perform checks on user inputs for locus-level filters

Parameters
  • args (argparse namespace) – Contains user arguments

  • vcftype (enum.) – Specifies which tool this VCF came from. Must be included in trh.VCFTYPES

  • is_beagle (bool) – Was this VCF generated by Beagle imputation?

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.CheckPopSTRFilters(format_fields, args)

Check PopSTR call-level filters

Parameters
  • format_fields – The format fields used in this VCF

  • args (argparse namespace) – Contains user arguments

Returns

checks – Set to True if all filters look ok. Set to False if filters are invalid

Return type

bool

trtools.dumpSTR.GetAllCallFilters(call_filters)

List all possible call filters

Parameters

call_filters (list of filters.Reason) – List of all call-level filters

Returns

reasons – A list of call-level filter reason strings

Return type

list of str

trtools.dumpSTR.MakeWriter(outfile, invcf, command)

Create a VCF writer with a dumpSTR header

Adds a header line with the dumpSTR command used

Parameters
  • outfile (str) – Name of the output file

  • invcf (vcf.Reader object) – Input VCF. Used to grab header info

  • command (str) – String command used to run dumpSTR

Returns

writer – VCF writer initialized with header of input VCF Set to None if we had a problem writing the file

Return type

vcf.Writer object

trtools.dumpSTR.WriteLocLog(loc_info, fname)

Write locus-level features to log file

Parameters
  • loc_info (dict of str->value) – Dictionary containing locus-level stats. Must have at least keys: ‘totalcalls’, ‘PASS’

  • fname (str) – Output log filename

Returns

success – Set to true if outputting the log was successful

Return type

bool

trtools.dumpSTR.WriteSampLog(sample_info, sample_names, fname)

Write sample-level features to log file.

Parameters
  • sample_info (Dict[str, numpy.ndarray]) – Mapping from statistic name to 1D array of values per sample

  • sample_names (List[str]) – List of sample names, same length as above arrays

  • fname (str) – Output filename

trtools.dumpSTR.getargs()
trtools.dumpSTR.main(args)
trtools.dumpSTR.run()