trtools.prancSTR module

trtools.prancSTR.ComputePvalue(reads, A, B, best_C, best_f, stutter_probs)

Compute pvalue testing H0:f=0

Parameters
  • reads (list of int) – list of repeat lengths seen in each read

  • A (int) – First allele of the genotype

  • B (int) – Second allele of the genotype

  • best_C (integer) – Estimated mosaic allele

  • best_f (float) – mosaic fraction

  • stutter_probs (list of floats) – stutter probs for each delta

Returns

pval – P-value testing H0: f=0

Return type

float

trtools.prancSTR.ConfineRange(x, minval, maxval)

Confine the range of a nmber to lie between minval and maxval

Parameters
  • x (numeric) – The value to be constrained

  • minval (numeric) – The minimum value the output can take

  • maxval (numeric) – The maximum value the output can take

Returns

x_cons – New value, which cannot exceed maxval or go below minval

Return type

numeric

trtools.prancSTR.ExtractAB(trrecord)

Extract list of <A,B> for each sample

Parameters

trrecord (trh.TRRecord) – TRRecord object for the locus

Returns

genotypes – [(A,B), ..] genotypes for each sample given in terms of bp diff from ref

Return type

list of list of ints

trtools.prancSTR.ExtractReadVector(mallreads, period)

Extract reads vector from MALLREADS, MALLREADS has format: allele1|readcount1;allele2|readcount2

Parameters
  • mallreads (str) – MALLREADS string from HipSTR output

  • period (int) – STR unit length

Returns

reads – List with one entry per read. Given in terms of difference in repeats from reference

Return type

list of int

trtools.prancSTR.Just_C_Pred(reads, A, B, f, stutter_probs)

Predict C, holding f constant

Parameters
  • reads (list of int) – list of repeat lengths seen in each read

  • A (int) – First allele of the genotype

  • B (int) – Second allele of the genotype

  • f (float) – Mosaic fraction

  • stutter_probs (list of floats) – stutter probs for each delta

Returns

C – mosaic allele

Return type

int

trtools.prancSTR.Just_F_Pred(reads, A, B, C, stutter_probs)

Predict f, holding C constant

Parameters
  • reads (list of int) – list of repeat lengths seen in each read

  • A (int) – First allele of the genotype

  • B (int) – Second allele of the genotype

  • C (integer) – Mosaic allele

  • stutter_probs (list of floats) – stutter probs for each delta

Returns

f – mosaic fraction

Return type

float

trtools.prancSTR.Likelihood_mosaic(A, B, C, f, reads, stutter_probs)

Compute likelihood of observing the reads, given true genotype=A,B and mosaic allele C, mosaic fraction f

Parameters
  • reads (list of int) – list of repeat lengths seen in each read

  • A (int) – First allele of the genotype

  • B (int) – Second allele of the genotype

  • C (integer) – Mosaic allele

  • stutter_probs (list of floats) – stutter probs for each delta

  • f (float) – mosaic fraction

Returns

sum_likelihood – sum of max likelihood calculated for each read

Return type

float

trtools.prancSTR.MaximizeMosaicLikelihoodBoth(reads, A, B, stutter_probs, maxiter=100, locname='None', quiet=False)

Find the maximum likelihood values of C: mosaic allele f: mosaic fraction

Parameters
  • reads (list of int) – list of repeat lengths seen in each read

  • A (int) – First allele of the genotype

  • B (int) – Second allele of the genotype

  • stutter_probs (list of floats) – stutter probs for each delta

  • max_iter (int (optional)) – Maximum number of iterations to run the estimation procedure. Default=100

  • locname (str (optional)) – String identifier of the locus. For warning message purposes. Default: “None”

  • quiet (bool) – Don’t print out any messages

Returns

  • C (int) – Estimated mosaic allele

  • f (float) – Estimated mosaic fraction

trtools.prancSTR.SF(x)

Survival function of a point mass at 0

Parameters

x (float) – Observed value

Returns

sf – Survival function result

Return type

float

trtools.prancSTR.StutterProb(delta, stutter_u, stutter_d, stutter_rho)

Compute P(r_i | genotype; error model)

Parameters
  • delta (int) – Difference in repeat length between observed and underlying allele, given in copy number (r_i-genotype)

  • stutter_u (float) – Probability to see an expansion stutter error

  • stutter_d (float) – Probability to see a deletion stutter error

  • stutter_rho (float) – Step size parameter

Returns

prob – P(r_i|genotype)

Return type

float

trtools.prancSTR.getargs()
trtools.prancSTR.main(args)
trtools.prancSTR.run()