trtools.simTR module

trtools.simTR.CreateAlleleFasta(newseq, delta, tmpdir)

Create fasta file for this allele Return the path to the fasta

Parameters
  • newseq (str) – New repeat allele sequence

  • delta (int) – Change in repeat units compared to ref

  • tmpdir (str) – Path to create the fasta in

Returns

fname – Path to created fasta file

Return type

str

trtools.simTR.GetAlleleSeq(seq_preflank, seq_postflank, seq_repeat, repeat_unit, delta)

Generate a new allele with a change of delta repeat units

Parameters
  • seq_preflank (str) – Sequence upstream of the STR

  • seq_postflank (str) – Sequence downstream of the STR

  • seq_repeat (str) – Sequence of the STR region

  • repeat_unit (str) – Repeat unit sequence

  • delta (int) – Change in repeat units compared to ref

  • tmpdir (str) – Path to create the fasta in

Returns

newseq – New repeat allele sequence Return None if there was a problem

Return type

str

trtools.simTR.GetMaxDelta(sprob, rho, pthresh)

Compute the max delta for which the frequency would be great than pthresh

based on freq = sprob*rho*(1-rho)**(delta-1)

Parameters
  • sprob (float) – Stutter probability

  • rho (float) – Stutter step size parameters

  • pthresh (float) – Minimum frequency threshold

Returns

delta – Highest delta for which freq>prob Return 0 if no such delta exists, which can happen e.g. with low rho

Return type

int

trtools.simTR.GetTempDir(debug=False, dir=None)

Create a temporary directory to store intermediate fastas and fastqs

Parameters
  • debug (bool) – Ignored for now

  • dir (str) – Directory in which to create the temporary directory

Returns

dirname – Path to the temporary directory Return None if there was a problem creating the directory

Return type

str

trtools.simTR.ParseCoordinates(coords)

Extract chrom, start, end from coordinate string

Parameters

coords (str) – Coordinate string in the form chrom:start-end

Returns

  • chrom (str) – Chromosome name

  • start (int) – start coordinate

  • end (int) – end coordinate

  • If we encounter an error parsing, then

  • chrom, start, end are None

trtools.simTR.SimulateReads(newfasta, coverage, read_length, single, insert, sd, tmpdir, delta, art_cmd)

Run ART on our dummy fasta file with specified parameters

Parameters
  • newfasta (str) – Path to dummy fasta file

  • coverage (int) – Desired coverage level (ART -f)

  • read_length (int) – Read length (ART -l)

  • single (bool) – Use single-end read mode

  • insert (float) – Mean fragment length (ART -m)

  • sd (float) – Std dev of fragment length distribution (ART -s)

  • tmpdir (str) – Path to create the fasta in

  • delta (int) – Difference in repeat units from reference Used for naming files

  • art_cmd (str) – Command to run ART

Returns

fq1file, fq2file – Paths to fastq file output for the two read pairs. Return None, None if failed If single end mode, fq2file is None

Return type

str, str

trtools.simTR.WriteCombinedFastqs(fqfiles, fname)

Concatenate fastq files to output

Parameters
  • fqfiles (list of str) – List of paths to fastqfiles to concatenate

  • fname (str) – Name of final output file

trtools.simTR.getargs()
trtools.simTR.main(args)
trtools.simTR.run()