trtools.simTR module
- trtools.simTR.CreateAlleleFasta(newseq, delta, tmpdir)
Create fasta file for this allele Return the path to the fasta
- Parameters
newseq (str) – New repeat allele sequence
delta (int) – Change in repeat units compared to ref
tmpdir (str) – Path to create the fasta in
- Returns
fname – Path to created fasta file
- Return type
str
- trtools.simTR.GetAlleleSeq(seq_preflank, seq_postflank, seq_repeat, repeat_unit, delta)
Generate a new allele with a change of delta repeat units
- Parameters
seq_preflank (str) – Sequence upstream of the STR
seq_postflank (str) – Sequence downstream of the STR
seq_repeat (str) – Sequence of the STR region
repeat_unit (str) – Repeat unit sequence
delta (int) – Change in repeat units compared to ref
tmpdir (str) – Path to create the fasta in
- Returns
newseq – New repeat allele sequence Return None if there was a problem
- Return type
str
- trtools.simTR.GetMaxDelta(sprob, rho, pthresh)
Compute the max delta for which the frequency would be great than pthresh
based on freq = sprob*rho*(1-rho)**(delta-1)
- Parameters
sprob (float) – Stutter probability
rho (float) – Stutter step size parameters
pthresh (float) – Minimum frequency threshold
- Returns
delta – Highest delta for which freq>prob Return 0 if no such delta exists, which can happen e.g. with low rho
- Return type
int
- trtools.simTR.GetTempDir(debug=False, dir=None)
Create a temporary directory to store intermediate fastas and fastqs
- Parameters
debug (bool) – Ignored for now
dir (str) – Directory in which to create the temporary directory
- Returns
dirname – Path to the temporary directory Return None if there was a problem creating the directory
- Return type
str
- trtools.simTR.ParseCoordinates(coords)
Extract chrom, start, end from coordinate string
- Parameters
coords (str) – Coordinate string in the form chrom:start-end
- Returns
chrom (str) – Chromosome name
start (int) – start coordinate
end (int) – end coordinate
If we encounter an error parsing, then
chrom, start, end are None
- trtools.simTR.SimulateReads(newfasta, coverage, read_length, single, insert, sd, tmpdir, delta, art_cmd)
Run ART on our dummy fasta file with specified parameters
- Parameters
newfasta (str) – Path to dummy fasta file
coverage (int) – Desired coverage level (ART -f)
read_length (int) – Read length (ART -l)
single (bool) – Use single-end read mode
insert (float) – Mean fragment length (ART -m)
sd (float) – Std dev of fragment length distribution (ART -s)
tmpdir (str) – Path to create the fasta in
delta (int) – Difference in repeat units from reference Used for naming files
art_cmd (str) – Command to run ART
- Returns
fq1file, fq2file – Paths to fastq file output for the two read pairs. Return None, None if failed If single end mode, fq2file is None
- Return type
str, str
- trtools.simTR.WriteCombinedFastqs(fqfiles, fname)
Concatenate fastq files to output
- Parameters
fqfiles (list of str) – List of paths to fastqfiles to concatenate
fname (str) – Name of final output file
- trtools.simTR.getargs()
- trtools.simTR.main(args)
- trtools.simTR.run()