pyani.scripts.subcommands.subcmd_anim module¶
Provides the anim subcommand for pyani.
-
class
pyani.scripts.subcommands.subcmd_anim.
ComparisonJob
[source]¶ Bases:
tuple
Pairwise comparison job for the SQLAlchemy implementation.
-
filtercmd
¶ Alias for field number 2
-
job
¶ Alias for field number 5
-
nucmercmd
¶ Alias for field number 3
-
outfile
¶ Alias for field number 4
-
query
¶ Alias for field number 0
-
subject
¶ Alias for field number 1
-
-
class
pyani.scripts.subcommands.subcmd_anim.
ComparisonResult
[source]¶ Bases:
tuple
Convenience struct for a single nucmer comparison result.
-
aln_length
¶ Alias for field number 2
-
pid
¶ Alias for field number 4
-
qcov
¶ Alias for field number 7
-
qid
¶ Alias for field number 0
-
qlen
¶ Alias for field number 5
-
scov
¶ Alias for field number 8
-
sid
¶ Alias for field number 1
-
sim_errs
¶ Alias for field number 3
-
slen
¶ Alias for field number 6
-
-
class
pyani.scripts.subcommands.subcmd_anim.
ProgData
[source]¶ Bases:
tuple
Convenience struct for comparison program data/info.
-
program
¶ Alias for field number 0
-
version
¶ Alias for field number 1
-
-
class
pyani.scripts.subcommands.subcmd_anim.
ProgParams
[source]¶ Bases:
tuple
Convenience struct for comparison parameters.
Use default of zero for fragsize or else db queries will not work as SQLite/Python nulls do not match up well
-
fragsize
¶ Alias for field number 0
-
maxmatch
¶ Alias for field number 1
-
-
class
pyani.scripts.subcommands.subcmd_anim.
RunData
[source]¶ Bases:
tuple
Convenience struct describing an analysis run.
-
cmdline
¶ Alias for field number 3
-
date
¶ Alias for field number 2
-
method
¶ Alias for field number 0
-
name
¶ Alias for field number 1
-
-
pyani.scripts.subcommands.subcmd_anim.
generate_joblist
(comparisons: List[Tuple], existingfiles: List[pathlib.Path], args: argparse.Namespace) → List[pyani.scripts.subcommands.subcmd_anim.ComparisonJob][source]¶ Return list of ComparisonJobs.
Parameters: - comparisons – list of (Genome, Genome) tuples
- existingfiles – list of pre-existing nucmer output files
- args – Namespace of command-line arguments for the run
-
pyani.scripts.subcommands.subcmd_anim.
run_anim_jobs
(joblist: List[pyani.scripts.subcommands.subcmd_anim.ComparisonJob], args: argparse.Namespace) → None[source]¶ Pass ANIm nucmer jobs to the scheduler.
Parameters: - joblist – list of ComparisonJob namedtuples
- args – command-line arguments for the run
-
pyani.scripts.subcommands.subcmd_anim.
subcmd_anim
(args: argparse.Namespace) → None[source]¶ Perform ANIm on all genome files in an input directory.
Parameters: args – Namespace, command-line arguments Finds ANI by the ANIm method, as described in Richter et al (2009) Proc Natl Acad Sci USA 106: 19126-19131 doi:10.1073/pnas.0906412106.
All FASTA format files (selected by suffix) in the input directory are compared against each other, pairwise, using NUCmer (whose path must be provided).
For each pairwise comparison, the NUCmer .delta file output is parsed to obtain an alignment length and similarity error count for every unique region alignment between the two organisms, as represented by sequences in the FASTA files. These are processed to calculated aligned sequence lengths, average nucleotide identity (ANI) percentages, coverage (aligned percentage of whole genome - forward direction), and similarity error count for each pairwise comparison.
The calculated values are deposited in the SQLite3 database being used for the analysis.
For each pairwise comparison the NUCmer output is stored in the output directory for long enough to extract summary information, but for each run the output is gzip compressed. Once all runs are complete, the outputs for each comparison are concatenated into a single gzip archive.
-
pyani.scripts.subcommands.subcmd_anim.
update_comparison_results
(joblist: List[pyani.scripts.subcommands.subcmd_anim.ComparisonJob], run, session, nucmer_version: str, args: argparse.Namespace) → None[source]¶ Update the Comparison table with the completed result set.
Parameters: - joblist – list of ComparisonJob namedtuples
- run – Run ORM object for the current ANIm run
- session – active pyanidb session via ORM
- nucmer_version – version of nucmer used for the comparison
- args – command-line arguments for this run
The Comparison table stores individual comparison results, one per row.