pyani.scripts.subcommands.subcmd_fastani module¶

class pyani.scripts.subcommands.subcmd_fastani.ComparisonJob[source]¶

Bases: tuple

Pairwise comparison job for the SQLAlchemy implementation

fastcmd¶: Alias for field number 2

fragLen¶: Alias for field number 4

job¶: Alias for field number 7

kmerSize¶: Alias for field number 5

minFraction¶: Alias for field number 6

outfile¶: Alias for field number 3

query¶: Alias for field number 0

ref¶: Alias for field number 1

class pyani.scripts.subcommands.subcmd_fastani.ComparisonResult[source]¶

Bases: tuple

Convenience struct for a single fastani comparison result.

aln_length¶: Alias for field number 2

pid¶: Alias for field number 4

qcov¶: Alias for field number 7

qid¶: Alias for field number 0

qlen¶: Alias for field number 5

rcov¶: Alias for field number 8

rid¶: Alias for field number 1

rlen¶: Alias for field number 6

sim_errs¶: Alias for field number 3

class pyani.scripts.subcommands.subcmd_fastani.RunData[source]¶

Bases: tuple

Convenience struct describing an analysis run.

cmdline¶: Alias for field number 3

date¶: Alias for field number 2

method¶: Alias for field number 0

name¶: Alias for field number 1

pyani.scripts.subcommands.subcmd_fastani.generate_joblist(comparisons: List[T], existingfiles: List[T], args: argparse.Namespace) → List[pyani.scripts.subcommands.subcmd_fastani.ComparisonJob][source]¶

Return list of ComparisonJobs

Parameters:	comparisons – list of (Genome, Genome) tuples for which comparisons are needed files (existing) – list of pre-existing FastANI outputs args – Namespace, command-line arguments

pyani.scripts.subcommands.subcmd_fastani.run_fastani_jobs(joblist: List[pyani.scripts.subcommands.subcmd_fastani.ComparisonJob], args: argparse.Namespace) → None[source]¶

Pass fastANI jobs to the scheduler.

Parameters:	joblist – list of ComparisonJob namedtuples args – command-line arguments for the run

pyani.scripts.subcommands.subcmd_fastani.subcmd_fastani(args: argparse.Namespace) → None[source]¶

Perform fastANI on all genome files in an input directory.

Parameters:	args – Namespace, command-line arguments

Finds ANI by the fastANI method, as described in Jain et al (2018) Nature Communications 9, 5114. doi:10.1038/s41467-018-07641-9.

All FASTA format files (selected by suffix) in the input directory are compared against each other, pairwise, using fastANI (whose path must be provided).

For each pairwise comparison, the fastANI .fastani file output is parsed to obtain an alignment length and similarity error countfor the two organisms, as represented by sequences in the FASTA files. These are processed to calculate aligned sequence lengths, average nucleotide identity (ANI) percentages, coverage (aligned percentage of whole genome), and similarity error count for each pairwise comparison.

The calculated values are deposited in the SQLite3 database being used for the analysis.

For each pairwise comparison, the fastANI output is stored in the output directory for long enough to extract summary information, but for each run the output is gzip compressed. Once all runs are complete, the outputs for each comparison are concatenated into a single gzip archive.

pyani.scripts.subcommands.subcmd_fastani.update_comparison_results(joblist: List[pyani.scripts.subcommands.subcmd_fastani.ComparisonJob], run, session, fastani_version: str, args: argparse.Namespace) → None[source]¶

Update the Comparison table with the completed result set.

Parameters:	joblist – list of ComparisonJob namedtuples run – Run ORM object for the current ANIm run session – active pyanidb session via ORM fastani_version – version of fastANI used for the comparison args – command-line arguments for this run

The Comparison table stores individual comparison results, one per row.