Using non-NCBI genomes¶
It is usual to want to include or work only with genomes that have been generated locally, or that were not downloaded from NCBI using pyani download. To use these genomes with the pyani analysis subcommands, the genomes must be indexed .
To index a set of genomes, use the pyani index subcommand on the input directory, which is passed to the
-i argument. To index the directory
mygenomes, for example:
pyani index -i mygenomes
This will create a
.md5 file (containing the hash) for each genome, as well as class and label files listing all the input genomes.
All genomes in the
mygenomes directory will now be available for use in pyani.
pyani download command will create two files, by default
labels.txt containing identifiers for each input genome that are used in later analysis and visualisation. These files are also created when
pyani index is used as above.
The location of the labels and classes files may be changed using the
--classes arguments, for example:
pyani index -i mygenomes --classes myclasses.txt --labels mylabels.txt
|indexing here refers to constructing a hash of the genome: a short representation of the entire genome’s contents that can be used to identify it uniquely