pyani download
¶
The download
subcommand controls download of genome files from the NCBI Assembly database for input to pyani
.
usage: pyani download [-h] [-l LOGFILE] [-v] [--debug] [--disable_tqdm] [--version]
[--citation] -o OUTDIR -t TAXON --email EMAIL
[--api_key API_KEYPATH] [--retries RETRIES]
[--batchsize BATCHSIZE] [--timeout TIMEOUT] [-f]
[--noclobber] [--labels LABELFNAME] [--classes CLASSFNAME]
[--kraken] [--dry-run]
Positional arguments¶
outdir
- The
outdir
argument should be the path to a directory into which genome files will be downloaded. If the directory exists, a warning will be given and the download will not proceed, to avoid overwriting existing data. To force writing into an existing directory, use the-f
option.
Flagged arguments¶
--api_key PATH_TO_API_KEY
- The program will attempt to use an NCBI API key (see here) located at
PATH_TO_API_KEY
. Default:~/.ncbi/api_key
--batchsize BATCHSIZE
- The download process will attempt to download assemblies in multiples of
BATCHSIZE
. Default: 10000 --classes CLASSFNAME
- Write a set of labels (one per downloaded genome) to the file
CLASSFNAME
inoutdir
. Default:classes.txt
--disable_tqdm
- Disable the
tqdm
progress bar while the download process runs. This is useful when testing to avoid aesthetic problems with test output. --dry-run
- Perform all actions of the download process except for downloading files.
--email EMAIL
- COMPULSORY. Provide the email address
EMAIL
to NCBI so that they can track problems. -f, --force
- Force use of the
OUTDIR
directory when downloaded genomes, even if it already exists. -h, --help
- Display usage information for
pyani download
. --kraken
- Add taxonomy information to the FASTA file headers of downloaded genomes. This allows the genomes to be readily used to construct databases for the Kraken software package.
-l LOGFILE, --logfile LOGFILE
- Provide the location
LOGFILE
to which a logfile of the download process will be written. --labels LABELFNAME
- Write a set of labels (one per downloaded genome) to the file
LABELFNAME
inoutdir
. Default:labels.txt
--noclobber
- Do not overwrite individual files in the
outdir
directory, when used with-f
. -o OUTDIR, --outdir OUTDIR
- The
OUTDIR
argument should be the path to a directory into which genome files will be downloaded. If the directory exists, a warning will be given and the download will not proceed, to avoid overwriting existing data. To force writing into an existing directory, use the-f
option. --retries RETRIES
- The download process will attempt to download each batch of assemblies a maximum of
RETRIES
times. Default: 20 -t TAXON, --taxon TAXON
- COMPULSORY. All genomes below taxon ID
TAXON
of a node in the NCBI Taxonomy database will be downloaded to the location specified byoutdir
. --timeout TIMEOUT
- The download process will wait a amaximum of
TIMEOUT
seconds before abandoning a URL connection attempt. Default: 10 -v, --verbose
- Provide verbose output to
STDOUT