pyani.run_multiprocessing module

Code to run a set of command-line jobs using multiprocessing.

For parallelisation on multi-core desktop/laptop systems, etc. we use Python’s multiprocessing module to distribute command-line jobs.

pyani.run_multiprocessing.multiprocessing_run(cmdlines: List[T], workers: Optional[int] = None, logger: Optional[logging.Logger] = None) → int[source]

Distributes passed command-line jobs using multiprocessing.

Parameters:
  • cmdlines – iterable, command line strings
  • workers – int, number of workers to use for multiprocessing

Returns the sum of exit codes from each job that was run. If all goes well, this should be 0. Anything else and the calling function should act accordingly.

Sends a warning to the logger if a comparison fails; the warning consists of the specific command line that has failed.

pyani.run_multiprocessing.populate_cmdsets(job: pyani.pyani_jobs.Job, cmdsets: List[T], depth: int) → List[T][source]

Create list of jobsets at different depths of dependency tree.

Parameters:
  • job
  • cmdsets
  • depth

This is a recursive function (is there something quicker in the itertools module?) that descends each ‘root’ job in turn, populating each

pyani.run_multiprocessing.run_dependency_graph(jobgraph, workers: Optional[int] = None, logger: Optional[logging.Logger] = None) → int[source]

Create and run pools of jobs based on the passed jobgraph.

Parameters:
  • jobgraph – list of jobs, which may have dependencies.
  • workers – int, number of workers to use with multiprocessing
  • logger – a logger module logger (optional)

The strategy here is to loop over each job in the list of jobs (jobgraph), and create/populate a series of Sets of commands, to be run in reverse order with multiprocessing_run as asynchronous pools.