kb_python.utils

Module Contents

kb_python.utils.logger
kb_python.utils.TECHNOLOGY_PARSER
kb_python.utils.VERSION_PARSER
exception kb_python.utils.NotImplementedException

Bases: Exception

exception kb_python.utils.UnmetDependencyException

Bases: Exception

kb_python.utils.open_as_text(path, mode)

Open a textfile or gzip file in text mode.

Parameters:
  • path (str) – path to textfile or gzip
  • mode (str) – mode to open the file, either w for write or r for read
Returns:

file object

Return type:

file object

kb_python.utils.decompress_gzip(gzip_path, out_path)

Decompress a gzip file to provided file path.

Parameters:
  • gzip_path (str) – path to gzip file
  • out_path (str) – path to decompressed file
Returns:

path to decompressed file

Return type:

str

kb_python.utils.compress_gzip(file_path, out_path)

Compress a file into gzip.

Parameters:
  • file_path (str) – path to file
  • out_dir (str) – path to compressed file
Returns:

path to compressed file

Return type:

str

kb_python.utils.run_executable(command, stdin=None, stdout=sp.PIPE, stderr=sp.PIPE, wait=True, stream=True, quiet=False, returncode=0)

Execute a single shell command.

Parameters:
  • command (list) – a list representing a single shell command
  • stdin (stream, optional) – object to pass into the stdin argument for subprocess.Popen, defaults to None
  • stdout (stream, optional) – object to pass into the stdout argument for subprocess.Popen, defaults to subprocess.PIPE
  • stderr (stream, optional) – object to pass into the stderr argument for subprocess.Popen, defaults to subprocess.PIPE
  • wait (bool, optional) – whether to wait until the command has finished, defaults to True
  • stream (bool, optional) – whether to stream the output to the command line, defaults to True
  • quiet (bool, optional) – whether to not display anything to the command line and not check the return code, defaults to False
  • returncode (int, optional) – the return code expected if the command runs as intended, defaults to 0
Returns:

the spawned process

Return type:

subprocess.Process

kb_python.utils.run_chain(*commands, stdin=None, stdout=sp.PIPE, wait=True, stream=False)

Execute multiple shell commands by piping the output into inputs.

Parameters:
  • commands (list) – lists of shell commands
  • stdin (stream, optional) – object to pass into the stdin argument for subprocess.Popen, defaults to None
  • stdout (stream, optional) – object to pass into the stdout argument for subprocess.Popen, defaults to subprocess.PIPE
  • wait (bool, optional) – whether to wait until the command has finished, defaults to True
  • stream (bool, optional) – whether to stream the output to the command line, defaults to True
Returns:

list of spawned subprocesses

Return type:

list

kb_python.utils.get_kallisto_version()

Get the provided Kallisto version.

This function parses the help text by executing the included Kallisto binary.

Returns:tuple of major, minor, patch versions
Return type:tuple
kb_python.utils.get_bustools_version()

Get the provided Bustools version.

This function parses the help text by executing the included Bustools binary.

Returns:tuple of major, minor, patch versions
Return type:tuple
kb_python.utils.parse_technologies(lines)

Parse a list of strings into a list of supported technologies.

This function parses the technologies printed by running kallisto bus –list.

Parameters:lines (list) – the output of kallisto bus –list split into lines
Returns:list of technologies
Return type:list
kb_python.utils.get_supported_technologies()

Runs ‘kallisto bus –list’ to fetch a list of supported technologies.

Returns:list of technologies
Return type:list
kb_python.utils.whitelist_provided(technology)

Determine whether or not the whitelist for a technology is provided.

Parameters:technology (str) – the name of the technology
Returns:whether the whitelist is provided
Return type:bool
kb_python.utils.copy_whitelist(technology, out_dir)

Copies provided whitelist for specified technology.

Parameters:
  • technology (str) – the name of the technology
  • out_dir (str) – directory to put the whitelist
Returns:

path to whitelist

Return type:

str

kb_python.utils.concatenate_files(*paths, out_path, temp_dir='tmp')

Concatenates an arbitrary number of files into one TEXT file.

Only supports text and gzip files.

Parameters:
  • paths (str) – an arbitrary number of paths to files
  • out_path (str) – path to place concatenated file
  • temp_dir (str, optional) – temporary directory, defaults to tmp
Returns:

path to concatenated file

Return type:

str

kb_python.utils.stream_file(url, path)

Creates a FIFO file to use for piping remote files into processes.

This function spawns a new thread to download the remote file into a FIFO file object. FIFO file objects are only supported on unix systems.

Parameters:
  • url (str) – url to the file
  • path (str) – path to place FIFO file
Raises:

UnsupportedOSException – if the OS is Windows

Returns:

path to FIFO file

Return type:

str

kb_python.utils.import_matrix_as_anndata(matrix_path, barcodes_path, genes_path)

Import a matrix as an Anndata object.

Parameters:
  • matrix_path (str) – path to the matrix ec file
  • barcodes_path (str) – path to the barcodes txt file
  • genes_path (str) – path to the genes txt file
Returns:

a new Anndata object

Return type:

anndata.Anndata

kb_python.utils.overlay_anndatas(adata_spliced, adata_unspliced)

‘Overlays’ anndata objects by taking the intersection of the obs and var of each anndata.

Parameters:
  • adata_spliced (anndata.Anndata) – an Anndata object
  • adata_unspliced (anndata.Anndata) – an Anndata object
Returns:

a new Anndata object

Return type:

anndata.Anndata

kb_python.utils.sum_anndatas(adata_spliced, adata_unspliced)

Sum the counts in two anndata objects by taking the intersection of both matrices and adding the values together.

Parameters:
  • adata_spliced (anndata.Anndata) – an Anndata object
  • adata_unspliced (anndata.Anndata) – an Anndata object
Returns:

a new Anndata object

Return type:

anndata.Anndata