kb_python.utils
¶
Module Contents¶
Classes¶
Custom logging handler so that logging does not affect progress bars. |
Functions¶
|
Update the provided path with the specified code. |
|
Open a textfile or gzip file in text mode. |
|
Decompress a gzip file to provided file path. |
|
Compress a file into gzip. |
|
Quietly make the specified directory (and any subdirectories). |
|
Quietly make the specified directory (and any subdirectories). |
|
Execute a single shell command. |
Get the provided Kallisto version. |
|
Get the provided Bustools version. |
|
|
Parse a list of strings into a list of supported technologies. |
Runs ‘kallisto bus –list’ to fetch a list of supported technologies. |
|
|
Determine whether or not the whitelist for a technology is provided. |
|
Move a file from source to destination, overwriting the file if the |
|
Copies provided whitelist for specified technology. |
|
Copies provided feature-to-cell barcode mapping for the speified technology. |
|
Concatenates an arbitrary number of files into one TEXT file. |
|
Download a remote file to the provided path while displaying a progress bar. |
|
Creates a FIFO file to use for piping remote files into processes. |
|
Create a temporary file in the provided temprorary directory. |
|
Given a transcript-to-gene mapping path, read it into a dictionary. |
|
Import a TCC matrix as an Anndata object. |
|
Import a matrix as an Anndata object. |
|
‘Overlays’ anndata objects by taking the intersection of the obs and var |
|
Sum the counts in two anndata objects by taking the intersection of |
-
kb_python.utils.
logger
¶
-
kb_python.utils.
TECHNOLOGY_PARSER
¶
-
kb_python.utils.
VERSION_PARSER
¶
-
exception
kb_python.utils.
NotImplementedException
¶ Bases:
Exception
Common base class for all non-exit exceptions.
-
exception
kb_python.utils.
UnmetDependencyException
¶ Bases:
Exception
Common base class for all non-exit exceptions.
-
class
kb_python.utils.
TqdmLoggingHandler
(level=logging.NOTSET)¶ Bases:
logging.Handler
Custom logging handler so that logging does not affect progress bars.
-
emit
(self, record)¶ Do whatever it takes to actually log the specified logging record.
This version is intended to be implemented by subclasses and so raises a NotImplementedError.
-
-
kb_python.utils.
update_filename
(filename, code)¶ Update the provided path with the specified code.
For instance, if the path is ‘output.bus’ and code is s (for sort), this function returns output.s.bus.
- Parameters
filename (str) – filename (NOT path)
code (str) – code to append to filename
- Returns
path updated with provided code
- Return type
str
-
kb_python.utils.
open_as_text
(path, mode)¶ Open a textfile or gzip file in text mode.
- Parameters
path (str) – path to textfile or gzip
mode (str) – mode to open the file, either w for write or r for read
- Returns
file object
- Return type
file object
-
kb_python.utils.
decompress_gzip
(gzip_path, out_path)¶ Decompress a gzip file to provided file path.
- Parameters
gzip_path (str) – path to gzip file
out_path (str) – path to decompressed file
- Returns
path to decompressed file
- Return type
str
-
kb_python.utils.
compress_gzip
(file_path, out_path)¶ Compress a file into gzip.
- Parameters
file_path (str) – path to file
out_dir (str) – path to compressed file
- Returns
path to compressed file
- Return type
str
-
kb_python.utils.
make_directory
(path)¶ Quietly make the specified directory (and any subdirectories).
This function is a wrapper around os.makedirs. It is used so that the appropriate mkdir command can be printed for dry runs.
- Parameters
path (str) – path to directory to make
-
kb_python.utils.
remove_directory
(path)¶ Quietly make the specified directory (and any subdirectories).
This function is a wrapper around shutil.rmtree. It is used so that the appropriate rm command can be printed for dry runs.
- Parameters
path (str) – path to directory to remove
-
kb_python.utils.
run_executable
(command, stdin=None, stdout=sp.PIPE, stderr=sp.PIPE, wait=True, stream=True, quiet=False, returncode=0, alias=True, record=True)¶ Execute a single shell command.
- Parameters
command (list) – a list representing a single shell command
stdin (stream, optional) – object to pass into the stdin argument for subprocess.Popen, defaults to None
stdout (stream, optional) – object to pass into the stdout argument for subprocess.Popen, defaults to subprocess.PIPE
stderr (stream, optional) – object to pass into the stderr argument for subprocess.Popen, defaults to subprocess.PIPE
wait (bool, optional) – whether to wait until the command has finished, defaults to True
stream (bool, optional) – whether to stream the output to the command line, defaults to True
quiet (bool, optional) – whether to not display anything to the command line and not check the return code, defaults to False
returncode (int, optional) – the return code expected if the command runs as intended, defaults to 0
alias (bool, optional) – whether to use the basename of the first element of command, defaults to True
record (bool, optional) – whether to record the call statistics, defaults to True
- Returns
the spawned process
- Return type
subprocess.Process
-
kb_python.utils.
get_kallisto_version
()¶ Get the provided Kallisto version.
This function parses the help text by executing the included Kallisto binary.
- Returns
tuple of major, minor, patch versions
- Return type
tuple
-
kb_python.utils.
get_bustools_version
()¶ Get the provided Bustools version.
This function parses the help text by executing the included Bustools binary.
- Returns
tuple of major, minor, patch versions
- Return type
tuple
-
kb_python.utils.
parse_technologies
(lines)¶ Parse a list of strings into a list of supported technologies.
This function parses the technologies printed by running kallisto bus –list.
- Parameters
lines (list) – the output of kallisto bus –list split into lines
- Returns
list of technologies
- Return type
list
-
kb_python.utils.
get_supported_technologies
()¶ Runs ‘kallisto bus –list’ to fetch a list of supported technologies.
- Returns
list of technologies
- Return type
list
-
kb_python.utils.
whitelist_provided
(technology)¶ Determine whether or not the whitelist for a technology is provided.
- Parameters
technology (str) – the name of the technology
- Returns
whether the whitelist is provided
- Return type
bool
-
kb_python.utils.
move_file
(source, destination)¶ Move a file from source to destination, overwriting the file if the destination exists.
- Parameters
source (str) – path to source file
destination (str) – path to destination
- Returns
path to moved file
- Return type
str
-
kb_python.utils.
copy_whitelist
(technology, out_dir)¶ Copies provided whitelist for specified technology.
- Parameters
technology (str) – the name of the technology
out_dir (str) – directory to put the whitelist
- Returns
path to whitelist
- Return type
str
-
kb_python.utils.
copy_map
(technology, out_dir)¶ Copies provided feature-to-cell barcode mapping for the speified technology.
- Parameters
technology (str) – the name of the technology
out_dir (str) – directory to put the map
- Returns
path to map
- Return type
str
-
kb_python.utils.
concatenate_files
(*paths, out_path, temp_dir='tmp')¶ Concatenates an arbitrary number of files into one TEXT file.
Only supports text and gzip files.
- Parameters
paths (str) – an arbitrary number of paths to files
out_path (str) – path to place concatenated file
temp_dir (str, optional) – temporary directory, defaults to tmp
- Returns
path to concatenated file
- Return type
str
-
kb_python.utils.
download_file
(url, path)¶ Download a remote file to the provided path while displaying a progress bar.
- Parameters
url (str) – remote url
path (str) – local path to download the file to
- Returns
path to downloaded file
- Return type
str
-
kb_python.utils.
stream_file
(url, path)¶ Creates a FIFO file to use for piping remote files into processes.
This function spawns a new thread to download the remote file into a FIFO file object. FIFO file objects are only supported on unix systems.
- Parameters
url (str) – url to the file
path (str) – path to place FIFO file
- Raises
UnsupportedOSException – if the OS is Windows
- Returns
path to FIFO file
- Return type
str
-
kb_python.utils.
get_temporary_filename
(temp_dir=None)¶ Create a temporary file in the provided temprorary directory.
The caller is responsible for deleting the file.
- Parameters
temp_dir (str, optional) – path to temporary directory, defaults to None
- Returns
temporary filename
- Return type
str
-
kb_python.utils.
read_t2g
(t2g_path)¶ Given a transcript-to-gene mapping path, read it into a dictionary. The first column is always assumed to tbe the transcript IDs.
- Parameters
t2g_path (str) – path to t2g
- Returns
dictionary containing transcript IDs as keys and all other columns as a tuple as values
- Return type
dict
-
kb_python.utils.
import_tcc_matrix_as_anndata
(matrix_path, barcodes_path, ec_path, txnames_path, threads=8)¶ Import a TCC matrix as an Anndata object.
- Parameters
matrix_path (str) – path to the matrix ec file
barcodes_path (str) – path to the barcodes txt file
genes_path (str) – path to the ec txt file
txnames_path (str) – path to transcripts.txt generated by kallisto bus
- Returns
a new Anndata object
- Return type
anndata.Anndata
-
kb_python.utils.
import_matrix_as_anndata
(matrix_path, barcodes_path, genes_path, t2g_path=None, name='gene')¶ Import a matrix as an Anndata object.
- Parameters
matrix_path (str) – path to the matrix ec file
barcodes_path (str) – path to the barcodes txt file
genes_path (str) – path to the genes txt file
t2g_path (str, optional) – path to transcript-to-gene mapping. If this is provided, the third column of the mapping is appended to the anndata var, defaults to None
name (str, optional) – name of the columns, defaults to “gene”
- Returns
a new Anndata object
- Return type
anndata.Anndata
-
kb_python.utils.
overlay_anndatas
(adata_spliced, adata_unspliced)¶ ‘Overlays’ anndata objects by taking the intersection of the obs and var of each anndata.
- Parameters
adata_spliced (anndata.Anndata) – an Anndata object
adata_unspliced (anndata.Anndata) – an Anndata object
- Returns
a new Anndata object
- Return type
anndata.Anndata
-
kb_python.utils.
sum_anndatas
(adata_spliced, adata_unspliced)¶ Sum the counts in two anndata objects by taking the intersection of both matrices and adding the values together.
- Parameters
adata_spliced (anndata.Anndata) – an Anndata object
adata_unspliced (anndata.Anndata) – an Anndata object
- Returns
a new Anndata object
- Return type
anndata.Anndata