kb_python.utils
¶
Module Contents¶
Functions¶
|
Update the provided path with the specified code. |
|
Quietly make the specified directory (and any subdirectories). |
|
Quietly make the specified directory (and any subdirectories). |
|
Execute a single shell command. |
|
Get the provided Kallisto version. |
|
Get the provided Bustools version. |
|
Parse a list of strings into a list of supported technologies. |
|
Runs 'kallisto bus --list' to fetch a list of supported technologies. |
|
Determine whether or not the whitelist for a technology is provided. |
|
Move a file from source to destination, overwriting the file if the |
|
Copies provided whitelist for specified technology. |
|
Create a feature-barcode map for the 10x Feature Barcoding technology. |
|
Creates a FIFO file to use for piping remote files into processes. |
|
Given a transcript-to-gene mapping path, read it into a dictionary. |
|
Collapse the given Anndata by summing duplicate rows. The by argument |
|
Import a TCC matrix as an Anndata object. |
|
Import a matrix as an Anndata object. |
|
'Overlays' anndata objects by taking the intersection of the obs and var |
|
Sum the counts in two anndata objects by taking the intersection of |
|
Function decorator to decorate functions that change the current working |
Attributes¶
- kb_python.utils.TECHNOLOGY_PARSER¶
- kb_python.utils.VERSION_PARSER¶
- kb_python.utils.open_as_text¶
- kb_python.utils.decompress_gzip¶
- kb_python.utils.compress_gzip¶
- kb_python.utils.concatenate_files¶
- kb_python.utils.download_file¶
- kb_python.utils.get_temporary_filename¶
- kb_python.utils.update_filename(filename: str, code: str) str ¶
Update the provided path with the specified code.
For instance, if the path is ‘output.bus’ and code is s (for sort), this function returns output.s.bus.
- Parameters
filename – filename (NOT path)
code – code to append to filename
- Returns
Path updated with provided code
- kb_python.utils.make_directory(path: str)¶
Quietly make the specified directory (and any subdirectories).
This function is a wrapper around os.makedirs. It is used so that the appropriate mkdir command can be printed for dry runs.
- Parameters
path – Path to directory to make
- kb_python.utils.remove_directory(path: str)¶
Quietly make the specified directory (and any subdirectories).
This function is a wrapper around shutil.rmtree. It is used so that the appropriate rm command can be printed for dry runs.
- Parameters
path – Path to directory to remove
- kb_python.utils.run_executable(command: List[str], stdin: Optional[int] = None, stdout: int = sp.PIPE, stderr: int = sp.PIPE, wait: bool = True, stream: bool = True, quiet: bool = False, returncode: int = 0, alias: bool = True, record: bool = True) Union[Tuple[subprocess.Popen, str, str], subprocess.Popen] ¶
Execute a single shell command.
- Parameters
command – A list representing a single shell command
stdin – Object to pass into the stdin argument for subprocess.Popen, defaults to None
stdout – Object to pass into the stdout argument for subprocess.Popen, defaults to subprocess.PIPE
stderr – Object to pass into the stderr argument for subprocess.Popen, defaults to subprocess.PIPE
wait – Whether to wait until the command has finished, defaults to True
stream – Whether to stream the output to the command line, defaults to True
quiet – Whether to not display anything to the command line and not check the return code, defaults to False
returncode – The return code expected if the command runs as intended, defaults to 0
alias – Whether to use the basename of the first element of command, defaults to True
record – Whether to record the call statistics, defaults to True
- Returns
- (the spawned process, list of strings printed to stdout,
list of strings printed to stderr) if wait=True. Otherwise, the spawned process
- kb_python.utils.get_kallisto_version() Optional[Tuple[int, int, int]] ¶
Get the provided Kallisto version.
This function parses the help text by executing the included Kallisto binary.
- Returns
Major, minor, patch versions
- kb_python.utils.get_bustools_version() Optional[Tuple[int, int, int]] ¶
Get the provided Bustools version.
This function parses the help text by executing the included Bustools binary.
- Returns
Major, minor, patch versions
- kb_python.utils.parse_technologies(lines: List[str]) Set[str] ¶
Parse a list of strings into a list of supported technologies.
This function parses the technologies printed by running kallisto bus –list.
- Parameters
lines – The output of kallisto bus –list split into lines
- Returns
Set of technologies
- kb_python.utils.get_supported_technologies() Set[str] ¶
Runs ‘kallisto bus –list’ to fetch a list of supported technologies.
- Returns
Set of technologies
- kb_python.utils.whitelist_provided(technology: str) bool ¶
Determine whether or not the whitelist for a technology is provided.
- Parameters
technology – The name of the technology
- Returns
Whether the whitelist is provided
- kb_python.utils.move_file(source: str, destination: str) str ¶
Move a file from source to destination, overwriting the file if the destination exists.
- Parameters
source – Path to source file
destination – Path to destination
- Returns
Path to moved file
- kb_python.utils.copy_whitelist(technology: str, out_dir: str) str ¶
Copies provided whitelist for specified technology.
- Parameters
technology – The name of the technology
out_dir – Directory to put the whitelist
- Returns
Path to whitelist
- kb_python.utils.create_10x_feature_barcode_map(out_path: str) str ¶
Create a feature-barcode map for the 10x Feature Barcoding technology.
- Parameters
out_path – Path to the output mapping file
- Returns
Path to map
- kb_python.utils.stream_file(url: str, path: str) str ¶
Creates a FIFO file to use for piping remote files into processes.
This function spawns a new thread to download the remote file into a FIFO file object. FIFO file objects are only supported on unix systems.
- Parameters
url – Url to the file
path – Path to place FIFO file
- Returns
Path to FIFO file
- Raises
UnsupportedOSError – If the OS is Windows
- kb_python.utils.read_t2g(t2g_path: str) Dict[str, Tuple[str, Ellipsis]] ¶
Given a transcript-to-gene mapping path, read it into a dictionary. The first column is always assumed to tbe the transcript IDs.
- Parameters
t2g_path – Path to t2g
- Returns
- Dictionary containing transcript IDs as keys and all other columns
as a tuple as values
- kb_python.utils.collapse_anndata(adata: anndata.AnnData, by: Optional[str] = None) anndata.AnnData ¶
Collapse the given Anndata by summing duplicate rows. The by argument specifies which column to use. If not provided, the index is used.
Note
This function also collapses any existing layers. Additionally, the returned AnnData will have the values used to collapse as the index.
- Parameters
adata – The Anndata to collapse
by – The column to collapse by. If not provided, the index is used. When this column contains missing values (i.e. nan or None), these columns are removed.
- Returns
A new collapsed Anndata object. All matrices are sparse, regardless of whether or not they were in the input Anndata.
- kb_python.utils.import_tcc_matrix_as_anndata(matrix_path: str, barcodes_path: str, ec_path: str, txnames_path: str, threads: int = 8) anndata.AnnData ¶
Import a TCC matrix as an Anndata object.
- Parameters
matrix_path – Path to the matrix ec file
barcodes_path – Path to the barcodes txt file
genes_path – Path to the ec txt file
txnames_path – Path to transcripts.txt generated by kallisto bus
- Returns
A new Anndata object
- kb_python.utils.import_matrix_as_anndata(matrix_path: str, barcodes_path: str, genes_path: str, t2g_path: Optional[str] = None, name: str = 'gene', by_name: bool = False) anndata.AnnData ¶
Import a matrix as an Anndata object.
- Parameters
matrix_path – Path to the matrix ec file
barcodes_path – Path to the barcodes txt file
genes_path – Path to the genes txt file
t2g_path – Path to transcript-to-gene mapping. If this is provided, the third column of the mapping is appended to the anndata var, defaults to None
name – Name of the columns, defaults to “gene”
by_name – Aggregate counts by name instead of ID. t2g_path must be provided and contain names.
- Returns
A new Anndata object
- kb_python.utils.overlay_anndatas(adata_spliced: anndata.AnnData, adata_unspliced: anndata.AnnData) anndata.AnnData ¶
‘Overlays’ anndata objects by taking the intersection of the obs and var of each anndata.
Note
Matrices generated by kallisto | bustools always contain all genes, even if they have zero counts. Therefore, taking the intersection is not entirely necessary but is done as a sanity check.
- Parameters
adata_spliced – An Anndata object
adata_unspliced – An Anndata object
- Returns
A new Anndata object
- kb_python.utils.sum_anndatas(adata_spliced: anndata.AnnData, adata_unspliced: anndata.AnnData) anndata.AnnData ¶
Sum the counts in two anndata objects by taking the intersection of both matrices and adding the values together.
Note
Matrices generated by kallisto | bustools always contain all genes, even if they have zero counts. Therefore, taking the intersection is not entirely necessary but is done as a sanity check.
- Parameters
adata_spliced – An Anndata object
adata_unspliced – An Anndata object
- Returns
A new Anndata object
- kb_python.utils.restore_cwd(func: Callable) Callable ¶
Function decorator to decorate functions that change the current working directory. When such a function is decorated with this function, the current working directory is restored to its previous state when the function exits.