alphapepttools.io.read_psm_table#
- alphapepttools.io.read_psm_table(file_paths, search_engine, level='proteins', *, intensity_column=None, feature_id_column=None, sample_id_column=None, var_columns=None, obs_columns=None, **reader_kwargs)#
Read peptide spectrum match tables to the
anndata.AnnDataformatRead peptide spectrum match (PSM) tables from proteomics search engines into the
anndata.AnnDataformat (observations x features). Per default, raw protein intensities are returned. Additionally, custom columns can be selected to be retained in the resulting AnnData object.Note: The underlying pivoting function will aggregate metadata in a “first” manner, meaning that if the metadata is finer grained than the feature level, information will be lost. An example for this is setting feature_id_column=”protein_ids” and setting “var_columns” to include peptide sequences. This produces a protein-level AnnData object with one peptide sequence per protein, which is likely not desired. Therefore, ensure that the metadata you want to retain is actually applicable to the feature level.
Supported formats include
AlphaDIA (
alphadia)AlphaPept (
alphapept)DIANN (
diann)MaxQuant (
maxquant)Spectronaut (
spectronaut, parquet + tsv)
- Parameters:
file_paths (
str|list[str]) – Path to peptide spectrum match reports. If a list of reports is passed, all must be from the same search engine.search_engine (
str) – Name of search engine that generated the output, pass the method name of the corresponding reader.level (
str(default:'proteins')) – Level of quantification to read. One of “proteins”, “precursors”, or “genes”. Defaults to “proteins”.intensity_column (
Optional[str] (default:None)) – Column that holds the quantified intensities in the PSM table. Defaults to the pre-configured protein intensities value inalphabase.feature_id_column (
Optional[str] (default:None)) – Column that holds the feature identifier in the PSM table. Defaults to proteins and the pre-configured value inalphabase.sample_id_column (
Optional[str] (default:None)) – Column that holds the sample identifier in the PSM table. Defaults to the pre-configured value inalphabase.var_columns (
Union[str,list[str],None] (default:None)) – Additional columns to annotate features in theadata.vartable. Can be a single column name or a list of column names. Defaults to None.obs_columns (
Union[str,list[str],None] (default:None)) – Additional columns to annotate observations in theadata.obstable. Can be a single column name or a list of column names. Defaults to None.**reader_kwargs – Keyword arguments passed to
alphabase.psm_reader.psm_reader_provider.get_reader()
- Return type:
- Returns:
anndata.AnnDataAnnData object that can be further processed with scVerse packages.- adata.X
Stores values of the intensity columns in the report of shape observations x features
- adata.obs
Stores observations with protein group matrix sample names as
sample_idcolumn.
- adata.var
Stores features and feature metadata with standardized alphabase names.
Example
import alphapepttools as at alphadia_path = ... adata = at.io.read_psm_table(alhpadia_path, search_engine="alphadia")
See also
alphabase.psm_reader