alphapepttools.tl.prepare_scree_data_to_plot#
- alphapepttools.tl.prepare_scree_data_to_plot(adata, n_pcs, dim_space, embeddings_name=None)#
Prepare scree plot data from AnnData object.
- Parameters:
- Return type:
DataFrame- Returns:
pd.DataFrame DataFrame with PC numbers and explained variance values.
Examples
Prepare data for a scree plot after running PCA:
import anndata as ad import pandas as pd import numpy as np import alphapepttools as at # Create a 5x5 dataset where 4 proteins are core (no missing values) X = np.array( [ [10.5, 12.3, 11.8, 9.2, np.nan], # Sample 1 [11.2, 13.1, 12.5, 10.1, 7.5], # Sample 2 [9.8, 11.9, 10.2, 8.9, np.nan], # Sample 3 [12.1, 14.2, 13.3, 11.3, 8.2], # Sample 4 [10.9, 12.7, 11.5, 9.8, np.nan], # Sample 5 ] ) adata = ad.AnnData( X=X, obs=pd.DataFrame({"sample": ["S1", "S2", "S3", "S4", "S5"]}), var=pd.DataFrame({"protein": ["P1", "P2", "P3", "P4", "P5"], "is_core": [True, True, True, True, False]}), ) # Run PCA on observation space (samples) at.tl.pca(adata, meta_data_mask_column_name="is_core", n_comps=2, dim_space="obs") # Prepare scree plot data scree_data = at.tl.prepare_scree_data_to_plot(adata, n_pcs=2, dim_space="obs") display(scree_data) # DataFrame contains: # - PC: Principal component number (1, 2) # - explained_variance: Proportion of variance explained (0-1) # - explained_variance_percent: Variance explained as percentage (0-100)