Format uniprot annotation

format_uniprot_annotation[source]

format_uniprot_annotation(uniprot_ann:DataFrame, uniprot_feature_dict:dict)

Function to format uniprot annotation for plotting.

Args: uniprot_ann (pd.DataFrame): Formatted uniprot annotations from alphamap. uniprot_feature_dict (dict): Uniprot feature dictionary defined by alphamap. Returns: pd.DataFrame: Uniprot annotation with a combined structure entry for helix, strand and turn.

Format data for plotting

get_plot_data[source]

get_plot_data(protein, df, fasta)

Function to format experimental data for plotting.

Args: protein (str): Uniprot protein accession. df (pd.DataFrame): Experimental data imported and formatted according to alphamap standards. fasta (fasta): Fasta file imported by pyteomics 'fasta.IndexedUniProt'. Returns: pd.DataFrame: Formatted dataframe for plotting.

Function to plot a single dataset

plot_single_peptide_traces[source]

plot_single_peptide_traces(df_plot, protein, fasta)

Function to plot single peptide trace.

Args: df_plot (pd.DataFrame): Formatted dataframe for plotting, generated by get_plot_data. protein (str): Uniprot protein accession. fasta (fasta): Fasta file imported by pyteomics 'fasta.IndexedUniProt'. Returns: go.Figure: Figure data for a single dataset.

Plotting function for the full sequence plot

get_quality_category[source]

get_quality_category(s)

get_exposure_category[source]

get_exposure_category(s)

get_alphafold_annotation[source]

get_alphafold_annotation(protein:str, selected_features:list, download_folder:str='/var/folders/zx/n29r0swn0hddt1qgy159sjrw0000gn/T')

# af_annotation = get_alphafold_annotation(protein= "Q00266", 
#                              selected_features= ["AlphaFold confidence",  
#                                                    "AlphaFold exposure", 
#                                                    "AlphaFold IDR", 
#                                                    "AlphaFold secondary structures"], 
#                              download_folder= '/Users/isabell/Downloads/') 

plot_peptide_traces[source]

plot_peptide_traces(df:DataFrame, name:str, protein:str, fasta:py'>, uniprot:DataFrame, selected_features:list, uniprot_feature_dict:dict, uniprot_color_dict:dict, selected_proteases:list=[], selected_alphafold_features:list=[], dashboard:bool=False, trace_colors:list=[], download_folder:str='/var/folders/zx/n29r0swn0hddt1qgy159sjrw0000gn/T')

Function to generate the sequence plot.

Args: df (pd.DataFrame/list): Single dataframe or list of dataframes containing the datasets to plot. name (str/list): Single string or list of strings containing the names for each dataset in df. protein (str): Uniprot protein accession. fasta (fasta): Fasta file imported by pyteomics 'fasta.IndexedUniProt'. uniprot (pd.DataFrame): Uniprot annotations formatted by alphamap. selected_features (list): List of uniprot features to plot. uniprot_feature_dict (dict): Uniprot feature dictionary. uniprot_color_dict (dict): Uniprot color dictionary. selected_proteases (list, optional): List of proteases to plot. Default is an empty list. dashboard (bool, optional): Flag if the function is called from the dashboard. Default is 'False'. trace_colors (list, optional): List of manualy selected colors for each dataset in df. Default is an empty list.

Returns: go.Figure: Sequence plot.

3D visualization

extract_annotation[source]

extract_annotation(df:DataFrame)

manipulate_cif[source]

manipulate_cif(protein:str, MS_data:DataFrame, download_folder:str='/var/folders/zx/n29r0swn0hddt1qgy159sjrw0000gn/T')

adjust_html[source]

adjust_html(protein:str, coloring:str, data_name:str)

get_ms_concensus[source]

get_ms_concensus(ms_list)

plot_3d_structure[source]

plot_3d_structure(df:DataFrame, name:str, protein:str, fasta:py'>, selected_coloring:str, dashboard:bool=False, download_folder:str='/var/folders/zx/n29r0swn0hddt1qgy159sjrw0000gn/T')

Function to generate the 3D sequence plot.

Args: df (pd.DataFrame/list): Single dataframe or list of dataframes containing the datasets to plot. name (str): Single string containing the name of the MS dataset in df. protein (str): Uniprot protein accession. fasta (fasta): Fasta file imported by pyteomics 'fasta.IndexedUniProt'. selected_coloring (str): Coloring to show. dashboard (bool, optional): Flag if the function is called from the dashboard. Default is 'False'.

Returns: go.Figure: 3D plot.

format_for_3Dviz[source]

format_for_3Dviz(df:DataFrame, ptm_dataset:str)

Function to format data for 3D visualization.

Args: df (pd.DataFrame): Single dataframe containing PTM data for visualization, formatted accorinding to StructureMap. ptm_dataset (str): Single string containing the name of the target PTM column in df.

Returns: pd.DataFrame: DataFrame containing the formatted PTM dat

visualize_structure_in_panel[source]

visualize_structure_in_panel(plot_html:str, js_path:str, cif_path:str)

# mod_html, js_path, cif_path = plot_3d_structure(df = formatted_diann_data,
#                     name = 'proteome',
#                     protein = "P23284", #Q9H7C9, P23284
#                     fasta = human_fasta,
#                     selected_coloring = 'AlphaFold quality', #'AlphaFold exposure', #'AlphaFold quality', 'MS data'
#                     dashboard = False)

# visualize_structure_in_panel(mod_html, js_path, cif_path)
# 'alphafold_quality'
# 'alphafold_exposure'
# 'alphafold_IDR'
# 'alphafold_secondary_structures'
# 'MS data'
# 'MS PTMs'
# 'MS amino acids'

plot_3d_structuremap[source]

plot_3d_structuremap(df:DataFrame, organism:str, protein:str, ptm_type:str)

Function to return a sigle

Args: df (pd.DataFrame): Single dataframe containing PTM data for visualization, organism (str): String specifying the organism. protein (str): String specufying the UniProt protein accession. ptm_type (str): String of the PTM type to visualize.

Returns: go.Figure: 3D plot.

Create a pdf report

create_pdf_report[source]

create_pdf_report(proteins:list, df:DataFrame, name:str, fasta:py'>, uniprot:DataFrame, selected_features:list, uniprot_feature_dict:dict, uniprot_color_dict:dict, selected_proteases:list=[], trace_colors:list=[])

Function to write pdf reports for selected proteins

Args: proteins (list): List of uniprot protein accessions. df (pd.DataFrame/list): Single dataframe or list of dataframes containing the datasets to plot. name (str/list): Single string or list of strings containing the names for each dataset in df. fasta (fasta): Fasta file imported by pyteomics 'fasta.IndexedUniProt'. uniprot (pd.DataFrame): Uniprot annotations formatted by alphamap. selected_features (list): List of uniprot features to plot. uniprot_feature_dict (dict): Uniprot feature dictionary. uniprot_color_dict (dict): Uniprot color dictionary. selected_proteases (list, optional): List of proteases to plot. Default is an empty list. trace_colors (list, optional): List of manualy selected colors for each dataset in df. Default is an empty list.

Returns: BytesIO: BytesIO object for writing a pdf report.