PathIntegrate
PathIntegrate(
omics_data: dict, metadata, pathway_source, sspa_scoring = sspa.sspa_SVD,
min_coverage = 3
)
PathIntegrate class for multi-omics pathway integration.
Args
- omics_data (dict) : Dictionary of omics data. Keys are omics names, values are pandas DataFrames containing omics data where rows contain samples and columns reprsent features.
- metadata (pandas.Series) : Metadata for samples. Index is sample names, values are class labels.
- pathway_source (pandas.DataFrame) : GMT style pathway source data. Must contain column 'Pathway_name'.
- sspa_scoring (object, optional) : Scoring method for ssPA. Defaults to sspa.sspa_SVD. Options are sspa.sspa_SVD, sspa.sspa_ssGSEA, sspa.sspa_KPCA, sspa.sspa_ssClustPA, sspa.sspa_zscore.
- min_coverage (int, optional) : Minimum number of molecules required in a pathway. Defaults to 3.
Attributes
- omics_data (dict) : Dictionary of omics data. Keys are omics names, values are pandas DataFrames.
- omics_data_scaled (dict) : Dictionary of omics data scaled to mean 0 and unit variance. Keys are omics names, values are pandas DataFrames.
- metadata (pandas.Series) : Metadata for samples. Index is sample names, values are class labels.
- pathway_source (pandas.DataFrame) : Pathway source data.
- pathway_dict (dict) : Dictionary of pathways. Keys are pathway names, values are lists of molecules.
- sspa_scoring (object) : Scoring method for SSPA.
- min_coverage (int) : Minimum number of molecules required to cover a pathway.
- sspa_method (object) : SSPA scoring method.
- sspa_scores_mv (dict) : Dictionary of SSPA scores for each omics data. Keys are omics names, values are pandas DataFrames.
- sspa_scores_sv (pandas.DataFrame) : SSPA scores for all omics data concatenated.
- coverage (dict) : Dictionary of pathway coverage. Keys are pathway names, values are number of omics covering the pathway.
- mv (object) : Fitted MultiView model.
- sv (object) : Fitted SingleView model.
- labels (pandas.Series) : Class labels for samples. Index is sample names, values are class labels.
Methods:
.get_multi_omics_coverage
.get_multi_omics_coverage()
.MultiView
.MultiView(
ncomp = 2
)
Fits a PathIntegrate MultiView model using MBPLS.
Args
- ncomp (int, optional) : Number of components. Defaults to 2.
Returns
- object : Fitted PathIntegrate MultiView model.
.SingleView
.SingleView(
model = sklearn.linear_model.LogisticRegression, model_params = None
)
Fits a PathIntegrate SingleView model using an SKLearn-compatible predictive model.
Args
- model (object, optional) : SKlearn prediction model class. Defaults to sklearn.linear_model.LogisticRegression.
- model_params (type, optional) : Model-specific hyperparameters. Defaults to None.
Returns
- object : Fitted PathIntegrate SingleView model.
.SingleViewClust
.SingleViewClust(
model = sklearn.cluster.KMeans, n_clusters_range = (2, 10), model_params = None,
use_pca = True, pca_params = None, consensus_clustering = False, n_runs = 10,
auto_n_clusters = False, subsample_fraction = 0.8, return_plot = False,
return_ground_truth_plot = False, return_confusion_matrix = False,
return_metrics_table = False
)
Fits a PathIntegrate SingleView Unsupervised model using an SKLearn-compatible KMeans model. Credit: Jude Popham
Args
- model (object, optional) : SKLearn clustering model class. Defaults to sklearn.cluster.KMeans.
- model_params (dict, optional) : Model-specific hyperparameters. Defaults to None.
- use_pca (bool, optional) : Whether to perform PCA before clustering. Defaults to False.
- pca_params (dict, optional) : PCA-specific hyperparameters. Defaults to None.
- consensus_clustering (bool, optional) : Whether to perform consensus clustering. Defaults to False.
- n_runs (int, optional) : Number of runs for consensus clustering. Defaults to 10.
- auto_n_clusters (bool, optional) : Automatically determine the optimal number of clusters. Defaults to False.
- n_clusters_range (tuple, optional) : Range of cluster numbers to evaluate for optimal clusters. Defaults to (2, 10).
- subsample_fraction (float, optional) : Fraction of samples to use for each consensus clustering run. Defaults to 0.8.
- return_plot (bool, optional) : Whether to return a plot of the clustering result. Defaults to False.
- return_ground_truth_plot (bool, optional) : Whether to return a plot comparing the clustering result with ground truth. Defaults to False.
- return_confusion_matrix (bool, optional) : Whether to return a plot comparing different clustering algorithms. Defaults to False.
- return_metrics_table (bool, optional) : Whether to return a table of clustering evaluation metrics. Defaults to False.
Returns
- object : Fitted PathIntegrate SingleView Clustering model with various plots saved inside.
.SingleViewDimRed
.SingleViewDimRed(
model = sklearn.decomposition.PCA, model_params = None, return_pca_plot = False,
return_tsne_plot = False, return_biplot = False, return_loadings_plot = False,
return_tsne_density_plot = False, metadata_continuous = False
)
Applies a dimensionality reduction technique to the input data. Credit: Jude Popham
Args
- model (object, optional) : The dimensionality reduction model to use. Defaults to sklearn.decomposition.PCA.
- model_params (dict, optional) : Model-specific hyperparameters. Defaults to None.
- return_pca_plot (bool, optional) : Whether to return a PCA scatter plot of the first two principal components.
- return_tsne_plot (bool, optional) : Whether to return a t-SNE scatter plot of the first two components.
- return_biplot (bool, optional) : Whether to return a biplot (PCA plot with loadings).
- return_loadings_plot (bool, optional) : Whether to return a plot of the top loadings for each component.
- return_tsne_density_plot (bool, optional) : Whether to return a t-SNE scatter plot with a density overlay.
- metadata_continuous (bool, optional) : Whether metadata is continuous or categorical.
Returns
- object : Fitted dimensionality reduction model with reduced data and optional plots.
.convert_range_to_midpoint
.convert_range_to_midpoint(
value
)
Converts a range like '10-20' into its midpoint value '15'.
.SingleViewCV
.SingleViewCV(
model = sklearn.linear_model.LogisticRegression, model_params = None,
cv_params = None
)
Cross-validation for SingleView model.
Args
- model (object, optional) : SKlearn prediction model class. Defaults to sklearn.linear_model.LogisticRegression.
- model_params (type, optional) : Model-specific hyperparameters. Defaults to None.
- cv_params (dict, optional) : Cross-validation parameters. Defaults to None.
Returns
- object : Cross-validation results.
.SingleViewGridSearchCV
.SingleViewGridSearchCV(
param_grid, model = sklearn.linear_model.LogisticRegression,
grid_search_params = None
)
Grid search cross-validation for SingleView model.
Args
- param_grid (dict) : Grid search parameters.
- model (object, optional) : SKlearn prediction model class. Defaults to sklearn.linear_model.LogisticRegression.
- grid_search_params (dict, optional) : Grid search parameters. Defaults to None.
Returns
- object : GridSearchCV object.
.MultiViewCV
.MultiViewCV()
Cross-validation for MultiView model.
Returns
- object : Cross-validation results.
.MultiViewGridSearchCV
.MultiViewGridSearchCV()