Input data for PathIntegrate
Omics data
Each multi-omics dataset should be in the form of a pandas DataFrame with samples as rows and molecules as columns. The index of the DataFrame should be the sample IDs, and the columns should be the molecule IDs.
The molecule IDs should match those of the desired pathway database (i.e. ChEBI IDs, UniProt IDs, and ENSEMBL genes for Reactome; and KEGG IDs for KEGG). The values in the DataFrame should be the molecule abundances for each sample.
Note
The omics data should be pre-processed before inputting into PathIntegrate. PathIntgrate will automatically standardise the data, but it is recommended to log-transform the data before inputting into PathIntegrate.
sample_id | 1372 | 16610 | 72665 | 30915 | 37373 | Group |
---|---|---|---|---|---|---|
INCOV092-BL | 1.541009 | 1.228611 | 1.224076 | 1.962028 | 0.652984 | COVID |
INCOV107-BL | 0.910486 | 2.169111 | 2.819585 | 1.234384 | 1.453066 | COVID |
INCOV020-BL | 0.831297 | 0.23298 | 2.126393 | 0.861793 | 2.877589 | COVID |
INCOV035-BL | 1.862011 | 0.792962 | 1.434183 | 1.223473 | 0.706152 | COVID |
INCOV122-BL | 1.416927 | 2.493762 | 1.77004 | 0.888144 | 0.693444 | Non-covid |
INCOV084-BL | 1.622171 | 1.021112 | 2.323956 | 0.573877 | 0.764003 | Non-covid |
INCOV086-BL | 1.610941 | 1.205343 | 0.83498 | 2.600065 | 1.700068 | Non-covid |
INCOV133-BL | 0.83727 | 2.144127 | 1.24222 | 1.035411 | 2.037335 | Non-covid |
Pathways
Pathways should be in the form of a pandas DataFrame with pathways as rows and molecules as columns. The index of the DataFrame should be the pathway IDs, and the columns should be the molecule IDs. The first column should be the pathway names or descriptions.
Pathways can be from any pathway database, but the molecule IDs should match those of the omics data.
Each pathway can either contain molecules from a single omics, or a combination of omics.
Note
Pathways can be downloaded using the sspa package
Pathway_name | 0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|---|
1 | Pathway_52 | Q13554 | P61289 | P05114 | P62081 | P54760 |
2 | Pathway_53 | Q9Y243 | P17252 | |||
3 | Pathway_54 | 16708 | P06732 | P61289 | O00220 | O75914 |
4 | Pathway_55 | O15264 | P25786 | |||
5 | Pathway_56 | P07858 | P62979 | Q9Y625 | P14778 | P12314 |
6 | Pathway_57 | P18510 | P15260 | Q13557 | P32942 | P04818 |
7 | Pathway_58 | P00738 | P37023 | P01588 | P63098 | P05362 |
8 | Pathway_59 | P52798 | P15498 |