spatialfusion.utils.embed_gcn_utils
spatialfusion.utils.embed_gcn_utils
Utility functions for extracting GCN embeddings and metadata for downstream analysis.
This module provides: - extract_gcn_embeddings_with_metadata: Extracts GCN embeddings and merges with cell type, spatial, and ligand-receptor metadata.
extract_gcn_embeddings_with_metadata(model, graphs, sample_list, base_path, z_joint, device='cuda:0', spatial_key='spatial_px', celltype_key='celltypes', adata_by_sample=None)
Extracts GCN embeddings along with metadata such as spatial coordinates and cell types. Supports both in-memory AnnData inputs and on-disk loading via base_path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Module
|
Trained GCN model with an |
required |
graphs
|
list[DGLGraph]
|
List of DGL graphs, one per sample, containing node features in |
required |
sample_list
|
list[str]
|
List of sample IDs corresponding to the graphs. |
required |
base_path
|
str or Path
|
Root directory containing per-sample subdirectories. Used only if |
required |
z_joint
|
DataFrame
|
Joint embeddings from the autoencoder step. Used to align indices. |
required |
device
|
str
|
Device string (e.g., |
'cuda:0'
|
spatial_key
|
str
|
Key name for spatial coordinates stored in |
'spatial_px'
|
celltype_key
|
str
|
Column name in |
'celltypes'
|
adata_by_sample
|
Optional[Dict[str, AnnData]]
|
Optional mapping from sample IDs to AnnData objects already loaded in memory. If provided, this takes precedence over disk loading. |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: A concatenated DataFrame of GCN embeddings across all samples, including metadata: - sample_id, cell_id - celltype (and optional subtype/niche labels) - spatial coordinates (X_coord, Y_coord) - optional ligand–receptor features if present |