Skip to content

spatialfusion.models.multi_ae

spatialfusion.models.multi_ae

Multi-modal autoencoder models for paired datasets.

This module provides: - PairedDataset: PyTorch Dataset for paired samples. - PairedAE: Standard autoencoder for paired modalities. - EncoderAE, Decoder: Building blocks for autoencoders. - build_mlp: Utility to build MLP networks.

Decoder

Bases: Module

Decoder network for autoencoders.

Parameters:

Name Type Description Default
latent_dim int

Latent space dimension.

required
hidden_dims list

Hidden layer sizes.

required
output_dim int

Output feature dimension.

required

forward(z)

Forward pass for decoder.

Parameters:

Name Type Description Default
z Tensor

Latent vector.

required

Returns: torch.Tensor: Reconstructed output.

EncoderAE

Bases: Module

Encoder network for autoencoder models.

This encoder maps input features to a latent representation using a feedforward neural network.

Parameters:

Name Type Description Default
input_dim

Input feature dimensionality.

required
hidden_dims

List of hidden layer sizes.

required
latent_dim

Dimensionality of the latent space.

required

forward(x)

Forward pass for AE encoder.

Parameters:

Name Type Description Default
x Tensor

Input tensor.

required

Returns: z (torch.Tensor): Latent vector.

PairedAE

Bases: Module

Autoencoder model for paired modalities.

This model learns a shared latent representation for two input modalities and supports: - Within-modality reconstruction - Cross-modality reconstruction - Single-modality (UNI-only) operation

Parameters:

Name Type Description Default
d1_dim

Input feature dimension for modality 1.

required
d2_dim

Input feature dimension for modality 2.

required
latent_dim

Dimensionality of the latent space.

required
enc_hidden_dims

Hidden layer sizes for the encoders.

None
dec_hidden_dims

Hidden layer sizes for the decoders.

None

forward(d1, d2=None)

Forward pass of the paired autoencoder.

If both modalities are provided, the model computes: - Latent representations for both modalities - Within-modality reconstructions - Cross-modality reconstructions

If only d1 is provided, the model runs in single-modality mode.

Parameters:

Name Type Description Default
d1

Input tensor for modality 1.

required
d2

Optional input tensor for modality 2.

None

Returns:

Type Description

Dictionary containing: - "z1": Latent embedding for modality 1 - "z2": Latent embedding for modality 2 (or None) - "recon1": Reconstruction of modality 1 - "recon2": Reconstruction of modality 2 (or None) - "cross12": Reconstruction of modality 2 from z1 (or None) - "cross21": Reconstruction of modality 1 from z2 (or None)

PairedDataset

Bases: Dataset

PyTorch Dataset for paired samples from two modalities.

Each item returned by the dataset is a tuple (X1, X2) corresponding to aligned samples from two feature matrices.

Parameters:

Name Type Description Default
df1 DataFrame

First modality data (samples × features).

required
df2 DataFrame

Second modality data (samples × features).

required

Raises:

Type Description
AssertionError

If the indices of df1 and df2 do not match.

__getitem__(idx)

Return a tuple of paired samples (X1, X2) at index idx.

__len__()

Return the number of samples.

build_mlp(layer_dims, activation_fn=nn.ReLU)

Build a multi-layer perceptron (MLP) network.

The MLP consists of linear layers with an activation function applied after each hidden layer. No activation is applied to the output layer.

Parameters:

Name Type Description Default
layer_dims

List of layer dimensions, including input and output.

required
activation_fn

Activation function class used between layers.

ReLU

Returns:

Type Description

A torch.nn.Sequential MLP model.