vedant-jumle
/

cosae

image-reconstruction

generative-model

Model card Files Files and versions

CosAE Convolutional Harmonic Autoencoder

This is a pretrained Convolutional Harmonic Autoencoder (CosAE) model. It encodes images into amplitude/phase harmonics and reconstructs RGB images.

Usage

from transformers import AutoModel

# Load the model with remote code trust
model = AutoModel.from_pretrained(
    "vedant-jumle/cosae",
    trust_remote_code=True,
)
model.eval()

# Example input: tensor of shape [B, 9, H, W] (RGB + FFT) or [B,3,H,W]
import torch
x = torch.randn(1, 9, 256, 256)
with torch.no_grad():
    recon = model(x)

Model Details

Architecture: Convolutional encoder (ResBlocks + optional attention), Harmonic Construction Module, upsampling decoder
Input channels: 9 (3 RGB + 6 FFT) or 3
Image size: 256×256 (configurable)

References

Sifei et al. (2024). CosAE: Convolutional Harmonic Autoencoder. NVIDIA AMRI. https://research.nvidia.com/labs/amri/publication/sifei2024cosae/

License

This model is released under the MIT License. See the repository LICENSE for details.

Downloads last month: 4

Safetensors

Model size

18.2M params

Tensor type

F32

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support