SCENIC: A JAX Library for Computer Vision Research and Beyond
Abstract
Scenic is an open-source JAX library with a focus on Transformer-based models for computer vision research and beyond. The goal of this toolkit is to facilitate rapid experimentation, prototyping, and research of new vision architectures and models. Scenic supports a diverse range of vision tasks (e.g., classification, segmentation, detection)and facilitates working on multi-modal problems, along with GPU/TPU support for multi-host, multi-device large-scale training. Scenic also offers optimized implementations of state-of-the-art research models spanning a wide range of modalities. Scenic has been successfully used for numerous projects and published papers and continues serving as the library of choice for quick prototyping and publication of new research ideas.
Community
Introduces Scenic: a JAX library for transformer-centric computer vision models for research and prototyping. Has ready-made implementations for ViT, DETR, MLP-Mixer, ResNet, and U-Net. Relies on JAX and Flax (for implementation), TFDS and DMVR (TensorFlow datasets and DeepMind video readers for data pipelines), OTT (optimal transport tools - Wasserstein bipartite matching toolbox) and training facilities from CLU (common loop utilities). Portable on GPU/TPU, scalable to multi-accelerator, GPU, and node training. From Google (and DeepMind).
Links: GitHub (JAX, FLAX; CLU, DMVR, OTT), also see TFD (TensorFlow datasets), Optax (gradient processing and optimization in JAX), Paxml (experimentation and parallelism), Chex (readable JAX code)
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper