---
license: cc
datasets:
- speechlab/SPRING_INX_R1
tags:
- ASR
- speech-recognition
---


# Fairseq Inference Setup and Usage

This repository provides a streamlined setup and guide for performing inference with Fairseq models, tailored for automatic speech recognition.

## Table of Contents

1. [Setup Instructions](#setup-instructions)
2. [Download Required Models](#download-required-models)
3. [Running Inference](#running-inference)
4. [Getting Transcripts](#getting-transcripts)

---

### Setup Instructions

To set up the environment and install necessary dependencies for Fairseq inference, follow these steps.

#### 1. Create and Activate a Virtual Environment

Choose between Python's `venv` or Conda for environment management.

Using `venv`:
```bash
python3.8 -m venv lm_env  # use python3.8 or adjust for your preferred version
source lm_env/bin/activate
```

Using Conda:
```bash
conda create -n fairseq_inference python==3.8.10
conda activate fairseq_inference
```

#### 2. Install PyTorch and CUDA

Install the appropriate version of PyTorch and CUDA for your setup:
```bash
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
```

If using Python 3.10.15 and CUDA 12.4:
```bash
pip install torch+cu124 torchvision+cu124 torchaudio+cu124 -f https://download.pytorch.org/whl/cu124/torch_stable.html
```

#### 3. Install Additional Packages

```bash
pip install wheel soundfile editdistance pyarrow tensorboard tensorboardX
```

#### 4. Clone the Fairseq Inference Repository

```bash
git clone https://github.com/Speech-Lab-IITM/Fairseq-Inference.git
cd Fairseq-Inference/fairseq-0.12.2
pip install --editable ./
python setup.py build develop

```

---

### Download Required Models

Download the necessary models for your ASR tasks. Place them in the appropriate directory (`model_path`).

### Running Inference

Once setup is complete and models are downloaded, use the following command to run inference:

```bash
python3 infer.py model_path audio_path
```

This script takes in the model directory and an audio file to generate a transcription.

### Getting Transcripts

After running the inference script, you will receive the transcript for the provided audio file in the output.