--- license: cc datasets: - speechlab/SPRING_INX_R1 tags: - ASR - speech-recognition --- # Fairseq Inference Setup and Usage This repository provides a streamlined setup and guide for performing inference with Fairseq models, tailored for automatic speech recognition. ## Table of Contents 1. [Setup Instructions](#setup-instructions) 2. [Download Required Models](#download-required-models) 3. [Running Inference](#running-inference) 4. [Getting Transcripts](#getting-transcripts) --- ### Setup Instructions To set up the environment and install necessary dependencies for Fairseq inference, follow these steps. #### 1. Create and Activate a Virtual Environment Choose between Python's `venv` or Conda for environment management. Using `venv`: ```bash python3.8 -m venv lm_env # use python3.8 or adjust for your preferred version source lm_env/bin/activate ``` Using Conda: ```bash conda create -n fairseq_inference python==3.8.10 conda activate fairseq_inference ``` #### 2. Install PyTorch and CUDA Install the appropriate version of PyTorch and CUDA for your setup: ```bash pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html ``` If using Python 3.10.15 and CUDA 12.4: ```bash pip install torch+cu124 torchvision+cu124 torchaudio+cu124 -f https://download.pytorch.org/whl/cu124/torch_stable.html ``` #### 3. Install Additional Packages ```bash pip install wheel soundfile editdistance pyarrow tensorboard tensorboardX ``` #### 4. Clone the Fairseq Inference Repository ```bash git clone https://github.com/Speech-Lab-IITM/Fairseq-Inference.git cd Fairseq-Inference/fairseq-0.12.2 pip install --editable ./ python setup.py build develop ``` --- ### Download Required Models Download the necessary models for your ASR tasks. Place them in the appropriate directory (`model_path`). ### Running Inference Once setup is complete and models are downloaded, use the following command to run inference: ```bash python3 infer.py model_path audio_path ``` This script takes in the model directory and an audio file to generate a transcription. ### Getting Transcripts After running the inference script, you will receive the transcript for the provided audio file in the output.