LSTM-vs-Seq2Seq / README.md
MonicaDasari's picture
Update README.md
9da7fa1 verified
|
raw
history blame
2.82 kB

BLEU Score Comparison for English-to-Japanese Translations

Overview

This project demonstrates the calculation and visualization of BLEU scores for English-to-Japanese translations. The BLEU scores evaluate the performance of two different models: an LSTM-based model and a Seq2Seq model, based on their ability to translate input sentences into Japanese.

Models Evaluated

  1. LSTM-based Model:

    • A simpler model that predicts translations based on a sequential structure.
    • Tends to perform moderately well but lacks sophistication in handling complex language patterns.
  2. Seq2Seq Model:

    • A more advanced model designed for sequence-to-sequence tasks.
    • Expected to perform better due to its ability to learn complex patterns and context.

Key Features

  • Calculates BLEU scores using the SacreBLEU library.
  • Visualizes BLEU scores as a bar chart for easy comparison.
  • Saves the BLEU scores to a CSV file for further analysis.

Implementation

Steps in the Code:

  1. Dataset Preparation:

    • The dataset contains English sentences and their corresponding Japanese translations (used as references).
    • Predictions from both LSTM and Seq2Seq models are compared against these references.
  2. BLEU Score Calculation:

    • BLEU scores are computed using SacreBLEU to quantify the overlap between the model predictions and the ground truth references.
  3. Visualization:

    • BLEU scores are visualized using a bar chart to provide an intuitive comparison of model performance.
  4. Saving Results:

    • The BLEU scores for both models are saved to a CSV file named bleu_scores_english_to_japanese.csv.

Files

  • main.py: The primary Python script containing the code for BLEU score calculation, visualization, and saving results.
  • bleu_scores.csv: Output file containing the BLEU scores for both models.

Requirements

Dependencies:

  • Python 3.x
  • Libraries:
    • sacrebleu
    • matplotlib
    • csv

To install the required dependencies, run:

pip install sacrebleu matplotlib

Usage

  1. Clone this repository and navigate to the project directory.
  2. Run the script:
    python main.py
    
  3. View the BLEU scores printed in the console and the generated bar chart.
  4. Check the bleu_scores_english.csv file for the saved results.

Results

  • The BLEU scores for both models are displayed in the console and visualized in the bar chart.
  • Example output:
    BLEU Score Comparison (English-to-Japanese):
    LSTM Model BLEU Score: 45.32
    Seq2Seq Model BLEU Score: 70.25
    BLEU scores have been saved to bleu_scores.csv
    

Acknowledgments

This project uses the SacreBLEU library for BLEU score calculation and Matplotlib for visualization.