import streamlit as st # Custom CSS for better styling st.markdown(""" """, unsafe_allow_html=True) # Main Title st.markdown('
Wav2Vec2 for Speech Recognition
', unsafe_allow_html=True) # Description st.markdown("""

Wav2Vec2 is a groundbreaking model in Automatic Speech Recognition (ASR), developed to learn speech representations from raw audio. This model achieves exceptional accuracy with minimal labeled data, making it ideal for low-resource settings. Adapted for Spark NLP, Wav2Vec2 enables scalable, production-ready ASR applications.

""", unsafe_allow_html=True) # Why, Where, and When to Use Wav2Vec2 st.markdown('
Why, Where, and When to Use Wav2Vec2
', unsafe_allow_html=True) st.markdown("""

Use Wav2Vec2 when you need a robust ASR solution that excels in scenarios with limited labeled data. It’s perfect for various speech-to-text applications where scalability and accuracy are critical. Some ideal use cases include:

""", unsafe_allow_html=True) # How to Use the Model st.markdown('
How to Use the Model
', unsafe_allow_html=True) st.code(''' audio_assembler = AudioAssembler() \\ .setInputCol("audio_content") \\ .setOutputCol("audio_assembler") speech_to_text = Wav2Vec2ForCTC \\ .pretrained("asr_wav2vec2_large_xlsr_53_english_by_jonatasgrosman", "en")\\ .setInputCols("audio_assembler") \\ .setOutputCol("text") pipeline = Pipeline(stages=[ audio_assembler, speech_to_text, ]) pipelineModel = pipeline.fit(audioDf) pipelineDF = pipelineModel.transform(audioDf) ''', language='python') # Best Practices & Tips st.markdown('
Best Practices & Tips
', unsafe_allow_html=True) st.markdown("""
""", unsafe_allow_html=True) # Model Information st.markdown('
Model Information
', unsafe_allow_html=True) st.markdown("""
Attribute Description
Model Name asr_wav2vec2_large_xlsr_53_english_by_jonatasgrosman
Compatibility Spark NLP 4.2.0+
License Open Source
Edition Official
Input Labels [audio_assembler]
Output Labels [text]
Language en
Size 1.2 GB
""", unsafe_allow_html=True) # Data Source Section st.markdown('
Data Source
', unsafe_allow_html=True) st.markdown("""

The Wav2Vec2 model is available on Hugging Face. This model, trained by jonatasgrosman, has been adapted for use with Spark NLP, ensuring it is optimized for large-scale applications.

""", unsafe_allow_html=True) # Conclusion st.markdown('
Conclusion
', unsafe_allow_html=True) st.markdown("""

Wav2Vec2 is a versatile and powerful ASR model that excels in scenarios with limited labeled data, making it a game-changer in the field of speech recognition. Its seamless integration with Spark NLP allows for scalable, efficient, and accurate deployment in various real-world applications, from transcription services to voice-activated systems.

""", unsafe_allow_html=True) # References st.markdown('
References
', unsafe_allow_html=True) st.markdown("""
""", unsafe_allow_html=True) # Community & Support st.markdown('
Community & Support
', unsafe_allow_html=True) st.markdown("""
""", unsafe_allow_html=True)