import streamlit as st from PIL import Image from fast_text_summarizer import FastTextSummarizer # Streamlit UI # Left column: Text input and summary sample_paragraphs = [ """በውይይቱ የመስኖና ቆላማ አካባቢ ሚኒስትር አብርሃም በላይ (ዶ/ር) በውይይቱ ላይ ተሳትፈዋል። አብርሃም በላይ (ዶ/ር) በማህበራዊ ትስስር ገጻቸው ባሰፈሩት መልዕክት፤ ከፋውንዴሽኑ ፕሬዚዳንት ጋር በተለያዩ ጉዳዮች ላይ ውይይት መደረጉን ገልጸዋል። በዚህም ነባር ፕሮግራሞች ላይ እንዲሁም ወደፊት ሊኖሩ ስለሚችሉት የአየር ንብረት መቋቋም ኢንሼቲቮች፣ የግሉ ዘርፍ ተሳትፎ እና የመስኖ ልማት የፋይናንስ እድሎች መዳሰሳቸውን ገልጸዋል፡፡ የሮክፌለር ፋውንዴሽን በኢትዮጵያ እንደ ኢነርጂ፣ ግብርና እና ጤና ባሉ ፕሮጀክቶች ላይ ድጋፍ የሚያደርግ ዓለም አቀፍ ተቋም ነው፡፡""", """ታንዛኒያ ዳሬሰላም ይህን አስመልክቶ የአፍሪካ ሀገራት መሪዎችና የዓለም አቀፍ ተቋማት የሥራ ሃላፊዎች መክረዋል፡፡ ከገንዘቡ ውስጥ ግማሽ ያህሉ የኤሌክትሪክ አቅርቦት ለሌላቸው ማህበረሰቦች አስተማማኝ የኃይል አቅርቦት ለሚያቀርቡ ታዳሽ የኃይል ምንጭ (የፀሃይ ሃይል ሚኒግሪድ) እንደሚሆንም ተጠቁሟል፡፡ ለዚህ የሚሆን ብድርም በአነስተኛ የወለድ መጠን ይገኛልም ነው የተባለው፡፡ የዓለም ባንክ ፕሬዚዳንት አጃይ ባንጋ፥ ኤሌክትሪክ ከሌለን ሥራ፣ የጤና እንክብካቤና ሌሎች እድሎችን ለማግኘት ከባድ ነው ሲሉ ተናግረዋል፡፡ የመሪዎች ጉባኤው በስድስት ዓመታት ውስጥ ብቻ ከ600 ሚሊየን የአፍሪካ ዜጎች መካከል ግማሹን የኤሌክትሪክ አገልግሎት ተጠቃሚ የሚያደርግ ኃይል ለማመንጨት ቃል መግባቱን የዘገበው ኒው ዮርክ ታይምስ ነው፡፡""" ] st.title("Amharic Text Summarizer") st.write("This app uses a trained FastText model to summarize your input text.") st.markdown( """

Summarization is performed by vectorizing the words in the input sentences, calculating cosine similarity for each sentence, and selecting the most relevant sentences based on their similarity to the overall document.

The FastText word embedding model used in this project achieved an average loss of 0.3215. It was trained with a dimensionality of 50 and 30 epochs on the GPAC Amharic dataset, which contains 82 million words and 534,123 unique words. For more details, you can refer to the orginal paper of the dataset used in this project: [Click here to access the paper](https://doi.org/10.3390/info12010020).

Note: The dimensionality of the model was reduced to minimize its size, making it suitable for upload to Hugging Face, which has a maximum file size limit of 1GB. This reduction successfully decreased the model size from 1.4GB to 598.53MB.To view code for this project please click Files in the platform

This is the result of training a FastText word embedding model with a dimensionality of 100, a learning rate of 0.05, and 30 epochs, achieving an average loss of 0.541634. However, the resulting model size is 1.4GB.

""", unsafe_allow_html=True ) # Load and display the image image = Image.open("avgllose_epoch.png") # Replace with the actual path to your image st.image(image, caption="Epoch vs Average Loss", use_container_width=True)