Spaces:

billyxx
/

Sprouts_Assignment

Sleeping

billyxx commited on 16 days ago

Commit

d7a756f

verified ·

1 Parent(s): 9929265

Upload 2 files

Files changed (2) hide show

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ emoji: 📌
 colorFrom: blue
 colorTo: green
 sdk: gradio
-sdk_version: 5.41.1
 app_file: app.py
 pinned: false
 ---
@@ -95,4 +95,30 @@ making the ranking of candidates more accurate and relevant.
 - Cosine similarity does not account for specific skill weights (all terms treated equally).
 - Large file uploads may impact performance on free hosting tiers.
----

 colorFrom: blue
 colorTo: green
 sdk: gradio
+sdk_version: "3.45.0"  # (or your Gradio version)
 app_file: app.py
 pinned: false
 ---
 - Cosine similarity does not account for specific skill weights (all terms treated equally).
 - Large file uploads may impact performance on free hosting tiers.
+## Potential Improvements
+Here are some ideas to enhance the engine’s functionality and performance -
+### 1. Model and Embeddings
+- Experiment with larger or more recent language models for improved summarization quality.
+- Fine-tune embedding models on domain-specific data for better candidate-job matching.
+- Cache embeddings to speed up repeated queries.
+### 2. Ranking & Recommendation
+- Incorporate additional ranking criteria like experience level, skills match weighting, or recency of resume updates.
+- Use a hybrid approach combining semantic similarity with keyword matching for more accurate recommendations.
+### 3. Scalability and Deployment
+- Containerize the app using Docker for easier deployment and scalability.
+- Integrate with cloud storage (e.g., AWS S3) for resume and job description management.
+- Use asynchronous processing or batch jobs to handle large volumes of resumes efficiently.
+## Install dependencies:
+pip install -r requirements.txt
+## Running the Engine
+python app.py
+This will launch the engine locally, typically at http://127.0.0.1:5000/
+---

app.py CHANGED Viewed

@@ -16,15 +16,13 @@ def process_resumes(job_description, uploaded_files):
     resume_texts = []
     for uploaded_file in uploaded_files:
-        # uploaded_file is a file path string from gr.Files
         filename = os.path.basename(uploaded_file)
         ext = filename.lower().split(".")[-1]
-        # Copy the file from Gradio temp folder to your uploads folder
         file_path = os.path.join(UPLOAD_FOLDER, filename)
         shutil.copy(uploaded_file, file_path)
-        # Read content based on extension
         if ext == "txt":
             with open(file_path, "r", encoding="utf-8") as f:
                 text = f.read()
@@ -54,13 +52,17 @@ def process_resumes(job_description, uploaded_files):
     # Rank resumes and generate summaries
     results = rank_resumes(job_description, resume_texts)
     for candidate in results:
         candidate["summary"] = summarize_resume_flan(candidate["text"], job_description)
     table_data = [
         [
-            candidate.get("applicant_name", extract_applicant_name(candidate["text"], candidate.get("name", "Unknown"))),
-            candidate.get("name", "Unknown"),
             f"{candidate['score']:.4f}",
             candidate["summary"]
         ] for candidate in results

     resume_texts = []
     for uploaded_file in uploaded_files:
         filename = os.path.basename(uploaded_file)
         ext = filename.lower().split(".")[-1]
+        # Copying the file from Gradio temp folder to uploads folder
         file_path = os.path.join(UPLOAD_FOLDER, filename)
         shutil.copy(uploaded_file, file_path)
         if ext == "txt":
             with open(file_path, "r", encoding="utf-8") as f:
                 text = f.read()
     # Rank resumes and generate summaries
     results = rank_resumes(job_description, resume_texts)
+    # Attach filename to each candidate for display
+    for i, candidate in enumerate(results):
+        candidate["filename"] = resume_texts[i][0]
     for candidate in results:
         candidate["summary"] = summarize_resume_flan(candidate["text"], job_description)
     table_data = [
         [
+            candidate.get("applicant_name", extract_applicant_name(candidate["text"], candidate.get("filename", "Unknown"))),
+            candidate.get("filename", "Unknown"),
             f"{candidate['score']:.4f}",
             candidate["summary"]
         ] for candidate in results