Spaces:

samu
/

AnotherLanguageApp

Running

App Files Files Community

samu commited on 26 days ago

Commit

77e56ff

1 Parent(s): 8e70df9

optimized backend

Browse files

Files changed (27) hide show

backend/__pycache__/cache.cpython-310.pyc +0 -0
backend/__pycache__/cache.cpython-312.pyc +0 -0
backend/__pycache__/config.cpython-310.pyc +0 -0
backend/__pycache__/config.cpython-312.pyc +0 -0
backend/__pycache__/main.cpython-310.pyc +0 -0
backend/__pycache__/main.cpython-312.pyc +0 -0
backend/cache.py +29 -0
backend/config.py +0 -6
backend/main.py +25 -118
backend/utils/__pycache__/generate_completions.cpython-310.pyc +0 -0
backend/utils/__pycache__/generate_completions.cpython-312.pyc +0 -0
backend/utils/__pycache__/handlers.cpython-310.pyc +0 -0
backend/utils/__pycache__/handlers.cpython-312.pyc +0 -0
backend/utils/generate_completions.py +0 -26
backend/utils/handlers.py +65 -0
prev_backend_v3/backend/__pycache__/config.cpython-310.pyc +0 -0
prev_backend_v3/backend/__pycache__/config.cpython-312.pyc +0 -0
prev_backend_v3/backend/__pycache__/database.cpython-310.pyc +0 -0
prev_backend_v3/backend/__pycache__/database.cpython-312.pyc +0 -0
prev_backend_v3/backend/__pycache__/main.cpython-310.pyc +0 -0
prev_backend_v3/backend/__pycache__/main.cpython-312.pyc +0 -0
prev_backend_v3/backend/config.py +226 -0
{backend → prev_backend_v3/backend}/database.py +0 -0
prev_backend_v3/backend/main.py +189 -0
prev_backend_v3/backend/utils/__pycache__/generate_completions.cpython-310.pyc +0 -0
prev_backend_v3/backend/utils/__pycache__/generate_completions.cpython-312.pyc +0 -0
prev_backend_v3/backend/utils/generate_completions.py +107 -0

backend/__pycache__/cache.cpython-310.pyc ADDED Viewed

Binary file (1.15 kB). View file

backend/__pycache__/cache.cpython-312.pyc ADDED Viewed

Binary file (2.24 kB). View file

backend/__pycache__/config.cpython-310.pyc CHANGED Viewed

Binary files a/backend/__pycache__/config.cpython-310.pyc and b/backend/__pycache__/config.cpython-310.pyc differ

backend/__pycache__/config.cpython-312.pyc CHANGED Viewed

Binary files a/backend/__pycache__/config.cpython-312.pyc and b/backend/__pycache__/config.cpython-312.pyc differ

backend/__pycache__/main.cpython-310.pyc CHANGED Viewed

Binary files a/backend/__pycache__/main.cpython-310.pyc and b/backend/__pycache__/main.cpython-310.pyc differ

backend/__pycache__/main.cpython-312.pyc CHANGED Viewed

Binary files a/backend/__pycache__/main.cpython-312.pyc and b/backend/__pycache__/main.cpython-312.pyc differ

backend/cache.py ADDED Viewed

	@@ -0,0 +1,29 @@

+import asyncio
+from typing import Any, Callable, Dict, Tuple
+class AsyncLRUCache:
+    def __init__(self, maxsize=100_000):  # Optimized for 16GB RAM
+        self.cache: Dict[Tuple, Any] = {}
+        self.order = []
+        self.maxsize = maxsize
+        self.lock = asyncio.Lock()
+    async def get_or_set(self, key: Tuple, coro: Callable, *args, **kwargs):
+        async with self.lock:
+            if key in self.cache:
+                # Move key to end to show it was recently used
+                self.order.remove(key)
+                self.order.append(key)
+                return self.cache[key]
+        # Not cached, compute result
+        result = await coro(*args, **kwargs)
+        async with self.lock:
+            self.cache[key] = result
+            self.order.append(key)
+            if len(self.order) > self.maxsize:
+                oldest = self.order.pop(0)
+                del self.cache[oldest]
+        return result
+# Initialize cache with optimized size for 16GB RAM
+cache = AsyncLRUCache()  # Uses default maxsize=100_000

backend/config.py CHANGED Viewed

@@ -142,15 +142,12 @@ flashcard_mode_instructions = """
 # Native language: {native_language}
 # Target language: {target_language}
 # Proficiency level: {proficiency}
 You are a highly adaptive vocabulary tutor capable of teaching any language. Your goal is to help users learn rapidly by generating personalized flashcards from lesson-based content.
 ### Input Format
 You will receive a structured lesson as input (text, dialogue, or vocabulary list). Use this input to:
 - Identify new or useful vocabulary terms.
 - Extract contextually relevant and domain-specific language.
 - Ensure that flashcards reflect the lesson's language, style, and purpose.
 ### Generation Guidelines
 When generating flashcards:
 1. **Use the provided metadata**:
@@ -160,15 +157,12 @@ When generating flashcards:
      - *Beginner*: High-frequency, essential words.
      - *Intermediate*: Broader, topic-specific terms and common collocations.
      - *Advanced*: Nuanced, idiomatic, or technical vocabulary.
 2. **Contextual relevance**:
    - Flashcards should reflect the themes, activities, or domain of the lesson input (e.g., cooking, business, travel).
    - Ensure that example sentences are directly related to the input content and sound natural in use.
 3. **Avoid redundancy**:
    - Select terms that are novel, useful, or not overly repetitive within the lesson.
    - Prioritize terms that learners are likely to encounter again in real-world usage.
 ### Flashcard Format
 Generate exactly **10 flashcards** as a **valid JSON array**, with each flashcard containing:
 - `"word"`: A key word or phrase in {target_language} drawn from the lesson.

 # Native language: {native_language}
 # Target language: {target_language}
 # Proficiency level: {proficiency}
 You are a highly adaptive vocabulary tutor capable of teaching any language. Your goal is to help users learn rapidly by generating personalized flashcards from lesson-based content.
 ### Input Format
 You will receive a structured lesson as input (text, dialogue, or vocabulary list). Use this input to:
 - Identify new or useful vocabulary terms.
 - Extract contextually relevant and domain-specific language.
 - Ensure that flashcards reflect the lesson's language, style, and purpose.
 ### Generation Guidelines
 When generating flashcards:
 1. **Use the provided metadata**:
      - *Beginner*: High-frequency, essential words.
      - *Intermediate*: Broader, topic-specific terms and common collocations.
      - *Advanced*: Nuanced, idiomatic, or technical vocabulary.
 2. **Contextual relevance**:
    - Flashcards should reflect the themes, activities, or domain of the lesson input (e.g., cooking, business, travel).
    - Ensure that example sentences are directly related to the input content and sound natural in use.
 3. **Avoid redundancy**:
    - Select terms that are novel, useful, or not overly repetitive within the lesson.
    - Prioritize terms that learners are likely to encounter again in real-world usage.
 ### Flashcard Format
 Generate exactly **10 flashcards** as a **valid JSON array**, with each flashcard containing:
 - `"word"`: A key word or phrase in {target_language} drawn from the lesson.

backend/main.py CHANGED Viewed

@@ -3,13 +3,12 @@ from fastapi.responses import JSONResponse
 from fastapi.middleware.cors import CORSMiddleware
 from pydantic import BaseModel
 from backend.utils import generate_completions
 from backend import config
-from backend.database import get_db_connection
-import psycopg2
-from psycopg2.extras import RealDictCursor
 from typing import Union, List, Literal, Optional
 import logging
 import json
 logging.basicConfig(level=logging.INFO)
@@ -24,14 +23,6 @@ app.add_middleware(
     allow_headers=["*"],  # Allows all headers
 )
-# Dependency to get database connection
-async def get_db():
-    conn = await get_db_connection()
-    try:
-        yield conn
-    finally:
-        conn.close()
 class Message(BaseModel):
     role: Literal["user", "assistant"]
     content: str
@@ -46,11 +37,6 @@ class GenerationRequest(BaseModel):
 class MetadataRequest(BaseModel):
     query: str
-# Global metadata variables
-native_language: Optional[str] = None
-target_language: Optional[str] = None
-proficiency: Optional[str] = None
 @app.get("/")
 async def root():
     return {"message": "Welcome to the AI Learning Assistant API!"}
@@ -59,15 +45,13 @@ async def root():
 async def extract_metadata(data: MetadataRequest):
     logging.info(f"Query: {data.query}")
     try:
-        response_str = await generate_completions.get_completions(
             data.query,
             config.language_metadata_extraction_prompt
         )
         metadata_dict = json.loads(response_str)
-        # Update globals for other endpoints
-        globals()['native_language'] = metadata_dict.get('native_language', 'unknown')
-        globals()['target_language'] = metadata_dict.get('target_language', 'unknown')
-        globals()['proficiency'] = metadata_dict.get('proficiency', 'unknown')
         return JSONResponse(
             content={
                 "data": metadata_dict,
@@ -81,109 +65,32 @@ async def extract_metadata(data: MetadataRequest):
 @app.post("/generate/curriculum")
 async def generate_curriculum(data: GenerationRequest):
-    try:
-        # Use metadata from request or fallback to globals
-        nl = data.native_language or native_language or "unknown"
-        tl = data.target_language or target_language or "unknown"
-        prof = data.proficiency or proficiency or "unknown"
-        instructions = (
-            config.curriculum_instructions
-            .replace("{native_language}", nl)
-            .replace("{target_language}", tl)
-            .replace("{proficiency}", prof)
-        )
-        response = await generate_completions.get_completions(
-            data.query,
-            instructions
-        )
-        return JSONResponse(
-            content={
-                "data": response,
-                "type": "curriculum",
-                "status": "success"
-            },
-            status_code=200
-        )
-    except Exception as e:
-        raise HTTPException(status_code=500, detail=str(e))
 @app.post("/generate/flashcards")
 async def generate_flashcards(data: GenerationRequest):
-    try:
-        nl = data.native_language or native_language or "unknown"
-        tl = data.target_language or target_language or "unknown"
-        prof = data.proficiency or proficiency or "unknown"
-        instructions = (
-            config.flashcard_mode_instructions
-            .replace("{native_language}", nl)
-            .replace("{target_language}", tl)
-            .replace("{proficiency}", prof)
-        )
-        response = await generate_completions.get_completions(
-            data.query,
-            instructions
-        )
-        return JSONResponse(
-            content={
-                "data": response,
-                "type": "flashcards",
-                "status": "success"
-            },
-            status_code=200
-        )
-    except Exception as e:
-        raise HTTPException(status_code=500, detail=str(e))
 @app.post("/generate/exercises")
 async def generate_exercises(data: GenerationRequest):
-    try:
-        nl = data.native_language or native_language or "unknown"
-        tl = data.target_language or target_language or "unknown"
-        prof = data.proficiency or proficiency or "unknown"
-        instructions = (
-            config.exercise_mode_instructions
-            .replace("{native_language}", nl)
-            .replace("{target_language}", tl)
-            .replace("{proficiency}", prof)
-        )
-        response = await generate_completions.get_completions(
-            data.query,
-            instructions
-        )
-        return JSONResponse(
-            content={
-                "data": response,
-                "type": "exercises",
-                "status": "success"
-            },
-            status_code=200
-        )
-    except Exception as e:
-        raise HTTPException(status_code=500, detail=str(e))
 @app.post("/generate/simulation")
 async def generate_simulation(data: GenerationRequest):
-    try:
-        nl = data.native_language or native_language or "unknown"
-        tl = data.target_language or target_language or "unknown"
-        prof = data.proficiency or proficiency or "unknown"
-        instructions = (
-            config.simulation_mode_instructions
-            .replace("{native_language}", nl)
-            .replace("{target_language}", tl)
-            .replace("{proficiency}", prof)
-        )
-        response = await generate_completions.get_completions(
-            data.query,
-            instructions
-        )
-        return JSONResponse(
-            content={
-                "data": response,
-                "type": "simulation",
-                "status": "success"
-            },
-            status_code=200
-        )
-    except Exception as e:
-        raise HTTPException(status_code=500, detail=str(e))

 from fastapi.middleware.cors import CORSMiddleware
 from pydantic import BaseModel
 from backend.utils import generate_completions
+from backend.utils.handlers import handle_generation_request, INSTRUCTION_TEMPLATES
 from backend import config
 from typing import Union, List, Literal, Optional
 import logging
 import json
+from backend.cache import cache
 logging.basicConfig(level=logging.INFO)
     allow_headers=["*"],  # Allows all headers
 )
 class Message(BaseModel):
     role: Literal["user", "assistant"]
     content: str
 class MetadataRequest(BaseModel):
     query: str
 @app.get("/")
 async def root():
     return {"message": "Welcome to the AI Learning Assistant API!"}
 async def extract_metadata(data: MetadataRequest):
     logging.info(f"Query: {data.query}")
     try:
+        response_str = await cache.get_or_set(
+            (str(data.query), config.language_metadata_extraction_prompt),
+            generate_completions.get_completions,
             data.query,
             config.language_metadata_extraction_prompt
         )
         metadata_dict = json.loads(response_str)
         return JSONResponse(
             content={
                 "data": metadata_dict,
 @app.post("/generate/curriculum")
 async def generate_curriculum(data: GenerationRequest):
+    return await handle_generation_request(
+        data=data,
+        mode="curriculum",
+        instructions_template=INSTRUCTION_TEMPLATES["curriculum"]
+    )
 @app.post("/generate/flashcards")
 async def generate_flashcards(data: GenerationRequest):
+    return await handle_generation_request(
+        data=data,
+        mode="flashcards",
+        instructions_template=INSTRUCTION_TEMPLATES["flashcards"]
+    )
 @app.post("/generate/exercises")
 async def generate_exercises(data: GenerationRequest):
+    return await handle_generation_request(
+        data=data,
+        mode="exercises",
+        instructions_template=INSTRUCTION_TEMPLATES["exercises"]
+    )
 @app.post("/generate/simulation")
 async def generate_simulation(data: GenerationRequest):
+    return await handle_generation_request(
+        data=data,
+        mode="simulation",
+        instructions_template=INSTRUCTION_TEMPLATES["simulation"]
+    )

backend/utils/__pycache__/generate_completions.cpython-310.pyc CHANGED Viewed

Binary files a/backend/utils/__pycache__/generate_completions.cpython-310.pyc and b/backend/utils/__pycache__/generate_completions.cpython-310.pyc differ

backend/utils/__pycache__/generate_completions.cpython-312.pyc CHANGED Viewed

Binary files a/backend/utils/__pycache__/generate_completions.cpython-312.pyc and b/backend/utils/__pycache__/generate_completions.cpython-312.pyc differ

backend/utils/__pycache__/handlers.cpython-310.pyc ADDED Viewed

Binary file (1.94 kB). View file

backend/utils/__pycache__/handlers.cpython-312.pyc ADDED Viewed

Binary file (2.71 kB). View file

backend/utils/generate_completions.py CHANGED Viewed

@@ -40,32 +40,6 @@ def process_input(data: Union[str, List[Dict[str, str]]]) -> Union[str, List[Dic
     else:
         raise TypeError("Input must be a string or a list of dictionaries with a 'content' field")
-# async def get_completions(
-#     prompt: Union[str, List[Dict[str, str]]],
-#     instructions: str
-# ) -> str:
-#     processed_prompt = process_input(prompt)  # Ensures the input format is correct
-#     if isinstance(processed_prompt, str):
-#         messages = [
-#             {"role": "system", "content": instructions},
-#             {"role": "user", "content": processed_prompt}
-#         ]
-#     elif isinstance(processed_prompt, list):
-#         messages = [{"role": "system", "content": instructions}] + processed_prompt
-#     else:
-#         raise TypeError("Unexpected processed input type.")
-#     response = await client.chat.completions.create(
-#         model=os.getenv("MODEL"),
-#         messages=messages,
-#         response_format={"type": "json_object"}
-#     )
-#     output: str = response.choices[0].message.content
-#     return output
 async def get_completions(
     prompt: Union[str, List[Dict[str, str]]],
     instructions: str

     else:
         raise TypeError("Input must be a string or a list of dictionaries with a 'content' field")
 async def get_completions(
     prompt: Union[str, List[Dict[str, str]]],
     instructions: str

backend/utils/handlers.py ADDED Viewed

	@@ -0,0 +1,65 @@

+from fastapi import HTTPException
+from fastapi.responses import JSONResponse
+from typing import Callable, Dict, Any
+from backend import config
+from backend.cache import cache
+from backend.utils import generate_completions
+async def handle_generation_request(
+    data: Any,
+    mode: str,
+    instructions_template: str
+) -> JSONResponse:
+    """
+    Shared handler for all generation endpoints (curriculum, flashcards, exercises, simulation).
+    Args:
+        data: The GenerationRequest object containing query and metadata
+        mode: The type of generation (curriculum, flashcards, exercises, simulation)
+        instructions_template: The template string from config to use
+    Returns:
+        JSONResponse with the generated content
+    Raises:
+        HTTPException: If required metadata is missing or other errors occur
+    """
+    # Validate required metadata
+    if not (data.native_language and data.target_language and data.proficiency):
+        raise HTTPException(
+            status_code=400,
+            detail="native_language, target_language, and proficiency are required. Please extract metadata first."
+        )
+    # Format instructions with metadata
+    instructions = (
+        instructions_template
+        .replace("{native_language}", data.native_language)
+        .replace("{target_language}", data.target_language)
+        .replace("{proficiency}", data.proficiency)
+    )
+    # Get response from cache or generate new
+    response = await cache.get_or_set(
+        (str(data.query), instructions),
+        generate_completions.get_completions,
+        data.query,
+        instructions
+    )
+    return JSONResponse(
+        content={
+            "data": response,
+            "type": mode,
+            "status": "success"
+        },
+        status_code=200
+    )
+# Mapping of modes to their instruction templates
+INSTRUCTION_TEMPLATES: Dict[str, str] = {
+    "curriculum": config.curriculum_instructions,
+    "flashcards": config.flashcard_mode_instructions,
+    "exercises": config.exercise_mode_instructions,
+    "simulation": config.simulation_mode_instructions
+}

prev_backend_v3/backend/__pycache__/config.cpython-310.pyc ADDED Viewed

Binary file (12 kB). View file

prev_backend_v3/backend/__pycache__/config.cpython-312.pyc ADDED Viewed

Binary file (17.4 kB). View file

prev_backend_v3/backend/__pycache__/database.cpython-310.pyc ADDED Viewed

Binary file (10.1 kB). View file

prev_backend_v3/backend/__pycache__/database.cpython-312.pyc ADDED Viewed

Binary file (12.6 kB). View file

prev_backend_v3/backend/__pycache__/main.cpython-310.pyc ADDED Viewed

Binary file (3.28 kB). View file

prev_backend_v3/backend/__pycache__/main.cpython-312.pyc ADDED Viewed

Binary file (8.49 kB). View file

prev_backend_v3/backend/config.py ADDED Viewed

	@@ -0,0 +1,226 @@

+language_metadata_extraction_prompt = """
+You are a language learning assistant. Your task is to analyze the user's input and infer their:
+- Native language (use the language of the input as a fallback if unsure)
+- Target language (the one they want to learn)
+- Proficiency level (beginner, intermediate, or advanced)
+- Title (a brief title summarizing the user's language learning context, written in the user's native language)
+- Description (a catchy, short description of their learning journey, written in the user's native language)
+Respond ONLY with a valid JSON object using the following format:
+{
+  "native_language": "<user's native language>",
+  "target_language": "<language the user wants to learn>",
+  "proficiency": "<beginner | intermediate | advanced>",
+  "title": "<brief title summarizing the learning context, in the native language>",
+  "description": "<catchy, short description of the learning journey, in the native language>"
+}
+Guidelines:
+- If the user's native language is not explicitly stated, assume it's the same as the language used in the query.
+- If the target language is mentioned indirectly (e.g., "my Dutch isn't great"), infer that as the target language.
+- Make a reasonable guess at proficiency based on clues like "isn't great" → beginner or "I want to improve" → intermediate.
+- If you cannot infer something at all, write "unknown" for native_language, target_language, or proficiency.
+- After inferring the native language, ALWAYS generate the title and description in that language, regardless of the query language or any other context.
+- For title, create a concise phrase (e.g., "Beginner Dutch Adventure" or "Improving Spanish Skills") based on the inferred target language and proficiency, and write it in the user's native language.
+- For description, craft a catchy, short sentence (10-15 words max) that captures the user's learning journey, and write it in the user's native language.
+- If target_language or proficiency is "unknown," use generic but engaging phrases for title and description (e.g., "Language Learning Quest," "Embarking on a new linguistic journey!"), but always in the user's native language.
+- Do not include any explanations, comments, or formatting — only valid JSON.
+Example:
+User query: "i want to improve my english"
+Expected output:
+{
+  "native_language": "english",
+  "target_language": "english",
+  "proficiency": "intermediate",
+  "title": "Improving English Skills",
+  "description": "A journey to perfect English for greater fluency and confidence!"
+}
+"""
+curriculum_instructions = """
+# Metadata:
+# Native language: {native_language}
+# Target language: {target_language}
+# Proficiency level: {proficiency}
+You are an AI-powered language learning assistant tasked with generating an extensive, personalized curriculum. Your goal is to help the user learn {target_language} by designing a 25-lesson curriculum that reflects the user's goals, interests, and proficiency level. All outputs should be written in {native_language}.
+### Curriculum Goals:
+- Provide 25 lessons.
+- Ensure logical progression from basic to advanced topics (according to {proficiency}).
+- Align each lesson with a practical communication goal.
+- Tailor vocabulary and sub-topics to the user’s intended use (e.g., work, travel, hobbies, daily life).
+### Instructions:
+1. **Define the Lesson Series (Overall Theme):**
+   - Choose a main theme relevant to the user's motivation for learning {target_language} (e.g., "Living in a new country", "Professional communication", "Traveling in {target_language}-speaking regions").
+   - The theme should guide the tone, content, and scope of the entire 25-lesson sequence.
+2. **Divide the Curriculum into 25 Thematic Lessons:**
+   - Each lesson should have a clear focus (e.g., asking for help, describing your job, booking accommodation).
+   - Sequence lessons to build from foundational topics to more complex, specialized language use.
+   - Vary grammar, vocabulary, and communication functions across lessons to avoid repetition and ensure comprehensive coverage.
+3. **Describe Each Lesson Clearly and Concisely:**
+   For each of the 25 lessons, provide:
+   - "sub_topic": A clear and practical lesson title in {native_language}.
+   - "keywords": A list of 1–3 high-level categories in {native_language} that describe the lesson focus (e.g., "directions", "daily routine", "formal conversation").
+   - "description": One sentence in {native_language} that explains what the learner will achieve or be able to do after completing the lesson. Be specific and learner-oriented.
+### Output Format:
+Return a valid JSON object with:
+- "lesson_topic": The overall learning theme (in {native_language}).
+- "sub_topics": A list of 25 items. Each item must include:
+   - "sub_topic": A short title of the lesson (in {native_language}).
+   - "keywords": A list of 1–3 general-purpose categories (in {native_language}).
+   - "description": One clear sentence (in {native_language}) describing the purpose of the lesson.
+Avoid:
+- Using overly generic or repetitive titles or descriptions.
+- Keyword lists with only one-word entries (e.g., use "ordering in a restaurant" instead of "food").
+- Abstract lessons with no real-world relevance.
+Ensure the curriculum builds toward user fluency in relevant contexts.
+"""
+exercise_mode_instructions = """
+# Metadata:
+# Native language: {native_language}
+# Target language: {target_language}
+# Proficiency level: {proficiency}
+You are a smart, context-aware language exercise generator. Your task is to create personalized cloze-style exercises that help learners reinforce vocabulary and grammar through realistic, domain-specific practice. You support any language.
+### Input Format
+You will receive a structured lesson or topic description (e.g., text excerpt, dialogue, thematic scenario). For example, this could be a short paragraph about daily routines, a dialogue between a customer and a shopkeeper, or a scenario involving travel planning. Use it to:
+- Identify 5 concrete vocabulary items or grammar points suited to the learner’s immediate needs.
+- Ground each exercise in a specific, vivid scenario.
+- Reflect real-world tasks or conversations the learner will encounter.
+### Generation Guidelines
+1. **Metadata usage**
+   - **Native language**: Use {native_language} for all explanations.
+   - **Target language**: Use {target_language} for sentences, answers, and choices.
+   - **Proficiency**:
+     - *Beginner*: Focus on high-frequency vocabulary and simple grammar structures, such as present tense, basic prepositions, and common nouns and verbs.
+     - *Intermediate*: Incorporate a mix of common and thematic vocabulary, and introduce one new tense or grammatical structure per exercise.
+     - *Advanced*: Use domain-specific terminology, idiomatic expressions, and complex syntax to challenge learners.
+2. **Sentence specificity**
+   - Craft each sentence around a concrete action, object, or event (e.g., “At the café counter, she ___ her order,” not “I want to ___”). To make exercises more engaging, consider adding details that paint a vivid picture, such as specific locations, times, or characters. For instance, use "On a sunny Saturday morning, Maria is heading to the local farmers' market to buy fresh produce" instead of "I am going to the store."
+   - Avoid “template” prompts like “I am going to ___” or “I like to ___” without added context.
+   - Each sentence must clearly point to one—and only one—correct word or structure.
+3. **Unique, unambiguous answers**
+   - Design each prompt so distractors could be grammatically plausible but contextually impossible. For example, if the sentence is "She ___ the book on the table," and the correct answer is "put," ensure only "put" fits the context, while distractors like "placed," "set," or "laid" are plausible but incorrect here.
+   - Ensure there is no secondary interpretation that could validate another choice.
+4. **Plausible distractors**
+   - Provide four total options: one correct, three context-related but incorrect.
+   - Distractors must belong to the same word class (noun, verb, adjective, etc.) and semantic field.
+   - Shuffle answer positions randomly.
+   - Ensure distractors are not too similar to the correct answer to avoid confusion.
+5. **Explanations**
+   - Offer a concise 1–2-sentence rationale in {native_language}, explaining why the correct answer fits this very context and briefly noting why each distractor fails. If space allows, consider adding a brief example or analogy to reinforce the learning point.
+### Output Format
+Return exactly **5** cloze-style exercises as a **JSON array**, each element with:
+- `"sentence"`: A fully contextualized sentence in {target_language} containing one blank (`___`).
+- `"answer"`: The single correct fill-in, in {target_language}.
+- `"choices"`: A list of four total options (in randomized order), all in {target_language}.
+- `"explanation"`: A concise note in {native_language} clarifying the correct answer and why others don’t fit.
+_Do not wrap the array in any additional objects or metadata—output only the raw JSON array._
+"""
+flashcard_mode_instructions = """
+# Metadata:
+# Native language: {native_language}
+# Target language: {target_language}
+# Proficiency level: {proficiency}
+You are a highly adaptive vocabulary tutor capable of teaching any language. Your goal is to help users learn rapidly by generating personalized flashcards from lesson-based content.
+### Input Format
+You will receive a structured lesson as input (text, dialogue, or vocabulary list). Use this input to:
+- Identify new or useful vocabulary terms.
+- Extract contextually relevant and domain-specific language.
+- Ensure that flashcards reflect the lesson's language, style, and purpose.
+### Generation Guidelines
+When generating flashcards:
+1. **Use the provided metadata**:
+   - **Native language**: Use {native_language} for definitions.
+   - **Target language**: Extract and present vocabulary and examples in {target_language}.
+   - **Proficiency level**: Adjust vocabulary complexity based on {proficiency}:
+     - *Beginner*: High-frequency, essential words.
+     - *Intermediate*: Broader, topic-specific terms and common collocations.
+     - *Advanced*: Nuanced, idiomatic, or technical vocabulary.
+2. **Contextual relevance**:
+   - Flashcards should reflect the themes, activities, or domain of the lesson input (e.g., cooking, business, travel).
+   - Ensure that example sentences are directly related to the input content and sound natural in use.
+3. **Avoid redundancy**:
+   - Select terms that are novel, useful, or not overly repetitive within the lesson.
+   - Prioritize terms that learners are likely to encounter again in real-world usage.
+### Flashcard Format
+Generate exactly **10 flashcards** as a **valid JSON array**, with each flashcard containing:
+- `"word"`: A key word or phrase in {target_language} drawn from the lesson.
+- `"definition"`: A learner-friendly explanation in {native_language}.
+- `"example"`: A clear, natural sentence in {target_language} demonstrating the word **in context with the lesson**.
+"""
+simulation_mode_instructions = """
+# Metadata:
+# Native language: {native_language}
+# Target language: {target_language}
+# Proficiency level: {proficiency}
+You are a **creative, context-aware storytelling engine**. Your task is to generate short, engaging stories or dialogues in **any language** to make language learning enjoyable, memorable, and relevant. Stories must reflect the user's interests, profession, or hobbies, and align with their learning level.
+### Input Format
+You will receive a user-provided **lesson topic, theme, or domain of interest** (e.g., “a courtroom drama for a law student” or “space mission dialogue for a space enthusiast”). Use this input to:
+- Personalize characters, setting, and vocabulary.
+- Make the story both educational and entertaining.
+- Ensure the language reflects real-world use in that context.
+### Story Generation Task
+1. **Use the provided metadata**:
+   - **Native language**: Present explanations, setup, and translations in {native_language}.
+   - **Target language**: Write dialogue and narration in {target_language}.
+   - **Proficiency level**: Match language complexity to {proficiency}:
+     - *Beginner*: Simple grammar, short sentences, high-frequency vocabulary.
+     - *Intermediate*: Natural sentence flow, basic narrative devices, slightly challenging vocabulary.
+     - *Advanced*: Complex structures, idiomatic expressions, domain-specific language.
+2. **Domain relevance**:
+   - Base the story or dialogue on the user’s interests or specified topic.
+   - Integrate relevant vocabulary and situations (e.g., a chef character using cooking terms, or a pilot discussing navigation).
+3. **Engagement and originality**:
+   - Make the story fun, dramatic, or surprising to increase engagement.
+   - Avoid clichés and repetition—each story should be fresh and imaginative.
+   - Vary tone and structure depending on the theme (e.g., suspenseful for a mystery, humorous for a slice-of-life scene).
+4. **Educational value**:
+   - Use natural-sounding language learners would benefit from hearing or using.
+   - Provide translations and (where helpful) phonetic transcription to support pronunciation and comprehension.
+### Output Format
+Return a valid **JSON object** with the following structure:
+- `"title"`: An engaging title in {native_language}.
+- `"setting"`: A brief setup paragraph in {native_language} explaining the story’s background and relevance to the user’s interest.
+- `"content"`: A list of **10 segments**, each structured as:
+  - `"speaker"`: A named or role-based character label in {native_language} (e.g., "Narrator", "Captain Li", "The Botanist").
+  - `"target_language_text"`: The sentence or dialogue line in {target_language}.
+  - `"phonetics"`: A phonetic transcription (IPA, Pinyin, etc.), only if helpful or relevant for the target language.
+  - `"base_language_translation"`: A simple, clear translation in {native_language}.
+Ensure that all entries are structured cleanly and consistently. Do not wrap the result in additional containers or metadata.
+"""

{backend → prev_backend_v3/backend}/database.py RENAMED Viewed

File without changes

prev_backend_v3/backend/main.py ADDED Viewed

	@@ -0,0 +1,189 @@

+from fastapi import FastAPI, HTTPException
+from fastapi.responses import JSONResponse
+from fastapi.middleware.cors import CORSMiddleware
+from pydantic import BaseModel
+from backend.utils import generate_completions
+from backend import config
+from backend.database import get_db_connection
+import psycopg2
+from psycopg2.extras import RealDictCursor
+from typing import Union, List, Literal, Optional
+import logging
+import json
+logging.basicConfig(level=logging.INFO)
+app = FastAPI()
+# Add CORS middleware
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],  # Allows all origins
+    allow_credentials=True,
+    allow_methods=["*"],  # Allows all methods
+    allow_headers=["*"],  # Allows all headers
+)
+# Dependency to get database connection
+async def get_db():
+    conn = await get_db_connection()
+    try:
+        yield conn
+    finally:
+        conn.close()
+class Message(BaseModel):
+    role: Literal["user", "assistant"]
+    content: str
+class GenerationRequest(BaseModel):
+    user_id: int
+    query: Union[str, List[Message]]
+    native_language: Optional[str] = None
+    target_language: Optional[str] = None
+    proficiency: Optional[str] = None
+class MetadataRequest(BaseModel):
+    query: str
+# Global metadata variables
+native_language: Optional[str] = None
+target_language: Optional[str] = None
+proficiency: Optional[str] = None
+@app.get("/")
+async def root():
+    return {"message": "Welcome to the AI Learning Assistant API!"}
+@app.post("/extract/metadata")
+async def extract_metadata(data: MetadataRequest):
+    logging.info(f"Query: {data.query}")
+    try:
+        response_str = await generate_completions.get_completions(
+            data.query,
+            config.language_metadata_extraction_prompt
+        )
+        metadata_dict = json.loads(response_str)
+        # Update globals for other endpoints
+        globals()['native_language'] = metadata_dict.get('native_language', 'unknown')
+        globals()['target_language'] = metadata_dict.get('target_language', 'unknown')
+        globals()['proficiency'] = metadata_dict.get('proficiency', 'unknown')
+        return JSONResponse(
+            content={
+                "data": metadata_dict,
+                "type": "language_metadata",
+                "status": "success"
+            },
+            status_code=200
+        )
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+@app.post("/generate/curriculum")
+async def generate_curriculum(data: GenerationRequest):
+    try:
+        # Use metadata from request or fallback to globals
+        nl = data.native_language or native_language or "unknown"
+        tl = data.target_language or target_language or "unknown"
+        prof = data.proficiency or proficiency or "unknown"
+        instructions = (
+            config.curriculum_instructions
+            .replace("{native_language}", nl)
+            .replace("{target_language}", tl)
+            .replace("{proficiency}", prof)
+        )
+        response = await generate_completions.get_completions(
+            data.query,
+            instructions
+        )
+        return JSONResponse(
+            content={
+                "data": response,
+                "type": "curriculum",
+                "status": "success"
+            },
+            status_code=200
+        )
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+@app.post("/generate/flashcards")
+async def generate_flashcards(data: GenerationRequest):
+    try:
+        nl = data.native_language or native_language or "unknown"
+        tl = data.target_language or target_language or "unknown"
+        prof = data.proficiency or proficiency or "unknown"
+        instructions = (
+            config.flashcard_mode_instructions
+            .replace("{native_language}", nl)
+            .replace("{target_language}", tl)
+            .replace("{proficiency}", prof)
+        )
+        response = await generate_completions.get_completions(
+            data.query,
+            instructions
+        )
+        return JSONResponse(
+            content={
+                "data": response,
+                "type": "flashcards",
+                "status": "success"
+            },
+            status_code=200
+        )
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+@app.post("/generate/exercises")
+async def generate_exercises(data: GenerationRequest):
+    try:
+        nl = data.native_language or native_language or "unknown"
+        tl = data.target_language or target_language or "unknown"
+        prof = data.proficiency or proficiency or "unknown"
+        instructions = (
+            config.exercise_mode_instructions
+            .replace("{native_language}", nl)
+            .replace("{target_language}", tl)
+            .replace("{proficiency}", prof)
+        )
+        response = await generate_completions.get_completions(
+            data.query,
+            instructions
+        )
+        return JSONResponse(
+            content={
+                "data": response,
+                "type": "exercises",
+                "status": "success"
+            },
+            status_code=200
+        )
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+@app.post("/generate/simulation")
+async def generate_simulation(data: GenerationRequest):
+    try:
+        nl = data.native_language or native_language or "unknown"
+        tl = data.target_language or target_language or "unknown"
+        prof = data.proficiency or proficiency or "unknown"
+        instructions = (
+            config.simulation_mode_instructions
+            .replace("{native_language}", nl)
+            .replace("{target_language}", tl)
+            .replace("{proficiency}", prof)
+        )
+        response = await generate_completions.get_completions(
+            data.query,
+            instructions
+        )
+        return JSONResponse(
+            content={
+                "data": response,
+                "type": "simulation",
+                "status": "success"
+            },
+            status_code=200
+        )
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))

prev_backend_v3/backend/utils/__pycache__/generate_completions.cpython-310.pyc ADDED Viewed

Binary file (2.55 kB). View file

prev_backend_v3/backend/utils/__pycache__/generate_completions.cpython-312.pyc ADDED Viewed

Binary file (3.73 kB). View file

prev_backend_v3/backend/utils/generate_completions.py ADDED Viewed

	@@ -0,0 +1,107 @@

+from openai import AsyncOpenAI, OpenAI
+import asyncio
+import json
+from typing import AsyncIterator
+from typing import Union, List, Dict, Literal
+from dotenv import load_dotenv
+import os
+from pydantic import BaseModel
+load_dotenv()
+# Initialize the async client
+client = AsyncOpenAI(
+    base_url=os.getenv("BASE_URL"),
+    api_key=os.getenv("API_KEY"),
+)
+class Message(BaseModel):
+    role: Literal["user", "assistant"]
+    content: str
+# Helper function to flatten chat messages into a single string prompt
+def flatten_messages(messages: List[Message]) -> str:
+    return "\n".join([f"{m.role}: {m.content}" for m in messages])
+def process_input(data: Union[str, List[Dict[str, str]]]) -> Union[str, List[Dict[str, str]]]:
+    """
+    Processes input to either uppercase a string or modify the 'content' field
+    of a list of dictionaries.
+    """
+    if isinstance(data, str):
+        return data.strip()  # Ensures prompt is cleaned up (optional)
+    elif isinstance(data, list):
+        # Ensure each item in the list is a dictionary with a 'content' key
+        return [
+            {**item, "content": item["content"].strip()}  # Trims whitespace in 'content'
+            for item in data if isinstance(item, dict) and "content" in item
+        ]
+    else:
+        raise TypeError("Input must be a string or a list of dictionaries with a 'content' field")
+# async def get_completions(
+#     prompt: Union[str, List[Dict[str, str]]],
+#     instructions: str
+# ) -> str:
+#     processed_prompt = process_input(prompt)  # Ensures the input format is correct
+#     if isinstance(processed_prompt, str):
+#         messages = [
+#             {"role": "system", "content": instructions},
+#             {"role": "user", "content": processed_prompt}
+#         ]
+#     elif isinstance(processed_prompt, list):
+#         messages = [{"role": "system", "content": instructions}] + processed_prompt
+#     else:
+#         raise TypeError("Unexpected processed input type.")
+#     response = await client.chat.completions.create(
+#         model=os.getenv("MODEL"),
+#         messages=messages,
+#         response_format={"type": "json_object"}
+#     )
+#     output: str = response.choices[0].message.content
+#     return output
+async def get_completions(
+    prompt: Union[str, List[Dict[str, str]]],
+    instructions: str
+) -> str:
+    if isinstance(prompt, list):
+        formatted_query = flatten_messages(prompt)
+    else:
+        formatted_query = prompt
+    processed_prompt = process_input(formatted_query)
+    messages = [{"role": "system", "content": instructions}]
+    if isinstance(processed_prompt, str):
+        messages.append({"role": "user", "content": processed_prompt})
+    elif isinstance(processed_prompt, list):
+        # Only keep the history for context and append the latest user query at the end
+        history = processed_prompt[:-1]
+        last_user_msg = processed_prompt[-1]
+        # Optional: Validate that the last message is from the user
+        if last_user_msg.get("role") != "user":
+            raise ValueError("Last message must be from the user.")
+        messages += history
+        messages.append(last_user_msg)
+    else:
+        raise TypeError("Unexpected processed input type.")
+    # print(os.getenv("MODEL"))
+    response = await client.chat.completions.create(
+        model=os.getenv("MODEL"),
+        messages=messages,
+        response_format={"type": "json_object"}
+    )
+    return response.choices[0].message.content  # adjust based on your client