{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/ryanrodriguez/src/Simplify/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", " from .autonotebook import tqdm as notebook_tqdm\n" ] } ], "source": [ "from ragas.llms import LangchainLLMWrapper\n", "from ragas.embeddings import LangchainEmbeddingsWrapper\n", "from langchain_openai import ChatOpenAI\n", "from langchain_openai import OpenAIEmbeddings\n", "\n", "generator_llm = LangchainLLMWrapper(ChatOpenAI(model=\"gpt-4o\"))\n", "generator_embeddings = LangchainEmbeddingsWrapper(OpenAIEmbeddings())" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "mkdir: static/: File exists\n", "mkdir: static/training_data: File exists\n", " % Total % Received % Xferd Average Speed Time Time Time Current\n", " Dload Upload Total Spent Left Speed\n", "100 340k 100 340k 0 0 2188k 0 --:--:-- --:--:-- --:--:-- 2196k\n" ] }, { "data": { "text/plain": [ "1" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "!mkdir static/\n", "!mkdir static/training_data\n", "!curl https://python.langchain.com/docs/tutorials/rag/ -o static/training_data/langchain_rag_tutorial.html\n", "\n", "from langchain_community.document_loaders import DirectoryLoader\n", "from langchain_community.document_loaders import BSHTMLLoader\n", "\n", "path = \"static/training_data/\"\n", "text_loader = DirectoryLoader(path, glob=\"*.html\", loader_cls=BSHTMLLoader)\n", "docs = text_loader.load()\n", "len(docs)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Generating personas: 100%|██████████| 1/1 [00:01<00:00, 1.24s/it] \n", "Generating Scenarios: 100%|██████████| 2/2 [00:05<00:00, 2.76s/it]\n", "Generating Samples: 100%|██████████| 10/10 [00:56<00:00, 5.68s/it]\n" ] } ], "source": [ "from ragas.testset import TestsetGenerator\n", "\n", "generator = TestsetGenerator(llm=generator_llm, embedding_model=generator_embeddings)\n", "dataset = generator.generate_with_langchain_docs(docs, testset_size=10)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "df = dataset.to_pandas()" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | user_input | \n", "reference_contexts | \n", "reference | \n", "synthesizer_name | \n", "
---|---|---|---|---|
0 | \n", "Cud yu explane how Pydantic is used in LangCha... | \n", "[Build a Retrieval Augmented Generation (RAG) ... | \n", "The context mentions 'How to use LangChain wit... | \n", "single_hop_specifc_query_synthesizer | \n", "
1 | \n", "how langsmith help when buildin rag apps with ... | \n", "[the most powerful applications enabled by LLM... | \n", "LangSmith can help trace and understand your a... | \n", "single_hop_specifc_query_synthesizer | \n", "
2 | \n", "Wht is RAG in the context of AI applcations? | \n", "[Retrieval and generation: the actual RAG chai... | \n", "RAG, or Retrieval and Generation, is a process... | \n", "single_hop_specifc_query_synthesizer | \n", "
3 | \n", "How does LangChain facilitate document retriev... | \n", "[Detailed walkthrough Let’s go through the ab... | \n", "LangChain facilitates document retrieval and g... | \n", "single_hop_specifc_query_synthesizer | \n", "
4 | \n", "How does LangGraph enhance the development of ... | \n", "[a TypedDict, but can also be a Pydantic BaseM... | \n", "LangGraph enhances the development of RAG appl... | \n", "single_hop_specifc_query_synthesizer | \n", "
5 | \n", "How does the LangGraph platform enhance the de... | \n", "[<1-hop>\\n\\na TypedDict, but can also be a Pyd... | \n", "The LangGraph platform enhances the developmen... | \n", "multi_hop_specific_query_synthesizer | \n", "
6 | \n", "How does LangChain utilize document loaders an... | \n", "[<1-hop>\\n\\nRetrieval and generation: the actu... | \n", "LangChain utilizes document loaders, such as t... | \n", "multi_hop_specific_query_synthesizer | \n", "
7 | \n", "How does LangChain, Inc. facilitate the develo... | \n", "[<1-hop>\\n\\nstr query: Search context: List[Do... | \n", "LangChain, Inc. facilitates the development of... | \n", "multi_hop_specific_query_synthesizer | \n", "
8 | \n", "How does the RAG technique facilitate sophisti... | \n", "[<1-hop>\\n\\nthe most powerful applications ena... | \n", "The RAG (Retrieval Augmented Generation) techn... | \n", "multi_hop_specific_query_synthesizer | \n", "
9 | \n", "How can LangChain JS/TS be utilized to build a... | \n", "[<1-hop>\\n\\nRetrieval and generation: the actu... | \n", "LangChain JS/TS can be utilized to build a Ret... | \n", "multi_hop_specific_query_synthesizer | \n", "