{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# settings\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1. install neollm\n", "\n", "[Document インストール方法](https://www.notion.so/c760d96f1b4240e6880a32bee96bba35)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# githubのssh接続してね\n", "# versionは適宜変更してね\n", "%pip install git+https://github.com/neoAI-inc/neo-llm-module.git@v1.2.6\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2 環境変数の設定方法\n", "\n", "[Document env ファイルの作り方](https://www.notion.so/env-32ebb04105684a77bbc730c39865df34)\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "環境変数読み込み成功\n" ] } ], "source": [ "from dotenv import load_dotenv\n", "\n", "env_path = \".env\" # .envのpath 適宜変更\n", "if load_dotenv(env_path):\n", " print(\"環境変数読み込み成功\")\n", "else:\n", " print(\"path違うよ〜\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# neoLLM  使い方\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "neollm は、前処理・LLM のリクエスト・後処理を 1 つのクラスにした、Pytorch 的な記法で書ける neoAI の LLM 統一ライブラリ。\n", "\n", "大きく 2 種類のクラスがあり、MyLLM は 1 つのリクエスト、MyL3M2 は複数のリクエストを受け持つことができる。\n", "\n", "![概観図](../asset/external_view.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### モデルの定義\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[43mWARNING: AZURE_API_BASEではなく、AZURE_OPENAI_ENDPOINTにしてね\u001b[0m\n", "\u001b[43mWARNING: AZURE_API_VERSIONではなく、OPENAI_API_VERSIONにしてね\u001b[0m\n" ] } ], "source": [ "from neollm import MyLLM\n", "\n", "# 例: 翻訳をするclass\n", "# _preprocess, _postprocessを必ず書く\n", "\n", "\n", "class Translator(MyLLM):\n", " # _preprocessは、前処理をしてMessageを作る関数\n", " def _preprocess(self, inputs: str):\n", " messages = [\n", " {\"role\": \"system\", \"content\": \"英語を日本語に翻訳するAIです。\"},\n", " {\"role\": \"user\", \"content\": inputs},\n", " ]\n", " return messages\n", "\n", " # _postprocessは、APIのResponseを後処理をして、欲しいものを返す関数\n", " def _postprocess(self, response):\n", " text_translated: str = str(response.choices[0].message.content)\n", " return text_translated" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### モデルの呼び出し\n" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[41mPARENT\u001b[0m\n", "MyLLM(Translator) ----------------------------------------------------------------------------------\n", "\u001b[34m[inputs]\u001b[0m\n", "\"Hello, We are neoAI.\"\n", "\u001b[34m[messages]\u001b[0m\n", " \u001b[32msystem\u001b[0m\n", " 英語を日本語に翻訳するAIです。\n", " \u001b[32muser\u001b[0m\n", " Hello, We are neoAI.\n", " \u001b[32massistant\u001b[0m\n", " こんにちは、私たちはneoAIです。\n", "\u001b[34m[outputs]\u001b[0m\n", "\"こんにちは、私たちはneoAIです。\"\n", "\u001b[34m[client_settings]\u001b[0m -\n", "\u001b[34m[llm_settings]\u001b[0m {'platform': 'azure', 'temperature': 1, 'model': 'gpt-3.5-turbo-0613', 'engine': 'neoai-free-swd-gpt-35-0613'}\n", "\u001b[34m[metadata]\u001b[0m 1.6s; 45(36+9)tokens; $6.8e-05; ¥0.0095\n", "----------------------------------------------------------------------------------------------------\n", "こんにちは、私たちはneoAIです。\n" ] } ], "source": [ "# 初期化 (platformやmodelなど設定をしておく)\n", "# 詳細: https://www.notion.so/neollm-MyLLM-581cd7562df9473b91c981d88469c452?pvs=4#ac5361a5e3fa46a48441fdd538858fee\n", "translator = Translator(\n", " platform=\"azure\", # azure or openai\n", " model=\"gpt-3.5-turbo-0613\", # gpt-3.5-turbo-1106, gpt-4-turbo-1106\n", " llm_settings={\"temperature\": 1}, # llmの設定 dictで渡す\n", ")\n", "\n", "# 呼び出し\n", "# preprocessでinputsとしたものを入力として、postprocessで処理したものを出力とする。\n", "translated_text = translator(inputs=\"Hello, We are neoAI.\")\n", "print(translated_text)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "時間 1.5658628940582275\n", "token数 TokenInfo(input=36, output=9, total=45)\n", "token数合計 45\n", "値段(USD) PriceInfo(input=5.4e-05, output=1.8e-05, total=6.75e-05)\n", "値段数合計(USD) 6.75e-05\n" ] } ], "source": [ "# 処理時間\n", "print(\"時間\", translator.time)\n", "# トークン数\n", "print(\"token数\", translator.token)\n", "print(\"token数合計\", translator.token.total)\n", "# 値段の取得\n", "print(\"値段(USD)\", translator.price)\n", "print(\"値段数合計(USD)\", translator.price.total)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "inputs Hello, We are neoAI.\n", "messages [{'role': 'system', 'content': '英語を日本語に翻訳するAIです。'}, {'role': 'user', 'content': 'Hello, We are neoAI.'}]\n", "response ChatCompletion(id='chatcmpl-8T5MkidV9bhqewdzcUwO1PioHOSHi', choices=[Choice(finish_reason='stop', index=0, message=ChatCompletionMessage(content='こんにちは、私たちはneoAIです。', role='assistant', function_call=None, tool_calls=None), content_filter_results={'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}})], created=1701942830, model='gpt-35-turbo', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=9, prompt_tokens=36, total_tokens=45), prompt_filter_results=[{'prompt_index': 0, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}])\n", "outputs こんにちは、私たちはneoAIです。\n", "chat_history [{'role': 'system', 'content': '英語を日本語に翻訳するAIです。'}, {'role': 'user', 'content': 'Hello, We are neoAI.'}, {'content': 'こんにちは、私たちはneoAIです。', 'role': 'assistant'}]\n" ] } ], "source": [ "# その他property\n", "print(\"inputs\", translator.inputs)\n", "print(\"messages\", translator.messages)\n", "print(\"response\", translator.response)\n", "print(\"outputs\", translator.outputs)\n", "\n", "print(\"chat_history\", translator.chat_history)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# neoLLM  例\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1-1 MyLLM (ex. 翻訳)\n" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "from neollm import MyLLM\n", "from neollm.utils.preprocess import optimize_token\n", "from neollm.utils.postprocess import strip_string\n", "\n", "\n", "class Translator(MyLLM):\n", " def _preprocess(self, inputs):\n", " system_prompt = (\n", " \"You are a good translator. Translate Japanese into English or English into Japanese.\\n\"\n", " \"# output_format:\\n\\n{translated text in English or Japanese}\"\n", " )\n", " user_prompt = \"\\n\" f\"'''{inputs['text'].strip()}'''\"\n", " messages = [\n", " {\"role\": \"system\", \"content\": optimize_token(system_prompt)},\n", " {\"role\": \"user\", \"content\": optimize_token(user_prompt)},\n", " ]\n", " return messages\n", "\n", " def _ruleprocess(self, inputs):\n", " # 例外処理\n", " if inputs[\"text\"].strip() == \"\":\n", " return {\"text_translated\": \"\"}\n", " # APIリクエストを送る場合はNone\n", " return None\n", "\n", " def _update_settings(self):\n", " # 入力によってAPIの設定を変更する\n", "\n", " # トークン数: self.llm.count_tokens(self.messsage)\n", "\n", " # モデル変更: self.model = \"gpt-3.5-turbo-16k\"\n", "\n", " # パラメータ変更: self.llm_settings = {\"temperature\": 0.2}\n", "\n", " # 入力が多い時に16kを使う(1106の場合はやらなくていい)\n", " if self.messages is not None:\n", " if self.llm.count_tokens(self.messages) >= 1600:\n", " self.model = \"gpt-3.5-turbo-16k-0613\"\n", " else:\n", " self.model = \"gpt-3.5-turbo-0613\"\n", "\n", " def _postprocess(self, response):\n", " text_translated: str = str(response.choices[0].message.content)\n", " text_translated = strip_string(text=text_translated, first_character=[\"\", \"\"])\n", " outputs = {\"text_translated\": text_translated}\n", " return outputs" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[41mPARENT\u001b[0m\n", "MyLLM(Translator) ----------------------------------------------------------------------------------\n", "\u001b[34m[inputs]\u001b[0m\n", "{\n", " \"text\": \"大規模LLMモデル\"\n", "}\n", "\u001b[34m[messages]\u001b[0m\n", " \u001b[32msystem\u001b[0m\n", " You are a good translator. Translate Japanese into English or English into Japanese.\n", " # output_format:\n", " \n", " {translated text in English or Japanese}\n", " \u001b[32muser\u001b[0m\n", " \n", " '''大規模LLMモデル'''\n", " \u001b[32massistant\u001b[0m\n", " \n", " \"Large-Scale LLM Model\"\n", "\u001b[34m[outputs]\u001b[0m\n", "{\n", " \"text_translated\": \"Large-Scale LLM Model\"\n", "}\n", "\u001b[34m[client_settings]\u001b[0m -\n", "\u001b[34m[llm_settings]\u001b[0m {'platform': 'azure', 'temperature': 1, 'model': 'gpt-3.5-turbo-0613', 'engine': 'neoai-free-swd-gpt-35-0613'}\n", "\u001b[34m[metadata]\u001b[0m 1.5s; 66(55+11)tokens; $9.9e-05; ¥0.014\n", "----------------------------------------------------------------------------------------------------\n", "{'text_translated': 'Large-Scale LLM Model'}\n" ] } ], "source": [ "translator = Translator(\n", " llm_settings={\"temperature\": 1}, # defaultは、{\"temperature\": 0}\n", " model=\"gpt-3.5-turbo-0613\", # defaultは、DEFAULT_MODEL_NAME\n", " platform=\"azure\", # defaultは、LLM_PLATFORM\n", " verbose=True,\n", " silent_list=[], # 表示しないもの\n", ")\n", "output_1 = translator(inputs={\"text\": \"大規模LLMモデル\"})\n", "print(output_1)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[43mWARNING: model_nameに日付を指定してください\u001b[0m\n", "model_name: gpt-3.5-turbo -> gpt-3.5-turbo-0613\n", "\u001b[41mPARENT\u001b[0m\n", "MyLLM(Translator) ----------------------------------------------------------------------------------\n", "\u001b[34m[inputs]\u001b[0m\n", "{\n", " \"text\": \"Large LLM Model\"\n", "}\n", "\u001b[34m[messages]\u001b[0m\n", " \u001b[32msystem\u001b[0m\n", " You are a good translator. Translate Japanese into English or English into Japanese.\n", " # output_format:\n", " \n", " {translated text in English or Japanese}\n", " \u001b[32muser\u001b[0m\n", " \n", " '''Large LLM Model'''\n", "\u001b[43mWARNING: model_nameに日付を指定してください\u001b[0m\n", "model_name: gpt-3.5-turbo -> gpt-3.5-turbo-0613\n", " \u001b[32massistant\u001b[0m\n", " \n", " 大きなLLMモデル\n", "\u001b[34m[outputs]\u001b[0m\n", "{\n", " \"text_translated\": \"大きなLLMモデル\"\n", "}\n", "\u001b[34m[client_settings]\u001b[0m -\n", "\u001b[34m[llm_settings]\u001b[0m {'platform': 'openai', 'temperature': 0, 'model': 'gpt-3.5-turbo-0613'}\n", "\u001b[34m[metadata]\u001b[0m 0.9s; 61(49+12)tokens; $9.2e-05; ¥0.013\n", "----------------------------------------------------------------------------------------------------\n", "{'text_translated': '大きなLLMモデル'}\n" ] } ], "source": [ "translator = Translator(\n", " platform=\"openai\", # <- 変えてみる\n", " verbose=True,\n", ")\n", "output_1 = translator(inputs={\"text\": \"Large LLM Model\"})\n", "print(output_1)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[43mWARNING: model_nameに日付を指定してください\u001b[0m\n", "model_name: gpt-3.5-turbo -> gpt-3.5-turbo-0613\n", "\u001b[41mPARENT\u001b[0m\n", "MyLLM(Translator) ----------------------------------------------------------------------------------\n", "\u001b[34m[inputs]\u001b[0m\n", "{\n", " \"text\": \"\"\n", "}\n", "\u001b[34m[outputs]\u001b[0m\n", "{\n", " \"text_translated\": \"\"\n", "}\n", "\u001b[34m[client_settings]\u001b[0m -\n", "\u001b[34m[llm_settings]\u001b[0m {'platform': 'azure', 'temperature': 0, 'model': 'gpt-3.5-turbo-0613', 'engine': 'neoai-free-swd-gpt-35-0613'}\n", "\u001b[34m[metadata]\u001b[0m 0.0s; 0(0+0)tokens; $0; ¥0\n", "----------------------------------------------------------------------------------------------------\n" ] }, { "data": { "text/plain": [ "{'text_translated': ''}" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# ルールベースが起動\n", "data = {\"text\": \"\"}\n", "translator = Translator(verbose=True)\n", "translator(data)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[43mWARNING: model_nameに日付を指定してください\u001b[0m\n", "model_name: gpt-3.5-turbo -> gpt-3.5-turbo-0613\n", "\u001b[41mPARENT\u001b[0m\n", "MyLLM(Translator) ----------------------------------------------------------------------------------\n", "\u001b[34m[inputs]\u001b[0m\n", "{\n", " \"text\": \"こんにちは!!\\nこんにちは?こんにちは?\"\n", "}\n", "\u001b[34m[messages]\u001b[0m\n", " \u001b[32msystem\u001b[0m\n", " You are a good translator. Translate Japanese into English or English into Japanese.\n", " # output_format:\n", " \n", " {translated text in English or Japanese}\n", " \u001b[32muser\u001b[0m\n", " \n", " '''こんにちは!!\n", " こんにちは?こんにちは?'''\n", "\u001b[43mWARNING: model_nameに日付を指定してください\u001b[0m\n", "model_name: gpt-3.5-turbo -> gpt-3.5-turbo-0613\n", " \u001b[32massistant\u001b[0m\n", " \n", " Hello!!\n", " Hello? Hello?\n", "\u001b[34m[outputs]\u001b[0m\n", "{\n", " \"text_translated\": \"Hello!!\\nHello? Hello?\"\n", "}\n", "\u001b[34m[client_settings]\u001b[0m -\n", "\u001b[34m[llm_settings]\u001b[0m {'platform': 'azure', 'temperature': 0, 'model': 'gpt-3.5-turbo-0613', 'engine': 'neoai-free-swd-gpt-35-0613'}\n", "\u001b[34m[metadata]\u001b[0m 1.4s; 60(51+9)tokens; $9e-05; ¥0.013\n", "----------------------------------------------------------------------------------------------------\n" ] }, { "data": { "text/plain": [ "{'text_translated': 'Hello!!\\nHello? Hello?'}" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = {\"text\": \"こんにちは!!\\nこんにちは?こんにちは?\"}\n", "translator = Translator(verbose=True)\n", "translator(data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 情報抽出\n" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [], "source": [ "from neollm import MyLLM\n", "from neollm.utils.preprocess import optimize_token, dict2json\n", "from neollm.utils.postprocess import json2dict\n", "\n", "\n", "class Extractor(MyLLM):\n", " def _preprocess(self, inputs):\n", " system_prompt = \"から、にしたがって、情報を抽出しなさい。\"\n", " output_format = {\"date\": \"yy-mm-dd形式 日付\", \"event\": \"起きたことを簡潔に。\"}\n", " user_prompt = (\n", " \"\\n\"\n", " \"```\\n\"\n", " f\"{inputs['info'].strip()}\\n\"\n", " \"```\\n\"\n", " \"\\n\"\n", " \"\\n\"\n", " \"```json\\n\"\n", " f\"{dict2json(output_format)}\\n\"\n", " \"```\"\n", " )\n", "\n", " messages = [\n", " {\"role\": \"system\", \"content\": optimize_token(system_prompt)},\n", " {\"role\": \"user\", \"content\": optimize_token(user_prompt)},\n", " ]\n", " return messages\n", "\n", " def _ruleprocess(self, inputs):\n", " # 例外処理\n", " if inputs[\"info\"].strip() == \"\":\n", " return {\"date\": \"\", \"event\": \"\"}\n", " # APIリクエストを送る場合はNone\n", " return None\n", "\n", " def _postprocess(self, response):\n", " return json2dict(response.choices[0].message.content)" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[41mPARENT\u001b[0m\n", "MyLLM(Extractor) -----------------------------------------------------------------------------------\n", "\u001b[34m[inputs]\u001b[0m\n", "{\n", " \"info\": \"2021年6月13日に、neoAIのサービスが始まりました。\"\n", "}\n", "\u001b[34m[messages]\u001b[0m\n", " \u001b[32msystem\u001b[0m\n", " から、にしたがって、情報を抽出しなさい。\n", " \u001b[32muser\u001b[0m\n", " \n", " ```\n", " 2021年6月13日に、neoAIのサービスが始まりました。\n", " ```\n", " \n", " \n", " ```json\n", " {\n", " \"date\": \"yy-mm-dd形式 日付\",\n", " \"event\": \"起きたことを簡潔に。\"\n", " }\n", " ```\n", " \u001b[32massistant\u001b[0m\n", " ```json\n", " {\n", " \"date\": \"2021-06-13\",\n", " \"event\": \"neoAIのサービスが始まりました。\"\n", " }\n", " ```\n", "\u001b[34m[outputs]\u001b[0m\n", "{\n", " \"date\": \"2021-06-13\",\n", " \"event\": \"neoAIのサービスが始まりました。\"\n", "}\n", "\u001b[34m[client_settings]\u001b[0m -\n", "\u001b[34m[llm_settings]\u001b[0m {'platform': 'azure', 'temperature': 0, 'model': 'gpt-3.5-turbo-0613', 'engine': 'neoai-free-swd-gpt-35-0613'}\n", "\u001b[34m[metadata]\u001b[0m 1.6s; 143(106+37)tokens; $0.00021; ¥0.03\n", "----------------------------------------------------------------------------------------------------\n" ] }, { "data": { "text/plain": [ "{'date': '2021-06-13', 'event': 'neoAIのサービスが始まりました。'}" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "extractor = Extractor(model=\"gpt-3.5-turbo-0613\")\n", "\n", "extractor(inputs={\"info\": \"2021年6月13日に、neoAIのサービスが始まりました。\"})" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[41mPARENT\u001b[0m\n", "MyLLM(Extractor) -----------------------------------------------------------------------------------\n", "\u001b[34m[inputs]\u001b[0m\n", "{\n", " \"info\": \"1998年4月1日に、neoAI大学が設立されました。\"\n", "}\n", "\u001b[34m[messages]\u001b[0m\n", " \u001b[32msystem\u001b[0m\n", " から、にしたがって、情報を抽出しなさい。\n", " \u001b[32muser\u001b[0m\n", " \n", " ```\n", " 1998年4月1日に、neoAI大学が設立されました。\n", " ```\n", " \n", " \n", " ```json\n", " {\n", " \"date\": \"yy-mm-dd形式 日付\",\n", " \"event\": \"起きたことを簡潔に。\"\n", " }\n", " ```\n", " \u001b[32massistant\u001b[0m\n", " \n", " ```json\n", " {\n", " \"date\": \"1998-04-01\",\n", " \"event\": \"neoAI大学の設立\"\n", " }\n", " ```\n", "\u001b[34m[outputs]\u001b[0m\n", "{\n", " \"date\": \"1998-04-01\",\n", " \"event\": \"neoAI大学の設立\"\n", "}\n", "\u001b[34m[client_settings]\u001b[0m -\n", "\u001b[34m[llm_settings]\u001b[0m {'platform': 'azure', 'temperature': 0, 'model': 'gpt-3.5-turbo-0613', 'engine': 'neoai-free-swd-gpt-35-0613'}\n", "\u001b[34m[metadata]\u001b[0m 1.6s; 139(104+35)tokens; $0.00021; ¥0.029\n", "----------------------------------------------------------------------------------------------------\n" ] }, { "data": { "text/plain": [ "{'date': '1998-04-01', 'event': 'neoAI大学の設立'}" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "extractor = Extractor(model=\"gpt-3.5-turbo-0613\")\n", "\n", "extractor(inputs={\"info\": \"1998年4月1日に、neoAI大学が設立されました。\"})" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.11" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }