File size: 3,653 Bytes
b73cea4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
{
"cells": [
{
"cell_type": "markdown",
"id": "bf2cde26",
"metadata": {},
"source": [
"# First LLM Classifier\n",
"\n",
"Learn how journalists use large-language models to organize and analyze massive datasets\n",
"\n",
"## What you will learn\n",
"\n",
"This class will give you hands-on experience creating a machine-learning model that can read and categorize the text recorded in newsworthy datasets.\n",
"\n",
"It will teach you how to:\n",
"\n",
"- Submit large-language model prompts with the Python programming language\n",
"- Write structured prompts that can classify text into predefined categories\n",
"- Submit dozens of prompts at once as part of an automated routine\n",
"- Evaluate results using a rigorous, scientific approach\n",
"- Improve results by training the model with rules and examples\n",
"\n",
"By the end, you will understand how LLM classifiers can outperform traditional machine-learning methods with significantly less code. And you will be ready to write a classifier on your own.\n",
"\n",
"## Who can take it\n",
"\n",
"This course is free. Anyone who has dabbled with code and AI is qualified to work through the materials. A curious mind and good attitude are all that’s required, but a familiarity with Python will certainly come in handy.\n",
"\n",
"💬 Need help or want to connect with others? Join the **Journalists on Hugging Face** community by signing up for our Slack group [here](https://forms.gle/JMCULh3jEdgFEsJu5).\n",
"\n",
"## Table of contents\n",
"\n",
"- [1. What we’ll do](ch1-what-we-will-do.ipynb) \n",
"- [2. The LLM advantage](ch2-the-LLM-advantage.ipynb) \n",
"- [3. Getting started with Hugging Face](ch3-getting-started-with-hf.ipynb) \n",
"- [4. Installing JupyterLab (optional)](ch4-installing-jupyterlab.ipynb) \n",
"- [5. Prompting with Python](ch5-prompting-with-python.ipynb) \n",
"- [6. Structured responses](ch6-structured-responses.ipynb) \n",
"- [7. Bulk prompts](ch7-bulk-prompts.ipynb) \n",
"- [8. Evaluating prompts](ch8-evaluating-prompts.ipynb) \n",
"- [9. Improving prompts](ch9-improving-prompts.ipynb) \n",
"- [10. Sharing your app with Gradio](ch10-sharing-with-gradio.ipynb)\n",
"\n",
"## About this class\n",
"[Ben Welsh](https://palewi.re/who-is-ben-welsh/) and [Derek Willis](https://thescoop.org/about/) prepared this guide for [a training session](https://schedules.ire.org/nicar-2025/index.html#2045) at the National Institute for Computer-Assisted Reporting’s 2025 conference in Minneapolis. \n",
"The project was adapted to run on Hugging Face by [Florent Daudens](https://www.linkedin.com/in/fdaudens/). \n",
"\n",
"Some of the copy was written with the assistance of GitHub’s Copilot, an AI-powered text generator. The materials are available as free and open source.\n",
"\n",
"**[1. What we’ll do →](ch1-what-we-will-do.ipynb)**"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "02477b14-edff-4380-ad41-9954b6c80863",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
|