stevenbucaille commited on
Commit
5e05897
·
1 Parent(s): bf34015

Update README.md to remove outdated introductory content and add a link to the submission video, streamlining the overview of ScouterAI.

Browse files
Files changed (1) hide show
  1. README.md +1 -29
README.md CHANGED
@@ -14,32 +14,4 @@ short_description: The agent using over 9000 vision models from the HF Hub.
14
 
15
  # ScouterAI - The Vision enhanced Agent
16
 
17
- Welcome to ScouterAI, my [Agents - MCP Hackathon](https://huggingface.co/Agents-MCP-Hackathon) submission.
18
- This app falls under the track 3 : Agentic Demo.
19
- The goal of the app is to demonstrate the capabilities of agentic llm's combined with more "traditional" deep learning computer vision.
20
- LLM's (and VLM's) are great models when it comes to interacting with the user and understanding its queries but are not (yet) capable of a precise perception of the images presented to them.
21
- Computer Vision models like object detection or image segmentation models are tailored models to accomplish these tasks but require some engineering to wrap them and be user ready.
22
- The idea of the agentic demo is to provide powerful LLM with access to expert vision models like object detection or image segmentation models.
23
- The agent can fulfill precise perception task on any object present in the image : detection, location, classification, masking, counting, etc...
24
-
25
- ## Overview
26
-
27
- In this preliminary app, the agent is a CodeAgent provided by the smolagents framework.
28
- Its interface consists of a chat interface with example and a gallery which is used to display the agent's work.
29
- The agent is provided with a set of tools :
30
- - Task model retriever : a RAG tool which, given a task (object-detection or image-segmentation) and a query (car e.g.), returns a list of models with their model id and the list of classes it is capable of detecting/segmenting. The list if based on a curated dataset of all the models available on the HuggingFace Hub, returns the mo
31
- - Computer vision models : Any object detection and image segmentation models available of HuggingFace
32
- - Image processing functions : Resizing, cropping, ...
33
- - Image annotation functions : Label, bounding box and mask annotators
34
-
35
-
36
-
37
- To complete a user request
38
-
39
- ## Use-cases
40
-
41
- ## Stack
42
-
43
- Agent framework : smolagents
44
- LLM : Anthropic
45
- Compute : Modal
 
14
 
15
  # ScouterAI - The Vision enhanced Agent
16
 
17
+ [Submission video](https://youtu.be/FD8sZTjF5_4)