stevenbucaille commited on
Commit
bda8e49
Β·
verified Β·
1 Parent(s): 72a26fd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -2
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
- title: VisionEnhancedAgent
3
- emoji: πŸš€
4
  colorFrom: green
5
  colorTo: gray
6
  sdk: gradio
@@ -11,3 +11,30 @@ license: apache-2.0
11
  tag: agent-demo-track
12
  ---
13
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: ScouterAI
3
+ emoji: πŸ‘“
4
  colorFrom: green
5
  colorTo: gray
6
  sdk: gradio
 
11
  tag: agent-demo-track
12
  ---
13
 
14
+ # ScouterAI - The Vision enhanced Agent
15
+
16
+ Welcome to ScouterAI, my [Agents - MCP Hackathon](https://huggingface.co/Agents-MCP-Hackathon) submission.
17
+ This app falls under the track 3 : Agentic Demo.
18
+ The goal of the app is to demonstrate the capabilities of agentic llm's combined with more "traditional" deep learning computer vision.
19
+ LLM's (and VLM's) are great models when it comes to interacting with the user and understanding its queries but are not (yet) capable of a precise perception of the images presented to them.
20
+ Computer Vision models like object detection or image segmentation models are tailored models to accomplish these tasks but require some engineering to wrap them and be user ready.
21
+ The idea of the agentic demo is to provide powerful LLM with access to expert vision models like object detection or image segmentation models.
22
+ The agent can fulfill precise perception task on any object present in the image : detection, location, classification, masking, counting, etc...
23
+
24
+ ##
25
+
26
+ In this preliminary app, the agent is a CodeAgent (provided by the smolagents framework) provided with access to a set of tools :
27
+ - Any object detection and image segmentation models available of HuggingFace
28
+ - Image processing functions
29
+ - Image annotation functions
30
+
31
+ To complete a user request
32
+
33
+ ## Use-cases
34
+
35
+ ## Stack
36
+
37
+ Agent framework : smolagents
38
+ LLM : Anthropic
39
+ Compute : Modal
40
+