Spaces:

Agents-MCP-Hackathon
/

ScouterAI

Running

App Files Files Community

stevenbucaille commited on 9 days ago

Commit

5e05897

1 Parent(s): bf34015

Update README.md to remove outdated introductory content and add a link to the submission video, streamlining the overview of ScouterAI.

Browse files

Files changed (1) hide show

README.md +1 -29

README.md CHANGED Viewed

@@ -14,32 +14,4 @@ short_description: The agent using over 9000 vision models from the HF Hub.
 # ScouterAI - The Vision enhanced Agent
-Welcome to ScouterAI, my [Agents - MCP Hackathon](https://huggingface.co/Agents-MCP-Hackathon) submission.
-This app falls under the track 3 : Agentic Demo.
-The goal of the app is to demonstrate the capabilities of agentic llm's combined with more "traditional" deep learning computer vision.
-LLM's (and VLM's) are great models when it comes to interacting with the user and understanding its queries but are not (yet) capable of a precise perception of the images presented to them.
-Computer Vision models like object detection or image segmentation models are tailored models to accomplish these tasks but require some engineering to wrap them and be user ready.
-The idea of the agentic demo is to provide powerful LLM with access to expert vision models like object detection or image segmentation models.
-The agent can fulfill precise perception task on any object present in the image : detection, location, classification, masking, counting, etc...
-## Overview
-In this preliminary app, the agent is a CodeAgent provided by the smolagents framework.
-Its interface consists of a chat interface with example and a gallery which is used to display the agent's work.
-The agent is provided with a set of tools :
-- Task model retriever : a RAG tool which, given a task (object-detection or image-segmentation) and a query (car e.g.), returns a list of models with their model id and the list of classes it is capable of detecting/segmenting. The list if based on a curated dataset of all the models available on the HuggingFace Hub, returns the mo
-- Computer vision models : Any object detection and image segmentation models available of HuggingFace
-- Image processing functions : Resizing, cropping, ...
-- Image annotation functions : Label, bounding box and mask annotators
-To complete a user request
-## Use-cases
-## Stack
-Agent framework : smolagents
-LLM : Anthropic
-Compute : Modal


14
15	# ScouterAI - The Vision enhanced Agent
16
17	+ [Submission video](https://youtu.be/FD8sZTjF5_4)