File size: 2,585 Bytes
d3b2b70 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
# Scene Graph Generator API
This repository provides an API endpoint for generating scene graphs from images. Upload an image, and the API returns the annotated image, a visual graph representation, and the detected relationships between objects.
## API Usage
### Endpoint
```
POST https://dixisouls-scene-graph-generator.hf.space/generate
```
### Parameters
- `image`: The image file to analyze (multipart/form-data)
- `confidence_threshold`: A value between 0 and 1 (default: 0.5)
- `use_fixed_boxes`: Boolean value (default: false)
### Response
The API returns a JSON response with:
```json
{
"objects": [
{
"label": "person",
"label_id": 1,
"score": 0.91,
"bbox": [0.3, 0.4, 0.1, 0.3]
},
...
],
"relationships": [
{
"subject": "person",
"predicate": "riding",
"object": "bicycle",
"score": 0.82,
"subject_id": 0,
"object_id": 1,
"predicate_id": 5
},
...
],
"annotated_image": "base64_encoded_image_data",
"graph_image": "base64_encoded_image_data"
}
```
## Example Usage
### Python
```python
import requests
import base64
from PIL import Image
import io
# Prepare the image
image_path = "your_image.jpg"
files = {'image': open(image_path, 'rb')}
# Set parameters
data = {
'confidence_threshold': 0.5,
'use_fixed_boxes': False
}
# Make the API call
api_url = "https://dixisouls-scene-graph-generator.hf.space/generate"
response = requests.post(api_url, files=files, data=data)
# Process the results
if response.status_code == 200:
result = response.json()
# Decode and save the images
annotated_image = Image.open(io.BytesIO(base64.b64decode(result['annotated_image'])))
annotated_image.save("annotated_image.jpg")
graph_image = Image.open(io.BytesIO(base64.b64decode(result['graph_image'])))
graph_image.save("graph_image.jpg")
# Print information about objects and relationships
print(f"Found {len(result['objects'])} objects and {len(result['relationships'])} relationships")
else:
print(f"Error: {response.text}")
```
### cURL
```bash
curl -X POST \
-F "image=@your_image.jpg" \
-F "confidence_threshold=0.5" \
-F "use_fixed_boxes=false" \
https://dixisouls-scene-graph-generator.hf.space/generate
```
## Model Information
This API uses:
- YOLOv8 for object detection
- A custom neural network for relationship prediction
- PyTorch as the deep learning framework
## License
This project is licensed under the MIT License.
## Author
Created by [dixisouls](https://github.com/dixisouls) |