prithivMLmods
/

open-scene-detection

Image Classification

Scene-Detection

Model card Files Files and versions

open-scene-detection / README.md

prithivMLmods's picture

Update README.md

9aa3c3d verified about 2 months ago

|

history blame contribute delete

3.68 kB

	---
	license: apache-2.0
	datasets:
	- prithivMLmods/OpenScene-Classification
	language:
	- en
	base_model:
	- google/siglip-base-patch16-512
	pipeline_tag: image-classification
	library_name: transformers
	tags:
	- SigLIP2
	- Scene-Detection
	- buildings
	- forest
	- glacier
	- mountain
	- sea
	- street
	---

	![scene.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/31-sygJAsY1LaPKIeCylh.png)

	# open-scene-detection

	> open-scene-detection is a vision-language encoder model fine-tuned from [`siglip2-base-patch16-512`](https://huggingface.co/google/siglip-base-patch16-512) for multi-class scene classification. It is trained to recognize and categorize natural and urban scenes using a curated visual dataset. The model uses the `SiglipForImageClassification` architecture.

	```py
	Classification Report:
	precision recall f1-score support

	buildings 0.9755 0.9570 0.9662 2625
	forest 0.9989 0.9955 0.9972 2694
	glacier 0.9564 0.9517 0.9540 2671
	mountain 0.9540 0.9592 0.9566 2723
	sea 0.9934 0.9898 0.9916 2758
	street 0.9595 0.9819 0.9706 2874

	accuracy 0.9728 16345
	macro avg 0.9730 0.9725 0.9727 16345
	weighted avg 0.9729 0.9728 0.9728 16345
	```

	![download.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/oqlb8a1p6zJuNZSI9PgZO.png)

	---

	## Label Space: 6 Classes

	The model classifies an image into one of the following scenes:

	```
	Class 0: Buildings
	Class 1: Forest
	Class 2: Glacier
	Class 3: Mountain
	Class 4: Sea
	Class 5: Street
	```

	---

	## Install Dependencies

	```bash
	pip install -q transformers torch pillow gradio hf_xet
	```

	---

	## Inference Code

	```python
	import gradio as gr
	from transformers import AutoImageProcessor, SiglipForImageClassification
	from PIL import Image
	import torch

	# Load model and processor
	model_name = "prithivMLmods/open-scene-detection" # Updated model name
	model = SiglipForImageClassification.from_pretrained(model_name)
	processor = AutoImageProcessor.from_pretrained(model_name)

	# Updated label mapping
	id2label = {
	"0": "Buildings",
	"1": "Forest",
	"2": "Glacier",
	"3": "Mountain",
	"4": "Sea",
	"5": "Street"
	}

	def classify_image(image):
	image = Image.fromarray(image).convert("RGB")
	inputs = processor(images=image, return_tensors="pt")

	with torch.no_grad():
	outputs = model(**inputs)
	logits = outputs.logits
	probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()

	prediction = {
	id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
	}

	return prediction

	# Gradio Interface
	iface = gr.Interface(
	fn=classify_image,
	inputs=gr.Image(type="numpy"),
	outputs=gr.Label(num_top_classes=6, label="Scene Classification"),
	title="open-scene-detection",
	description="Upload an image to classify the scene into one of six categories: Buildings, Forest, Glacier, Mountain, Sea, or Street."
	)

	if __name__ == "__main__":
	iface.launch()
	```

	---

	## Intended Use

	`open-scene-detection` is designed for:

	* Scene Recognition – Automatically classify natural and urban scenes.
	* Environmental Mapping – Support geographic and ecological analysis from visual data.
	* Dataset Annotation – Efficiently label large-scale image datasets by scene.
	* Visual Search and Organization – Enable smart scene-based filtering or retrieval.
	* Autonomous Systems – Assist navigation and perception modules with scene understanding.