Spaces:

Tonic
/

florence-pdf

Sleeping

App Files Files Community

Tonic commited on Sep 15, 2024

Commit

8db717e

unverified ·

1 Parent(s): 8c9559c

improve gradio blocks interface

Browse files

Files changed (1) hide show

app.py +7 -14

app.py CHANGED Viewed

@@ -12,12 +12,8 @@ import matplotlib.patches as patches
 from matplotlib.patches import Polygon
 import numpy as np
 import random
-import json
-with open("config.json", "r") as f:
-    config = json.load(f)
 d_model = config['text_config']['d_model']
 num_layers = config['text_config']['encoder_layers']
 attention_heads = config['text_config']['encoder_attention_heads']
@@ -32,10 +28,15 @@ temporal_embeddings = config['vision_config']['visual_temporal_embedding']['max_
 title = """# 🙋🏻‍♂️Welcome to Tonic's PLeIAs/📸📈✍🏻Florence-PDF"""
 description = """
----
 This application showcases the **PLeIAs/📸📈✍🏻Florence-PDF** model, a powerful AI system designed for both **text and image generation tasks**. The model is capable of handling complex tasks such as object detection, image captioning, OCR (Optical Character Recognition), and detailed region-based image analysis.
 ### **How to Use**:
 1. **Upload an Image**: Select an image for processing.
 2. **Choose a Task**: Pick a task from the dropdown menu, such as "Caption", "Object Detection", "OCR", etc.
@@ -50,8 +51,6 @@ You can reset the interface anytime by clicking the **Reset** button.
 - **📸✍🏻OCR**: Extract text from the image.
 - **📸Region Proposal**: Detect key regions in the image for detailed captioning.
----
 ### Join us :
 🌟TeamTonic🌟 is always making cool demos! Join our active builder's 🛠️community 👻 [![Join us on Discord](https://img.shields.io/discord/1109943800132010065?label=Discord&logo=discord&style=flat-square)](https://discord.gg/qdfnvSPcqP) On 🤗Huggingface:[MultiTransformer](https://huggingface.co/MultiTransformer) On 🌐Github: [Tonic-AI](https://github.com/tonic-ai) & contribute to🌟 [Build Tonic](https://git.tonic-ai.com/contribute)🤗Big thanks to Yuvi Sharma and all the folks at huggingface for the community grant 🤗
 """
@@ -77,12 +76,6 @@ In addition to text tasks, 🙏🏻PLeIAs/📸📈✍🏻Florence-PDF also incor
 - **Patch-based Image Processing**: The vision component operates on image patches with a patch size of **{patch_size}x{patch_size}**.
 - **Temporal Embedding**: Visual tasks benefit from temporal embeddings with up to **{temporal_embeddings} steps**, making Florence-2 well-suited for video analysis.
-### Model Usage and Flexibility
-- **No Repeat N-Grams**: To reduce repetition in text generation, the model is configured with a **no_repeat_ngram_size** of **{no_repeat_ngram_size}**, ensuring more diverse and meaningful outputs.
-- **Sampling Strategies**: 🙏🏻PLeIAs/📸📈✍🏻Florence-PDF offers flexible sampling strategies, including **top-k** and **top-p (nucleus) sampling**, allowing for both creative and constrained generation based on user needs.
-📸📈✍🏻Florence-PDF is a robust model capable of handling various **text and image** tasks with high precision and flexibility, making it a valuable tool for both academic research and practical applications.
 """
 device = "cuda" if torch.cuda.is_available() else "cpu"

 from matplotlib.patches import Polygon
 import numpy as np
 import random
 d_model = config['text_config']['d_model']
 num_layers = config['text_config']['encoder_layers']
 attention_heads = config['text_config']['encoder_attention_heads']
 title = """# 🙋🏻‍♂️Welcome to Tonic's PLeIAs/📸📈✍🏻Florence-PDF"""
 description = """
 This application showcases the **PLeIAs/📸📈✍🏻Florence-PDF** model, a powerful AI system designed for both **text and image generation tasks**. The model is capable of handling complex tasks such as object detection, image captioning, OCR (Optical Character Recognition), and detailed region-based image analysis.
+### Model Usage and Flexibility
+- **No Repeat N-Grams**: To reduce repetition in text generation, the model is configured with a **no_repeat_ngram_size** of **{no_repeat_ngram_size}**, ensuring more diverse and meaningful outputs.
+- **Sampling Strategies**: 🙏🏻PLeIAs/📸📈✍🏻Florence-PDF offers flexible sampling strategies, including **top-k** and **top-p (nucleus) sampling**, allowing for both creative and constrained generation based on user needs.
+📸📈✍🏻Florence-PDF is a robust model capable of handling various **text and image** tasks with high precision and flexibility, making it a valuable tool for both academic research and practical applications.
 ### **How to Use**:
 1. **Upload an Image**: Select an image for processing.
 2. **Choose a Task**: Pick a task from the dropdown menu, such as "Caption", "Object Detection", "OCR", etc.
 - **📸✍🏻OCR**: Extract text from the image.
 - **📸Region Proposal**: Detect key regions in the image for detailed captioning.
 ### Join us :
 🌟TeamTonic🌟 is always making cool demos! Join our active builder's 🛠️community 👻 [![Join us on Discord](https://img.shields.io/discord/1109943800132010065?label=Discord&logo=discord&style=flat-square)](https://discord.gg/qdfnvSPcqP) On 🤗Huggingface:[MultiTransformer](https://huggingface.co/MultiTransformer) On 🌐Github: [Tonic-AI](https://github.com/tonic-ai) & contribute to🌟 [Build Tonic](https://git.tonic-ai.com/contribute)🤗Big thanks to Yuvi Sharma and all the folks at huggingface for the community grant 🤗
 """
 - **Patch-based Image Processing**: The vision component operates on image patches with a patch size of **{patch_size}x{patch_size}**.
 - **Temporal Embedding**: Visual tasks benefit from temporal embeddings with up to **{temporal_embeddings} steps**, making Florence-2 well-suited for video analysis.
 """
 device = "cuda" if torch.cuda.is_available() else "cpu"