Tonic commited on
Commit
8db717e
·
unverified ·
1 Parent(s): 8c9559c

improve gradio blocks interface

Browse files
Files changed (1) hide show
  1. app.py +7 -14
app.py CHANGED
@@ -12,12 +12,8 @@ import matplotlib.patches as patches
12
  from matplotlib.patches import Polygon
13
  import numpy as np
14
  import random
15
- import json
16
 
17
 
18
- with open("config.json", "r") as f:
19
- config = json.load(f)
20
-
21
  d_model = config['text_config']['d_model']
22
  num_layers = config['text_config']['encoder_layers']
23
  attention_heads = config['text_config']['encoder_attention_heads']
@@ -32,10 +28,15 @@ temporal_embeddings = config['vision_config']['visual_temporal_embedding']['max_
32
 
33
  title = """# 🙋🏻‍♂️Welcome to Tonic's PLeIAs/📸📈✍🏻Florence-PDF"""
34
  description = """
35
- ---
36
-
37
  This application showcases the **PLeIAs/📸📈✍🏻Florence-PDF** model, a powerful AI system designed for both **text and image generation tasks**. The model is capable of handling complex tasks such as object detection, image captioning, OCR (Optical Character Recognition), and detailed region-based image analysis.
38
 
 
 
 
 
 
 
 
39
  ### **How to Use**:
40
  1. **Upload an Image**: Select an image for processing.
41
  2. **Choose a Task**: Pick a task from the dropdown menu, such as "Caption", "Object Detection", "OCR", etc.
@@ -50,8 +51,6 @@ You can reset the interface anytime by clicking the **Reset** button.
50
  - **📸✍🏻OCR**: Extract text from the image.
51
  - **📸Region Proposal**: Detect key regions in the image for detailed captioning.
52
 
53
- ---
54
-
55
  ### Join us :
56
  🌟TeamTonic🌟 is always making cool demos! Join our active builder's 🛠️community 👻 [![Join us on Discord](https://img.shields.io/discord/1109943800132010065?label=Discord&logo=discord&style=flat-square)](https://discord.gg/qdfnvSPcqP) On 🤗Huggingface:[MultiTransformer](https://huggingface.co/MultiTransformer) On 🌐Github: [Tonic-AI](https://github.com/tonic-ai) & contribute to🌟 [Build Tonic](https://git.tonic-ai.com/contribute)🤗Big thanks to Yuvi Sharma and all the folks at huggingface for the community grant 🤗
57
  """
@@ -77,12 +76,6 @@ In addition to text tasks, 🙏🏻PLeIAs/📸📈✍🏻Florence-PDF also incor
77
  - **Patch-based Image Processing**: The vision component operates on image patches with a patch size of **{patch_size}x{patch_size}**.
78
  - **Temporal Embedding**: Visual tasks benefit from temporal embeddings with up to **{temporal_embeddings} steps**, making Florence-2 well-suited for video analysis.
79
 
80
- ### Model Usage and Flexibility
81
-
82
- - **No Repeat N-Grams**: To reduce repetition in text generation, the model is configured with a **no_repeat_ngram_size** of **{no_repeat_ngram_size}**, ensuring more diverse and meaningful outputs.
83
- - **Sampling Strategies**: 🙏🏻PLeIAs/📸📈✍🏻Florence-PDF offers flexible sampling strategies, including **top-k** and **top-p (nucleus) sampling**, allowing for both creative and constrained generation based on user needs.
84
-
85
- 📸📈✍🏻Florence-PDF is a robust model capable of handling various **text and image** tasks with high precision and flexibility, making it a valuable tool for both academic research and practical applications.
86
  """
87
 
88
  device = "cuda" if torch.cuda.is_available() else "cpu"
 
12
  from matplotlib.patches import Polygon
13
  import numpy as np
14
  import random
 
15
 
16
 
 
 
 
17
  d_model = config['text_config']['d_model']
18
  num_layers = config['text_config']['encoder_layers']
19
  attention_heads = config['text_config']['encoder_attention_heads']
 
28
 
29
  title = """# 🙋🏻‍♂️Welcome to Tonic's PLeIAs/📸📈✍🏻Florence-PDF"""
30
  description = """
 
 
31
  This application showcases the **PLeIAs/📸📈✍🏻Florence-PDF** model, a powerful AI system designed for both **text and image generation tasks**. The model is capable of handling complex tasks such as object detection, image captioning, OCR (Optical Character Recognition), and detailed region-based image analysis.
32
 
33
+ ### Model Usage and Flexibility
34
+
35
+ - **No Repeat N-Grams**: To reduce repetition in text generation, the model is configured with a **no_repeat_ngram_size** of **{no_repeat_ngram_size}**, ensuring more diverse and meaningful outputs.
36
+ - **Sampling Strategies**: 🙏🏻PLeIAs/📸📈✍🏻Florence-PDF offers flexible sampling strategies, including **top-k** and **top-p (nucleus) sampling**, allowing for both creative and constrained generation based on user needs.
37
+
38
+ 📸📈✍🏻Florence-PDF is a robust model capable of handling various **text and image** tasks with high precision and flexibility, making it a valuable tool for both academic research and practical applications.
39
+
40
  ### **How to Use**:
41
  1. **Upload an Image**: Select an image for processing.
42
  2. **Choose a Task**: Pick a task from the dropdown menu, such as "Caption", "Object Detection", "OCR", etc.
 
51
  - **📸✍🏻OCR**: Extract text from the image.
52
  - **📸Region Proposal**: Detect key regions in the image for detailed captioning.
53
 
 
 
54
  ### Join us :
55
  🌟TeamTonic🌟 is always making cool demos! Join our active builder's 🛠️community 👻 [![Join us on Discord](https://img.shields.io/discord/1109943800132010065?label=Discord&logo=discord&style=flat-square)](https://discord.gg/qdfnvSPcqP) On 🤗Huggingface:[MultiTransformer](https://huggingface.co/MultiTransformer) On 🌐Github: [Tonic-AI](https://github.com/tonic-ai) & contribute to🌟 [Build Tonic](https://git.tonic-ai.com/contribute)🤗Big thanks to Yuvi Sharma and all the folks at huggingface for the community grant 🤗
56
  """
 
76
  - **Patch-based Image Processing**: The vision component operates on image patches with a patch size of **{patch_size}x{patch_size}**.
77
  - **Temporal Embedding**: Visual tasks benefit from temporal embeddings with up to **{temporal_embeddings} steps**, making Florence-2 well-suited for video analysis.
78
 
 
 
 
 
 
 
79
  """
80
 
81
  device = "cuda" if torch.cuda.is_available() else "cpu"