Spaces:

Riksarkivet
/

htr_demo

Running on Zero

App Files Files Community

Gabriel commited on Jan 3

Commit

381bbf4

1 Parent(s): 21c87da

Starting adding language support

Browse files

Files changed (45) hide show

app/{texts_langs/overview → content/ENG}/changelog_roadmap/changelog.md +0 -0
app/{texts_langs/overview → content/ENG}/changelog_roadmap/old_changelog.md +0 -0
app/{texts_langs/overview → content/ENG}/changelog_roadmap/roadmap.md +0 -0
app/{texts_langs/overview → content/ENG}/contributions/contributions.md +0 -0
app/{texts_langs/overview → content/ENG}/contributions/huminfra_image.md +0 -0
app/{texts_langs/overview → content/ENG}/contributions/riksarkivet_image.md +0 -0
app/{texts_langs/overview → content/ENG}/duplicate_api/api1.md +0 -0
app/{texts_langs/overview → content/ENG}/duplicate_api/api2.md +0 -0
app/{texts_langs/overview → content/ENG}/duplicate_api/api_code1.md +0 -0
app/{texts_langs/overview → content/ENG}/duplicate_api/api_code2.md +0 -0
app/{texts_langs/overview → content/ENG}/duplicate_api/duplicate.md +0 -0
app/{texts_langs/overview → content/ENG}/faq_discussion/discussion.md +0 -0
app/{texts_langs/overview → content/ENG}/faq_discussion/faq.md +0 -0
app/{texts_langs/overview → content/ENG}/htrflow/htrflow_col1.md +0 -0
app/{texts_langs/overview → content/ENG}/htrflow/htrflow_col2.md +0 -0
app/{texts_langs/overview → content/ENG}/htrflow/htrflow_row1.md +0 -0
app/{texts_langs/overview → content/ENG}/htrflow/htrflow_tab1.md +0 -0
app/{texts_langs/overview → content/ENG}/htrflow/htrflow_tab2.md +0 -0
app/{texts_langs/overview → content/ENG}/htrflow/htrflow_tab3.md +0 -0
app/{texts_langs/overview → content/ENG}/htrflow/htrflow_tab4.md +0 -0
app/content/SWE/htrflow/htrflow_col1.md +18 -0
app/content/SWE/htrflow/htrflow_col2.md +23 -0
app/content/SWE/htrflow/htrflow_row1.md +3 -0
app/content/SWE/htrflow/htrflow_tab1.md +7 -0
app/content/SWE/htrflow/htrflow_tab2.md +7 -0
app/content/SWE/htrflow/htrflow_tab3.md +7 -0
app/content/SWE/htrflow/htrflow_tab4.md +7 -0
app/content/main_sub_title.md +3 -0
app/content/main_title.md +1 -0
app/gradio_config.py +0 -14
app/main.py +46 -26
app/tabs/adv_htrflow_tab.py +3 -18
app/tabs/htrflow_tab.py +63 -57
app/tabs/overview_tab.py +71 -91
app/templates/steps_template.yaml.j2 +16 -0
app/texts_langs/text_app.py +0 -9
app/texts_langs/text_overview.py +0 -37
app/translation.yaml +13 -0
app/{texts_langs → utils}/__init__.py +0 -0
app/utils/lang_helper.py +7 -0
app/utils/md_helper.py +14 -0
pyproject.toml +2 -1
todo.txt +14 -0
translation.yaml +10 -0
uv.lock +2 -0

app/{texts_langs/overview → content/ENG}/changelog_roadmap/changelog.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/changelog_roadmap/old_changelog.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/changelog_roadmap/roadmap.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/contributions/contributions.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/contributions/huminfra_image.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/contributions/riksarkivet_image.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/duplicate_api/api1.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/duplicate_api/api2.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/duplicate_api/api_code1.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/duplicate_api/api_code2.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/duplicate_api/duplicate.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/faq_discussion/discussion.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/faq_discussion/faq.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/htrflow/htrflow_col1.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/htrflow/htrflow_col2.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/htrflow/htrflow_row1.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/htrflow/htrflow_tab1.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/htrflow/htrflow_tab2.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/htrflow/htrflow_tab3.md RENAMED Viewed

File without changes

app/{texts_langs/overview → content/ENG}/htrflow/htrflow_tab4.md RENAMED Viewed

File without changes

app/content/SWE/htrflow/htrflow_col1.md ADDED Viewed

	@@ -0,0 +1,18 @@

+### Introduktion
+Riksarkivet presenterar en demonstrationspipeline för HTR (Handwritten Text Recognition). Pipelinen består av två instanssegmenteringsmodeller: en tränad för att segmentera textregioner i bilder av löpande-textdokument och en annan tränad för att segmentera textrader inom dessa regioner. Textraderna transkriberas därefter av en textigenkänningsmodell som är tränad på ett stort dataset med svensk handskrift från 1600- till 1800-talet.
+### Användning
+Det är viktigt att betona att denna applikation främst är avsedd för demonstrationsändamål. Målet är att visa upp vår pipeline för att transkribera historiska dokument med löpande text, inte att använda pipelinen i storskalig produktion.
+**Obs**: I framtiden kommer vi att optimera koden för att passa ett produktionsscenario med multi-GPU och batch-inferens, men detta arbete pågår fortfarande. <br>
+För en inblick i de kommande funktionerna vi arbetar med:
+- Navigera till > **Översikt** > **Ändringslogg och roadmap**.
+### Begränsningar
+Demon, som är värd på Huggingface och tilldelad en T4 GPU, kan bara hantera två användarinlämningar åt gången. Om du upplever långa väntetider eller att applikationen inte svarar, är detta anledningen. I framtiden planerar vi att själva vara värdar för denna lösning, med en bättre server för en förbättrad användarupplevelse, optimerad kod och flera modellalternativ. Spännande utveckling är på gång!
+Det är också viktigt att notera att modellerna fungerar på löpande text och inte text i tabellformat.

app/content/SWE/htrflow/htrflow_col2.md ADDED Viewed

	@@ -0,0 +1,23 @@

+## Source Code
+Please fork and leave a star on Github if you like it! The code for this project can be found here:
+- [Github](https://github.com/Riksarkivet/HTRFLOW)
+**Note**: We will in the future package all of the code for mass HTR (batch inference on multi-GPU setup), but the code is still work in progress.
+## Models
+The models used in this demo are very much a work in progress, and as more data, and new architectures, becomes available, they will be retrained and reevaluated. For more information about the models, please refer to their model-cards on Huggingface.
+- [Riksarkivet/rtmdet_regions](https://huggingface.co/Riksarkivet/rtmdet_regions)
+- [Riksarkivet/rtmdet_lines](https://huggingface.co/Riksarkivet/rtmdet_lines)
+- [Riksarkivet/satrn_htr](https://huggingface.co/https://huggingface.co/Riksarkivet/satrn_htr)
+## Datasets
+Train and testsets created by the Swedish National Archives will be released here:
+- [Riksarkivet/placeholder_region_segmentation](https://huggingface.co/datasets/Riksarkivet/placeholder_region_segmentation)
+- [Riksarkivet/placeholder_line_segmentation](https://huggingface.co/datasets/Riksarkivet/placeholder_line_segmentation)
+- [Riksarkivet/placeholder_htr](https://huggingface.co/datasets/Riksarkivet/placeholder_htr)

app/content/SWE/htrflow/htrflow_row1.md ADDED Viewed

	@@ -0,0 +1,3 @@


1	+ ## The Pipeline in Overview
2	+
3	+ The steps in the pipeline can be seen below as follows:

app/content/SWE/htrflow/htrflow_tab1.md ADDED Viewed

	@@ -0,0 +1,7 @@

+### Binarization
+The reason for binarizing the images before processing them is that we want the models to generalize as well as possible. By training on only binarized images and by binarizing images before running them through the pipeline, we take the target domain closer to the training domain, and reduce negative effects of background variation, background noise etc., on the final results. The pipeline implements a simple adaptive thresholding algorithm for binarization.
+<figure>
+<img src="https://github.com/Borg93/htr_gradio_file_placeholder/blob/main/app_project_bin.png?raw=true" alt="HTR_tool" style="width:70%; display: block; margin-left: auto; margin-right:auto;" >
+</figure>

app/content/SWE/htrflow/htrflow_tab2.md ADDED Viewed

	@@ -0,0 +1,7 @@

+### Text-region segmentation
+To facilitate the text-line segmentation process, it is advantageous to segment the image into text-regions beforehand. This initial step offers several benefits, including reducing variations in line spacing, eliminating blank areas on the page, establishing a clear reading order, and distinguishing marginalia from the main text. The segmentation model utilized in this process predicts both bounding boxes and masks. Although the model has the capability to predict both, only the masks are utilized for the segmentation tasks of lines and regions. An essential post-processing step involves checking for regions that are contained within other regions. During this step, only the containing region is retained, while the contained region is discarded. This ensures that the final segmented text-regions are accurate and devoid of overlapping or redundant areas. This ensures that there’s no duplicate text-regions sent to the text-recognition model.
+<figure>
+<img src="https://github.com/Borg93/htr_gradio_file_placeholder/blob/main/app_project_region.png?raw=true" alt="HTR_tool" style="width:70%; display: block; margin-left: auto; margin-right:auto;" >
+</figure>

app/content/SWE/htrflow/htrflow_tab3.md ADDED Viewed

	@@ -0,0 +1,7 @@

+### Text-line segmentation
+This is also an instance segmentation model, trained on extracting text-lines from the cropped text-regions. The same post-processing as in the text-region segmentation step, is done in the text-line segmentation step.
+<figure>
+<img src="https://github.com/Borg93/htr_gradio_file_placeholder/blob/main/app_project_line.png?raw=true" alt="HTR_tool" style="width:70%; display: block; margin-left: auto; margin-right:auto;" >
+</figure>

app/content/SWE/htrflow/htrflow_tab4.md ADDED Viewed

	@@ -0,0 +1,7 @@

+### Text Recognition
+The text-recognition model was trained on approximately one million handwritten text-line images ranging from the 17th to the 19th century. See the model card for detailed evaluation results, and results from some fine-tuning experiments.
+<figure>
+<img src="https://github.com/Borg93/htr_gradio_file_placeholder/blob/main/app_project_htr.png?raw=true" alt="HTR_tool" style="width:70%; display: block; margin-left: auto; margin-right:auto;" >
+</figure>

app/content/main_sub_title.md ADDED Viewed

	@@ -0,0 +1,3 @@

+<a href="https://riksarkivet.se">
+<img src="https://raw.githubusercontent.com/Borg93/Riksarkivet_docs/main/docs/assets/fav-removebg-preview.png" width="17%" align="right" margin-right="100" />
+</a>

app/content/main_title.md ADDED Viewed

	@@ -0,0 +1 @@


1	+ <h1><center> HTRflow 🔍 App </center></h1>

app/gradio_config.py CHANGED Viewed

@@ -19,20 +19,6 @@ body > gradio-app > div > div > div.wrap.svelte-1rjryqp > footer > a {
 body > gradio-app > div > div > div.wrap.svelte-1rjryqp > footer > div {
     display: none !important;
 }
-# .top-navbar .tab-container {justify-content: center;}
-# .top-navbar .tab-container button {font-size:large !important;}
 #langdropdown {width: 100px;}
-#column-form .wrap {flex-direction: column; height:100vh;}
-@media screen and (max-width: 1024px) {
-    #column-form .wrap {
-        flex-direction: column;
-        height: auto;
-    }
-}
-#htrflowouttab-button {opacity: 0; cursor:auto;}
 """

 body > gradio-app > div > div > div.wrap.svelte-1rjryqp > footer > div {
     display: none !important;
 }
 #langdropdown {width: 100px;}
 """

app/main.py CHANGED Viewed

@@ -3,50 +3,70 @@ import gradio as gr
 from app.gradio_config import css, theme
 from app.tabs.adv_htrflow_tab import adv_htrflow_pipeline
 from app.tabs.htrflow_tab import htrflow_pipeline
-from app.tabs.overview_tab import overview
-from app.texts_langs.text_app import TextApp
 with gr.Blocks(title="HTRflow", theme=theme, css=css) as demo:
     with gr.Row():
         with gr.Column(scale=1):
-            radio = gr.Dropdown(
                 choices=["ENG", "SWE"], value="ENG", container=False, min_width=50, scale=0, elem_id="langdropdown"
             )
         with gr.Column(scale=2):
-            gr.Markdown(TextApp.title_markdown)
         with gr.Column(scale=1):
-            gr.Markdown(TextApp.title_markdown_img)
     with gr.Tabs(elem_classes="top-navbar") as navbar:
-        with gr.Tab("Home"):
             overview.render()
-        with gr.Tab("Simple HTR"):
             htrflow_pipeline.render()
-        with gr.Tab("Custom HTR"):
             adv_htrflow_pipeline.render()
-    # radio.change(
-    #     None,
-    #     inputs=radio,
-    #     js="""
-    #     (data) => {
-    #     window.localStorage.setItem('data', JSON.stringify(data))
-    #     }
-    #     """,
-    # )
-    demo.load(
-        None,
-        inputs=radio,
-        js="""
-        (data) => {
-        window.localStorage.setItem('data', JSON.stringify(data))
-        }
-        """,
     )
 demo.queue()

 from app.gradio_config import css, theme
 from app.tabs.adv_htrflow_tab import adv_htrflow_pipeline
 from app.tabs.htrflow_tab import htrflow_pipeline
+from app.tabs.overview_tab import overview, overview_language
+from app.utils.lang_helper import get_tab_updates
+from app.utils.md_helper import load_markdown
+TAB_LABELS = {
+    "ENG": ["Home", "Simple HTR", "Custom HTR"],
+    "SWE": ["Hem", "Enkel HTR", "Anpassad HTR"],
+}
 with gr.Blocks(title="HTRflow", theme=theme, css=css) as demo:
     with gr.Row():
+        local_language = gr.BrowserState(default_value="ENG", storage_key="selected_language")
+        main_language = gr.State(value="ENG")
         with gr.Column(scale=1):
+            language_selector = gr.Dropdown(
                 choices=["ENG", "SWE"], value="ENG", container=False, min_width=50, scale=0, elem_id="langdropdown"
             )
         with gr.Column(scale=2):
+            gr.Markdown(load_markdown(None, "main_title"))
         with gr.Column(scale=1):
+            gr.Markdown(load_markdown(None, "main_sub_title"))
     with gr.Tabs(elem_classes="top-navbar") as navbar:
+        with gr.Tab(label="Home") as tab_home:
             overview.render()
+        with gr.Tab(label="Simple HTR") as tab_simple_htr:
             htrflow_pipeline.render()
+        with gr.Tab(label="Custom HTR") as tab_custom_htr:
             adv_htrflow_pipeline.render()
+    @demo.load(inputs=[local_language], outputs=[language_selector, main_language, overview_language])
+    def load_language(saved_values):
+        return (saved_values,) * 3
+    @language_selector.change(
+        inputs=[language_selector],
+        outputs=[
+            local_language,
+            main_language,
+            overview_language,
+        ],
     )
+    def save_language_to_browser(selected_language):
+        return (selected_language,) * 3
+    @main_language.change(
+        inputs=[main_language],
+        outputs=[
+            tab_home,
+            tab_simple_htr,
+            tab_custom_htr,
+        ],
+    )
+    def update_main_tabs(selected_language):
+        return (*get_tab_updates(selected_language, TAB_LABELS),)
+    @main_language.change(inputs=[main_language])
+    def on_language_change(selected_language):
+        print(f"Language changed to: {selected_language}")
 demo.queue()

app/tabs/adv_htrflow_tab.py CHANGED Viewed

@@ -11,24 +11,7 @@ with gr.Blocks() as adv_htrflow_pipeline:
             with gr.Group():
                 with gr.Row(visible=True) as yaml_pipeline:
                     custom_template_yaml = gr.Code(
-                        value="""
-    steps:
-    - step: Segmentation
-        settings:
-        model: yolo
-        model_settings:
-            model: Riksarkivet/yolov9-lines-within-regions-1
-    - step: TextRecognition
-        settings:
-        model: TrOCR
-        model_settings:
-            model: Riksarkivet/trocr-base-handwritten-hist-swe-2
-    - step: OrderLines
-    - step: Export
-        settings:
-        format: txt
-        dest: outputs
-                                    """,
                         language="yaml",
                         label="yaml",
                         interactive=True,
@@ -47,6 +30,8 @@ with gr.Blocks() as adv_htrflow_pipeline:
                     )
                     gr.Image()
                 with gr.Tab("Table"):
                     pass
                 with gr.Tab("Analysis"):

             with gr.Group():
                 with gr.Row(visible=True) as yaml_pipeline:
                     custom_template_yaml = gr.Code(
+                        value="Paste your custom pipeline here",
                         language="yaml",
                         label="yaml",
                         interactive=True,
                     )
                     gr.Image()
+                with gr.Tab("Graph Excution"):
+                    pass
                 with gr.Tab("Table"):
                     pass
                 with gr.Tab("Analysis"):

app/tabs/htrflow_tab.py CHANGED Viewed

@@ -1,5 +1,6 @@
 import gradio as gr
 import pandas as pd
 from app.assets.examples import DemoImages
@@ -58,53 +59,61 @@ def get_yaml_button_fn(
     nested_segment_model_2_type=None,
     nested_htr_model_type=None,
 ):
-    if method == "Simple layout":
-        yaml_value = f"""steps:
-  - step: Segmentation
-    settings:
-      model: {simple_htr_model_type}
-      model_settings:
-        model: {simple_segment_model}
-  - step: TextRecognition
-    settings:
-      model: {simple_segment_model_type}
-      model_settings:
-        model: {simple_htr_model}
-  - step: OrderLines
-"""
-    elif method == "Nested segmentation":
-        yaml_value = f"""steps:
-  - step: Segmentation
-    settings:
-      model: {nested_segment_model_1_type}
-      model_settings:
-        model: {nested_segment_model_1}
-  - step: Segmentation
-    settings:
-      model: {nested_segment_model_2_type}
-      model_settings:
-        model: {nested_segment_model_2}
-  - step: TextRecognition
-    settings:
-      model: {nested_htr_model_type}
-      model_settings:
-        model: {nested_htr_model}
-  - step: OrderLines
-"""
-    else:
-        return gr.Error("Invalid method or not yet supported.")
-    export_steps = ""
-    for output_format in output_formats:
-        export_steps += f"""  - step: Export
-    settings:
-      format: {output_format}
-      dest: {output_format}-outputs
-"""
-    yaml_value += export_steps
-    return yaml_value
 output_image_placehholder = gr.Image(label="Output image", height=500, show_share_button=True)
@@ -214,14 +223,14 @@ with gr.Blocks() as htrflow_pipeline:
         with gr.Column():
             # gr.Markdown("<h2>Output Panel</h2>")
             with gr.Tabs():
-                with gr.Tab("Viewer"): #interactive=False, elem_id="htrflowouttab"
                     with gr.Group():
                         with gr.Row():
                             output_image_placehholder.render()
                         with gr.Row():
                             markdown_selected_option.render()
                         with gr.Row():
-                            output_dataframe_pipeline = gr.Textbox(label="Click text",info="click on image bla bla..")
                 with gr.Tab("Table") as htrflow_output_table_tab:
                     with gr.Group():
                         with gr.Row():
@@ -280,11 +289,8 @@ with gr.Blocks() as htrflow_pipeline:
         outputs=[output_yaml_code],
     ).then(dummy_revealer, inputs=output_yaml_code, outputs=output_yaml_code)
-# TODO : hide the tab when selected for yaml code
-# htrflow_output_table_tab.select(dummy_revealer, inputs=output_yaml_code, outputs=output_yaml_code)
-template_method_radio.select(
-    lambda choice: toggle_visibility_default_templates(choice),
-    inputs=template_method_radio,
-    outputs=[simple_pipeline, nested_pipeline, table_pipeline, selected_option],
-)

 import gradio as gr
 import pandas as pd
+from jinja2 import Environment, FileSystemLoader
 from app.assets.examples import DemoImages
     nested_segment_model_2_type=None,
     nested_htr_model_type=None,
 ):
+    env = Environment(loader=FileSystemLoader("app/templates"))
+    template_name = "steps_template.yaml.j2"
+    try:
+        if method == "Simple layout":
+            steps = [
+                {
+                    "step": "Segmentation",
+                    "model": simple_htr_model_type,
+                    "model_settings": {"model": simple_segment_model},
+                },
+                {
+                    "step": "TextRecognition",
+                    "model": simple_segment_model_type,
+                    "model_settings": {"model": simple_htr_model},
+                },
+                {"step": "OrderLines"},
+            ]
+        elif method == "Nested segmentation":
+            steps = [
+                {
+                    "step": "Segmentation",
+                    "model": nested_segment_model_1_type,
+                    "model_settings": {"model": nested_segment_model_1},
+                },
+                {
+                    "step": "Segmentation",
+                    "model": nested_segment_model_2_type,
+                    "model_settings": {"model": nested_segment_model_2},
+                },
+                {
+                    "step": "TextRecognition",
+                    "model": nested_htr_model_type,
+                    "model_settings": {"model": nested_htr_model},
+                },
+                {"step": "OrderLines"},
+            ]
+        else:
+            return "Invalid method or not yet supported."
+        steps.extend(
+            {
+                "step": "Export",
+                "settings": {"format": format, "dest": f"{format}-outputs"},
+            }
+            for format in output_formats
+        )
+        template = env.get_template(template_name)
+        yaml_value = template.render(steps=steps)
+        return yaml_value
+    except Exception as e:
+        return f"Error generating YAML: {str(e)}"
 output_image_placehholder = gr.Image(label="Output image", height=500, show_share_button=True)
         with gr.Column():
             # gr.Markdown("<h2>Output Panel</h2>")
             with gr.Tabs():
+                with gr.Tab("Viewer"):  # interactive=False, elem_id="htrflowouttab"
                     with gr.Group():
                         with gr.Row():
                             output_image_placehholder.render()
                         with gr.Row():
                             markdown_selected_option.render()
                         with gr.Row():
+                            output_dataframe_pipeline = gr.Textbox(label="Click text", info="click on image bla bla..")
                 with gr.Tab("Table") as htrflow_output_table_tab:
                     with gr.Group():
                         with gr.Row():
         outputs=[output_yaml_code],
     ).then(dummy_revealer, inputs=output_yaml_code, outputs=output_yaml_code)
+    template_method_radio.select(
+        lambda choice: toggle_visibility_default_templates(choice),
+        inputs=template_method_radio,
+        outputs=[simple_pipeline, nested_pipeline, table_pipeline, selected_option],
+    )

app/tabs/overview_tab.py CHANGED Viewed

@@ -1,60 +1,27 @@
 import gradio as gr
-from app.texts_langs.text_overview import TextOverview
-default_value_radio_overview = "Home"
-overview_choices_eng = [
-    "Home",
-    "About App",
-    "Guide",
-    "Model & Data",
-    "Contributions",
-    "Duplicate App",
-    "FAQ & Contact",
-]
-def toggle_visibility(selected_option):
-    return [
-        gr.update(visible=(selected_option == "Home")),
-        gr.update(visible=(selected_option == "About App")),
-        gr.update(visible=(selected_option == "Guide")),
-        gr.update(visible=(selected_option == "Model & Data")),
-        gr.update(visible=(selected_option == "Contributions")),
-        gr.update(visible=(selected_option == "FAQ & Contact")),
-        gr.update(visible=(selected_option == "Duplicate App")),
-    ]
 with gr.Blocks() as overview:
-    with gr.Row():
-        with gr.Column(visible=True, min_width=170, scale=0, variant="panel") as sidebar:
-            options_overview = gr.Radio(
-                overview_choices_eng,
-                label="Side Navigation",
-                container=False,
-                value=default_value_radio_overview,
-                elem_id="column-form",
-                min_width=100,
-                scale=0,
-            )
-        with gr.Column(variant="panel") as overview_main:
-            with gr.Row(visible=True) as overview_home:
-                with gr.Column():
-                    gr.Markdown("## landing page to explain version")
-                    gr.Markdown("## htrflow app 1.0.0")
-                    gr.Markdown("## links to different stuff")
-                    gr.Markdown("## Whats new..")
-            with gr.Row(visible=False) as overview_about:
                 with gr.Column():
-                    gr.Markdown(TextOverview.htrflow_col1)
-                    gr.Markdown(TextOverview.htrflow_col2)
-            with gr.Row(visible=False) as overview_guide:
                 with gr.Column():
                     with gr.Row():
                         with gr.Column():
@@ -71,64 +38,77 @@ with gr.Blocks() as overview:
                                 format="mp4",
                             )
-            with gr.Row(visible=False) as overview_model_data:
                 with gr.Column():
-                    gr.Markdown(TextOverview.htrflow_row1)
                     with gr.Tabs():
                         with gr.Tab("Binarization"):
-                            gr.Markdown(TextOverview.htrflow_tab1)
                         with gr.Tab("Region segmentation"):
-                            gr.Markdown(TextOverview.htrflow_tab2)
                         with gr.Tab("Line segmentation"):
-                            gr.Markdown(TextOverview.htrflow_tab3)
                         with gr.Tab("Text recognition"):
-                            gr.Markdown(TextOverview.htrflow_tab4)
-            with gr.Row(visible=False) as overview_contribute:
                 with gr.Column():
-                    gr.Markdown(TextOverview.contributions)
-                    gr.Markdown(TextOverview.huminfra_image)
-            with gr.Row(visible=False) as overview_duplicate:
                 with gr.Column():
-                    gr.Markdown(TextOverview.duplicate)
                 with gr.Column():
-                    gr.Markdown(TextOverview.api1)
-                    gr.Code(
-                        value=TextOverview.api_code1,
-                        language="python",
-                        interactive=False,
-                        show_label=False,
-                    )
-                    gr.Markdown(TextOverview.api2)
-                    gr.Code(
-                        value=TextOverview.api_code2,
-                        language=None,
-                        interactive=False,
-                        show_label=False,
-                    )
-            with gr.Row(visible=False) as overview_faq:
                 with gr.Column():
-                    gr.Markdown(TextOverview.text_faq)
                 with gr.Column():
-                    gr.Markdown(TextOverview.text_discussion)
-        with gr.Column(visible=True, min_width=0, scale=0) as empty:
-            pass
-    options_overview.change(
-        lambda choice: toggle_visibility(choice),
-        inputs=options_overview,
         outputs=[
-            overview_home,
-            overview_about,
-            overview_guide,
-            overview_model_data,
-            overview_contribute,
-            overview_duplicate,
-            overview_faq,
         ],
     )

 import gradio as gr
+from app.utils.lang_helper import get_tab_updates
+from app.utils.md_helper import load_markdown
+TAB_LABELS = {
+    "ENG": ["Overview", "About App", "Guide", "Model & Data", "Contributions", "Duplicate App", "FAQ & Contact"],
+    "SWE": ["Översikt", "Om appen", "Guide", "Modell & Data", "Bidrag", "Duplicera App", "FAQ & Kontakt"],
+}
 with gr.Blocks() as overview:
+    overview_language = gr.State(value="ENG")
+    with gr.Column(variant="panel"):
+        with gr.Tabs(elem_classes="top-navbar") as navbar:
+            with gr.Tab("Overview") as tab_overview:
+                with gr.Column(variant="panel"):
+                    md1 = gr.Markdown("some text")
+            with gr.Tab("About App") as tab_about:
                 with gr.Column():
+                    about_md = gr.Markdown(load_markdown(overview_language.value, "htrflow/htrflow_col1"))
+            with gr.Tab("Guide") as tab_guide:
                 with gr.Column():
                     with gr.Row():
                         with gr.Column():
                                 format="mp4",
                             )
+            with gr.Tab("Model & Data") as tab_model_data:
                 with gr.Column():
+                    # gr.Markdown(TextOverview.htrflow_row1)
                     with gr.Tabs():
                         with gr.Tab("Binarization"):
+                            gr.Markdown("")  # gr.Markdown(TextOverview.htrflow_tab1)
                         with gr.Tab("Region segmentation"):
+                            gr.Markdown("")  # gr.Markdown(TextOverview.htrflow_tab2)
                         with gr.Tab("Line segmentation"):
+                            gr.Markdown("")  # gr.Markdown(TextOverview.htrflow_tab3)
                         with gr.Tab("Text recognition"):
+                            gr.Markdown("")  # gr.Markdown(TextOverview.htrflow_tab4)
+            with gr.Tab("Contributions") as tab_contributions:
                 with gr.Column():
+                    gr.Markdown("")  # gr.Markdown(TextOverview.contributions)
+                    gr.Markdown("")  # gr.Markdown(TextOverview.huminfra_image)
+            with gr.Tab("Duplicate App") as tab_duplicate_app:
                 with gr.Column():
+                    gr.Markdown("")  # gr.Markdown(TextOverview.duplicate)
                 with gr.Column():
+                    gr.Markdown("")  # gr.Markdown(TextOverview.api1)
+                    # gr.Code(
+                    #    value=TextOverview.api_code1,
+                    #    language="python",
+                    #    interactive=False,
+                    #    show_label=False,)
+                    gr.Markdown("")  # gr.Markdown(TextOverview.api2)
+                    # gr.Code(
+                    #     value=TextOverview.api_code2,
+                    #     language=None,
+                    #     interactive=False,
+                    #     show_label=False,
+                    # )
+            with gr.Tab("FAQ & Contact") as tab_faq_contact:
                 with gr.Column():
+                    gr.Markdown("")  # gr.Markdown(TextOverview.text_faq)
                 with gr.Column():
+                    gr.Markdown("")  # gr.Markdown(TextOverview.text_discussion)
+    overview.load(
+        inputs=[overview_language],
+        outputs=[about_md],
+    )
+    def load_md_text(selected_language):
+        return load_markdown(selected_language, "htrflow/htrflow_col1")
+    @overview_language.change(
+        inputs=[overview_language],
+        outputs=[about_md],
+    )
+    def change_md_text(selected_language):
+        return load_markdown(selected_language, "htrflow/htrflow_col1")
+    @overview_language.change(
+        inputs=[overview_language],
         outputs=[
+            tab_overview,
+            tab_about,
+            tab_guide,
+            tab_model_data,
+            tab_contributions,
+            tab_duplicate_app,
+            tab_faq_contact,
         ],
     )
+    def save_language_to_browser(selected_language):
+        return (*get_tab_updates(selected_language, TAB_LABELS),)

app/templates/steps_template.yaml.j2 ADDED Viewed

	@@ -0,0 +1,16 @@

+steps:
+{% for step in steps -%}
+  - step: {{ step.step }}
+    {% if step.model -%}
+    settings:
+      model: {{ step.model }}
+      model_settings:
+        model: {{ step.model_settings.model }}
+    {% endif -%}
+    {% if step.settings -%}
+    settings:
+      {% for key, value in step.settings.items() -%}
+      {{ key }}: {{ value }}
+      {% endfor -%}
+    {% endif -%}
+{% endfor %}

app/texts_langs/text_app.py DELETED Viewed

@@ -1,9 +0,0 @@
-class TextApp:
-    title_markdown = """
-    <h1><center> HTRflow 🔍 App </center></h1>"""  #
-    title_markdown_img = """
-    <a href="https://riksarkivet.se">
-    <img src="https://raw.githubusercontent.com/Borg93/Riksarkivet_docs/main/docs/assets/fav-removebg-preview.png" width="17%" align="right" margin-right="100" />
-    </a>
-    """

app/texts_langs/text_overview.py DELETED Viewed

@@ -1,37 +0,0 @@
-def read_markdown(file_path: str) -> str:
-    with open(file_path, "r") as file:
-        content = file.read()
-    return f"""{content}"""
-class TextOverview:
-    # HTRFLOW
-    htrflow_col1 = read_markdown("app/texts_langs/overview/htrflow/htrflow_col1.md")
-    htrflow_col2 = read_markdown("app/texts_langs/overview/htrflow/htrflow_col2.md")
-    htrflow_row1 = read_markdown("app/texts_langs/overview/htrflow/htrflow_row1.md")
-    htrflow_tab1 = read_markdown("app/texts_langs/overview/htrflow/htrflow_tab1.md")
-    htrflow_tab2 = read_markdown("app/texts_langs/overview/htrflow/htrflow_tab2.md")
-    htrflow_tab3 = read_markdown("app/texts_langs/overview/htrflow/htrflow_tab3.md")
-    htrflow_tab4 = read_markdown("app/texts_langs/overview/htrflow/htrflow_tab4.md")
-    # faq & discussion
-    text_faq = read_markdown("app/texts_langs/overview/faq_discussion/faq.md")
-    text_discussion = read_markdown("app/texts_langs/overview/faq_discussion/discussion.md")
-    # Contributions
-    contributions = read_markdown("app/texts_langs/overview/contributions/contributions.md")
-    huminfra_image = read_markdown("app/texts_langs/overview/contributions/huminfra_image.md")
-    # Changelog & Roadmap
-    changelog = read_markdown("app/texts_langs/overview/changelog_roadmap/changelog.md")
-    old_changelog = read_markdown("app/texts_langs/overview/changelog_roadmap/old_changelog.md")
-    roadmap = read_markdown("app/texts_langs/overview/changelog_roadmap/roadmap.md")
-    # duplicate & api
-    duplicate = read_markdown("app/texts_langs/overview/duplicate_api/duplicate.md")
-    api1 = read_markdown("app/texts_langs/overview/duplicate_api/api1.md")
-    api_code1 = read_markdown("app/texts_langs/overview/duplicate_api/api_code1.md")
-    api2 = read_markdown("app/texts_langs/overview/duplicate_api/api2.md")
-    api_code2 = read_markdown("app/texts_langs/overview/duplicate_api/api_code2.md")

app/translation.yaml ADDED Viewed

	@@ -0,0 +1,13 @@

+ENG:
+  Language: Language
+  Home: Home
+  Simple HTR: Simple HTR
+  Custom HTR: Custom HTR
+  # Other translations...
+SWE:
+  Language: Språk
+  Home: Hem
+  Simple HTR: Enkel HTR
+  Custom HTR: Anpassad HTR
+  # Other translations...

app/{texts_langs → utils}/__init__.py RENAMED Viewed

File without changes

app/utils/lang_helper.py ADDED Viewed

	@@ -0,0 +1,7 @@

+import gradio as gr
+def get_tab_updates(selected_language, TAB_LABELS):
+    """Helper to generate tab updates for the selected language."""
+    labels = TAB_LABELS[selected_language]
+    return [gr.update(label=label) for label in labels]

app/utils/md_helper.py ADDED Viewed

	@@ -0,0 +1,14 @@

+import os
+def load_markdown(language, section, content_dir="app/content"):
+    """Load markdown content from files."""
+    if language is None:
+        file_path = os.path.join(content_dir, f"{section}.md")
+    else:
+        file_path = os.path.join(content_dir, language, f"{section}.md")
+    if os.path.exists(file_path):
+        with open(file_path, "r", encoding="utf-8") as f:
+            return f.read()
+    return f"## Content missing for {file_path} in {language}"

pyproject.toml CHANGED Viewed

@@ -22,6 +22,7 @@ dependencies = [
     "gradio>=5.9.1",
     "datasets>=3.2.0",
     "pandas>=2.2.3",
 ]
 [project.urls]
@@ -67,4 +68,4 @@ target-version = "py310"
 [tool.ruff.lint]
 ignore = ["C901", "E741", "W605"]
-select = ["C", "E", "F", "I", "W"]

     "gradio>=5.9.1",
     "datasets>=3.2.0",
     "pandas>=2.2.3",
+    "jinja2>=3.1.4",
 ]
 [project.urls]
 [tool.ruff.lint]
 ignore = ["C901", "E741", "W605"]
+select = ["C", "E", "F", "I", "W"]

todo.txt ADDED Viewed

	@@ -0,0 +1,14 @@

+TODO: laang, shoulde perhaps est to do it with change event. and update the value on each component. Also we should find a nice way to keep lang stae if we update, e.g. use browser state.
+TODO: graph viz of pipeline
+TODO: Seperate analysis tab, https://www.gradio.app/docs/gradio/highlightedtext,https://huggingface.co/spaces/pngwn/gradio_imageslider
+TODO: Seperate "fiftyone tab", använd https://www.gradio.app/docs/gradio/gallery
+TODO: add support for batch inference and you can load image trough filepath or s3.
+TODO: support hämta från iiif-lb för att köra inference på custom run.
+TODO: accordin on sidetab
+TODO: alot of documentation / tutorials
+TODO: toggle dark and light mode: https://github.com/gradio-app/gradio/issues/7384
+TODO: ssr mode: https://www.gradio.app/docs/gradio/blocks
+TODO: enable monitoring and test api mode: https://www.gradio.app/docs/gradio/blocks
+TODO: tes usage of modal: https://www.gradio.app/custom-components/gallery
+TODO: new docker container..  https://huggingface.co/spaces/pngwn/gradio-docker/blob/main/Dockerfile, need cuda and uv stuff..

translation.yaml ADDED Viewed

	@@ -0,0 +1,10 @@

+ENG:
+  Language: Language
+  Home: Home
+  Simple HTR: Simple HTR
+  Custom HTR: Custom HTR
+SWE:
+  Language: Language
+  Home: Home
+  Simple HTR: Simple HTR
+  Custom HTR: Custom HTR

uv.lock CHANGED Viewed

@@ -754,6 +754,7 @@ dependencies = [
     { name = "datasets", marker = "platform_machine == 'aarch64' or platform_system != 'Linux' or sys_platform != 'win32'" },
     { name = "gradio", marker = "platform_machine == 'aarch64' or platform_system != 'Linux' or sys_platform != 'win32'" },
     { name = "htrflow", marker = "platform_machine == 'aarch64' or platform_system != 'Linux' or sys_platform != 'win32'" },
     { name = "pandas", marker = "platform_machine == 'aarch64' or platform_system != 'Linux' or sys_platform != 'win32'" },
     { name = "torch", marker = "platform_machine == 'aarch64' or platform_system != 'Linux' or sys_platform != 'win32'" },
 ]
@@ -780,6 +781,7 @@ requires-dist = [
     { name = "datasets", specifier = ">=3.2.0" },
     { name = "gradio", specifier = ">=5.9.1" },
     { name = "htrflow", specifier = "==0.1.3" },
     { name = "mmcv", marker = "extra == 'openmmlab'", specifier = "==2.0.1" },
     { name = "mmdet", marker = "extra == 'openmmlab'", specifier = "==3.0.0" },
     { name = "mmengine", marker = "extra == 'openmmlab'", specifier = "==0.7.4" },

     { name = "datasets", marker = "platform_machine == 'aarch64' or platform_system != 'Linux' or sys_platform != 'win32'" },
     { name = "gradio", marker = "platform_machine == 'aarch64' or platform_system != 'Linux' or sys_platform != 'win32'" },
     { name = "htrflow", marker = "platform_machine == 'aarch64' or platform_system != 'Linux' or sys_platform != 'win32'" },
+    { name = "jinja2", marker = "platform_machine == 'aarch64' or platform_system != 'Linux' or sys_platform != 'win32'" },
     { name = "pandas", marker = "platform_machine == 'aarch64' or platform_system != 'Linux' or sys_platform != 'win32'" },
     { name = "torch", marker = "platform_machine == 'aarch64' or platform_system != 'Linux' or sys_platform != 'win32'" },
 ]
     { name = "datasets", specifier = ">=3.2.0" },
     { name = "gradio", specifier = ">=5.9.1" },
     { name = "htrflow", specifier = "==0.1.3" },
+    { name = "jinja2", specifier = ">=3.1.4" },
     { name = "mmcv", marker = "extra == 'openmmlab'", specifier = "==2.0.1" },
     { name = "mmdet", marker = "extra == 'openmmlab'", specifier = "==3.0.0" },
     { name = "mmengine", marker = "extra == 'openmmlab'", specifier = "==0.7.4" },