Spaces:
Runtime error
Runtime error
Raymond Weitekamp
Add OAuth authentication to protect the interface and track user contributions
a345416
import gradio as gr | |
import random | |
import os | |
from datetime import datetime | |
from huggingface_hub import HfApi | |
from typing import Optional | |
# The list of sentences from our previous conversation. | |
sentences = [ | |
"Optical character recognition (OCR) is the process of converting images of text into machine-readable data.", | |
"When applied to handwriting, OCR faces additional challenges because of the natural variability in individual penmanship.", | |
"Over the last century, advances in computer vision and machine learning have transformed handwriting OCR from bulky, specialized hardware into highly accurate, software-driven systems.", | |
"The origins of OCR date back to the early 20th century.", | |
"Early pioneers explored how machines might read text.", | |
"In the 1920s, inventors such as Emanuel Goldberg developed early devices that could capture printed characters by converting them into telegraph codes.", | |
"Around the same time, Gustav Tauschek created the Reading Machine using template-matching methods to detect letters in images.", | |
"These devices were designed for printed text and depended on fixed, machine-friendly fonts rather than natural handwriting.", | |
"In the 1950s, systems like David Shepard's GISMO emerged to begin automating the conversion of paper records into digital form.", | |
"Although these early OCR systems were limited in scope and accuracy, they laid the groundwork for later innovations.", | |
"The 1960s saw OCR technology being applied to real-world tasks.", | |
"In 1965, American inventor Jacob Rabinow developed an OCR machine specifically aimed at sorting mail by reading addresses.", | |
"This was a critical step for the U.S. Postal Service.", | |
"Soon after, research groups, including those at IBM, began developing machines such as the IBM 1287, which was capable of reading handprinted numbers on envelopes to facilitate automated mail processing.", | |
"These systems marked the first attempts to apply computer vision to handwritten data on a large scale.", | |
"By the late 1980s and early 1990s, researchers such as Yann LeCun and his colleagues developed neural network architectures to recognize handwritten digits.", | |
"Their work, initially applied to reading ZIP codes on mail, demonstrated that carefully designed, constrained neural networks could achieve error rates as low as about 1% on USPS data.", | |
"Sargur Srihari and his team at the Center of Excellence for Document Analysis and Recognition extended these ideas to develop complete handwritten address interpretation systems.", | |
"These systems, deployed by the USPS and postal agencies worldwide, helped automate the routing of mail and revolutionized the sorting process.", | |
"The development and evaluation of handwriting OCR have been driven in part by standard benchmark datasets.", | |
"The MNIST dataset, introduced in the 1990s, consists of 70,000 images of handwritten digits and became the de facto benchmark for handwritten digit recognition.", | |
"Complementing MNIST is the USPS dataset, which provides images of hand‐written digits derived from actual envelopes and captures real-world variability.", | |
"Handwriting OCR entered a new era with the introduction of neural network models.", | |
"In 1989, LeCun et al. applied backpropagation to a convolutional neural network tailored for handwritten digit recognition, an innovation that evolved into the LeNet series.", | |
"By automatically learning features rather than relying on hand-designed templates, these networks drastically improved recognition performance.", | |
"As computational power increased and large labeled datasets became available, deep learning models, particularly convolutional neural networks and recurrent neural networks, pushed the accuracy of handwriting OCR to near-human levels.", | |
"Modern systems can handle both printed and cursive text, automatically segmenting and recognizing characters in complex handwritten documents.", | |
"Cursive handwriting presents a classic challenge known as Sayre's paradox, where word recognition requires letter segmentation and letter segmentation requires word recognition.", | |
"Contemporary approaches use implicit segmentation methods, often combined with hidden Markov models or end-to-end neural networks, to circumvent this paradox.", | |
"Today's handwriting OCR systems are highly accurate and widely deployed.", | |
"Modern systems combine OCR with artificial intelligence to not only recognize text but also extract meaning, verify data, and integrate into larger enterprise workflows.", | |
"Projects such as In Codice Ratio use deep convolutional networks to transcribe historical handwritten documents, further expanding OCR applications.", | |
"Despite impressive advances, handwriting OCR continues to face challenges with highly variable or degraded handwriting.", | |
"Ongoing research aims to improve recognition accuracy, particularly for cursive and unconstrained handwriting, and to extend support across languages and historical scripts.", | |
"With improvements in deep learning architectures, increased computing power, and large annotated datasets, future OCR systems are expected to become even more robust, handling real-world handwriting in diverse applications from postal services to archival digitization.", | |
"Today's research in handwriting OCR benefits from a wide array of well-established datasets and ongoing evaluation challenges.", | |
"These resources help drive the development of increasingly robust systems for both digit and full-text recognition.", | |
"For handwritten digit recognition, the MNIST dataset remains the most widely used benchmark thanks to its simplicity and broad adoption.", | |
"Complementing MNIST is the USPS dataset, which is derived from actual mail envelopes and provides additional challenges with real-world variability.", | |
"The IAM Handwriting Database is one of the most popular datasets for unconstrained offline handwriting recognition and includes scanned pages of handwritten English text with corresponding transcriptions.", | |
"It is frequently used to train and evaluate models that work on full-line or full-page recognition tasks.", | |
"For systems designed to capture the dynamic aspects of handwriting, such as pen stroke trajectories, the IAM On-Line Handwriting Database offers valuable data.", | |
"The CVL dataset provides multi-writer handwritten texts with a range of writing styles, making it useful for assessing the generalization capabilities of OCR systems across diverse handwriting samples.", | |
"The RIMES dataset, developed for French handwriting recognition, contains scanned documents and is a key resource for evaluating systems in multilingual settings.", | |
"Various ICDAR competitions, such as ICDAR 2013 and ICDAR 2017, have released datasets that reflect the complexities of real-world handwriting, including historical documents and unconstrained writing.", | |
"For Arabic handwriting recognition, the KHATT dataset offers a collection of handwritten texts that capture the unique challenges of cursive and context-dependent scripts.", | |
"These datasets, along with continual evaluation efforts through competitions hosted at ICDAR and ICFHR, ensure that the field keeps pushing toward higher accuracy, better robustness, and broader language coverage.", | |
"Emerging benchmarks, often tailored to specific scripts, historical documents, or noisy real-world data, will further refine the state-of-the-art in handwriting OCR.", | |
"This array of resources continues to shape the development of handwriting OCR systems today.", | |
"This additional section outlines today's most influential datasets and benchmarks, highlighting how they continue to shape the development of handwriting OCR systems." | |
] | |
class OCRDataCollector: | |
def __init__(self): | |
self.collected_pairs = [] | |
self.current_text_block = self.get_random_text_block() | |
self.hf_api = HfApi() | |
def get_random_text_block(self): | |
block_length = random.randint(1, 5) | |
start_index = random.randint(0, len(sentences) - block_length) | |
block = " ".join(sentences[start_index:start_index + block_length]) | |
return block | |
def submit_image(self, image, text_block, username: Optional[str] = None): | |
if image is not None and username: | |
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") | |
self.collected_pairs.append({ | |
"text": text_block, | |
"image": image, | |
"timestamp": timestamp, | |
"username": username | |
}) | |
return self.get_random_text_block() | |
def skip_text(self, text_block, username: Optional[str] = None): | |
return self.get_random_text_block() | |
def create_gradio_interface(): | |
collector = OCRDataCollector() | |
with gr.Blocks() as demo: | |
gr.Markdown("## Crowdsourcing Handwriting OCR Dataset") | |
with gr.Row(): | |
user_info = gr.Markdown("") | |
def update_user_info(request: gr.Request): | |
if request.username: | |
return f"Logged in as: {request.username}", gr.update(visible=True) | |
return "Please log in with your Hugging Face account to contribute to the dataset.", gr.update(visible=False) | |
with gr.Column(visible=False) as main_interface: | |
gr.Markdown("You will be shown between 1 and 5 consecutive sentences. Please handwrite them on paper and upload an image of your handwriting. If you wish to skip the current text, click 'Skip'.") | |
text_box = gr.Textbox(value=collector.current_text_block, label="Text to Handwrite", interactive=False) | |
image_input = gr.Image(type="pil", label="Upload Handwritten Image", sources=["upload"]) | |
with gr.Row(): | |
submit_btn = gr.Button("Submit") | |
skip_btn = gr.Button("Skip") | |
def check_login(request: gr.Request): | |
if request.username is None: | |
raise gr.Error("Please log in to use this application") | |
return request.username | |
def protected_submit(image, text_block, request: gr.Request): | |
username = check_login(request) | |
return collector.submit_image(image, text_block, username) | |
def protected_skip(text_block, request: gr.Request): | |
username = check_login(request) | |
return collector.skip_text(text_block, username) | |
demo.load(update_user_info, outputs=[user_info, main_interface]) | |
submit_btn.click( | |
fn=protected_submit, | |
inputs=[image_input, text_box], | |
outputs=text_box | |
) | |
skip_btn.click( | |
fn=protected_skip, | |
inputs=[text_box], | |
outputs=text_box | |
) | |
return demo | |
if __name__ == "__main__": | |
demo = create_gradio_interface() | |
demo.launch(auth_message="Please login with your Hugging Face account to contribute to the dataset.") |