How to optimize your data labelling project with custom interfaces

Community Article Published October 16, 2024

Upvote

When seeking human feedback for AI projects, developers typically face two options:

Dedicated annotation tool, such as Argilla, Label Studio, or Prodigy
UI libraries like Gradio or Streamlit

While annotation tools offer streamlined, automated feedback processes, they can be rigid and force users to fit their project into preset templates. On the other hand, using a UI library gives you limitless customization, but it often lacks built-in support for handling and managing feedback efficiently. For example, distributing a task between annotators, user management, or integration with dataset tools.

In this blog post, we will explore how Argilla and the new CustomField feature offers a third option: a flexible web template that you can customize for your projects without sacrificing ease of use.

The Problem with Traditional Data Labeling Tools

When building AI models, gathering feedback is a critical part of improving performance and refining results. For this, developers use annotation tools so experts can label and analyze datasets. However, the challenge comes when dealing with complex, non-standard data and feedback types. For example, combinations of modalities like instructions that relate to documents, or complex media that requires exploration to review like a 3D object. Traditional annotation tools often fall short when working with these kinds of data, offering rigid workflows that force you to mold your project to fit their structure.

Argilla offers a solution with its CustomField feature. CustomField lets you define your own HTML, CSS, and JavaScript templates, so you can build a fully customized annotation interface that meets the exact needs of your project. This flexibility allows for handling of specialized datasets, whether you’re working with code, 3D models, videos, or text comparisons.

Examples of custom feedback in action

Through these examples, we’ll explore how CustomField can transform your feedback process across various use cases, providing examples of how it can be applied to different types of data. We won't include all code examples in this posts, but check out this guide for complete examples

Code Annotation and Debugging

Code generation models have become a major part of software development, and there is an increasing need to gather feedback on code—not just for correctness, but also for quality, efficiency, and style. Traditional annotation tools are ill-equipped to handle this, often treating code like plain text, which lacks the interactivity and context required for deep code reviews.

With Argilla’s CustomField, you can create custom annotation interfaces that not only display code but also allow for real-time interaction. For instance, embedding a Python interpreter within the annotation environment enables you to run and debug code directly before leaving comments or feedback. This means that, rather than simply reviewing lines of code statically, reviewers can test it, see how it runs, and offer far more informed feedback.

Check out the full code editor template here

We define our custom template in a html file and pass that to the dataset settings in Argilla’s python SDK.

settings = rg.Settings(
    fields=[
        rg.TextQuestion("instruction"),
        rg.TextQuestion("input"),
        rg.CustomField("code", template="codemirror.html"),
    ],
    questions=[rg.RatingQuestion("rating", [0, 1, 2, 3, 4, 5])],
)
dataset = rg.Dataset(
    name="Codemirror-dataset",
    settings=settings,
)
dataset.create()

3D Model Visualization and Annotation

In fields like robotics, gaming, and virtual reality, 3D models are an important part of the development process. Annotating these models, whether to identify design flaws, review areas for improvement, or gather feedback on the spatial orientation of objects, is a far more complex task than simple text annotation.

Most standard annotation tools aren’t equipped to handle 3D data, forcing developers to look for alternative methods that can slow down the workflow. Argilla’s CustomField enables you to render 3D models directly in the feedback interface, giving reviewers the ability to interact with the model in real time. Whether you need to rotate, zoom, or inspect a 3D object from different angles, CustomField provides the flexibility to design an interface where these interactions are possible.

Image Comparison and Preference Selection

When working on tasks that require image-based feedback, such as comparing two images for quality, aesthetic appeal, or specific attributes, it’s crucial to have an interface that allows for side-by-side visual evaluation. Argilla’s CustomField enables you to create just such an interface, tailored to handle image comparison tasks efficiently.

For example, in projects where users need to select their preferred image, you can present two images side by side in columns. With a flexible layout defined by HTML and CSS, you can create an intuitive interface that lets users easily compare both images at first sight.

This setup can be beneficial in domains like e-commerce (choosing the best product image), creative design (assessing visual layouts), or even machine learning model training, where users may need to provide preference feedback for image-based datasets.

Below is an example of the custom web template that you need to create the layout above. This is a minimal working example, but you could add further customisations for your use case. Check out this guide for complete examples

<style>
#container {
    display: flex;
    flex-direction: column;
}
.image-container {
    display: flex;
    gap: 10px;
}
.column {
    flex: 1;
    position: relative;
}
img {
    max-width: 100%;
    height: auto;
    display: block;
}
</style>
<div id="container">
    <div class="image-container">
        <div class="column">
            <img src={{record.fields.images.image_1}} />
        </div>
        <div class="column">
            <img src={{record.fields.images.image_2}} />
        </div>
    </div>
</div>

Language Translation and Text Alignment

As fine-tuned language models become increasingly popular for translation tasks, ensuring the accuracy and consistency of translations is more important than ever. One common challenge is how to efficiently review and refine translations, especially when working with large volumes of text across multiple languages.

With a custom field you can present text in a table. This makes it easier to review translations line by line, providing more efficient ways to assess and refine the output. Instead of switching between documents or relying on separate tools, you can manage the entire translation review process within a single, customized interface.

For example, newsrooms using LLMs or tools like DeepL for translation can log their adjustments directly in the interface, creating a dataset of edits that can be used to fine-tune their model over time. As translations are gradually aligned with a specific style guide, the model becomes increasingly accurate, reducing the need to start from scratch with every new project. This approach not only improves translation quality but also builds a valuable resource for future use.

Document Revision and Text Comparison

When working on document revisions, it’s crucial to easily identify changes between the original and updated versions to ensure that edits align with goals, whether in legal, technical, or editorial fields. Argilla’s CustomField lets you show document comparisons as a diff, allowing you to display the original text side by side with the revised version, highlighting insertions, deletions, and modifications.

Below is an example of this custom template. Here we have used a complete web template with cusotm javascript. You could in fact use any framework or library that you can access via a cdn. Check out this guide for complete examples

<script src="https://cdn.jsdelivr.net/npm/handlebars@latest/dist/handlebars.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/diff_match_patch/20121119/diff_match_patch.js"></script>

<div id="main-container"></div>

<script id="template" type="text/x-handlebars-template">
  <style>
    #container {
      display: flex;
      gap: 10px;
    }
    .column {
      flex: 1;
      padding: 10px;

    }
    .diff {
      padding: 10px;
      border: 1px solid #ddd;
      background-color: #fff;
      border-radius: 5px;
    }
    .diff-ins {
      background-color: #d4edda;
      text-decoration: none;
    }
    .diff-del {
      background-color: #f8d7da;
      text-decoration: line-through;
    }
  </style>
  <div id="container">
  <div class="column">
      <h3>Original text</h3>
      <div>{{record.fields.text.original}}</div>
    </div>
    <div class="column">
      <h3>Revision</h3>
      <div>{{{visualDiff}}}</div>
    </div>
  </div>

  </div>
</script>


<script>
  function createDiff(originalText, revisedText) {
    const dmp = new diff_match_patch();
    const diffs = dmp.diff_main(originalText, revisedText);
    dmp.diff_cleanupSemantic(diffs);

    let diffHtml = '';
    diffs.forEach(part => {
      const [type, text] = part;
      if (type === 1) { // Insertion
        diffHtml += `<span class="diff-ins">${text}</span>`;
      } else if (type === -1) { // Deletion
        diffHtml += `<span class="diff-del">${text}</span>`;
      } else { // Equal
        diffHtml += `<span>${text}</span>`;
      }
    });
    return diffHtml;
  }


  // Compile the Handlebars template
  const template = Handlebars.compile(document.getElementById('template').innerHTML);

  // Generate the visual diff and inject it into the template
  const visualDiff = createDiff(record.fields.text.original, record.fields.text.revision);
  
  // Render the template with the record and the visual diff
  const rendered = template({ record: record, visualDiff: visualDiff });
  
  // Inject the rendered HTML into the main container
  document.getElementById('main-container').innerHTML = rendered;
</script>

This approach can streamline feedback for tasks like contract revisions, collaborative writing, or code documentation updates. Reviewers can quickly assess which changes have been made without manually combing through the entire document, speeding up the revision process and improving accuracy.

Conclusion

As AI projects evolve, the demand for flexible, customizable annotation tools grows. Argilla’s CustomField feature addresses this need by enabling developers to build tailored annotation interfaces for even the most complex datasets—whether they involve code, 3D models, videos, text, or interactive data.

By leveraging the power of HTML, CSS, and JavaScript, CustomField allows for a level of customization that ensures your feedback process is as specialized as your data. No longer confined to one-size-fits-all solutions, you can now design the exact tools you need for your project, streamlining your workflow and improving the quality of your annotations.

Whether you’re debugging Python code, visualizing 3D models, analyzing video frames, or aligning translations, Argilla’s CustomField gives you the flexibility to build the feedback process that works for you. So before you consider building an in-house tool from scratch, give Argilla’s CustomField a try—it might just be the solution you’ve been looking for.

Try Argilla out now by following Argilla’s quickstart guide or check out this guide for complete examples of CustomField

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote