Spaces:
Running
on
Zero
Running
on
Zero
title: Inkling | |
emoji: π | |
colorFrom: indigo | |
colorTo: yellow | |
python_version: 3.1 | |
sdk: gradio | |
sdk_version: 5.25.2 | |
app_file: app.py | |
pinned: true | |
license: agpl-3.0 | |
short_description: Use AI to find obvious research links in unexpected places. | |
datasets: | |
- nomadicsynth/arxiv-dataset-abstract-embeddings | |
models: | |
- nomadicsynth/research-compass-arxiv-abstracts-embedding-model | |
# Inkling: AI-assisted research discovery | |
 | |
[**Inkling**](https://nomadicsynth-research-compass.hf.space) is an AI-assisted tool that helps you discover meaningful connections between research papers β the kind of links a domain expert might spot, if they had time to read everything. | |
Rather than relying on superficial similarity or shared keywords, Inkling is trained to recognize **reasoning-based relationships** between papers. It evaluates conceptual, methodological, and application-level connections β even across disciplines β and surfaces links that may be overlooked due to the sheer scale of the research landscape. | |
This demo uses the first prototype of the model, trained on a dataset of **10,000+ rated abstract pairs**, built from a larger pool of arXiv triplets. The system will continue to improve with feedback and will be released alongside the dataset for public research. | |
--- | |
## What it does | |
- Accepts a research abstract, idea, or question | |
- Searches for papers with **deep, contextual relevance** | |
- Highlights key conceptual links and application overlaps | |
- Offers reasoning-based analysis between selected papers | |
- Gathers user feedback to improve the model over time | |
--- | |
## Background and Motivation | |
Scientific progress often depends on connecting ideas across papers, fields, and years of literature. But with the volume of research growing exponentially, it's increasingly difficult for any one person β or even a team β to stay on top of it all. As a result, valuable connections between papers often go unnoticed simply because the right expert never read both. | |
In 2024, Luo et al. published a landmark study in *Nature Human Behaviour* showing that **large language models (LLMs) can outperform human experts** in predicting the results of neuroscience experiments by integrating knowledge across the scientific literature. Their model, **BrainGPT**, demonstrated how tuning a general-purpose LLM (like Mistral-7B) on domain-specific data could synthesize insights that surpass human forecasting ability. Notably, the authors found that models as small as 7B parameters performed well β an insight that influenced the foundation for this project. | |
Inspired by this work β and a YouTube breakdown by physicist and science communicator **Sabine Hossenfelder**, titled *["AIs Predict Research Results Without Doing Research"](https://www.youtube.com/watch?v=Qgrl3JSWWDE)* β this project began as an attempt to explore similar methods of knowledge integration at the level of paper-pair relationships. Her clear explanation and commentary sparked the idea to apply this paradigm not just to forecasting outcomes, but to identifying latent connections between published studies. | |
Originally conceived as a perplexity-ranking experiment using LLMs directly (mirroring Luo et al.'s evaluation method), the project gradually evolved into what it is now β **Inkling**, a reasoning-aware embedding model fine-tuned on LLM-rated abstract pairings, and built to help researchers uncover links that would be obvious β *if only someone had the time to read everything*. | |
--- | |
## Why Inkling? | |
> Because the right connection is often obvious β once someone points it out. | |
Researchers today are overwhelmed by volume. Inkling helps restore those missed-but-meaningful links between ideas, methods, and fields β links that could inspire new directions, clarify existing work, or enable cross-pollination across domains. | |
--- | |
## Citation | |
> Luo, X., Rechardt, A., Sun, G. et al. Large language models surpass human experts in predicting neuroscience results. *Nat Hum Behav* **9**, 305β315 (2025). [https://www.nature.com/articles/s41562-024-02046-9](https://www.nature.com/articles/s41562-024-02046-9) | |
--- | |
## Status | |
Inkling is in **alpha** and under active development. The current model is hosted via Gradio, with a Hugging Face Space available for live interaction and feedback. Contributions, feedback, and collaboration are welcome. | |