Spaces:
Running
on
Zero
Running
on
Zero
File size: 4,463 Bytes
2a20f7f 323b1b7 0d63382 537c9da 35ac796 bcd4a65 2a20f7f f272480 2a20f7f a2dd896 2a20f7f cd676ee e6a1391 bcd4a65 e6a1391 bcd4a65 2a20f7f 323b1b7 a2dd896 781f01b 323b1b7 a2dd896 323b1b7 ac1681f d89b95a ac1681f 323b1b7 ac1681f 198284d ac1681f 323b1b7 ac1681f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
---
title: Inkling
emoji: π
colorFrom: indigo
colorTo: yellow
python_version: 3.1
sdk: gradio
sdk_version: 5.25.2
app_file: app.py
pinned: true
license: agpl-3.0
short_description: Use AI to find obvious research links in unexpected places.
datasets:
- nomadicsynth/arxiv-dataset-abstract-embeddings
models:
- nomadicsynth/research-compass-arxiv-abstracts-embedding-model
---
# Inkling: AI-assisted research discovery

[**Inkling**](https://nomadicsynth-research-compass.hf.space) is an AI-assisted tool that helps you discover meaningful connections between research papers β the kind of links a domain expert might spot, if they had time to read everything.
Rather than relying on superficial similarity or shared keywords, Inkling is trained to recognize **reasoning-based relationships** between papers. It evaluates conceptual, methodological, and application-level connections β even across disciplines β and surfaces links that may be overlooked due to the sheer scale of the research landscape.
This demo uses the first prototype of the model, trained on a dataset of **10,000+ rated abstract pairs**, built from a larger pool of arXiv triplets. The system will continue to improve with feedback and will be released alongside the dataset for public research.
---
## What it does
- Accepts a research abstract, idea, or question
- Searches for papers with **deep, contextual relevance**
- Highlights key conceptual links and application overlaps
- Offers reasoning-based analysis between selected papers
- Gathers user feedback to improve the model over time
---
## Background and Motivation
Scientific progress often depends on connecting ideas across papers, fields, and years of literature. But with the volume of research growing exponentially, it's increasingly difficult for any one person β or even a team β to stay on top of it all. As a result, valuable connections between papers often go unnoticed simply because the right expert never read both.
In 2024, Luo et al. published a landmark study in *Nature Human Behaviour* showing that **large language models (LLMs) can outperform human experts** in predicting the results of neuroscience experiments by integrating knowledge across the scientific literature. Their model, **BrainGPT**, demonstrated how tuning a general-purpose LLM (like Mistral-7B) on domain-specific data could synthesize insights that surpass human forecasting ability. Notably, the authors found that models as small as 7B parameters performed well β an insight that influenced the foundation for this project.
Inspired by this work β and a YouTube breakdown by physicist and science communicator **Sabine Hossenfelder**, titled *["AIs Predict Research Results Without Doing Research"](https://www.youtube.com/watch?v=Qgrl3JSWWDE)* β this project began as an attempt to explore similar methods of knowledge integration at the level of paper-pair relationships. Her clear explanation and commentary sparked the idea to apply this paradigm not just to forecasting outcomes, but to identifying latent connections between published studies.
Originally conceived as a perplexity-ranking experiment using LLMs directly (mirroring Luo et al.'s evaluation method), the project gradually evolved into what it is now β **Inkling**, a reasoning-aware embedding model fine-tuned on LLM-rated abstract pairings, and built to help researchers uncover links that would be obvious β *if only someone had the time to read everything*.
---
## Why Inkling?
> Because the right connection is often obvious β once someone points it out.
Researchers today are overwhelmed by volume. Inkling helps restore those missed-but-meaningful links between ideas, methods, and fields β links that could inspire new directions, clarify existing work, or enable cross-pollination across domains.
---
## Citation
> Luo, X., Rechardt, A., Sun, G. et al. Large language models surpass human experts in predicting neuroscience results. *Nat Hum Behav* **9**, 305β315 (2025). [https://www.nature.com/articles/s41562-024-02046-9](https://www.nature.com/articles/s41562-024-02046-9)
---
## Status
Inkling is in **alpha** and under active development. The current model is hosted via Gradio, with a Hugging Face Space available for live interaction and feedback. Contributions, feedback, and collaboration are welcome.
|