nicolay-r (Nicolay Rusnachenko)

updated 5 models 9 days ago

posted an update 21 days ago

Post

2904

📢 For those who planning to start a PhD or research in the UK 🇬🇧 (including AI field in particular) but facing ATAS (Academic Technology Approval Scheme) issues.
Excited to share the ultimate guide for dealing with ATAS refusals and how to write effective rebuttal letters.

🎬 https://youtu.be/bfknM3n-SHs

🔍 From the video you will find:
1. Why appealing an ATAS decision matters even if your visa is approved
2. Which docments to use in understanding the principles behind sponsorship decisions
3. Key tips for proper rebuttal letter structuring

replied to VolodymyrPugachov's post 23 days ago

Sounds good, all the best in this route!

reacted to hesamation's post with 👀 24 days ago

Post

3184

longer context doesn't generate better responses. it can even hurt your llm/agent. 1M context window doesn't automatically make models smarter as it's not about the size; it's how you use it.

here are 4 types of context failure and why each one happens:

1. context poisoning: if hallucination finds its way into your context, the agent will rely on that false information to make its future moves. for example if the agent hallucinates about the "task description", all of its planning to solve the task would also be corrupt.

2. context distraction: when the context becomes too bloated, the model focuses too much on it rather than come up with novel ideas or to follow what it has learned during training. as Gemini 2.5 Pro technical report points out, as context grows significantly from 100K tokens, "the agent showed a tendency toward favoring repeating actions from its vast history rather than synthesizing novel plans".

3. context confusion: everyone lost it when MCPs became popular, it seemed like AGI was achieved. I suspected there is something wrong and there was: it's not just about providing tools, bloating the context with tool use derails the model from selecting the right one! even if you can fit all your tool metadata in the context, as their number grows, the model gets confused over which one to pick.

4. Context Clash: if you exchange conversation with a model step by step and provide information as you go along, chances are you get worse performance rather than providing all the useful information at once. one the model's context fills with wrong information, it's more difficult to guide it to embrace the right info. agents pull information from tools, documents, user queries, etc. and there is a chance that some of these information contradict each other, and it's not good new for agentic applications.

check this article by Drew Breunig for deeper read: https://www.dbreunig.com/2025/06/26/how-to-fix-your-context.html?ref=blog.langchain.com

2 replies

·

replied to hesamation's post 24 days ago

Thanks for that grouped findings!
In addition, lost in the middle would fitted around 1-3 for sure:
https://arxiv.org/abs/2307.03172

reacted to kanaria007's post with 👀 24 days ago

Post

1367

✅ New Article on Hugging Face: Structured Perception for Structured Reasoning — Rethinking AI Input Through the Five-Sense Protocol

Title:
🧠 Understanding the Perceptual-Interface Protocol: Structured Sensory Modules for Cognitive Input Parsing
🔗 Read it here: https://huggingface.co/blog/kanaria007/understanding-the-perceptual-interface-protocol

---

Summary:
What if artificial intelligence systems could “sense” inputs — not physically, but structurally?

This article introduces the *Five Sense Protocol*, a *theoretical blueprint* for embedding cognitive input structuring into AI systems.
Inspired by human perceptual organization, it proposes *five abstract sensory layers* to refine reasoning through structured perception.

---

Why It Matters:
Current LLMs treat input as undifferentiated text streams. But humans don’t.

We segment, anticipate, detect contradiction, monitor coherence, and adjust ethically — *before* reasoning begins.
The Perceptual-Interface protocol brings this pre-reasoning perceptual organization into AI cognition design.

---

Core Layers:
• Surface Syntax Detection — using formatting as reasoning cues
• Temporal Rhythm Tracking — anticipating structural pacing
• Conflict Sensitivity — flagging contradictions and dissonance
• Structural Coherence Mapping — detecting fragmented logic
• Ethical Context Filters — triggering meta-awareness in sensitive domains

---

🧩 *Think of it as structured input for structured output.*
Reasoning is only as good as the way it begins.

---

Relevant For:
• AI researchers exploring perceptual architecture in cognition
• Developers building input-sensitive autonomous systems
• Cognitive scientists bridging human and artificial attention models

---

🧠 Protocol Dataset: kanaria007/agi-structural-intelligence-protocols

---

*This isn’t perception emulation.*
It’s *perception structuring* — for the next layer of intelligent reasoning.

replied to VolodymyrPugachov's post 25 days ago

What a detailed and vebose update! I see, there is a lot of happening here both on a side of the architectural solutions, organization of the fine-tuning, related experiments. My comment on that would be very broad as willing to have a closer look when highlighted ideas would be supported by the related tables as an outcome of experiments. At first glance, the idea of generatlized pre-training following by domain specific fine-tuning sounds quite promising. Besides that, curious if that DHM model and other cliincal-tasks related models would be publicly available

replied to MaziyarPanahi's post 25 days ago

True, especially with NER as a task, BERT fits too well both from the side of the required resources and annotation accuracy.

reacted to MaziyarPanahi's post with 🔥 27 days ago

Post

7874

🧬 Breaking news in Clinical AI: Introducing the OpenMed NER Model Discovery App on Hugging Face 🔬

OpenMed is back! 🔥 Finding the right biomedical NER model just became as precise as a PCR assay!

I'm thrilled to unveil my comprehensive OpenMed Named Entity Recognition Model Discovery App that puts 384 specialized biomedical AI models at your fingertips.

🎯 Why This Matters in Healthcare AI:
Traditional clinical text mining required hours of manual model evaluation. My Discovery App instantly connects researchers, clinicians, and data scientists with the exact NER models they need for their biomedical entity extraction tasks.

🔬 What You Can Discover:
✅ Pharmacological Models - Extract "chemical compounds", "drug interactions", and "pharmaceutical" entities from clinical notes
✅ Genomics & Proteomics - Identify "DNA sequences", "RNA transcripts", "gene variants", "protein complexes", and "cell lines"
✅ Pathology & Disease Detection - Recognize "pathological formations", "cancer types", and "disease entities" in medical literature
✅ Anatomical Recognition - Map "anatomical systems", "tissue types", "organ structures", and "cellular components"
✅ Clinical Entity Extraction - Detect "organism species", "amino acids", 'protein families", and "multi-tissue structures"

💡 Advanced Features:
🔍 Intelligent Entity Search - Find models by specific biomedical entities (e.g., "Show me models detecting CHEM + DNA + Protein")
🏥 Domain-Specific Filtering - Browse by Oncology, Pharmacology, Genomics, Pathology, Hematology, and more
📊 Model Architecture Insights - Compare BERT, RoBERTa, and DeBERTa implementations
⚡ Real-Time Search - Auto-filtering as you type, no search buttons needed
🎨 Clinical-Grade UI - Beautiful, intuitive interface designed for medical professionals

Ready to revolutionize your biomedical NLP pipeline?

🔗 Try it now: OpenMed/openmed-ner-models
🧬 Built with: Gradio, Transformers, Advanced Entity Mapping

5 replies

·

replied to MaziyarPanahi's post 27 days ago

Pleased for the lightweight architecture implementaiton and filtering feature.
Thank you for sharing this! 👀

replied to VolodymyrPugachov's post 27 days ago

Well done to Healthcare contribution and to field of ECG data processing in particular. 👏
Looks like base model mention a lot of concepts (CNN, ViT, GNN) that could individually represent a solution for that propblem or form a processing pipeline. Would it be possible to ask on having a look on the framework diagram / preprint studies for a greater detail?

liked a dataset 27 days ago

FreedomIntelligence/medical-o1-reasoning-SFT

Viewer • Updated Apr 22 • 90.1k • 9.19k • 831

reacted to VolodymyrPugachov's post with 🚀 27 days ago

Post

3713

Digital Heart Model: Initial Research Launch 🚀

I am excited to announce the launch of research on the Digital Heart Model (DHM), an AI-driven digital twin designed to transform personalized cardiovascular care. DHM integrates multimodal data, focusing initially on cardiac imaging, histopathological imaging, and ECG data, to predict patient outcomes and optimize interventions.

Initial Model and Dataset Overview:

Base Model: Multimodal AI foundation combining Convolutional Neural Networks (CNN), Vision Transformers (ViT), and Graph Neural Networks (GNN).

Datasets: Cardiac MRI and CT imaging datasets, histopathological cardiac tissue images, and extensive ECG waveform data.

Expected Results from First Iteration:

Cardiac event prediction (e.g., myocardial infarction) accuracy: AUC ≥ 0.90

Arrhythmia detection and classification accuracy: AUC ≥ 0.88

Enhanced segmentation accuracy for cardiac imaging: Dice Score ≥ 0.85

🔍 Next Steps:

Conducting initial retrospective validation.

Preparing for prospective clinical validation.

Stay tuned for updates as we redefine cardiovascular precision medicine!

Connect with us for collaboration and insights!

5 replies

·

upvoted a collection 27 days ago

sentiment-analysis-advances

Collection

This collection list studies aimed at advancing granular sentiment analysis in mass-media news • 10 items • Updated Apr 16 • 2

posted an update about 1 month ago

Post

250

🚀 For those who interested in multilingual clinical case report sukmmarization 🩺📋, deligned to share a video-update to the earlier post on Qwen2.5 model family adaptation:

🎬 Video: https://www.youtube.com/watch?v=uOAiUvLghuE

This is 15-min skimming of the study (+ 5 mins for code) in which we overview the application of Qwen model family (72B as a teacher and 0.5B as a student) in summarization of the clinical reports, including detaied overview of the experiments organization. In particular, attempted to cover:
1. Background of previous Seq2Seq models to conclude their limitations
2. ChatML roles exploiting for distilation tuning in clinical report summarization
3. Known limitation of work and unleashing full capabilities

As in previous post, there is a model card that is also covered in video.
🤗 Huggingface: https://huggingface.co/nicolay-r/qwen25-05b-multiclinsum-standar

replied to their post about 1 month ago

@Doctor-Chad-PhD , there is a definitely a room of further improvements, including adaptation of the larger instance of the Qwen and other models. Although such details goes beyound the experiments, I might still navigate you to more detailed overview, in which my observations on how to use this framework at larger scale:

https://www.youtube.com/watch?v=uOAiUvLghuE&t=960s

Spoiler: 1️⃣ necessity for more coplex summaries evaluation and 2️⃣ more precise declaration for the extracted clinical key reports.

Finally, organizers of the workshop still keep unpublished results of the other systems (planned to🔓 by the end of August), so and probably the leaderboard of other systems might sort out your question.

Nicolay Rusnachenko

AI & ML interests

Recent Activity

Organizations

nicolay-r/flan-t5-emotion-cause-thor-base

nicolay-r/flan-t5-tsa-prompt-xl

nicolay-r/flan-t5-tsa-thor-xl

nicolay-r/flan-t5-tsa-thor-large

nicolay-r/flan-t5-tsa-thor-base

FreedomIntelligence/medical-o1-reasoning-SFT

sentiment-analysis-advances

Nicolay Rusnachenko

AI & ML interests

Recent Activity

Organizations

nicolay-r's activity