Knut Jägersberg's picture

Knut Jägersberg

KnutJaegersberg

AI & ML interests

NLP, opinion mining, narrative intelligence

Recent Activity

liked a model about 5 hours ago
ibm-granite/granite-3.2-8b-instruct-preview
posted an update about 14 hours ago
A Brief Survey of Associations Between Meta-Learning and General AI The paper titled "A Brief Survey of Associations Between Meta-Learning and General AI" explores how meta-learning techniques can contribute to the development of Artificial General Intelligence (AGI). Here are the key points summarized: 1. General AI (AGI) and Meta-Learning: - AGI aims to develop algorithms that can handle a wide variety of tasks, similar to human intelligence. Current AI systems excel at specific tasks but struggle with generalization to unseen tasks. - Meta-learning or "learning to learn" improves model adaptation and generalization, allowing AI systems to tackle new tasks efficiently using prior experiences. 2. Neural Network Design in Meta-Learning: - Techniques like Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks enable self-improvement and adaptability for deep models, supporting generalization across tasks. - Highway networks and ResNet-style models use shortcuts for efficient backpropagation, allowing deeper models that can be used in meta-learning frameworks. 3. Coevolution: - Coevolution involves the mutual evolution of multiple components, such as learners or task-solvers, to improve overall performance. - Coevolution between learners enhances collaboration and competition within AI systems, while coevolution between tasks and solvers (e.g., POWERPLAY and AI-GA frameworks) pushes solvers to adapt to increasingly complex tasks. 4. Curiosity in Meta-Learning: - Curiosity-based exploration encourages AI systems to discover new, diverse features of the environment, avoiding local optima. - Curiosity-based objectives can be combined with performance-based objectives to ensure efficient exploration and adaptation in complex tasks. 5. Forgetting Mechanisms: - Forgetting is crucial to avoid memory overload in AI systems https://arxiv.org/abs/2101.04283
liked a Space 1 day ago
webml-community/kokoro-webgpu
View all activity

Organizations

LLMs's profile picture Blog-explorers's profile picture Qwen's profile picture Social Post Explorers's profile picture M4-ai's profile picture Chinese LLMs on Hugging Face's profile picture Smol Community's profile picture

KnutJaegersberg's activity

posted an update about 14 hours ago
view post
Post
270
A Brief Survey of Associations Between Meta-Learning and General AI

The paper titled "A Brief Survey of Associations Between Meta-Learning and General AI" explores how meta-learning techniques can contribute to the development of Artificial General Intelligence (AGI). Here are the key points summarized:

1. General AI (AGI) and Meta-Learning:
- AGI aims to develop algorithms that can handle a wide variety of tasks, similar to human intelligence. Current AI systems excel at specific tasks but struggle with generalization to unseen tasks.
- Meta-learning or "learning to learn" improves model adaptation and generalization, allowing AI systems to tackle new tasks efficiently using prior experiences.

2. Neural Network Design in Meta-Learning:
- Techniques like Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks enable self-improvement and adaptability for deep models, supporting generalization across tasks.
- Highway networks and ResNet-style models use shortcuts for efficient backpropagation, allowing deeper models that can be used in meta-learning frameworks.

3. Coevolution:
- Coevolution involves the mutual evolution of multiple components, such as learners or task-solvers, to improve overall performance.
- Coevolution between learners enhances collaboration and competition within AI systems, while coevolution between tasks and solvers (e.g., POWERPLAY and AI-GA frameworks) pushes solvers to adapt to increasingly complex tasks.

4. Curiosity in Meta-Learning:
- Curiosity-based exploration encourages AI systems to discover new, diverse features of the environment, avoiding local optima.
- Curiosity-based objectives can be combined with performance-based objectives to ensure efficient exploration and adaptation in complex tasks.

5. Forgetting Mechanisms:
- Forgetting is crucial to avoid memory overload in AI systems

https://arxiv.org/abs/2101.04283
posted an update 2 days ago
view post
Post
1241
Artificial general intelligence through recursive data compression and grounded reasoning: a position paper


This paper proposes a system to achieve AGI through general data compression and grounded reasoning.

General Data Compression involves creating a flexible algorithm that adapts to input data to simplify and compress it recursively, identifying simple, orthogonal features to avoid redundancy. The algorithm measures AGI progress by solving problems based on increasing complexity, and it expands its search space according to the data itself. Compression is applied not only to data but also to model parameters, and sequences are segmented based on compressibility.

Grounded Reasoning refers to forming representations with various granularities, crucial for commonsense reasoning and AGI. The system simulates the real world as its model, switching between representations and maximizing resourcefulness. Key ideas include the world as its own model for reasoning and actions aimed at maximizing entropy to test hypotheses.

The paper emphasizes simplicity, data-dependent bias, recursion, orthogonality, resourcefulness, and grounding in real-world contexts as fundamental principles in building an AGI system.

https://arxiv.org/abs/1506.04366
  • 1 reply
·
reacted to etemiz's post with 👀 7 days ago
reacted to chansung's post with 👍 7 days ago
view post
Post
4108
A brief summary of the o3-mini

The OpenAI o3-mini model is a significant improvement over the o1-mini, reaching o1 performance levels. While generally good, its performance isn't universally better than previous models (o1, o1-prev.) or GPT-4o across all benchmarks. This means workflows should be re-evaluated with each model upgrade.

The o3-mini has "low," "medium," and "high" versions, with "low" being the base model used for benchmarking. It's speculated that the higher versions simply involve more processing. A fair comparison with other models like Gemini 2.0 Thinking or DeepSeek-R1 would likely need to use the "low" version and a similar "think more" mechanism.

The system card is recommended reading due to its comprehensive benchmark data.

https://openai.com/index/openai-o3-mini/
posted an update 9 days ago
view post
Post
958
Anthropomorphic reasoning about neuromorphic AGI safety

Summary of "Anthropomorphic Reasoning About Neuromorphic AGI Safety"
This paper explores safety strategies for neuromorphic artificial general intelligence (AGI), defined as systems designed by reverse-engineering essential computations of the human brain. Key arguments and proposals include:

1. Anthropomorphic Reasoning Validity:
- Neuromorphic AGI’s design and assessment rely on human cognition models, making anthropomorphic reasoning (using human-like traits) critical for safety analysis. Comparisons to human behavior and neural mechanisms provide insights into AGI behavior and risks.

2. Countering Safety Criticisms:
- The authors challenge claims that neuromorphic AGI is inherently more dangerous than other AGI approaches. They argue all AGI systems face intractable verification challenges (e.g., real-world unpredictability, incomputable action validation). Neuromorphic AGI may even offer safety advantages by enabling comparisons to human cognitive processes.

3. Motivational Architecture:
- Basic drives (e.g., curiosity, social interaction) are essential for cognitive development and safety. These pre-conceptual, hardwired drives (analogous to human hunger or affiliation) shape learning and behavior. The orthogonality thesis (intelligence and goals as independent) is contested, as neuromorphic AGI’s drives likely intertwine with its cognitive architecture.

4. Safety Strategies:
- **Social Drives**: Embedding drives like caregiving, affiliation, and cooperation ensures AGI develops prosocial values through human interaction.
- **Bounded Reward Systems**: Human-like satiation mechanisms (e.g., diminishing rewards after fulfillment) prevent extreme behaviors (e.g., paperclip maximization).
- **Developmental Environment**: Exposure to diverse, positive human interactions and moral examples fosters

https://ccnlab.org/papers/JilkHerdReadEtAl17.pdf
posted an update 13 days ago
view post
Post
1855
Evolution and The Knightian Blindspot of Machine Learning


The paper discusses machine learning's limitations in addressing Knightian Uncertainty (KU), highlighting the fragility of models like reinforcement learning (RL) in unpredictable, open-world environments. KU refers to uncertainty that can't be quantified or predicted, a challenge that RL fails to handle due to its reliance on fixed data distributions and limited formalisms.


### Key Approaches:

1. **Artificial Life (ALife):** Simulating diverse, evolving systems to generate adaptability, mimicking biological evolution's robustness to unpredictable environments.

2. **Open-Endedness:** Creating AI systems capable of continuous innovation and adaptation, drawing inspiration from human creativity and scientific discovery.

3. **Revising RL Formalisms:** Modifying reinforcement learning (RL) models to handle dynamic, open-world environments by integrating more flexible assumptions and evolutionary strategies.

These approaches aim to address ML’s limitations in real-world uncertainty and move toward more adaptive, general intelligence.

https://arxiv.org/abs/2501.13075
reacted to clem's post with 🔥 13 days ago
view post
Post
6956
AI is not a zero-sum game. Open-source AI is the tide that lifts all boats!
posted an update 15 days ago
view post
Post
2066
Artificial Kuramoto Oscillatory Neurons

Artificial Kuramoto Oscillatory Neurons (AKOrN) differ from traditional artificial neurons by oscillating, rather than just turning on or off. Each neuron is represented by a rotating vector on a sphere, influenced by its connections to other neurons. This behavior is based on the Kuramoto model, which describes how oscillators (like neurons) tend to synchronize, similar to pendulums swinging in unison.

Key points:

Oscillating Neurons: Each AKOrN’s rotation is influenced by its connections, and they try to synchronize or oppose each other.
Synchronization: When neurons synchronize, they "bind," allowing the network to represent complex concepts (e.g., "a blue square toy") by compressing information.
Updating Mechanism: Neurons update their rotations based on connected neurons, input stimuli, and their natural frequency, using a Kuramoto update formula.
Network Structure: AKOrNs can be used in various network layers, with iterative blocks combining Kuramoto layers and feature extraction modules.
Reasoning: This model can perform reasoning tasks, like solving Sudoku puzzles, by adjusting neuron interactions.
Advantages: AKOrNs offer robust feature binding, reasoning capabilities, resistance to adversarial data, and well-calibrated uncertainty estimation.
In summary, AKOrN's oscillatory neurons and synchronization mechanisms enable the network to learn, reason, and handle complex tasks like image classification and object discovery with enhanced robustness and flexibility.

yt
https://www.youtube.com/watch?v=i3fRf6fb9ZM
paper
https://arxiv.org/html/2410.13821v1
  • 2 replies
·
replied to their post 16 days ago
view reply

meaning making is always work!
we can discriminate against (partially) AI generated content, to our disadvantage. That's freedom of choice.

posted an update 16 days ago
posted an update 20 days ago
reacted to AtAndDev's post with 🚀 20 days ago
view post
Post
1598
R1 is out! And with a lot of other R1 releated models...
posted an update 21 days ago
view post
Post
1773
Understanding and Benchmarking Artificial Intelligence: OpenAI's o3 Is Not AGI

It's an interesting paper that argues "new approaches are required that can reliably solve a wide variety of problems without existing skills."
"It is therefore hoped that the benchmark outlined in this article contributes to further exploration of this direction of research and incentivises the development of new AGI approaches that focus on intelligence rather than skills."

https://arxiv.org/abs/2501.07458
reacted to prithivMLmods's post with 🔥 27 days ago
view post
Post
5934
Reasoning SmolLM2 🚀

🎯Fine-tuning SmolLM2 on a lightweight synthetic reasoning dataset for reasoning-specific tasks. Future updates will focus on lightweight, blazing-fast reasoning models. Until then, check out the blog for fine-tuning details.

🔥Blog : https://huggingface.co/blog/prithivMLmods/smollm2-ft

🔼 Models :
+ SmolLM2-CoT-360M : prithivMLmods/SmolLM2-CoT-360M
+ Reasoning-SmolLM2-135M : prithivMLmods/Reasoning-SmolLM2-135M
+ SmolLM2-CoT-360M-GGUF : prithivMLmods/SmolLM2-CoT-360M-GGUF

🤠 Other Details :
+ Demo : prithivMLmods/SmolLM2-CoT-360M
+ Fine-tune nB : prithivMLmods/SmolLM2-CoT-360M




reacted to davanstrien's post with 🔥 27 days ago
view post
Post
3058
Introducing scandi-fine-web-cleaner davanstrien/scandi-fine-web-cleaner, the first model trained on FineWeb-C community annotations!

FineWeb2 is a massive multilingual dataset for pre-training language models. Like any web-scale dataset, it contains low-quality content. How can we improve it?

Over the past months, an amazing community of 400+ annotators has been labelling content quality (using Argilla) across 23 languages through the FineWeb-C initiative.

Today, I'm happy to share the first classifier trained on this data.

🔍 What we've built:

- A lightweight classifier that efficiently removes low-quality content
- 90%+ precision demonstrated on Danish & Swedish
- Can process the 43M+ documents in Danish FineWeb2 with minimal compute

🌍 Why this matters: The approach can be reproduced for any of the 23 languages in FineWeb-C ( data-is-better-together/fineweb-c). We can improve training data quality at scale without massive compute resources by starting with community annotations and training small, efficient classifiers.

Want to build a classifier for your language? Check out the full blog post with code examples and implementation details: https://danielvanstrien.xyz/posts/2025/FineWeb-c/scandinavian-content-filtering-fineweb.html
  • 1 reply
·
reacted to merve's post with ❤️ 27 days ago
view post
Post
3873
there's a new multimodal retrieval model in town 🤠
LlamaIndex released vdr-2b-multi-v1
> uses 70% less image tokens, yet outperforming other dse-qwen2 based models
> 3x faster inference with less VRAM 💨
> shrinkable with matryoshka 🪆
> can do cross-lingual retrieval!
Collection: llamaindex/visual-document-retrieval-678151d19d2758f78ce910e1 (with models and datasets)
Demo: llamaindex/multimodal_vdr_demo
Learn more from their blog post here https://huggingface.co/blog/vdr-2b-multilingual 📖
posted an update 27 days ago
reacted to s3nh's post with ❤️ about 1 month ago
view post
Post
1885
Welcome back,

Small Language Models Enthusiasts and GPU Poor oss enjoyers lets connect.
Just created an organization which main target is to have fun with smaller models tuneable on consumer range GPUs, feel free to join and lets have some fun, much love ;3

https://huggingface.co/SmolTuners
·
posted an update about 2 months ago
reacted to sayakpaul's post with 🤗 2 months ago
view post
Post
2119
Introducing a high-quality open-preference dataset to further this line of research for image generation.

Despite being such an inseparable component for modern image generation, open preference datasets are a rarity!

So, we decided to work on one with the community!

Check it out here:
https://huggingface.co/blog/image-preferences
  • 7 replies
·