Adina Yakefu

AdinaY

AI & ML interests

None yet

Recent Activity

updated a collection about 5 hours ago
✨ MoE model 2025
updated a collection about 5 hours ago
2025 March
liked a model about 5 hours ago
deepseek-ai/DeepSeek-V3-0324
View all activity

Organizations

Hugging Face's profile picture Hugging Face Chinese Localization's profile picture Huggingface Projects's profile picture Blog-explorers's profile picture ICCV2023's profile picture Open LLM Leaderboard's profile picture huggingPartyParis's profile picture Qwen's profile picture Journalists on Hugging Face's profile picture Women on Hugging Face's profile picture Social Post Explorers's profile picture Chinese LLMs on Hugging Face's profile picture Hugging Face for Legal's profile picture

AdinaY's activity

posted an update about 21 hours ago
replied to their post 1 day ago
posted an update 5 days ago
posted an update 6 days ago
posted an update 6 days ago
posted an update 7 days ago
view post
Post
2053
Skywork-R1V🚀 38B open multimodal reasoning model with advanced visual CoT capabilities, released by Skywork.

Skywork/Skywork-R1V-38B

✨ Visual Reasoning: Breaks down complex images step by step.
✨ Math & Science: Solves visual problems with high precision.
✨ Combines text & images for deeper understanding.

posted an update 7 days ago
reacted to Kseniase's post with 🔥 9 days ago
view post
Post
7607
15 types of attention mechanisms

Attention mechanisms allow models to dynamically focus on specific parts of their input when performing tasks. In our recent article, we discussed Multi-Head Latent Attention (MLA) in detail and now it's time to summarize other existing types of attention.

Here is a list of 15 types of attention mechanisms used in AI models:

1. Soft attention (Deterministic attention) -> Neural Machine Translation by Jointly Learning to Align and Translate (1409.0473)
Assigns a continuous weight distribution over all parts of the input. It produces a weighted sum of the input using attention weights that sum to 1.

2. Hard attention (Stochastic attention) -> Effective Approaches to Attention-based Neural Machine Translation (1508.04025)
Makes a discrete selection of some part of the input to focus on at each step, rather than attending to everything.

3. Self-attention -> Attention Is All You Need (1706.03762)
Each element in the sequence "looks" at other elements and "decides" how much to borrow from each of them for its new representation.

4. Cross-Attention (Encoder-Decoder attention) -> Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation (2104.08771)
The queries come from one sequence and the keys/values come from another sequence. It allows a model to combine information from two different sources.

5. Multi-Head Attention (MHA) -> Attention Is All You Need (1706.03762)
Multiple attention “heads” are run in parallel.​ The model computes several attention distributions (heads), each with its own set of learned projections of queries, keys, and values.

6. Multi-Head Latent Attention (MLA) -> DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model (2405.04434)
Extends MHA by incorporating a latent space where attention heads can dynamically learn different latent factors or representations.

7. Memory-Based attention -> End-To-End Memory Networks (1503.08895)
Involves an external memory and uses attention to read from and write to this memory.

See other types in the comments 👇
  • 1 reply
·
reacted to aifeifei798's post with 👀 10 days ago
view post
Post
1161
一个加入水印的小程序
from PIL import Image, ImageDraw, ImageFont

def add_watermark(image):
    watermark_text = "AI Generated by DarkIdol FeiFei"

    # Ensure the input is an Image object
    if not isinstance(image, Image.Image):
        raise ValueError("Input must be a PIL Image object")

    width, height = image.size

    # Create a drawing object to draw on the image
    draw = ImageDraw.Draw(image)

    # Set the font size for the watermark text
    font_size = 10  # Set font size to 10
    try:
        # Try to use a common font file
        font = ImageFont.truetype("Iansui-Regular.ttf", font_size)
    except IOError:
        # Use the default font if the specified font file is not found
        font = ImageFont.load_default()

    # Calculate the width and height of the watermark text using textbbox
    bbox = draw.textbbox((0, 0), watermark_text, font=font)
    text_width = bbox[2] - bbox[0]
    text_height = bbox[3] - bbox[1]

    # Calculate the position for the watermark text (bottom-right corner)
    x = width - text_width - 10  # 10 is the right margin
    y = height - text_height - 10  # 10 is the bottom margin

    # Add the watermark text to the image
    draw.text((x, y), watermark_text, font=font, fill=(255, 255, 255, 128))

    # Return the modified image object
    return image

- 字体从https://fonts.google.com去找就可以了,程序都标注清楚了,自行修改
  • 1 reply
·
reacted to clem's post with 🚀🤗 11 days ago
view post
Post
4559
We just crossed 1,500,000 public models on Hugging Face (and 500k spaces, 330k datasets, 50k papers). One new repository is created every 15 seconds. Congratulations all!
·
posted an update 12 days ago
posted an update 12 days ago
reacted to clefourrier's post with 🚀 13 days ago
view post
Post
1860
Gemma3 family is out! Reading the tech report, and this section was really interesting to me from a methods/scientific fairness pov.

Instead of doing over-hyped comparisons, they clearly state that **results are reported in a setup which is advantageous to their models**.
(Which everybody does, but people usually don't say)

For a tech report, it makes a lot of sense to report model performance when used optimally!
On leaderboards on the other hand, comparison will be apples to apples, but in a potentially unoptimal way for a given model family (like some user interact sub-optimally with models)

Also contains a cool section (6) on training data memorization rate too! Important to see if your model will output the training data it has seen as such: always an issue for privacy/copyright/... but also very much for evaluation!

Because if your model knows its evals by heart, you're not testing for generalization.
reacted to thomwolf's post with 🚀🔥 13 days ago
view post
Post
2558
We've kept pushing our Open-R1 project, an open initiative to replicate and extend the techniques behind DeepSeek-R1.

And even we were mind-blown by the results we got with this latest model we're releasing: ⚡️OlympicCoder ( open-r1/OlympicCoder-7B and open-r1/OlympicCoder-32B)

It's beating Claude 3.7 on (competitive) programming –a domain Anthropic has been historically really strong at– and it's getting close to o1-mini/R1 on olympiad level coding with just 7B parameters!

And the best part is that we're open-sourcing all about its training dataset, the new IOI benchmark, and more in our Open-R1 progress report #3: https://huggingface.co/blog/open-r1/update-3

Datasets are are releasing:
- open-r1/codeforces
- open-r1/codeforces-cots
- open-r1/ioi
- open-r1/ioi-test-cases
- open-r1/ioi-sample-solutions
- open-r1/ioi-cots
- open-r1/ioi-2024-model-solutions
posted an update 13 days ago
posted an update 13 days ago
posted an update 19 days ago
view post
Post
2295
Babel🗼A multilingual LLM supporting 25 languages, released by the Alibaba DAMO team.

Model: Tower-Babel/babel-67c172157372d4d6c4b4c6d5
Paper: Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers (2503.00865)

✨ 9B/83B chat & base
✨ Supports 25 languages: English, Chinese, Hindi, Spanish, Arabic, French, Bengali, Portuguese, Russian, Urdu, Indonesian, German, Japanese, Swahili, Filipino, Tamil, Vietnamese, Turkish, Italian, Javanese, Korean, Hausa, Persian, Thai, and Burmese
  • 1 reply
·
reacted to clem's post with 🔥 21 days ago
view post
Post
5900
Super happy to welcome Nvidia as our latest enterprise hub customer. They have almost 2,000 team members using Hugging Face, and close to 20,000 followers of their org. Can't wait to see what they'll open-source for all of us in the coming months!

Nvidia's org: https://huggingface.co/nvidia
Enterprise hub: https://huggingface.co/enterprise