Jonah Ramponi

jonah-ramponi
·

AI & ML interests

NLP

Organizations

None yet

jonah-ramponi's activity

posted an update 4 months ago
view post
Post
657
From Article 50 of the EU AI Act:

"2. Providers of AI systems, including general-purpose AI systems, generating synthetic audio, image, video or text content, shall ensure that the outputs of the AI system are marked in a machine-readable format and detectable as artificially generated or manipulated."

How might this be put into practice?

I'm interested to know how content might be deemed as being "detectable" as artificially generated. I wonder if this will require an image be detectable as AI generated if it was copied out of the site / application it was created on?

Some sort of a watermark? LSB Stegranography? I wonder if openAI are already sneaking something like this into DALL-E images.

Some sort of hash, which allowing content to be looked up, and verified as AI generated?

Would a pop up saying "this output was generated with AI"? suffice? Any ideas? Time is on the system provider's side, at least for now, as from what I can see this doesn't come into effect until August 2026.

src: https://artificialintelligenceact.eu/article/50/
  • 1 reply
·
posted an update 4 months ago
view post
Post
497
Thought this was an interesting graphic from the EAGLE blog post. It made me wonder if certain sampling methods have been shown to work better for certain tasks.

Does anyone know of any work looking at trends in the output token probability distribution by task type? (or similar)

Source: https://sites.google.com/view/eagle-llm
posted an update 4 months ago
view post
Post
491
🧠Shower Thought:

Chatbots should let users select their preferred reading speed, defined by words per minute.

By dynamically adjusting batch sizes based on user-defined reading speeds, you could more effectively distribute requests, especially in large-scale distributed systems. For users preferring slower token generation, larger batches can be processed concurrently, maximising GPU throughput without compromising user experience (as these users have expressed they are indifferent to, or may even prefer, higher latency).

For the user, for different tasks the user may prefer different reading speeds. When generating code, I want responses as quickly as possible. But when I'm bouncing ideas off an LLM, I'd prefer a more readable pace rather than a wall of text.


Thoughts?
replied to victor's post 4 months ago
view reply

one of my favourite features of Hugging Face is the daily papers page. It is great, but I think it could be improved in a number of ways:

  • it would be nice for papers to be tagged with topic. [LLM] or [Multimodal] and so on. Then the user could filter by topics relevant to them when looking through daily papers.

  • would be nice to have the ability to search over "influential" papers. Those papers with high upvotes, which are key reading.

  • I find that the current search is a bit temperamental - better ability to search and find relevant papers would be beneficial in my opinion.