Post
386
๐ ๐ฒ๐๐ฎ ๐๐ฒ๐ฎ๐บ ๐ท๐๐๐ ๐ฑ๐ฟ๐ผ๐ฝ๐ฝ๐ฒ๐ฑ ๐๐ต๐ฒ ๐ณ๐ถ๐ฟ๐๐ ๐ช๐ฎ๐๐ฒ๐ฟ๐บ๐ฎ๐ฟ๐ธ๐ถ๐ป๐ด ๐บ๐ผ๐ฑ๐ฒ๐น ๐๐ต๐ฎ๐ ๐ป๐ผ๐ ๐ฒ๐ฑ๐ถ๐ ๐ฐ๐ฎ๐ป ๐ฏ๐ฟ๐ฒ๐ฎ๐ธ!๐ก๏ธ
๐ค Ever heard of watermarking? It's a technique that allows you to mark in an image its original source. It's our best shield against AI-generated deepfakes, or content stolen from artists! ๐จ
๐ญ Watermarking systems are actually a pair of models: a watermark embedder that applies the watermark on the image, and its corresponding decoder that should detect the original watermark.
โ But current methods were very limited: they can only apply and detect the watermark on your image as a whole. So, if you're an attacker it's easy to break: just crop it! add text on top! or whatever, really, anything would work to break the watermark.
A team of researchers at Meta was not happy with this. ๐ค
So to withstand real-world attacks, they decided to make a watermarking model that would also work on any sub-part of the image. It's a real paradigm shift: they consider watermarking not as an image classification task, but as an image segmentation task!
๐๏ธ ๐๐ฟ๐ฐ๐ต๐ถ๐๐ฒ๐ฐ๐๐๐ฟ๐ฒ
โธ The "Embedder" (a variational autoencoder + embedder, 1.1M parameters in total) encodes a n-bit message into a watermark signal that is added to the original image
โธ [Only during training] The "Augmenter" randomly distorts the image: masks parts, crops, resizes, compresses. It's basically torture at this point.
โธ The "Extractor" (a vision transformer, or ViT, with 96M parameters) then re-extracts the message from the distorted image, by predicting a (1+n) vector per pixel to predict the watermarked parts and decode corresponding messages.
The performance blows existing models out of the water, they even created new tasks (segmentation-related) just to grok them!
Gerat work @pierrefdz and @tomsander1998 !
Paper here ๐ Watermark Anything with Localized Messages (2411.07231)
๐ค Ever heard of watermarking? It's a technique that allows you to mark in an image its original source. It's our best shield against AI-generated deepfakes, or content stolen from artists! ๐จ
๐ญ Watermarking systems are actually a pair of models: a watermark embedder that applies the watermark on the image, and its corresponding decoder that should detect the original watermark.
โ But current methods were very limited: they can only apply and detect the watermark on your image as a whole. So, if you're an attacker it's easy to break: just crop it! add text on top! or whatever, really, anything would work to break the watermark.
A team of researchers at Meta was not happy with this. ๐ค
So to withstand real-world attacks, they decided to make a watermarking model that would also work on any sub-part of the image. It's a real paradigm shift: they consider watermarking not as an image classification task, but as an image segmentation task!
๐๏ธ ๐๐ฟ๐ฐ๐ต๐ถ๐๐ฒ๐ฐ๐๐๐ฟ๐ฒ
โธ The "Embedder" (a variational autoencoder + embedder, 1.1M parameters in total) encodes a n-bit message into a watermark signal that is added to the original image
โธ [Only during training] The "Augmenter" randomly distorts the image: masks parts, crops, resizes, compresses. It's basically torture at this point.
โธ The "Extractor" (a vision transformer, or ViT, with 96M parameters) then re-extracts the message from the distorted image, by predicting a (1+n) vector per pixel to predict the watermarked parts and decode corresponding messages.
The performance blows existing models out of the water, they even created new tasks (segmentation-related) just to grok them!
Gerat work @pierrefdz and @tomsander1998 !
Paper here ๐ Watermark Anything with Localized Messages (2411.07231)