Shadin Pira's picture

Shadin Pira

shadinpira80
Β·

AI & ML interests

None yet

Recent Activity

Organizations

None yet

shadinpira80's activity

New activity in r3gm/Aesthetic_RVC_Inference_HF 3 months ago

Update README.md

#130 opened 3 months ago by
shadinpira80
New activity in leonelhs/FaceFusion 3 months ago

Update README.md

#7 opened 3 months ago by
shadinpira80
reacted to merve's post with πŸ”₯ 7 months ago
view post
Post
4217
I love Depth Anything V2 😍
It’s Depth Anything, but scaled with both larger teacher model and a gigantic dataset!

Here's a small TLDR of paper with a lot of findings, experiments and more.
I have also created a collection that has the models, the dataset, the demo and CoreML converted model 😚 merve/depth-anything-v2-release-6671902e798cd404513ffbf5

The authors have analyzed Marigold, a diffusion based model against Depth Anything and found out what’s up with using synthetic images vs real images for MDE:

πŸ”– Real data has a lot of label noise, inaccurate depth maps (caused by depth sensors missing transparent objects etc) and there are many details overlooked

πŸ”– Synthetic data have more precise and detailed depth labels and they are truly ground-truth, but there’s a distribution shift between real and synthetic images, and they have restricted scene coverage

The authors train different image encoders only on synthetic images and find out unless the encoder is very large the model can’t generalize well (but large models generalize inherently anyway) 🧐
But they still fail encountering real images that have wide distribution in labels (e.g. diverse instances of objects) πŸ₯²

Depth Anything v2 framework is to..

πŸ¦– Train a teacher model based on DINOv2-G based on 595K synthetic images
🏷️ Label 62M real images using teacher model
πŸ¦• Train a student model using the real images labelled by teacher
Result: 10x faster and more accurate than Marigold!

The authors also construct a new benchmark called DA-2K that is less noisy, highly detailed and more diverse!
reacted to LegolasS's post with 🀯 7 months ago
view post
Post
4045
🀯🀯🀯VERY ROBUST TOOL to control camera motion for videos!!!

Even doesn't need any additional finetuning! It uses inference process of video diffusion directly!!

Try it on your own video diffusion model and generate CINEMATIC SHOTS!πŸ“ΈπŸŽ₯🫒

Check at https://lifedecoder.github.io/CamTrol/
liked a Space 11 months ago