Cropping absurdity

#2
by AlekseyCalvin - opened

The mandatory cropping (applied even to very low resolution/small inputs!) appears to be entirely automated and does not take into account any compositional specificities of inputs. In practice, this makes this implementation absurdly unusable. For example, let's say I input a relatively small (say, 512x768) portrait of a person sitting. Your pre-processing script automatically crops out 75% of the image, including the subject's head and face, and instead renders a tile featuring an extreme zoom-in on the subject's hand. And because the image was already low-resolution in the first place (because, otherwise, why would I even use it in this Space), the cropping leaves me with a remainder fragment at an even lower resolution: so low, in fact, that likely no upscaler (and certainly not this implementation) would be able to improve it. I am aware that you've introduced the cropping script in the interest of not having those interest run into queues. And yes, I can guarantee that if the cropping feature stays in place, there wouldn't be any long queues here: but only because the frustration with the cropping would soon drive away all those interested.

I do not see this as viable. Thera is in Topaz Video AI for a while and can easily do 4k upscales while this thing took my 360x360 at 3x and made it 100% bad. Is this thing really working even?

image (33).png

There is the image I downloaded.

Photogrammetry and Remote Sensing Lab of ETH Zurich org

Hi, it looks like your input image contained compression artifacts, which is something that Thera currently can't handle well. We had also added a note about that to the demo.

It's important to note that the scope of our paper was about arbitrary-scale super-resolution (ASR), and for that we assumed inputs without artifacts. At a later stage, we might re-train the model with such data, but so far this has not been the focus. Other methods for ASR don't do that either, currently.

That is the image it gave me back. Oh, well, this is simply not a viable solution for real world work as was being demonstrated in the YouTube videos.

Photogrammetry and Remote Sensing Lab of ETH Zurich org

Where did you see Thera in Topaz Video AI, do you have a source?
And what YouTube video? Are we talking about the same project here?

All over the youtube creators, not just one. My mistake as I just fired up Topaz Video AI and theirs is called Theia.

Let's be honest with ourselves. When you see an image like this in the real world, there is going to be artifacts from compression as well.

image.png

Using an artificial means to manipulate an image to then upscale it is not a real world example, so once you bring in the artifact removal as well then we have something. At best this is only worth it for an AI generated image in something like ComfyUI and you just want to upscale it.

Photogrammetry and Remote Sensing Lab of ETH Zurich org

No, low resolution and compression artifacts are orthogonal concepts. Our paper addresses (A)SR, not compression artifact removal. If your image has strong compression artifacts, this is not the right method to apply to it.

Just saying reality versus paper analogies. This is not a real world ready project outside artificially generated low resolution images to then be upscaled. I am sure it will lead other devs to fork it to the next level for real world usage (think of the camera phone images from 20 years ago for instance).

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment