Spaces:

0x90e
/

ESRGAN-MANGA

Runtime error

App Files Files Community

0x90e commited on Dec 4, 2022

Commit

0d5c9d2

1 Parent(s): 0747f10

Upscaling works now

Browse files

Files changed (7) hide show

README.md +174 -10
RRDBNet_arch.py +0 -78
architecture.py +38 -0
block.py +261 -0
models/README.md +0 -9
test.py +21 -25
transer_RRDB_models.py +0 -55

README.md CHANGED Viewed

@@ -1,10 +1,174 @@
----
-title: ESRGAN MANGA
-emoji: 🏃
-colorFrom: red
-colorTo: indigo
-sdk: gradio
-sdk_version: 3.12.0
-app_file: app.py
-pinned: false
----

+# ESRGAN (Enhanced SRGAN) [[Paper]](https://arxiv.org/abs/1809.00219) [[BasicSR]](https://github.com/xinntao/BasicSR)
+## :smiley: Training codes are in [BasicSR](https://github.com/xinntao/BasicSR) repo.
+### Enhanced Super-Resolution Generative Adversarial Networks
+By Xintao Wang, [Ke Yu](https://yuke93.github.io/), Shixiang Wu, [Jinjin Gu](http://www.jasongt.com/), Yihao Liu, [Chao Dong](https://scholar.google.com.hk/citations?user=OSDCB0UAAAAJ&hl=en), [Yu Qiao](http://mmlab.siat.ac.cn/yuqiao/), [Chen Change Loy](http://personal.ie.cuhk.edu.hk/~ccloy/)
+This repo only provides simple testing codes, pretrained models and the network strategy demo.
+### **For full training and testing codes, please refer to  [BasicSR](https://github.com/xinntao/BasicSR).**
+We won the first place in [PIRM2018-SR competition](https://www.pirm2018.org/PIRM-SR.html) (region 3) and got the best perceptual index.
+The paper is accepted to [ECCV2018 PIRM Workshop](https://pirm2018.org/).
+:triangular_flag_on_post: Add [Frequently Asked Questions](https://github.com/xinntao/ESRGAN/blob/master/QA.md).
+> For instance,
+> 1. How to reproduce your results in the PIRM18-SR Challenge (with low perceptual index)?
+> 2. How do you get the perceptual index in your ESRGAN paper?
+#### BibTeX
+<!--
+    @article{wang2018esrgan,
+        author={Wang, Xintao and Yu, Ke and Wu, Shixiang and Gu, Jinjin and Liu, Yihao and Dong, Chao and Loy, Chen Change and Qiao, Yu and Tang, Xiaoou},
+        title={ESRGAN: Enhanced super-resolution generative adversarial networks},
+        journal={arXiv preprint arXiv:1809.00219},
+        year={2018}
+    }
+-->
+    @InProceedings{wang2018esrgan,
+        author = {Wang, Xintao and Yu, Ke and Wu, Shixiang and Gu, Jinjin and Liu, Yihao and Dong, Chao and Qiao, Yu and Loy, Chen Change},
+        title = {ESRGAN: Enhanced super-resolution generative adversarial networks},
+        booktitle = {The European Conference on Computer Vision Workshops (ECCVW)},
+        month = {September},
+        year = {2018}
+    }
+<p align="center">
+  <img src="figures/baboon.jpg">
+</p>
+The **RRDB_PSNR** PSNR_oriented model trained with DF2K dataset (a merged dataset with [DIV2K](https://data.vision.ee.ethz.ch/cvl/DIV2K/) and [Flickr2K](http://cv.snu.ac.kr/research/EDSR/Flickr2K.tar) (proposed in [EDSR](https://github.com/LimBee/NTIRE2017))) is also able to achive high PSNR performance.
+| <sub>Method</sub> | <sub>Training dataset</sub> | <sub>Set5</sub> | <sub>Set14</sub> | <sub>BSD100</sub> | <sub>Urban100</sub> | <sub>Manga109</sub> |
+|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
+| <sub>[SRCNN](http://mmlab.ie.cuhk.edu.hk/projects/SRCNN.html)</sub>| <sub>291</sub>| <sub>30.48/0.8628</sub> |<sub>27.50/0.7513</sub>|<sub>26.90/0.7101</sub>|<sub>24.52/0.7221</sub>|<sub>27.58/0.8555</sub>|
+| <sub>[EDSR](https://github.com/thstkdgus35/EDSR-PyTorch)</sub> | <sub>DIV2K</sub> | <sub>32.46/0.8968</sub> | <sub>28.80/0.7876</sub> | <sub>27.71/0.7420</sub> | <sub>26.64/0.8033</sub> | <sub>31.02/0.9148</sub> |
+| <sub>[RCAN](https://github.com/yulunzhang/RCAN)</sub> |  <sub>DIV2K</sub> | <sub>32.63/0.9002</sub> | <sub>28.87/0.7889</sub> | <sub>27.77/0.7436</sub> | <sub>26.82/ 0.8087</sub>| <sub>31.22/ 0.9173</sub>|
+|<sub>RRDB(ours)</sub>| <sub>DF2K</sub>| <sub>**32.73/0.9011**</sub> |<sub>**28.99/0.7917**</sub> |<sub>**27.85/0.7455**</sub> |<sub>**27.03/0.8153**</sub> |<sub>**31.66/0.9196**</sub>|
+## Quick Test
+#### Dependencies
+- Python 3
+- [PyTorch >= 0.4](https://pytorch.org/) (CUDA version >= 7.5 if installing with CUDA. [More details](https://pytorch.org/get-started/previous-versions/))
+- Python packages:  `pip install numpy opencv-python`
+### Test models
+1. Clone this github repo.
+```
+git clone https://github.com/xinntao/ESRGAN
+cd ESRGAN
+```
+2. Place your own **low-resolution images** in `./LR` folder. (There are two sample images - baboon and comic).
+3. Download pretrained models from [Google Drive](https://drive.google.com/drive/u/0/folders/17VYV_SoZZesU6mbxz2dMAIccSSlqLecY) or [Baidu Drive](https://pan.baidu.com/s/1-Lh6ma-wXzfH8NqeBtPaFQ). Place the models in `./models`. We provide two models with high perceptual quality and high PSNR performance (see [model list](https://github.com/xinntao/ESRGAN/tree/master/models)).
+4. Run test. We provide ESRGAN model and RRDB_PSNR model.
+```
+python test.py models/RRDB_ESRGAN_x4.pth
+python test.py models/RRDB_PSNR_x4.pth
+```
+5. The results are in `./results` folder.
+### Network interpolation demo
+You can interpolate the RRDB_ESRGAN and RRDB_PSNR models with alpha in [0, 1].
+1. Run `python net_interp.py 0.8`, where *0.8* is the interpolation parameter and you can change it to any value in [0,1].
+2. Run `python test.py models/interp_08.pth`, where *models/interp_08.pth* is the model path.
+<p align="center">
+  <img height="400" src="figures/43074.gif">
+</p>
+## Perceptual-driven SR Results
+You can download all the resutls from [Google Drive](https://drive.google.com/drive/folders/1iaM-c6EgT1FNoJAOKmDrK7YhEhtlKcLx?usp=sharing). (:heavy_check_mark: included;  :heavy_minus_sign: not included; :o: TODO)
+HR images can be downloaed from [BasicSR-Datasets](https://github.com/xinntao/BasicSR#datasets).
+| Datasets |LR | [*ESRGAN*](https://arxiv.org/abs/1809.00219) | [SRGAN](https://arxiv.org/abs/1609.04802) | [EnhanceNet](http://openaccess.thecvf.com/content_ICCV_2017/papers/Sajjadi_EnhanceNet_Single_Image_ICCV_2017_paper.pdf) | [CX](https://arxiv.org/abs/1803.04626) |
+|:---:|:---:|:---:|:---:|:---:|:---:|
+| Set5 |:heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:| :o: |
+| Set14 | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:| :o: |
+| BSDS100 | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:| :o: |
+| [PIRM](https://pirm.github.io/) <br><sup>(val, test)</sup> | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :heavy_check_mark: |
+| [OST300](https://arxiv.org/pdf/1804.02815.pdf) |:heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :o: |
+| urban100 | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :o: |
+| [DIV2K](https://data.vision.ee.ethz.ch/cvl/DIV2K/) <br><sup>(val, test)</sup> | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark:| :o: |
+## ESRGAN
+We improve the [SRGAN](https://arxiv.org/abs/1609.04802) from three aspects:
+1. adopt a deeper model using Residual-in-Residual Dense Block (RRDB) without batch normalization layers.
+2. employ [Relativistic average GAN](https://ajolicoeur.wordpress.com/relativisticgan/) instead of the vanilla GAN.
+3. improve the perceptual loss by using the features before activation.
+In contrast to SRGAN, which claimed that **deeper models are increasingly difficult to train**, our deeper ESRGAN model shows its superior performance with easy training.
+<p align="center">
+  <img height="120" src="figures/architecture.jpg">
+</p>
+<p align="center">
+  <img height="180" src="figures/RRDB.png">
+</p>
+## Network Interpolation
+We propose the **network interpolation strategy** to balance the visual quality and PSNR.
+<p align="center">
+  <img height="500" src="figures/net_interp.jpg">
+</p>
+We show the smooth animation with the interpolation parameters changing from 0 to 1.
+Interestingly, it is observed that the network interpolation strategy provides a smooth control of the RRDB_PSNR model and the fine-tuned ESRGAN model.
+<p align="center">
+  <img height="480" src="figures/81.gif">
+  &nbsp &nbsp
+  <img height="480" src="figures/102061.gif">
+</p>
+## Qualitative Results
+PSNR (evaluated on the Y channel) and the perceptual index used in the PIRM-SR challenge are also provided for reference.
+<p align="center">
+  <img src="figures/qualitative_cmp_01.jpg">
+</p>
+<p align="center">
+  <img src="figures/qualitative_cmp_02.jpg">
+</p>
+<p align="center">
+  <img src="figures/qualitative_cmp_03.jpg">
+</p>
+<p align="center">
+  <img src="figures/qualitative_cmp_04.jpg">
+</p>
+## Ablation Study
+Overall visual comparisons for showing the effects of each component in
+ESRGAN. Each column represents a model with its configurations in the top.
+The red sign indicates the main improvement compared with the previous model.
+<p align="center">
+  <img src="figures/abalation_study.png">
+</p>
+## BN artifacts
+We empirically observe that BN layers tend to bring artifacts. These artifacts,
+namely BN artifacts, occasionally appear among iterations and different settings,
+violating the needs for a stable performance over training. We find that
+the network depth, BN position, training dataset and training loss
+have impact on the occurrence of BN artifacts.
+<p align="center">
+  <img src="figures/BN_artifacts.jpg">
+</p>
+## Useful techniques to train a very deep network
+We find that residual scaling and smaller initialization can help to train a very deep network. More details are in the Supplementary File attached in our [paper](https://arxiv.org/abs/1809.00219).
+<p align="center">
+  <img height="250" src="figures/train_deeper_neta.png">
+  <img height="250" src="figures/train_deeper_netb.png">
+</p>
+## The influence of training patch size
+We observe that training a deeper network benefits from a larger patch size. Moreover, the deeper model achieves more improvement (∼0.12dB) than the shallower one (∼0.04dB) since larger model capacity is capable of taking full advantage of
+larger training patch size. (Evaluated on Set5 dataset with RGB channels.)
+<p align="center">
+  <img height="250" src="figures/patch_a.png">
+  <img height="250" src="figures/patch_b.png">
+</p>

RRDBNet_arch.py DELETED Viewed

@@ -1,78 +0,0 @@
-import functools
-import torch
-import torch.nn as nn
-import torch.nn.functional as F
-def make_layer(block, n_layers):
-    layers = []
-    for _ in range(n_layers):
-        layers.append(block())
-    return nn.Sequential(*layers)
-class ResidualDenseBlock_5C(nn.Module):
-    def __init__(self, nf=64, gc=32, bias=True):
-        super(ResidualDenseBlock_5C, self).__init__()
-        # gc: growth channel, i.e. intermediate channels
-        self.conv1 = nn.Conv2d(nf, gc, 3, 1, 1, bias=bias)
-        self.conv2 = nn.Conv2d(nf + gc, gc, 3, 1, 1, bias=bias)
-        self.conv3 = nn.Conv2d(nf + 2 * gc, gc, 3, 1, 1, bias=bias)
-        self.conv4 = nn.Conv2d(nf + 3 * gc, gc, 3, 1, 1, bias=bias)
-        self.conv5 = nn.Conv2d(nf + 4 * gc, nf, 3, 1, 1, bias=bias)
-        self.lrelu = nn.LeakyReLU(negative_slope=0.2, inplace=True)
-        # initialization
-        # mutil.initialize_weights([self.conv1, self.conv2, self.conv3, self.conv4, self.conv5], 0.1)
-    def forward(self, x):
-        x1 = self.lrelu(self.conv1(x))
-        x2 = self.lrelu(self.conv2(torch.cat((x, x1), 1)))
-        x3 = self.lrelu(self.conv3(torch.cat((x, x1, x2), 1)))
-        x4 = self.lrelu(self.conv4(torch.cat((x, x1, x2, x3), 1)))
-        x5 = self.conv5(torch.cat((x, x1, x2, x3, x4), 1))
-        return x5 * 0.2 + x
-class RRDB(nn.Module):
-    '''Residual in Residual Dense Block'''
-    def __init__(self, nf, gc=32):
-        super(RRDB, self).__init__()
-        self.RDB1 = ResidualDenseBlock_5C(nf, gc)
-        self.RDB2 = ResidualDenseBlock_5C(nf, gc)
-        self.RDB3 = ResidualDenseBlock_5C(nf, gc)
-    def forward(self, x):
-        out = self.RDB1(x)
-        out = self.RDB2(out)
-        out = self.RDB3(out)
-        return out * 0.2 + x
-class RRDBNet(nn.Module):
-    def __init__(self, in_nc, out_nc, nf, nb, gc=32):
-        super(RRDBNet, self).__init__()
-        RRDB_block_f = functools.partial(RRDB, nf=nf, gc=gc)
-        self.conv_first = nn.Conv2d(in_nc, nf, 3, 1, 1, bias=True)
-        self.RRDB_trunk = make_layer(RRDB_block_f, nb)
-        self.trunk_conv = nn.Conv2d(nf, nf, 3, 1, 1, bias=True)
-        #### upsampling
-        self.upconv1 = nn.Conv2d(nf, nf, 3, 1, 1, bias=True)
-        self.upconv2 = nn.Conv2d(nf, nf, 3, 1, 1, bias=True)
-        self.HRconv = nn.Conv2d(nf, nf, 3, 1, 1, bias=True)
-        self.conv_last = nn.Conv2d(nf, out_nc, 3, 1, 1, bias=True)
-        self.lrelu = nn.LeakyReLU(negative_slope=0.2, inplace=True)
-    def forward(self, x):
-        fea = self.conv_first(x)
-        trunk = self.trunk_conv(self.RRDB_trunk(fea))
-        fea = fea + trunk
-        fea = self.lrelu(self.upconv1(F.interpolate(fea, scale_factor=2, mode='nearest')))
-        fea = self.lrelu(self.upconv2(F.interpolate(fea, scale_factor=2, mode='nearest')))
-        out = self.conv_last(self.lrelu(self.HRconv(fea)))
-        return out

architecture.py ADDED Viewed

	@@ -0,0 +1,38 @@

+import math
+import torch
+import torch.nn as nn
+import block as B
+class RRDB_Net(nn.Module):
+    def __init__(self, in_nc, out_nc, nf, nb, gc=32, upscale=4, norm_type=None, act_type='leakyrelu', \
+            mode='CNA', res_scale=1, upsample_mode='upconv'):
+        super(RRDB_Net, self).__init__()
+        n_upscale = int(math.log(upscale, 2))
+        if upscale == 3:
+            n_upscale = 1
+        fea_conv = B.conv_block(in_nc, nf, kernel_size=3, norm_type=None, act_type=None)
+        rb_blocks = [B.RRDB(nf, kernel_size=3, gc=32, stride=1, bias=True, pad_type='zero', \
+            norm_type=norm_type, act_type=act_type, mode='CNA') for _ in range(nb)]
+        LR_conv = B.conv_block(nf, nf, kernel_size=3, norm_type=norm_type, act_type=None, mode=mode)
+        if upsample_mode == 'upconv':
+            upsample_block = B.upconv_blcok
+        elif upsample_mode == 'pixelshuffle':
+            upsample_block = B.pixelshuffle_block
+        else:
+            raise NotImplementedError('upsample mode [%s] is not found' % upsample_mode)
+        if upscale == 3:
+            upsampler = upsample_block(nf, nf, 3, act_type=act_type)
+        else:
+            upsampler = [upsample_block(nf, nf, act_type=act_type) for _ in range(n_upscale)]
+        HR_conv0 = B.conv_block(nf, nf, kernel_size=3, norm_type=None, act_type=act_type)
+        HR_conv1 = B.conv_block(nf, out_nc, kernel_size=3, norm_type=None, act_type=None)
+        self.model = B.sequential(fea_conv, B.ShortcutBlock(B.sequential(*rb_blocks, LR_conv)),\
+            *upsampler, HR_conv0, HR_conv1)
+    def forward(self, x):
+        x = self.model(x)
+        return x

block.py ADDED Viewed

	@@ -0,0 +1,261 @@

+from collections import OrderedDict
+import torch
+import torch.nn as nn
+####################
+# Basic blocks
+####################
+def act(act_type, inplace=True, neg_slope=0.2, n_prelu=1):
+    # helper selecting activation
+    # neg_slope: for leakyrelu and init of prelu
+    # n_prelu: for p_relu num_parameters
+    act_type = act_type.lower()
+    if act_type == 'relu':
+        layer = nn.ReLU(inplace)
+    elif act_type == 'leakyrelu':
+        layer = nn.LeakyReLU(neg_slope, inplace)
+    elif act_type == 'prelu':
+        layer = nn.PReLU(num_parameters=n_prelu, init=neg_slope)
+    else:
+        raise NotImplementedError('activation layer [%s] is not found' % act_type)
+    return layer
+def norm(norm_type, nc):
+    # helper selecting normalization layer
+    norm_type = norm_type.lower()
+    if norm_type == 'batch':
+        layer = nn.BatchNorm2d(nc, affine=True)
+    elif norm_type == 'instance':
+        layer = nn.InstanceNorm2d(nc, affine=False)
+    else:
+        raise NotImplementedError('normalization layer [%s] is not found' % norm_type)
+    return layer
+def pad(pad_type, padding):
+    # helper selecting padding layer
+    # if padding is 'zero', do by conv layers
+    pad_type = pad_type.lower()
+    if padding == 0:
+        return None
+    if pad_type == 'reflect':
+        layer = nn.ReflectionPad2d(padding)
+    elif pad_type == 'replicate':
+        layer = nn.ReplicationPad2d(padding)
+    else:
+        raise NotImplementedError('padding layer [%s] is not implemented' % pad_type)
+    return layer
+def get_valid_padding(kernel_size, dilation):
+    kernel_size = kernel_size + (kernel_size - 1) * (dilation - 1)
+    padding = (kernel_size - 1) // 2
+    return padding
+class ConcatBlock(nn.Module):
+    # Concat the output of a submodule to its input
+    def __init__(self, submodule):
+        super(ConcatBlock, self).__init__()
+        self.sub = submodule
+    def forward(self, x):
+        output = torch.cat((x, self.sub(x)), dim=1)
+        return output
+    def __repr__(self):
+        tmpstr = 'Identity .. \n|'
+        modstr = self.sub.__repr__().replace('\n', '\n|')
+        tmpstr = tmpstr + modstr
+        return tmpstr
+class ShortcutBlock(nn.Module):
+    #Elementwise sum the output of a submodule to its input
+    def __init__(self, submodule):
+        super(ShortcutBlock, self).__init__()
+        self.sub = submodule
+    def forward(self, x):
+        output = x + self.sub(x)
+        return output
+    def __repr__(self):
+        tmpstr = 'Identity + \n|'
+        modstr = self.sub.__repr__().replace('\n', '\n|')
+        tmpstr = tmpstr + modstr
+        return tmpstr
+def sequential(*args):
+    # Flatten Sequential. It unwraps nn.Sequential.
+    if len(args) == 1:
+        if isinstance(args[0], OrderedDict):
+            raise NotImplementedError('sequential does not support OrderedDict input.')
+        return args[0]  # No sequential is needed.
+    modules = []
+    for module in args:
+        if isinstance(module, nn.Sequential):
+            for submodule in module.children():
+                modules.append(submodule)
+        elif isinstance(module, nn.Module):
+            modules.append(module)
+    return nn.Sequential(*modules)
+def conv_block(in_nc, out_nc, kernel_size, stride=1, dilation=1, groups=1, bias=True,
+               pad_type='zero', norm_type=None, act_type='relu', mode='CNA'):
+    """
+    Conv layer with padding, normalization, activation
+    mode: CNA --> Conv -> Norm -> Act
+        NAC --> Norm -> Act --> Conv (Identity Mappings in Deep Residual Networks, ECCV16)
+    """
+    assert mode in ['CNA', 'NAC', 'CNAC'], 'Wong conv mode [%s]' % mode
+    padding = get_valid_padding(kernel_size, dilation)
+    p = pad(pad_type, padding) if pad_type and pad_type != 'zero' else None
+    padding = padding if pad_type == 'zero' else 0
+    c = nn.Conv2d(in_nc, out_nc, kernel_size=kernel_size, stride=stride, padding=padding, \
+            dilation=dilation, bias=bias, groups=groups)
+    a = act(act_type) if act_type else None
+    if 'CNA' in mode:
+        n = norm(norm_type, out_nc) if norm_type else None
+        return sequential(p, c, n, a)
+    elif mode == 'NAC':
+        if norm_type is None and act_type is not None:
+            a = act(act_type, inplace=False)
+            # Important!
+            # input----ReLU(inplace)----Conv--+----output
+            #        |________________________|
+            # inplace ReLU will modify the input, therefore wrong output
+        n = norm(norm_type, in_nc) if norm_type else None
+        return sequential(n, a, p, c)
+####################
+# Useful blocks
+####################
+class ResNetBlock(nn.Module):
+    """
+    ResNet Block, 3-3 style
+    with extra residual scaling used in EDSR
+    (Enhanced Deep Residual Networks for Single Image Super-Resolution, CVPRW 17)
+    """
+    def __init__(self, in_nc, mid_nc, out_nc, kernel_size=3, stride=1, dilation=1, groups=1, \
+            bias=True, pad_type='zero', norm_type=None, act_type='relu', mode='CNA', res_scale=1):
+        super(ResNetBlock, self).__init__()
+        conv0 = conv_block(in_nc, mid_nc, kernel_size, stride, dilation, groups, bias, pad_type, \
+            norm_type, act_type, mode)
+        if mode == 'CNA':
+            act_type = None
+        if mode == 'CNAC':  # Residual path: |-CNAC-|
+            act_type = None
+            norm_type = None
+        conv1 = conv_block(mid_nc, out_nc, kernel_size, stride, dilation, groups, bias, pad_type, \
+            norm_type, act_type, mode)
+        # if in_nc != out_nc:
+        #     self.project = conv_block(in_nc, out_nc, 1, stride, dilation, 1, bias, pad_type, \
+        #         None, None)
+        #     print('Need a projecter in ResNetBlock.')
+        # else:
+        #     self.project = lambda x:x
+        self.res = sequential(conv0, conv1)
+        self.res_scale = res_scale
+    def forward(self, x):
+        res = self.res(x).mul(self.res_scale)
+        return x + res
+class ResidualDenseBlock_5C(nn.Module):
+    """
+    Residual Dense Block
+    style: 5 convs
+    The core module of paper: (Residual Dense Network for Image Super-Resolution, CVPR 18)
+    """
+    def __init__(self, nc, kernel_size=3, gc=32, stride=1, bias=True, pad_type='zero', \
+            norm_type=None, act_type='leakyrelu', mode='CNA'):
+        super(ResidualDenseBlock_5C, self).__init__()
+        # gc: growth channel, i.e. intermediate channels
+        self.conv1 = conv_block(nc, gc, kernel_size, stride, bias=bias, pad_type=pad_type, \
+            norm_type=norm_type, act_type=act_type, mode=mode)
+        self.conv2 = conv_block(nc+gc, gc, kernel_size, stride, bias=bias, pad_type=pad_type, \
+            norm_type=norm_type, act_type=act_type, mode=mode)
+        self.conv3 = conv_block(nc+2*gc, gc, kernel_size, stride, bias=bias, pad_type=pad_type, \
+            norm_type=norm_type, act_type=act_type, mode=mode)
+        self.conv4 = conv_block(nc+3*gc, gc, kernel_size, stride, bias=bias, pad_type=pad_type, \
+            norm_type=norm_type, act_type=act_type, mode=mode)
+        if mode == 'CNA':
+            last_act = None
+        else:
+            last_act = act_type
+        self.conv5 = conv_block(nc+4*gc, nc, 3, stride, bias=bias, pad_type=pad_type, \
+            norm_type=norm_type, act_type=last_act, mode=mode)
+    def forward(self, x):
+        x1 = self.conv1(x)
+        x2 = self.conv2(torch.cat((x, x1), 1))
+        x3 = self.conv3(torch.cat((x, x1, x2), 1))
+        x4 = self.conv4(torch.cat((x, x1, x2, x3), 1))
+        x5 = self.conv5(torch.cat((x, x1, x2, x3, x4), 1))
+        return x5.mul(0.2) + x
+class RRDB(nn.Module):
+    """
+    Residual in Residual Dense Block
+    """
+    def __init__(self, nc, kernel_size=3, gc=32, stride=1, bias=True, pad_type='zero', \
+            norm_type=None, act_type='leakyrelu', mode='CNA'):
+        super(RRDB, self).__init__()
+        self.RDB1 = ResidualDenseBlock_5C(nc, kernel_size, gc, stride, bias, pad_type, \
+            norm_type, act_type, mode)
+        self.RDB2 = ResidualDenseBlock_5C(nc, kernel_size, gc, stride, bias, pad_type, \
+            norm_type, act_type, mode)
+        self.RDB3 = ResidualDenseBlock_5C(nc, kernel_size, gc, stride, bias, pad_type, \
+            norm_type, act_type, mode)
+    def forward(self, x):
+        out = self.RDB1(x)
+        out = self.RDB2(out)
+        out = self.RDB3(out)
+        return out.mul(0.2) + x
+####################
+# Upsampler
+####################
+def pixelshuffle_block(in_nc, out_nc, upscale_factor=2, kernel_size=3, stride=1, bias=True,
+                        pad_type='zero', norm_type=None, act_type='relu'):
+    """
+    Pixel shuffle layer
+    (Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional
+    Neural Network, CVPR17)
+    """
+    conv = conv_block(in_nc, out_nc * (upscale_factor ** 2), kernel_size, stride, bias=bias,
+                        pad_type=pad_type, norm_type=None, act_type=None)
+    pixel_shuffle = nn.PixelShuffle(upscale_factor)
+    n = norm(norm_type, out_nc) if norm_type else None
+    a = act(act_type) if act_type else None
+    return sequential(conv, pixel_shuffle, n, a)
+def upconv_blcok(in_nc, out_nc, upscale_factor=2, kernel_size=3, stride=1, bias=True,
+                pad_type='zero', norm_type=None, act_type='relu', mode='nearest'):
+    # Up conv
+    # described in https://distill.pub/2016/deconv-checkerboard/
+    upsample = nn.Upsample(scale_factor=upscale_factor, mode=mode)
+    conv = conv_block(in_nc, out_nc, kernel_size, stride, bias=bias,
+                        pad_type=pad_type, norm_type=norm_type, act_type=act_type)
+    return sequential(upsample, conv)

models/README.md DELETED Viewed

@@ -1,9 +0,0 @@
-## Place pretrained models here.
-We provide two pretrained models:
-1. `RRDB_ESRGAN_x4.pth`: the final ESRGAN model we used in our [paper](https://arxiv.org/abs/1809.00219).
-2. `RRDB_PSNR_x4.pth`: the PSNR-oriented model with **high PSNR performance**.
-*Note that* the pretrained models are trained under the `MATLAB bicubic` kernel.
-If the downsampled kernel is different from that, the results may have artifacts.

test.py CHANGED Viewed

@@ -1,37 +1,33 @@
-import os.path as osp
 import glob
 import cv2
 import numpy as np
 import torch
-import RRDBNet_arch as arch
-model_path = 'models/RRDB_ESRGAN_x4.pth'  # models/RRDB_ESRGAN_x4.pth OR models/RRDB_PSNR_x4.pth
-device = torch.device('cuda')  # if you want to run on CPU, change 'cuda' -> cpu
-# device = torch.device('cpu')
-test_img_folder = 'LR/*'
-model = arch.RRDBNet(3, 3, 64, 23, gc=32)
 model.load_state_dict(torch.load(model_path), strict=True)
 model.eval()
 model = model.to(device)
-print('Model path {:s}. \nTesting...'.format(model_path))
-idx = 0
-for path in glob.glob(test_img_folder):
-    idx += 1
-    base = osp.splitext(osp.basename(path))[0]
-    print(idx, base)
-    # read images
-    img = cv2.imread(path, cv2.IMREAD_COLOR)
-    img = img * 1.0 / 255
-    img = torch.from_numpy(np.transpose(img[:, :, [2, 1, 0]], (2, 0, 1))).float()
-    img_LR = img.unsqueeze(0)
-    img_LR = img_LR.to(device)
-    with torch.no_grad():
-        output = model(img_LR).data.squeeze().float().cpu().clamp_(0, 1).numpy()
-    output = np.transpose(output[[2, 1, 0], :, :], (1, 2, 0))
-    output = (output * 255.0).round()
-    cv2.imwrite('results/{:s}_rlt.png'.format(base), output)

+import sys
+import os.path
 import glob
 import cv2
 import numpy as np
 import torch
+import architecture as arch
+model_path = '4x_eula_digimanga_bw_v2_nc1_307k.pth'
+img_path = sys.argv[1]
+device = torch.device('cpu')
+model = arch.RRDB_Net(1, 1, 64, 23, gc=32, upscale=4, norm_type=None, act_type='leakyrelu', mode='CNA', res_scale=1, upsample_mode='upconv')
 model.load_state_dict(torch.load(model_path), strict=True)
 model.eval()
+for k, v in model.named_parameters():
+    v.requires_grad = False
 model = model.to(device)
+base = os.path.splitext(os.path.basename(img_path))[0]
+# read image
+img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
+img = img * 1.0 / 255
+img = torch.from_numpy(img[np.newaxis, :, :]).float()
+img_LR = img.unsqueeze(0)
+img_LR = img_LR.to(device)
+output = model(img_LR).squeeze(dim=0).float().cpu().clamp_(0, 1).numpy()
+output = np.transpose(output, (1, 2, 0))
+output = (output * 255.0).round()
+cv2.imwrite('results/{:s}_rlt.jpg'.format(base), output, [int(cv2.IMWRITE_JPEG_QUALITY), 90])

transer_RRDB_models.py DELETED Viewed

@@ -1,55 +0,0 @@
-import os
-import torch
-import RRDBNet_arch as arch
-pretrained_net = torch.load('./models/RRDB_ESRGAN_x4.pth')
-save_path = './models/RRDB_ESRGAN_x4.pth'
-crt_model = arch.RRDBNet(3, 3, 64, 23, gc=32)
-crt_net = crt_model.state_dict()
-load_net_clean = {}
-for k, v in pretrained_net.items():
-    if k.startswith('module.'):
-        load_net_clean[k[7:]] = v
-    else:
-        load_net_clean[k] = v
-pretrained_net = load_net_clean
-print('###################################\n')
-tbd = []
-for k, v in crt_net.items():
-    tbd.append(k)
-# directly copy
-for k, v in crt_net.items():
-    if k in pretrained_net and pretrained_net[k].size() == v.size():
-        crt_net[k] = pretrained_net[k]
-        tbd.remove(k)
-crt_net['conv_first.weight'] = pretrained_net['model.0.weight']
-crt_net['conv_first.bias'] = pretrained_net['model.0.bias']
-for k in tbd.copy():
-    if 'RDB' in k:
-        ori_k = k.replace('RRDB_trunk.', 'model.1.sub.')
-        if '.weight' in k:
-            ori_k = ori_k.replace('.weight', '.0.weight')
-        elif '.bias' in k:
-            ori_k = ori_k.replace('.bias', '.0.bias')
-        crt_net[k] = pretrained_net[ori_k]
-        tbd.remove(k)
-crt_net['trunk_conv.weight'] = pretrained_net['model.1.sub.23.weight']
-crt_net['trunk_conv.bias'] = pretrained_net['model.1.sub.23.bias']
-crt_net['upconv1.weight'] = pretrained_net['model.3.weight']
-crt_net['upconv1.bias'] = pretrained_net['model.3.bias']
-crt_net['upconv2.weight'] = pretrained_net['model.6.weight']
-crt_net['upconv2.bias'] = pretrained_net['model.6.bias']
-crt_net['HRconv.weight'] = pretrained_net['model.8.weight']
-crt_net['HRconv.bias'] = pretrained_net['model.8.bias']
-crt_net['conv_last.weight'] = pretrained_net['model.10.weight']
-crt_net['conv_last.bias'] = pretrained_net['model.10.bias']
-torch.save(crt_net, save_path)
-print('Saving to ', save_path)