You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, original paper "Image-to-Image Translation with Conditional Adversarial Networks" said:
Let Ck denote a Convolution-BatchNorm-ReLU layer with k filters. CDk denotes a Convolution-BatchNormDropout-ReLU layer with a dropout rate of 50%. All convolutions are 4 × 4 spatial filters applied with stride 2. Convolutions in the encoder, and in the discriminator, downsample by a factor of 2, whereas in the decoder they upsample by a factor of 2. The encoder-decoder architecture consists of:
encoder: C64-C128-C256-C512-C512-C512-C512-C512
U-Net decoder: CD512-CD1024-CD1024-C1024-C1024-C512 -C256-C128
Why, in this repo, is:
encoder: C64(no batchnorm)-C128-C256-C512-C512-C512-C512-C512
U-Net decoder: C512(no batchnorm)-C1024-CD1024-CD1024-CD1024-CD512 -C256-C128
Where especially in decoder, it is quite different. Does this repo provided a tweaked version?
The text was updated successfully, but these errors were encountered:
According to
torchsummary
, it shows default generator used in pixpix has an architecture of:Layer (type) Output Shape Param # ================================================================ Conv2d-1 [-1, 64, 128, 128] 3,072 LeakyReLU-2 [-1, 64, 128, 128] 0 Conv2d-3 [-1, 128, 64, 64] 131,072 BatchNorm2d-4 [-1, 128, 64, 64] 256 LeakyReLU-5 [-1, 128, 64, 64] 0 Conv2d-6 [-1, 256, 32, 32] 524,288 BatchNorm2d-7 [-1, 256, 32, 32] 512 LeakyReLU-8 [-1, 256, 32, 32] 0 Conv2d-9 [-1, 512, 16, 16] 2,097,152 BatchNorm2d-10 [-1, 512, 16, 16] 1,024 LeakyReLU-11 [-1, 512, 16, 16] 0 Conv2d-12 [-1, 512, 8, 8] 4,194,304 BatchNorm2d-13 [-1, 512, 8, 8] 1,024 LeakyReLU-14 [-1, 512, 8, 8] 0 Conv2d-15 [-1, 512, 4, 4] 4,194,304 BatchNorm2d-16 [-1, 512, 4, 4] 1,024 LeakyReLU-17 [-1, 512, 4, 4] 0 Conv2d-18 [-1, 512, 2, 2] 4,194,304 BatchNorm2d-19 [-1, 512, 2, 2] 1,024 LeakyReLU-20 [-1, 512, 2, 2] 0 Conv2d-21 [-1, 512, 1, 1] 4,194,304 ReLU-22 [-1, 512, 1, 1] 0 ConvTranspose2d-23 [-1, 512, 2, 2] 4,194,304 BatchNorm2d-24 [-1, 512, 2, 2] 1,024 UnetSkipConnectionBlock-25 [-1, 1024, 2, 2] 0 ReLU-26 [-1, 1024, 2, 2] 0 ConvTranspose2d-27 [-1, 512, 4, 4] 8,388,608 BatchNorm2d-28 [-1, 512, 4, 4] 1,024 Dropout-29 [-1, 512, 4, 4] 0 UnetSkipConnectionBlock-30 [-1, 1024, 4, 4] 0 ReLU-31 [-1, 1024, 4, 4] 0 ConvTranspose2d-32 [-1, 512, 8, 8] 8,388,608 BatchNorm2d-33 [-1, 512, 8, 8] 1,024 Dropout-34 [-1, 512, 8, 8] 0 UnetSkipConnectionBlock-35 [-1, 1024, 8, 8] 0 ReLU-36 [-1, 1024, 8, 8] 0 ConvTranspose2d-37 [-1, 512, 16, 16] 8,388,608 BatchNorm2d-38 [-1, 512, 16, 16] 1,024 Dropout-39 [-1, 512, 16, 16] 0 UnetSkipConnectionBlock-40 [-1, 1024, 16, 16] 0 ReLU-41 [-1, 1024, 16, 16] 0 ConvTranspose2d-42 [-1, 256, 32, 32] 4,194,304 BatchNorm2d-43 [-1, 256, 32, 32] 512 UnetSkipConnectionBlock-44 [-1, 512, 32, 32] 0 ReLU-45 [-1, 512, 32, 32] 0 ConvTranspose2d-46 [-1, 128, 64, 64] 1,048,576 BatchNorm2d-47 [-1, 128, 64, 64] 256 UnetSkipConnectionBlock-48 [-1, 256, 64, 64] 0 ReLU-49 [-1, 256, 64, 64] 0 ConvTranspose2d-50 [-1, 64, 128, 128] 262,144 BatchNorm2d-51 [-1, 64, 128, 128] 128 UnetSkipConnectionBlock-52 [-1, 128, 128, 128] 0 ReLU-53 [-1, 128, 128, 128] 0 ConvTranspose2d-54 [-1, 3, 256, 256] 6,147 Tanh-55 [-1, 3, 256, 256] 0 UnetSkipConnectionBlock-56 [-1, 3, 256, 256] 0 UnetGenerator-57 [-1, 3, 256, 256] 0 DataParallel-58 [-1, 3, 256, 256] 0
However, original paper "Image-to-Image Translation with Conditional Adversarial Networks" said:
Why, in this repo, is:
encoder: C64(no batchnorm)-C128-C256-C512-C512-C512-C512-C512
U-Net decoder: C512(no batchnorm)-C1024-CD1024-CD1024-CD1024-CD512 -C256-C128
Where especially in decoder, it is quite different. Does this repo provided a tweaked version?
The text was updated successfully, but these errors were encountered: