Show simple item record

dc.contributor.authorAlharbi, Yazeed
dc.contributor.authorSmith, Neil
dc.contributor.authorWonka, Peter
dc.date.accessioned2019-11-28T07:07:53Z
dc.date.available2019-11-28T07:05:11Z
dc.date.available2019-11-28T07:07:53Z
dc.date.issued2019
dc.identifier.citationAlharbi, Y., Smith, N., & Wonka, P. (2019). Latent Filter Scaling for Multimodal Unsupervised Image-To-Image Translation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr.2019.00155
dc.identifier.doi10.1109/CVPR.2019.00155
dc.identifier.urihttp://hdl.handle.net/10754/660300
dc.description.abstractIn multimodal unsupervised image-to-image translation tasks, the goal is to translate an image from the source domain to many images in the target domain. We present a simple method that produces higher quality images than current state-of-the-art while maintaining the same amount of multimodal diversity. Previous methods follow the unconditional approach of trying to map the latent code directly to a full-size image. This leads to complicated network architectures with several introduced hyperparameters to tune. By treating the latent code as a modifier of the convolutional filters, we produce multimodal output while maintaining the traditional Generative Adversarial Network (GAN) loss and without additional hyperparameters. The only tuning required by our method controls the tradeoff between variability and quality of generated images. Furthermore, we achieve disentanglement between source domain content and target domain style for free as a by-product of our formulation. We perform qualitative and quantitative experiments showing the advantages of our method compared with the state-of-the art on multiple benchmark image-to-image translation datasets.
dc.description.sponsorshipThe project was funded in part by the KAUST Office of Sponsored Research (OSR) under Award No. URF/1/3426-01-01.
dc.publisherIEEE
dc.relation.urlhttps://ieeexplore.ieee.org/document/8953741/
dc.relation.urlhttps://ieeexplore.ieee.org/document/8953741/
dc.relation.urlhttps://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8953741
dc.rightsArchived with thanks to IEEE
dc.subjectImage and Video Synthesis
dc.subjectDeep Learning
dc.titleLatent Filter Scaling for Multimodal Unsupervised Image-To-Image Translation
dc.typeConference Paper
dc.contributor.departmentComputer Science Program
dc.contributor.departmentVisual Computing Center (VCC)
dc.contributor.departmentComputer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
dc.conference.date15-20 June 2019
dc.conference.name2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
dc.conference.locationLong Beach, CA, USA
dc.eprint.versionPre-print
dc.identifier.arxividarXiv:1812.09877
kaust.personAlHarbi, Yazeed
kaust.personSmith, Neil
kaust.personWonka, Peter
kaust.grant.numberURF/1/3426-01-01
refterms.dateFOA2019-11-28T07:05:57Z
kaust.acknowledged.supportUnitKAUST Office of Sponsored Research (OSR)
dc.date.posted2018-12-24


Files in this item

Thumbnail
Name:
Preprintfile1.pdf
Size:
5.376Mb
Format:
PDF
Description:
Pre-print

This item appears in the following Collection(s)

Show simple item record

VersionItemEditorDateSummary

*Selected version