TY - GEN
T1 - Image Translation with Kernel Prediction Networks for Semantic Segmentation
AU - Mata, Cristina
AU - Ryoo, Michael
AU - Turbell, Henrik
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - Semantic segmentation relies on many dense pixel-wise annotations to achieve the best performance, but owing to the difficulty of obtaining accurate annotations for real world data, practitioners train on large-scale synthetic datasets. Unpaired image translation is one method used to address the ensuing domain gap by generating more realistic training data in low-data regimes. Current methods for unpaired image translation train generative adversarial networks (GANs) to perform the translation and enforce pixel-level semantic matching through cycle consistency. These methods do not guarantee that the semantic matching holds, posing a problem for semantic segmentation where performance is sensitive to noisy pixel labels. We propose a novel image translation method, Domain Adversarial Kernel Prediction Network (DA-KPN), that guarantees semantic matching between the synthetic label and translation. DA-KPN estimates pixel-wise input transformation parameters of a lightweight and simple translation function. To ensure the pixel-wise transformation is realistic, DA-KPN uses multi-scale discriminators to distinguish between translated and target samples. We show DA-KPN outperforms previous GAN-based methods on syn2real benchmarks for semantic segmentation with limited access to real image labels and achieves comparable performance on face parsing.
AB - Semantic segmentation relies on many dense pixel-wise annotations to achieve the best performance, but owing to the difficulty of obtaining accurate annotations for real world data, practitioners train on large-scale synthetic datasets. Unpaired image translation is one method used to address the ensuing domain gap by generating more realistic training data in low-data regimes. Current methods for unpaired image translation train generative adversarial networks (GANs) to perform the translation and enforce pixel-level semantic matching through cycle consistency. These methods do not guarantee that the semantic matching holds, posing a problem for semantic segmentation where performance is sensitive to noisy pixel labels. We propose a novel image translation method, Domain Adversarial Kernel Prediction Network (DA-KPN), that guarantees semantic matching between the synthetic label and translation. DA-KPN estimates pixel-wise input transformation parameters of a lightweight and simple translation function. To ensure the pixel-wise transformation is realistic, DA-KPN uses multi-scale discriminators to distinguish between translated and target samples. We show DA-KPN outperforms previous GAN-based methods on syn2real benchmarks for semantic segmentation with limited access to real image labels and achieves comparable performance on face parsing.
KW - Image Translation
KW - Kernel Prediction Networks
KW - Semantic Segmentation
UR - https://www.scopus.com/pages/publications/105006833815
U2 - 10.1007/978-3-031-91672-4_20
DO - 10.1007/978-3-031-91672-4_20
M3 - Conference contribution
AN - SCOPUS:105006833815
SN - 9783031916717
T3 - Lecture Notes in Computer Science
SP - 328
EP - 343
BT - Computer Vision – ECCV 2024 Workshops, Proceedings
A2 - Del Bue, Alessio
A2 - Canton, Cristian
A2 - Pont-Tuset, Jordi
A2 - Tommasi, Tatiana
PB - Springer Science and Business Media Deutschland GmbH
T2 - Workshops that were held in conjunction with the 18th European Conference on Computer Vision, ECCV 2024
Y2 - 29 September 2024 through 4 October 2024
ER -