Using synthetic images to train neural networks for tramline detection

Silko Schulpius; Jan Schattenberg; Ludger Frerichs

doi:10.15150/ae.2026.3358

Autor/innen

Silko Schulpius
Jan Schattenberg
Ludger Frerichs

DOI:

https://doi.org/10.15150/ae.2026.3358

Schlagworte:

Synthetische Bilder, Game-Engines, maschinelles Lernen, autonomes Fahren

Abstract

In den letzten Jahren haben sich die Möglichkeiten zur Generierung synthetischer Bilddaten und die Qualität dieser Daten erheblich verbessert. Diese Fortschritte erleichtern das Training neuronaler Netze, die große Mengen an Trainingsdaten benötigen. Die Möglichkeit, neuronale Netze schneller und einfacher zu trainieren, bietet Vorteile bei Anwendungen in der Präzisionslandwirtschaft, wie z. B. der automatischen Lenkung. Insbesondere in non-GNSS (Global Navigation Satellite System)-Szenarien kann eine Monokamera in Kombination mit einem neuronalen Netz als Alternative zu GNSS dienen. Dies ist besonders nützlich in Situationen, in denen GNSS-Signale nicht verfügbar oder unzuverlässig sind. In dieser Studie wird der Prozess der Erzeugung synthetischer Bilder mit Hilfe einer Spiele-Engine und eines Diffusionsmodells untersucht. Diese synthetischen Bilder werden verwendet, um ein neuronales Netz für die Erkennung von Fahrgassen zu trainieren. Das neuronale Netz erreichte beim Training mit realen Bildern eine mean Intersection over Union (mIoU) von 81,7 %. Durch die Einbeziehung synthetischer Bilder in den Trainingsprozess stieg die mIoU auf 83,3 %, was zu einer verbesserten Fahrgassenerkennung führte.

Literaturhinweise

Cieslak, M.; Govindarajan, U.; Garcia, A.; Chandrashekar, A.; Hädrich, T.; Mendoza-Drosik, A.; Michels, D.L.; Pirk, S.; Fu, C.-C.; Palubicki, W. (2024): Generating Diverse Agricultural Data for Vision-Based Farming Applications. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 5422–5431, https://doi.org/10.48550/arXiv.2403.18351

Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. (2016): The Cityscapes Dataset for Semantic Urban Scene Understanding. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 3213-3223, https://doi.org/10.1109/CVPR.2016.350

Dosovitskiy, A.; Ros, G.; Codevilla, F.; Lopez, A.; Koltun, V. (2017): CARLA: An Open Urban Driving Simulator. PMLR Proceedings of Machine Learning Research, pp. 1–16, https://doi.org/10.48550/arXiv.1711.03938

Fan, J.; Ma, C.; Zhong, Y. (2021): A Selective Overview of Deep Learning. Statistical science 36(2), pp. 264–290, https://doi.org/10.1214/20-STS783

Gaidon, A.; Wang, Q.; Cabon, Y.; Vig, E. (2016): VirtualWorlds as Proxy for Multi-object Tracking Analysis. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 4340–4349, https://doi.org/10.1109/CVPR.2016.470

Halmaoui, H.; Haqiq, A. (2022): Computer Graphics Rendering Survey: From Rasterization and Ray Tracing to Deep Learning. In: Innovations in Bio-Inspired Computing and Applications, eds. Abraham, A., et al. , IBICA 2021, Lecture Notes in Networks and Systems, vol 419, Springer, Cham, https://doi.org/10.1007/978-3-030-96299-9_51

Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. (2017): GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, https://doi.org/10.48550/arXiv.1706.08500

Huang, X.; Liu, M.-Y.; Belongie, S.; Kautz, J. (2018): Multimodal Unsupervised Image-to-Image Translation. In: Computer Vision – ECCV 2018, eds. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y., Lecture Notes in Computer Science, vol 11207. Springer, Cham, https://doi.org/10.1007/978-3-030-01219-9_11

Jung, Y.; Byun, S.; Kim, B.; Amin, S.U.; Seo, S. (2024): Harnessing synthetic data for enhanced detection of Pine Wilt Disease: An image classification approach. Computers and Electronics in Agriculture 218, p. 108690, https://doi.org/10.1016/j.compag.2024.108690

Koulaxidis, G.; Xinogalos, S. (2022): Improving Mobile Game Performance with Basic Optimization Techniques in Unity. Modelling 3(2), pp. 201–223, https://doi.org/10.3390/modelling3020014

Li, T.; Asai, M.; Kato, Y.; Fukano, Y.; Guo, W. (2024): Channel Attention GAN-Based Synthetic Weed Generation for Precise Weed Identification. Plant Phenomics 6, https://doi.org/10.34133/plantphenomics.0122

Matt Rowe (2023): Lets’ go off-road. https://carla.org/2023/04/21/avl-off-road-simulation/, accessed on 7 Oct 2025

Montesinos López, O.A.; Montesinos López, A.; Crossa, J. (2022): Overfitting, Model Tuning, and Evaluation of Prediction Performance. In: Multivariate Statistical Machine Learning Methods for Genomic Prediction, Cham, Springer International Publishing, pp. 109–139, https://doi.org/10.1007/978-3-030-89010-0_4

PyTorch (2025): Saving and Loading Models. https://docs.pytorch.org/tutorials/beginner/saving_loading_models.html, accessed on 27 Feb 2026

Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. (2019): Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 658–666, https://doi.org/10.1109/CVPR.2019.00075

Richter, S.R.; Vineet, V.; Roth, S.; Koltun, V. (2016): Playing for Data: Ground Truth from Computer Games In: Computer Vision – ECCV 2016, eds. Leibe, B., Matas, J., Sebe, N., Welling, M., Lecture Notes in Computer Science, vol 9906, Springer, Cham, https://doi.org/10.1007/978-3-319-46475-6_7

Ros, G.; Sellart, L.; Materzynska, J.; Vazquez, D.; Lopez, A.M. (2016): The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes, pp. 3234–3243, https://doi.org/10.1109/CVPR.2016.352

Singh, A.K.; Rao, A.; Chattopadhyay, P.; Maurya, R.; Singh, L. (2024): Effective plant disease diagnosis using Vision Transformer trained with leafy-generative adversarial network-generated images. Expert Syst. Appl. 254, p. 124387, https://doi.org/10.1016/j.eswa.2024.124387

Smith, S.; Kindermans, P.-J.; Le, Q. (2018): Don’t Decay the Learning Rate, Increase the Batch Size. 6th International Conference on Learning Representations, Vancouver, BC, Canada, https://doi.org/10.48550/arXiv.1711.00489

Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. (2016): Rethinking the Inception Architecture for Computer Vision. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 2818–2826, https://doi.org/10.1109/CVPR.2016.308

Unity (2025a): Introduction to render pipelines. https://docs.unity3d.com/6000.0/Documentation/Manual/render-pipelines-overview.html, accessed on 15 Apr 2025

Unity (2025b): Introduction to textures. https://docs.unity3d.com/6000.0/Documentation/Manual/Textures.html, accessed on 15 Apr 2025

Vélez, S.; Valente, J.; Bretzel, T.; Trommsdorff, M. (2024): Assessing the impact of overhead agrivoltaic systems on GNSS signal performance for precision agriculture. Smart Agricultural Technology 9, p. 100664, https://doi.org/10.1016/j.atech.2024.100664

Zhang, L.; Rao, A.; Agrawala, M. (2023): Adding Conditional Control to Text-to-Image Diffusion Models. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3813–3824, https://doi.org/10.48550/arXiv.2302.05543

Zhang, Y.; Shi, N.; Zhang, H.; Zhang, J.; Fan, X.; Suo, X. (2022): Appearance quality classification method of Huangguan pear under complex background based on instance segmentation and semantic segmentation. Frontiers in Plant Science 13, https://doi.org/10.3389/fpls.2022.914829

Zhang, Z.; Zhan, W.; Sun, Y.; Peng, J.; Zhang, Y.; Guo, Y.; Sun, K.; Gui, L. (2024): Mask-guided dual-perception generative adversarial network for synthesizing complex maize diseased leaves to augment datasets. Engineering Applications of Artificial Intelligence 136, p. 108875, https://doi.org/10.1016/j.engappai.2024.108875

Zhao, H.; Wang, Y.; Bashford-Rogers, T.; Donzella, V.; Debattista, K. (2024): Exploring Generative AI for Sim2Real in Driving Data Synthesis. 2024 IEEE Intelligent Vehicles Symposium (IV), Jeju Island, Republic of Korea, pp. 3071–3077, https://doi.org/10.1109/IV55156.2024.10588493

Verwendung synthetischer Bilder für das Training neuronaler Netze zur Erkennung von Fahrgassen

Autor/innen

DOI:

Schlagworte:

Abstract

Literaturhinweise

Downloads

Veröffentlicht

Zitationsvorschlag

Ausgabe

Rubrik

Lizenz

Am häufigsten gelesenen Artikel dieser/dieses Autor/in

Beitrag einreichen

header

Index

Kontakt

Herausgeber

Verwendung synthetischer Bilder für das Training neuronaler Netze zur Erkennung von Fahrgassen

Autor/innen

DOI:

Schlagworte:

Abstract

Literaturhinweise

Downloads

Veröffentlicht

Zitationsvorschlag

Ausgabe

Rubrik

Lizenz

Am häufigsten gelesenen Artikel dieser/dieses Autor/in

Beitrag einreichen

header

Newsletter

Index

Kontakt