- Herausgeber
- Fraundorfer, Friedrich
- Pock, Thomas
- Possegger, Horst
- TitelProceedings of the 28th Computer Vision Winter Workshop
- February 12–14, 2025; Graz, Austria
- Datei
- DOI10.3217/978-3-99161-022-9
- LicenceCC BY
- ISBN978-3-99161-022-9


- AbstractOfficial proceedings of the 28th Computer Vision Winter Workshop, the annual meeting of several computer vision research groups located in Graz, Ljubljana, Prague, and Vienna. The main goal of this workshop is to communicate fresh scientific ideas within these four groups and to provide conference experience to PhD students. However, the workshop is open to everyone.
Kapitel
FrontmatterFraundorfer, Friedrich; Pock, Thomas; Possegger, Horst; 10.3217/978-3-99161-022-9-000
A Data-Centric Approach to 3D Semantic Segmentation of Railway ScenesMuenger, Nicolas; Ronecker, Max; Diaz, Xavier; Karner, Michael; Watzenig, Daniel; Skaloud, Jan; 10.3217/978-3-99161-022-9-001
LiDAR-based semantic segmentation is critical for autonomous trains, requiring accurate predictions across varying distances. This paper introduces two targeted data augmentation methods designed to improve segmentation performance on the railway-specific OSDaR23 dataset. The person instance pasting method enhances segmentation of pedestrians at distant ranges by injecting realistic variations into the dataset. The track sparsification method redistributes point density in LiDAR scans, improving track segmentation at far distances with minimal impact on close-range accuracy. Both methods are evaluated using a state-of-the-art 3D semantic segmentation network, demonstrating significant improvements in distant-range performance while maintaining robustness in close-range predictions. We establish the first 3D semantic segmentation benchmark for OSDaR23, demonstrating the potential of data-centric approaches to address railway-specific challenges in autonomous train perception. An Investigation of Beam Density on LiDAR Object Detection PerformanceGriesbacher, Christoph; Fruhwirth-Reisinger, Christian; 10.3217/978-3-99161-022-9-002
Accurate 3D object detection is a critical component of autonomous driving, enabling vehicles to perceive their surroundings with precision and make informed decisions. LiDAR sensors, widely used for their ability to provide detailed 3D measurements, are key to achieving this capability. However, variations between training and inference data can cause significant performance drops when object detection models are employed in different sensor settings. One critical factor is beam density, as inference on sparse, cost-effective LiDAR sensors is often preferred in real-world applications. Despite previous work addressing the beam-density-induced domain gap, substantial knowledge gaps remain, particularly concerning dense 128-beam sensors in cross-domain scenarios. To gain better understanding of the impact of beam density on domain gaps, we conduct a comprehensive investigation that includes an evaluation of different object detection architectures. Our architecture evaluation reveals that combining voxel- and point-based approaches yields superior cross-domain performance by leveraging the strengths of both representations. Building on these findings, we analyze beam-density-induced domain gaps and argue that these domain gaps must be evaluated in conjunction with other domain shifts. Contrary to conventional beliefs, our experiments reveal that detectors benefit from training on denser data and exhibit robustness to beam density variations during inference. Human Pose-Constrained UV Map EstimationSuchanek, Matej; Purkrabek, Miroslav; Matas, Jiri; 10.3217/978-3-99161-022-9-003
UV map estimation is used in computer vision for detailed analysis of human posture or activity. Previous methods assign pixels to body model vertices by comparing pixel descriptors independently, without enforcing global coherence or plausibility in the UV map. We propose Pose- Constrained Continuous Surface Embeddings (PC-CSE), which integrates estimated 2D human pose into the pixelto- vertex assignment process. The pose provides global anatomical constraints, ensuring that UV maps remain coherent while preserving local precision. Evaluation on DensePose COCO demonstrates consistent improvement, regardless of the chosen 2D human pose model. Wholebody poses offer better constraints by incorporating additional details about the hands and feet. Conditioning UV maps with human pose reduces invalid mappings and enhances anatomical plausibility. In addition, we highlight inconsistencies in the ground-truth annotations Incremental Learning with Repetition via Pseudo-Feature ProjectionTscheschner, Benedikt; Veas, Eduardo; Masana, Marc; 10.3217/978-3-99161-022-9-004
Incremental Learning scenarios do not always represent real-world inference use-cases, which tend to have less strict task boundaries, and exhibit repetition of common classes and concepts in their continual data stream. To better represent these use-cases, new scenarios with partial repetition and mixing of tasks are proposed, where the repetition patterns are innate to the scenario and unknown to the strategy. We investigate how exemplar-free incremental learning strategies are affected by data repetition, and we adapt a series of state-of-the-art approaches to analyse and fairly compare them under both settings. Further, we also propose a novel method (Horde), able to dynamically adjust an ensemble of self-reliant feature extractors, and align them by exploiting class repetition. Our proposed exemplar-free method achieves competitive results in the classic scenario without repetition, and state-of-the-art performance in the one with repetition. Leveraging Intermediate Representations for Better Out-of-Distribution DetectionGuglielmo, Gianluca; Masana, Marc; 10.3217/978-3-99161-022-9-005
In real-world applications, machine learning models must reliably detect Out-of-Distribution (OoD) samples to prevent unsafe decisions. Current OoD detection methods often rely on analyzing the logits or the embeddings of the penultimate layer of a neural network. However, little work has been conducted on the exploitation of the rich information encoded in intermediate layers. To address this, we analyze the discriminative power of intermediate layers and show that they can positively be used for OoD detection. Therefore, we propose to regularize intermediate layers with an energy-based contrastive loss, and by grouping multiple layers in a single aggregated response. We demonstrate that intermediate layer activations improves OoD detection performance by running a comprehensive evaluation across multiple datasets. Real-time object detection in diverse weather conditions through adaptive model selection on embedded devicesTufan, Mohammad Milad; Fruhwirth-Reisinger, Christian; Mirza, M. Jehanzeb; Stern, Darko; 10.3217/978-3-99161-022-9-006
The perception system is a critical component of Advanced Driver Assistance Systems (ADAS) and Automated Driving (AD), playing a pivotal role in reducing traffic accidents caused by human error. For ADAS/AD systems to be seamlessly integrated into everyday life, it is essential to ensure the reliable operation of their perception systems, even under challenging conditions such as adverse weather. This paper presents a novel perception pipeline for real-time object detection with YOLOv3 across diverse weather scenarios. The pipeline incorporates adaptive model selection based on current conditions to optimize detection performance dynamically. To address the computational limitations of embedded systems in constraint environments, we propose a three-step approach: (1) reduction of YOLOv3 complexity using L1 regularization for feature selection, followed by (2) weight pruning and (3) knowledge distillation to recover precision lost in earlier steps. This results in lightweight models up to 70% smaller than the base model while maintaining high precision through knowledge distillation. Finally, the optimized models are evaluated on resource-constrained embedded devices, including the NVIDIA Jetson AGX Orin, NVIDIA Jetson Nano, and Raspberry Pi 4, demonstrating robust and efficient performance under real-world conditions.