Multi-Modal Learning

A color image not always contains enough information to capture the semantic content of a scene. Multi-modal learning techniques jointly exploit the color information and other representations (e.g., depth maps capturing the geometry of the scene) in order to improve the semantic understanding of complex scenes. Have a look here for a review of recent work in this field.

Key research topics include:

We proposed a novel multi-modal semantic segmentation scheme based on vision transformers where we jointly exploited multimodal positional embeddings and a cross-input attention scheme
We introduced a multimodal dataset (SELMA) for autonomous driving containing multiple color and depth cameras in variable daytime and weather conditions
We jointly exploited color and surface information clues to improve clustering-based segmentation methods

Selected publications:

6 entries « ‹ 1 of 2 › »

Barbato, Francesco; Camuffo, Elena; Milani, Simone; Zanuttigh, Pietro

Continual Road-Scene Semantic Segmentation via Feature-Aligned Symmetric Multi-Modal Network Proceedings Article

In: IEEE International Conference on Image Processing (ICIP), 2024.

Abstract | Links | BibTeX

Rizzoli, Giulia; Shenaj, Donald; Zanuttigh, Pietro

Source-Free Domain Adaptation for RGB-D Semantic Segmentation with Vision Transformers Proceedings Article

In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 615–624, 2024.

Links | BibTeX

Testolina, Paolo; Barbato, Francesco; Michieli, Umberto; Giordani, Marco; Zanuttigh, Pietro; Zorzi, Michele

SELMA: SEmantic Large-Scale Multimodal Acquisitions in Variable Weather, Daytime and Viewpoints Journal Article

In: IEEE Transactions on Intelligent Transportation Systems, pp. 1–13, 2023.

Links | BibTeX

Barbato, Francesco; Rizzoli, Giulia; Zanuttigh, Pietro

DepthFormer: Multimodal Positional Encodings and Cross-Input Attention for Transformer-Based Segmentation Networks Proceedings Article

In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023.

BibTeX

Rizzoli, Giulia; Barbato, Francesco; Zanuttigh, Pietro

Multimodal Semantic Segmentation in Autonomous Driving: A Review of Current Approaches and Future Perspectives Journal Article

In: Technologies, vol. 10, no. 4, pp. 90, 2022.

Links | BibTeX

6 entries « ‹ 1 of 2 › »