Journal ArticleUnknown
Few-Shot Incremental Multi-modal Learning via Touch Guidance and Imaginary Vision Synthesis
Authors
Author Affiliations
City University, Zhejiang Provincial Public Security Department, Hangzhou City University, Hangzhou Normal University, ...
Year2025
Abstract
Multimodal perception, which integrates vision and touch, is increasingly demonstrating its significance in domains such as embodied intelligence and human-computer interaction. However, in open-world scenarios, multimodal data streams face significant challenges, including catastrophic forgetting and overfitting, during few-shot class incremental learning (FSCIL), leading to a severe degradation in model performance. In this work, we propose a novel approach named Few-Shot Incremental Multi-modal Learning via Touch Guidance and Imaginary Vision Synthesis (TIFS). Our method leverages vision imagination synthesis to enhance the semantic understanding and integrates touch and vision fusion to improve the problem of modal imbalance. Specifically, we introduce a framework that employs touch-guided vision information for cross-modal contrastive learning to address the challenges of few-shot learning. Additionally, we incorporate multiple…
View at Publisher
BORR does not host full-text PDFs. The button above takes you to the original publisher.