Asmaul Hosna, Ethel Merry, Jigmey Gyalmo, Zulfikar Alom et al.
Infinite numbers of real-world applications use Machine Learning (ML) techniques to develop potentially the best data available for the users. Transfer learning (TL), one of the categories under ML, has received much attention from the research communities in the past few years. Traditional ML algor...
Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo
Learning subtle yet discriminative features (e.g., beak and eyes for a bird) plays a significant role in fine-grained image recognition. Existing attention-based approaches localize and amplify significant parts to learn fine-grained details, which often suffer from a limited number of parts and hea...
Qi Cai, Yingwei Pan, Chong‐Wah Ngo, Xinmei Tian et al.
Rendering synthetic data (e.g., 3D CAD-rendered images) to generate annotations for learning deep models in vision tasks has attracted increasing attention in recent years. However, simply applying the models learnt on synthetic images may lead to high generalization error on real images due to doma...
Qiao Liu, Hui Xue
Unsupervised domain adaptation (UDA) has been received increasing attention since it does not require labels in target domain. Most existing UDA methods learn domain-invariant features by minimizing discrepancy distance computed by a certain metric between domains. However, these discrepancy-based m...
Md Farhan Ishmam, Md Sakib Hossain Shovon, M. F. Mridha, Nilanjan Dey
The multimodal task of Visual Question Answering (VQA) encompassing elements of Computer Vision (CV) and Natural Language Processing (NLP), aims to generate answers to questions on any visual input. Over time, the scope of VQA has expanded from datasets focusing on an extensive collection of natural...
Jintao Guo, Na Wang, Lei Qi, Yinghuan Shi
Domain generalization (DG) aims to learn a model that generalizes well to unseen target domains utilizing multiple source domains without re-training. Most existing DG works are based on convolutional neural networks (CNNs). However, the local operation of the convolution kernel makes the model focu...
Bing Liu, Dong Wang, Xu Yang, Yong Zhou et al.
The transformer-based encoder-decoder framework has shown remarkable performance in image captioning. However, most transformer-based captioning methods ever overlook two kinds of elusive confounders: the visual confounder and the linguistic confounder, which generally lead to harmful bias, induce t...
Md. Nahiduzzaman, Md. Omaer Faruq Goni, Md. Shamim Anower, Md. Robiul Islam et al.
In this era of COVID19, proper diagnosis and treatment for pneumonia are very important. Chest X-Ray (CXR) image analysis plays a vital role in the reliable diagnosis of pneumonia. An experienced radiologist is required for this. However, even for an experienced radiographer, it is quite difficult a...
Ali Cheraghian, Shafin Rahman, Sameera Ramasinghe, Pengfei Fang et al.
Few-shot class incremental learning (FSCIL) aims to incrementally add sets of novel classes to a well-trained base model in multiple training sessions with the restriction that only a few novel instances are available per class. While learning novel classes, FSCIL methods gradually forget base (old)...
Ali Cheraghian, Shafin Rahman, Townim Faisal Chowdhury, Dylan Campbell et al.
Zero-shot learning, the task of learning to recognize new classes not seen during training, has received considerable attention in the case of 2D image classification. However, despite the increasing ubiquity of 3D sensors, the corresponding 3D point cloud classification problem has not been meaning...
Shafin Rahman, Salman H. Khan, Fatih Porikli
Zhonghao Wang, Yunchao Wei, Rogério Feris, Jinjun Xiong et al.
Utilizing synthetic data for semantic segmentation can significantly relieve human efforts in labelling pixel-level masks. A key challenge of this task is how to alleviate the data distribution discrepancy between the source and target domains, i.e. reducing domain shift. The common approach to this...
Jintao Guo, Lei Qi, Yinghuan Shi
Deep Neural Networks have exhibited considerable success in various visual tasks. However, when applied to unseen test datasets, state-of-the-art models often suffer performance degradation due to domain shifts. In this paper, we introduce a novel approach for domain generalization from a novel pers...
Ziqi Zhou, Lei Qi, Yinghuan Shi
Zhijun Mai, Guosheng Hu, Dexiong Chen, Fumin Shen et al.
MixUp is an effective data augmentation method to regularize deep neural networks via random linear interpolations between pairs of samples and their labels. It plays an important role in model regularization, semisupervised learning (SSL), and domain adaption. However, despite its empirical success...