Xu Yang, Hanwang Zhang, Jianfei Cai
Dataset bias in vision-language tasks is becoming one of the main problems which hinders the progress of our community. Existing solutions lack a principled analysis about why modern image captioners easily collapse into dataset bias. In this paper, we present a novel perspective: Deconfounded Image...
Xu Yang, Chongyang Gao, Hanwang Zhang, Jianfei Cai
We propose an Auto-Parsing Network (APN) to discover and exploit the input data’s hidden tree structures for improving the effectiveness of the Transformer-based vision-language systems. Specifically, we impose a Probabilistic Graphical Model (PGM) parameterized by the attention operations on each s...
Yuncheng Hua, Yuan-Fang Li, Gholamreza Haffari, Guilin Qi et al.
Complex question-answering (CQA) involves answering complex natural-language questions on a knowledge base (KB). However, the conventional neural program induction (NPI) approach exhibits uneven performance when the questions have different types, harboring inherently different characteristics, e.g....
Akm Ashiquzzaman, Abdul Kawsar Tushar, Md Ashiqur Rahman, Farzana Mohsin
Salman Fazle Rabby, Muhammad Abdullah Arafat, Taufiq Hasan
Brain tumors are severe medical conditions that can prove fatal if not detected and treated early. Radiologists often use MRI and CT scan imaging to diagnose brain tumors early. However, a shortage of skilled radiologists to analyze medical images can be problematic in low-resource healthcare settin...
Weipeng Cao, Yuhao Wu, Chengchao Huang, Muhammed J. A. Patwary et al.
Jian Zhang, Lei Qi, Yinghuan Shi, Yang Gao
To deal with the domain shift between training and test samples, current methods have primarily focused on learning generalizable features during training and ignore the specificity of unseen samples that are also critical during the test. In this paper, we investigate a more challenging task that a...
Yan Yang, Md Zakir Hossain, Eric A. Stone, Shafin Rahman
Spatial transcriptomics (ST) is essential for understanding diseases and developing novel treatments. It measures gene expression of each fine-grained area (i.e., different windows) in the tissue slide with low throughput. This paper proposes an Exemplar Guided Network (EGN) to accurately and effici...
Yixin Zhang, Zilei Wang
Unsupervised domain adaptation in semantic segmentation is to exploit the pixel-level annotated samples in the source domain to aid the segmentation of unlabeled samples in the target domain. For such a task, the key point is to learn domain-invariant representations and adversarial learning is usua...
Sadia Islam Tonni, Md. Alif Sheakh, Mst. Sazia Tahosin, Md. Zahid Hasan et al.
Brain tumors are among the most severe health challenges, necessitating early and precise diagnosis for effective treatment planning. This study introduces an optimized hybrid transfer learning (TL) framework for brain tumor classification using magnetic resonance imaging images. The proposed system...
Zekun Li, Lei Qi, Yinghuan Shi, Yang Gao
Semi-supervised learning (SSL) aims to leverage massive unlabeled data when labels are expensive to obtain. Unfortunately, in many real-world applications, the collected unlabeled data will inevitably contain unseen-class outliers not belonging to any of the labeled classes. To deal with the challen...
Wenxiao Cai, 克己 阿久津, Jinyan Hou, Cong Guo et al.
Shuaicheng Niu, Jiaxiang Wu, Yifan Zhang, Yong Guo et al.
Neural architecture search (NAS) has gained increasing attention in the community of architecture design. One of the key factors behind the success lies in the training efficiency brought by the weight sharing (WS) technique. However, WS-based NAS methods often suffer from a performance disturbance ...
Aiman Lameesa, Chaklam Silpasuwanchai, Md. Sakib Bin Alam
Yanshuo Wang, Jie Hong, Ali Cheraghian, Shafin Rahman et al.
The objective of Continual Test-time Domain Adaptation (CTDA) is to gradually adapt a pre-trained model to a sequence of target domains without accessing the source data. This paper proposes a Dynamic Sample Selection (DSS) method for CTDA. DSS consists of dynamic thresholding, positive learning, an...