Back to Search
Journal ArticleUnknown

Mining duplicate questions in stack overflow

Author Affiliations
University of Dhaka, University of Saskatchewan
Year2016
Citations123

Abstract

Stack Overflow is a popular question answering site that is focused on programming problems. Despite efforts to prevent asking questions that have already been answered, the site contains duplicate questions. This may cause developers to unnecessarily wait for a question to be answered when it has already been asked and answered. The site currently depends on its moderators and users with high reputation to manually mark those questions as duplicates, which not only results in delayed responses but also requires additional efforts. In this paper, we first perform a manual investigation to understand why users submit duplicate questions in Stack Overflow. Based on our manual investigation we propose a classification technique that uses a number of carefully chosen features to…
View at Publisher

BORR does not host full-text PDFs. The button above takes you to the original publisher.