Journal ArticleOpen Access

Imputation of missing data in life‐history trait datasets: which approach performs the best?

Authors

Caterina Penone, Ana D. Davidson, Kevin T. Shoemaker, Moreno Di Marco, …

Author Affiliations

Universidade Federal do Rio Grande do Norte, NatureServe, Stony Brook University, Sapienza University of Rome, ...

Published InMethods in Ecology and Evolution

Year2014

Citations461

DOI10.1111/2041-210x.12232

Abstract

Summary Despite efforts in data collection, missing values are commonplace in life‐history trait databases. Because these values typically are not missing randomly, the common practice of removing missing data not only reduces sample size, but also introduces bias that can lead to incorrect conclusions. Imputing missing values is a potential solution to this problem. Here, we evaluate the performance of four approaches for estimating missing values in trait databases (K‐nearest neighbour ( kNN ), multivariate imputation by chained equations (mice), missForest and Phylopars), and test whether imputed datasets retain underlying allometric relationships among traits. Starting with a nearly complete trait dataset on the mammalian order Carnivora (using four traits), we artificially removed values so that the percent of missing values…

View at Publisher

BORR does not host full-text PDFs. The button above takes you to the original publisher.

Fields & Keywords

Physical Sciences Environmental Science Ecology Wildlife Ecology and Conservation Evolution and Paleontology Studies Genetic and phenotypic traits in livestock Statistics Programming language