Berlin 2018 – scientific program
SOE 4.2: Talk
Monday, March 12, 2018, 11:45–12:00, MA 001
Quantifying and suppressing ranking bias — •Giacomo Vaccario1, Matus Medo2, Nicolas Wider1, and Manuel S. Marian2 — 1ETH, Zurich, Switzerland — 2University of Friburg, Friburg, Switzerland
With the increasing size of information repositories as the World Wide Web or scholarly publication databases, we rely more and more on rankings algorithms to filter and rank information. Ranking algorithms that have proven to be particularly successful are those based on a network perspective, such as Google's PageRank, or on normalization procedure, such as relative citation count. Even though these algorithms seem to be objective and hence, are often considered ``fair'' they have strong biases. For example, the popular Google PageRank is known to fail in individuating young valuables nodes in time evolving networks. For this reason we propose a new method to define and quantify biases of rankings. In this method, we define a null model based on a multivariate hyper-geometric distribution to generate random, but unbiased rankings. Then we quantify the bias of a given ranking by computing its average deviation from the unbiased rankings using the Mahalanobis distance. As example, we apply the proposed method on established indicators of papers importance (citation count, relative citation count and PageRank) and show that their rankings are biased with respect to both the age and topic of the papers. Finally, we give a general normalization procedure to partially cure the observed biases.