Efficient Crowdsourcing via Proxy Voting
Crowdsourcing platforms offer a way to label data by aggregating answers of multiple unqualified workers. We introduce a simple and budget efficient crowdsourcing method named Proxy Crowdsourcing (PCS). PCS collects answers from two sets of workers: leaders (a.k.a proxies) and followers. Each leader completely answers the survey while each follower answers only a small subset of it. We then weigh every leader according to the number of followers to which his answer are closest, and aggregate the answers of the leaders using any standard aggregation method (e.g., Plurality for categorical labels or Mean for continuous labels). We compare empirically the performance of PCS to unweighted aggregation, keeping the total number of questions (the budget) fixed. We show that PCS improves the accuracy of aggregated answers across several datasets, both with categorical and continuous labels. Overall, our suggested method improves accuracy while being simple and easy to implement.
READ FULL TEXT