Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing

Tao HAN , Hailong SUN , Yangqiu SONG , Yili FANG , Xudong LIU

Front. Comput. Sci. ›› 2021, Vol. 15 ›› Issue (4) : 154315

PDF (690KB)
Front. Comput. Sci. ›› 2021, Vol. 15 ›› Issue (4) : 154315 DOI: 10.1007/s11704-020-9364-x
RESEARCH ARTICLE

Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing

Author information +
History +
PDF (690KB)

Abstract

Crowdsourcing has been a helpful mechanism to leverage human intelligence to acquire useful knowledge.However, when we aggregate the crowd knowledge based on the currently developed voting algorithms, it often results in common knowledge that may not be expected. In this paper, we consider the problem of collecting specific knowledge via crowdsourcing. With the help of using external knowledge base such as WordNet, we incorporate the semantic relations between the alternative answers into a probabilisticmodel to determine which answer is more specific. We formulate the probabilistic model considering both worker’s ability and task’s difficulty from the basic assumption, and solve it by the expectation-maximization (EM) algorithm. To increase algorithm compatibility, we also refine our method into semi-supervised one. Experimental results show that our approach is robust with hyper-parameters and achieves better improvement thanmajority voting and other algorithms when more specific answers are expected, especially for sparse data.

Keywords

crowdsourcing / knowledge acquisition / EM algorithm / label aggregation

Cite this article

Download citation ▾
Tao HAN, Hailong SUN, Yangqiu SONG, Yili FANG, Xudong LIU. Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing. Front. Comput. Sci., 2021, 15(4): 154315 DOI:10.1007/s11704-020-9364-x

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Howe J. The rise of crowdsourcing. Wired Magazine, 2006, 14(6): 1–4

[2]

Wang J, Li G, Kraska T, Franklin M J, Feng J. Leveraging transitive relations for crowdsourced joins. In: Proceedings of ACM Conference on Management of Data. 2013, 229–240

[3]

Russell B C, Torralba A, Murphy K P, Freeman W T. Labelme: a database and Web-based tool for image annotation. International Journal of Computer Vision, 2008, 77(1–3): 157–173

[4]

Hwang K, Lee S Y. Environmental audio scene and activity recognition through mobile-based crowdsourcing. IEEE Transactions on Consumer Electronics, 2012, 58(2): 700–705

[5]

Vondrick C, Patterson D, Ramanan D. Efficiently scaling up crowdsourced video annotation. International Journal of Computer Vision, 2013, 101(1): 184–204

[6]

Waggoner B, Chen Y. Output agreement mechanisms and common knowledge. In: Proceedings of the 2nd AAAI Conference on Human Computation and Crowdsourcing. 2014

[7]

Ordonez V, Deng J, Choi Y, Berg A C, Berg T. From large scale image categorization to entry-level categories. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 2768–2775

[8]

Feng S, Ravi S, Kumar R, Kuznetsova P, Liu W, Berg A C, Berg T L, Choi Y. Refer-to-as relations as semantic knowledge. In: Proceedings of International Conference on Automated Planning and Scheduling. 2015

[9]

Dawid A P, Skene A M. Maximum likelihood estimation of observer error-rates using the em algorithm. Applied Statistics, 1979, 28(1): 20–28

[10]

Whitehill J, Wu T f, Bergsma J, Movellan J R, Ruvolo P L. Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of Annual Conference on Neural Information Processing Systems. 2009, 2035–2043

[11]

Salek M, Bachrach Y, Key P. Hotspotting-a probabilistic graphical model for image object localization through crowdsourcing. In: Proceedings of International Conference on Automated Planning and Scheduling. 2013

[12]

Bachrach Y, Minka T, Guiver J, Graepel T. How to grade a test without knowing the answers—a bayesian graphical model for adaptive crowdsourcing and aptitude testing. In: Proceedings of the 29th International Conference on Machine Learning. 2012, 819–826

[13]

Raykar V C, Yu S, Zhao L H, Valadez G H, Florin C, Bogoni L, Moy L. Learning from crowds. Journal of Machine Learning Research, 2010, 11(43): 1297–1322

[14]

Demartini G, Difallah D E, Cudré-Mauroux P. Zencrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Proceedings of the 21st International Conference on World Wide Web. 2012, 469–478

[15]

Zhou D, Basu S, Mao Y, Platt J C. Learning from the wisdom of crowds= by minimax entropy. In: Proceedings of Annual Conference on Neural Information Processing Systems. 2012, 2195–2203

[16]

Han T, Sun H, Song Y, Fang Y, Liu X. Incorporating external knowledge into crowd intelligence for more specific knowledge acquisition. In: Proceedings of International Joint Conference on Artificial Intelligence. 2016, 1541–1547

[17]

Chilton L B, Little G, Edge D, Weld D S, Landay J A. Cascade: crowdsourcing taxonomy creation. In: Proceedings of SIGCHI Conference on Human Factors in Computing Systems. 2013, 1999–2008

[18]

Bragg J, Weld D S. Crowdsourcing multi-label classification for taxonomy creation. In: Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing. 2013

[19]

Sun Y, Singla A, Fox D, Krause A. Building hierarchies of concepts via crowdsourcing. In: Proceedings of International Joint Conference on Artificial Intelligence. 2015, 844–851

[20]

Fellbaum C. WordNet: An Electronic Lexical Database. MIT Press, 1998

[21]

Lenat D B, Guha R V. Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley, 1989

[22]

Speer R, Havasi C. Representing general relational knowledge in conceptnet 5. In: Proceedings of Language Resources and Evaluation Conference. 2012, 3679–3686

[23]

Wu W, Li H, Wang H, Zhu K Q. Probase: a probabilistic taxonomy for text understanding. In: Proceedings of ACM Conference on Management of Data. 2012, 481–492

[24]

Prelec D, Seung H S, McCoy J. A solution to the single-question crowd wisdom problem. Nature, 2017, 541(7638): 532–535

[25]

Divvala S K, Farhadi A, Guestrin C. Learning everything about anything: webly-supervised visual concept learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 3270–3277

[26]

Sheng V S, Provost F, Ipeirotis P G. Get another label? improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 614–622

[27]

Ipeirotis P G, Provost F, Wang J. Quality management on amazon mechanical turk. In: Proceedings of the ACM SIGKDD Workshop on Human Computation. 2010, 64–67

[28]

Han T, Sun H, Song Y, Wang Z, Liu X. Budgeted task scheduling for crowdsourced knowledge acquisition. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2017, 1059–1068

[29]

Callison-Burch C. Fast, cheap, and creative: evaluating translation quality using amazon’s mechanical turk. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009, 286–295

[30]

Hu C, Bederson B B, Resnik P. Translation by iterative collaboration between monolingual users. In: Proceedings of Graphics Interface 2010. 2010, 39–46

[31]

Ambati V, Vogel S, Carbonell J. Active learning and crowd-sourcing for machine translation. In: Proceedings of the 7th International Conference on Language Resources and Evaluation. 2010

[32]

Dong X L, Gabrilovich E, Heitz G, Horn W, Murphy K, Sun S, Zhang W. From data fusion to knowledge fusion. Proceedings of the VLDB Endowment, 2014, 7(10): 881–892

[33]

Ma F, Li Y, Li Q, Qiu M, Gao J, Zhi S, Su L, Zhao B, Ji H, Han J. Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 745–754

[34]

Fang Y, Sun H, Chen P, Huai J. On the cost complexity of crowdsourcing. In: Proceedings of International Joint Conference on Artificial Intelligence. 2018, 1531–1537

[35]

Luengo-Oroz M A, Arranz A, Frean J. Crowdsourcing malaria parasite quantification: an online game for analyzing images of infected thick blood smears. Journal of Medical Internet Research, 2012, 14(6): e167

[36]

Kalman R E. A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 1960, 82(1): 35–45

[37]

Sun H, Hu K, Fang Y, Song Y. Adaptive result inference for collecting quantitative data with crowdsourcing. IEEE Internet of Things Journal, 2017, 4(5): 1389–1398

[38]

Dai P, Lin C H, Weld D S. Pomdp-based control of workflows for crowdsourcing. Artificial Intelligence, 2013, 202: 52–85

[39]

Dai P, Weld D S. Artificial intelligence for artificial artificial intelligence. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence. 2011

[40]

Fang Y, Sun H, Li G, Zhang R, Huai J. Context-aware result inference in crowdsourcing. Information Sciences, 2018, 460: 346–363

[41]

Otani N, Baba Y, Kashima H. Quality control of crowdsourced classification using hierarchical class structures. Expert Systems with Applications, 2016, 58: 155–163

[42]

Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L. Imagenet: a largescale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248–255

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (690KB)

Supplementary files

Article highlights

2054

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/