Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing

Tao HAN, Hailong SUN, Yangqiu SONG, Yili FANG, Xudong LIU

PDF(690 KB)
PDF(690 KB)
Front. Comput. Sci. ›› 2021, Vol. 15 ›› Issue (4) : 154315. DOI: 10.1007/s11704-020-9364-x
RESEARCH ARTICLE

Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing

Author information +
History +

Abstract

Crowdsourcing has been a helpful mechanism to leverage human intelligence to acquire useful knowledge.However, when we aggregate the crowd knowledge based on the currently developed voting algorithms, it often results in common knowledge that may not be expected. In this paper, we consider the problem of collecting specific knowledge via crowdsourcing. With the help of using external knowledge base such as WordNet, we incorporate the semantic relations between the alternative answers into a probabilisticmodel to determine which answer is more specific. We formulate the probabilistic model considering both worker’s ability and task’s difficulty from the basic assumption, and solve it by the expectation-maximization (EM) algorithm. To increase algorithm compatibility, we also refine our method into semi-supervised one. Experimental results show that our approach is robust with hyper-parameters and achieves better improvement thanmajority voting and other algorithms when more specific answers are expected, especially for sparse data.

Keywords

crowdsourcing / knowledge acquisition / EM algorithm / label aggregation

Cite this article

Download citation ▾
Tao HAN, Hailong SUN, Yangqiu SONG, Yili FANG, Xudong LIU. Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing. Front. Comput. Sci., 2021, 15(4): 154315 https://doi.org/10.1007/s11704-020-9364-x

References

[1]
Howe J. The rise of crowdsourcing. Wired Magazine, 2006, 14(6): 1–4
[2]
Wang J, Li G, Kraska T, Franklin M J, Feng J. Leveraging transitive relations for crowdsourced joins. In: Proceedings of ACM Conference on Management of Data. 2013, 229–240
CrossRef Google scholar
[3]
Russell B C, Torralba A, Murphy K P, Freeman W T. Labelme: a database and Web-based tool for image annotation. International Journal of Computer Vision, 2008, 77(1–3): 157–173
CrossRef Google scholar
[4]
Hwang K, Lee S Y. Environmental audio scene and activity recognition through mobile-based crowdsourcing. IEEE Transactions on Consumer Electronics, 2012, 58(2): 700–705
CrossRef Google scholar
[5]
Vondrick C, Patterson D, Ramanan D. Efficiently scaling up crowdsourced video annotation. International Journal of Computer Vision, 2013, 101(1): 184–204
CrossRef Google scholar
[6]
Waggoner B, Chen Y. Output agreement mechanisms and common knowledge. In: Proceedings of the 2nd AAAI Conference on Human Computation and Crowdsourcing. 2014
[7]
Ordonez V, Deng J, Choi Y, Berg A C, Berg T. From large scale image categorization to entry-level categories. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 2768–2775
CrossRef Google scholar
[8]
Feng S, Ravi S, Kumar R, Kuznetsova P, Liu W, Berg A C, Berg T L, Choi Y. Refer-to-as relations as semantic knowledge. In: Proceedings of International Conference on Automated Planning and Scheduling. 2015
[9]
Dawid A P, Skene A M. Maximum likelihood estimation of observer error-rates using the em algorithm. Applied Statistics, 1979, 28(1): 20–28
CrossRef Google scholar
[10]
Whitehill J, Wu T f, Bergsma J, Movellan J R, Ruvolo P L. Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of Annual Conference on Neural Information Processing Systems. 2009, 2035–2043
[11]
Salek M, Bachrach Y, Key P. Hotspotting-a probabilistic graphical model for image object localization through crowdsourcing. In: Proceedings of International Conference on Automated Planning and Scheduling. 2013
[12]
Bachrach Y, Minka T, Guiver J, Graepel T. How to grade a test without knowing the answers—a bayesian graphical model for adaptive crowdsourcing and aptitude testing. In: Proceedings of the 29th International Conference on Machine Learning. 2012, 819–826
[13]
Raykar V C, Yu S, Zhao L H, Valadez G H, Florin C, Bogoni L, Moy L. Learning from crowds. Journal of Machine Learning Research, 2010, 11(43): 1297–1322
[14]
Demartini G, Difallah D E, Cudré-Mauroux P. Zencrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Proceedings of the 21st International Conference on World Wide Web. 2012, 469–478
CrossRef Google scholar
[15]
Zhou D, Basu S, Mao Y, Platt J C. Learning from the wisdom of crowds= by minimax entropy. In: Proceedings of Annual Conference on Neural Information Processing Systems. 2012, 2195–2203
[16]
Han T, Sun H, Song Y, Fang Y, Liu X. Incorporating external knowledge into crowd intelligence for more specific knowledge acquisition. In: Proceedings of International Joint Conference on Artificial Intelligence. 2016, 1541–1547
[17]
Chilton L B, Little G, Edge D, Weld D S, Landay J A. Cascade: crowdsourcing taxonomy creation. In: Proceedings of SIGCHI Conference on Human Factors in Computing Systems. 2013, 1999–2008
CrossRef Google scholar
[18]
Bragg J, Weld D S. Crowdsourcing multi-label classification for taxonomy creation. In: Proceedings of the 1st AAAI Conference on Human Computation and Crowdsourcing. 2013
[19]
Sun Y, Singla A, Fox D, Krause A. Building hierarchies of concepts via crowdsourcing. In: Proceedings of International Joint Conference on Artificial Intelligence. 2015, 844–851
[20]
Fellbaum C. WordNet: An Electronic Lexical Database. MIT Press, 1998
CrossRef Google scholar
[21]
Lenat D B, Guha R V. Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley, 1989
[22]
Speer R, Havasi C. Representing general relational knowledge in conceptnet 5. In: Proceedings of Language Resources and Evaluation Conference. 2012, 3679–3686
[23]
Wu W, Li H, Wang H, Zhu K Q. Probase: a probabilistic taxonomy for text understanding. In: Proceedings of ACM Conference on Management of Data. 2012, 481–492
CrossRef Google scholar
[24]
Prelec D, Seung H S, McCoy J. A solution to the single-question crowd wisdom problem. Nature, 2017, 541(7638): 532–535
CrossRef Google scholar
[25]
Divvala S K, Farhadi A, Guestrin C. Learning everything about anything: webly-supervised visual concept learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 3270–3277
CrossRef Google scholar
[26]
Sheng V S, Provost F, Ipeirotis P G. Get another label? improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008, 614–622
CrossRef Google scholar
[27]
Ipeirotis P G, Provost F, Wang J. Quality management on amazon mechanical turk. In: Proceedings of the ACM SIGKDD Workshop on Human Computation. 2010, 64–67
CrossRef Google scholar
[28]
Han T, Sun H, Song Y, Wang Z, Liu X. Budgeted task scheduling for crowdsourced knowledge acquisition. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2017, 1059–1068
CrossRef Google scholar
[29]
Callison-Burch C. Fast, cheap, and creative: evaluating translation quality using amazon’s mechanical turk. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009, 286–295
CrossRef Google scholar
[30]
Hu C, Bederson B B, Resnik P. Translation by iterative collaboration between monolingual users. In: Proceedings of Graphics Interface 2010. 2010, 39–46
CrossRef Google scholar
[31]
Ambati V, Vogel S, Carbonell J. Active learning and crowd-sourcing for machine translation. In: Proceedings of the 7th International Conference on Language Resources and Evaluation. 2010
[32]
Dong X L, Gabrilovich E, Heitz G, Horn W, Murphy K, Sun S, Zhang W. From data fusion to knowledge fusion. Proceedings of the VLDB Endowment, 2014, 7(10): 881–892
CrossRef Google scholar
[33]
Ma F, Li Y, Li Q, Qiu M, Gao J, Zhi S, Su L, Zhao B, Ji H, Han J. Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 745–754
CrossRef Google scholar
[34]
Fang Y, Sun H, Chen P, Huai J. On the cost complexity of crowdsourcing. In: Proceedings of International Joint Conference on Artificial Intelligence. 2018, 1531–1537
CrossRef Google scholar
[35]
Luengo-Oroz M A, Arranz A, Frean J. Crowdsourcing malaria parasite quantification: an online game for analyzing images of infected thick blood smears. Journal of Medical Internet Research, 2012, 14(6): e167
CrossRef Google scholar
[36]
Kalman R E. A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 1960, 82(1): 35–45
CrossRef Google scholar
[37]
Sun H, Hu K, Fang Y, Song Y. Adaptive result inference for collecting quantitative data with crowdsourcing. IEEE Internet of Things Journal, 2017, 4(5): 1389–1398
CrossRef Google scholar
[38]
Dai P, Lin C H, Weld D S. Pomdp-based control of workflows for crowdsourcing. Artificial Intelligence, 2013, 202: 52–85
CrossRef Google scholar
[39]
Dai P, Weld D S. Artificial intelligence for artificial artificial intelligence. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence. 2011
[40]
Fang Y, Sun H, Li G, Zhang R, Huai J. Context-aware result inference in crowdsourcing. Information Sciences, 2018, 460: 346–363
CrossRef Google scholar
[41]
Otani N, Baba Y, Kashima H. Quality control of crowdsourced classification using hierarchical class structures. Expert Systems with Applications, 2016, 58: 155–163
CrossRef Google scholar
[42]
Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L. Imagenet: a largescale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248–255
CrossRef Google scholar

RIGHTS & PERMISSIONS

2020 Higher Education Press
AI Summary AI Mindmap
PDF(690 KB)

Accesses

Citations

Detail

Sections
Recommended

/