PDF
(2089KB)
Abstract
Crowdsourcing provides a flexible approach for leveraging human intelligence to solve large-scale problems, gaining widespread acceptance in domains like intelligent information processing, social decision-making, and crowd ideation. However, the uncertainty of participants significantly compromises the answer quality, sparking substantial research interest. Existing surveys predominantly concentrate on quality control in Boolean tasks, which are generally formulated as simple label classification, ranking, or numerical prediction. Ubiquitous open-ended tasks like question-answering, translation, and semantic segmentation have not been sufficiently discussed. These tasks usually have large to infinite answer spaces and non-unique acceptable answers, posing significant challenges for quality assurance. This survey focuses on quality control methods applicable to open-ended tasks in crowdsourcing. We propose a two-tiered framework to categorize related works. The first tier presents a comprehensive overview of the quality model, covering essential aspects including tasks, workers, answers, and the system. The second tier further refines this classification by breaking it down into more detailed categories: ‘quality dimensions’, ‘evaluation metrics’, and ‘design decisions’. This breakdown provides deeper insights into the internal structure of the quality control model for each aspect. We thoroughly investigate how these quality control methods are implemented in state-of-the-art works and discuss key challenges and potential future research directions.
Graphical abstract
Keywords
crowdsourcing
/
open-ended tasks
/
quality control
Cite this article
Download citation ▾
Lei CHAI, Hailong SUN, Jing ZHANG.
Quality control in open-ended crowdsourcing: a survey.
Front. Comput. Sci., 2026, 20(6): 2006330 DOI:10.1007/s11704-025-41081-1
| [1] |
Guo B, Wang Z, Yu Z, Wang Y, Yen N Y, Huang R, Zhou X . Mobile crowd sensing and computing: the review of an emerging human-powered sensing paradigm. ACM Computing Surveys (CSUR), 2015, 48( 1): 7
|
| [2] |
Li J, Fukumoto F. A dataset of crowdsourced word sequences: collections and answer aggregation for ground truth creation. In: Proceedings of the 1st Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP. 2019, 24−28
|
| [3] |
Li J. Crowdsourced text sequence aggregation based on hybrid reliability and representation. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2020, 1761−1764
|
| [4] |
Braylan A, Lease M. Modeling and aggregation of complex annotations via annotation distances. In: Proceedings of Web Conference 2020. 2020, 1807−1818
|
| [5] |
Cheng H D, Jiang X H, Sun Y, Wang J . Color image segmentation: advances and prospects. Pattern Recognition, 2001, 34( 12): 2259–2281
|
| [6] |
Chai L, Qi L, Sun H, Li J. Ra3: a human-in-the-loop framework for interpreting and improving image captioning with relation-aware attribution analysis. In: Proceedings of the 40th IEEE International Conference on Data Engineering. 2024, 330−341
|
| [7] |
Song J Y, Chung J J Y, Fouhey D F, Lasecki W S . C-reference: improving 2D to 3D object pose estimation accuracy via crowdsourced joint object estimation. Proceedings of the ACM on Human-Computer Interaction, 2020, 4( CSCW1): 51
|
| [8] |
Maninis K K, Caelles S, Pont-Tuset J, Van Gool L. Deep extreme cut: from extreme points to object segmentation. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 616−625
|
| [9] |
Sofiiuk K, Petrov I, Barinova O, Konushin A. F-BRS: rethinking backpropagating refinement for interactive segmentation. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 8620−8629
|
| [10] |
Jang W D, Kim C S. Interactive image segmentation via backpropagating refinement scheme. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 5292−5301
|
| [11] |
Kang X, Yu G, Li Q, Wang J, Li H, Domeniconi C. Semi-asynchronous online federated crowdsourcing. In: Proceedings of the 40th IEEE International Conference on Data Engineering. 2024, 4180−4193
|
| [12] |
Guan H, Song C, Zhang Z . GRAMO: geometric resampling augmentation for monocular 3D object detection. Frontiers of Computer Science, 2024, 18( 5): 185706
|
| [13] |
Kim J, Sterman S, Cohen A A B, Bernstein M S. Mechanical novel: crowdsourcing complex work through reflection and revision. In: Proceedings of 2017 ACM Conference on Computer Supported Cooperative Work and social Computing. 2017, 233−245
|
| [14] |
Li T, Luther K, North C . CrowdiA: solving mysteries with crowdsourced sensemaking. Proceedings of the ACM on Human-Computer Interaction, 2018, 2( CSCW): 105
|
| [15] |
Chilton L B, Little G, Edge D, Weld D S, Landay J A. Cascade: crowdsourcing taxonomy creation. In: Proceedings of SIGCHI Conference on Human Factors in Computing Systems. 2013, 1999−2008
|
| [16] |
de Alfaro L, Polychronopoulos V, Shavlovsky M. Reliable aggregation of Boolean crowdsourced tasks. In: Proceedings of the 3rd AAAI Conference on Human Computation and Crowdsourcing. 2015, 42−51
|
| [17] |
Chen C, Zhang X, Ju S, Fu C, Tang C, Zhou J, Li X. AntProphet: an intention mining system behind Alipay’s intelligent customer service bot. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2019, 6497−6499
|
| [18] |
Krishna R, Zhu Y, Groth O, Johnson J, Hata K, Kravitz J, Chen S, Kalantidis Y, Li L J, Shamma D A, Bernstein M S, Fei-Fei L . Visual genome: connecting language and vision using crowdsourced dense image annotations. International Journal of Computer Vision, 2017, 123( 1): 32–73
|
| [19] |
Lee D J L, Das Sarma A, Parameswaran A G. Aggregating crowdsourced image segmentations. In: Proceedings of HCOMP 2018 Works in Progress and Demonstration Papers Track of the 6th AAAI Conference on Human Computation and Crowdsourcing. 2018
|
| [20] |
Rzadca K, Findeisen P, Swiderski J, Zych P, Broniek P, Kusmierek J, Nowak P, Strack B, Witusowski P, Hand S, Wilkes J. Autopilot: workload autoscaling at google. In: Proceedings of the 15th European Conference on Computer Systems. 2020, 16
|
| [21] |
Han J, Brown C, Chauhan J, Grammenos A, Hasthanasombat A, Spathis D, Xia T, Cicuta P, Mascolo C. Exploring automatic COVID-19 diagnosis via voice and symptoms from crowdsourced data. In: Proceedings of 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. 2021, 8328−8332
|
| [22] |
Fortson L, Masters K, Nichol R, Borne K D, Edmondson E M, Lintott C, Raddick J, Schawinski K, Wallin J. Galaxy zoo. In: Way M J, Scargle J D, Ali K M, Srivastava A N, eds. Advances in Machine Learning and Data Mining for Astronomy. New York: CRC, 2012, 213−236
|
| [23] |
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, . . Highly accurate protein structure prediction with AlphaFold. Nature, 2021, 596( 7873): 583–589
|
| [24] |
Khare R, Good B M, Leaman R, Su A I, Lu Z . Crowdsourcing in biomedicine: challenges and opportunities. Briefings in Bioinformatics, 2016, 17( 1): 23–32
|
| [25] |
Glott R, Schmidt P, Ghosh R. Wikipedia survey–overview of results. United Nations University, Collaborative Creativity Group, 2010, 1158−1178
|
| [26] |
Wang Z, Sun H. Teaching active human learners. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. 2021, 5850−5857
|
| [27] |
Gadiraju U, Demartini G, Kawase R, Dietze S . Crowd anatomy beyond the good and bad: behavioral traces for crowd worker modeling and pre-selection. Computer Supported Cooperative Work (CSCW), 2019, 28( 5): 815–841
|
| [28] |
Chen P, Sun H, Yang Y, Chen Z. Adversarial learning from crowds. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence. 2022, 5304−5312
|
| [29] |
Rechkemmer A, Yin M. Motivating novice crowd workers through goal setting: an investigation into the effects on complex crowdsourcing task training. In: Proceedings of the 8th AAAI Conference on Human Computation and Crowdsourcing. 2020, 122−131
|
| [30] |
Zhou Z, Jin Y X, Li Y F . Rts: learning robustly from time series data with noisy label. Frontiers of Computer Science, 2024, 18( 6): 186332
|
| [31] |
Dedeoglu E, Kesgin H T, Amasyali M F . A robust optimization method for label noisy datasets based on adaptive threshold: adaptive-k. Frontiers of Computer Science, 2024, 18( 4): 184315
|
| [32] |
Mithe R, Indalkar S, Divekar N . Optical character recognition. International Journal of Recent Technology and Engineering (IJRTE), 2013, 2( 1): 72–75
|
| [33] |
Nanni L, Maguolo G, Paci M . Data augmentation approaches for improving animal audio classification. Ecological Informatics, 2020, 57: 101084
|
| [34] |
Wexler M N . The who, what and why of knowledge mapping. Journal of Knowledge Management, 2001, 5( 3): 249–264
|
| [35] |
Shin J, Wu S, Wang F, De Sa C, Zhang C, Ré C . Incremental knowledge base construction using DeepDive. Proceedings of the VLDB Endowment, 2015, 8( 11): 1310–1321
|
| [36] |
Zhao W, Chellappa R, Phillips P J, Rosenfeld A . Face recognition: a literature survey. ACM Computing Surveys (CSUR), 2003, 35( 4): 399–458
|
| [37] |
Gray D, Tao H. Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: Proceedings of the 10th European Conference on Computer Vision. 2008, 262−275
|
| [38] |
Levinson J, Askeland J, Becker J, Dolson J, Held D, Kammel S, Kolter J Z, Langer D, Pink O, Pratt V, Sokolsky M, Stanek G, Stavens D, Teichman A, Werling M, Sebastian Thrun S. Towards fully autonomous driving: systems and algorithms. In: Proceedings of 2011 IEEE Intelligent Vehicles Symposium (IV). 2011, 163−168
|
| [39] |
Yaqoob I, Hashem I A T, Mehmood Y, Gani A, Mokhtar S, Guizani S . Enabling communication technologies for smart cities. IEEE Communications Magazine, 2017, 55( 1): 112–120
|
| [40] |
Lee L H, Braud T, Zhou P Y, Wang L, Xu D, Lin Z, Kumar A, Bermejo C, Hui P . All one needs to know about metaverse: a complete survey on technological singularity, virtual ecosystem, and research agenda. Foundations and Trends® in Human-Computer Interaction, 2024, 18( 2−3): 100–337
|
| [41] |
Chai L, Sun H, Wang Z . An error consistency based approach to answer aggregation in open-ended crowdsourcing. Information Sciences, 2022, 608: 1029–1044
|
| [42] |
Li J. Context-based collective preference aggregation for prioritizing crowd opinions in social decision-making. In: Proceedings of ACM Web Conference 2022. 2022, 2657−2667
|
| [43] |
Arous I, Yang J, Khayati M, Cudré-Mauroux P. Opencrowd: a human-AI collaborative approach for finding social influencers via open-ended answers aggregation. In: Proceedings of Web Conference 2020. 2020, 1851−1862
|
| [44] |
Han T, Sun H, Song Y, Fang Y, Liu X . Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing. Frontiers of Computer Science, 2021, 15( 4): 154315
|
| [45] |
Schmidt G B, Jettinghoff W M . Using amazon mechanical Turk and other compensated crowdsourcing sites. Business Horizons, 2016, 59( 4): 391–400
|
| [46] |
Kim J, Monroy-Hernandez A. Storia: summarizing social media content based on narrative theory using crowdsourcing. In: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing. 2016, 1018−1027
|
| [47] |
Huang C Y, Huang S H, Huang T H K. Heteroglossia: in-situ story ideation with the crowd. In: Proceedings of 2020 CHI Conference on Human Factors in Computing Systems. 2020, 1−12
|
| [48] |
Deng Z, Xiang Y . Multistep planning for crowdsourcing complex consensus tasks. Knowledge-Based Systems, 2021, 231: 107447
|
| [49] |
Verroios V, Bernstein M. Context trees: crowdsourcing global understanding from local views. In: Proceedings of the 2nd AAAI Conference on Human Computation and Crowdsourcing. 2014, 210−219
|
| [50] |
Wang N C, Hicks D, Luther K . Exploring trade-offs between learning and productivity in crowdsourced history. Proceedings of the ACM on Human-Computer Interaction, 2018, 2( CSCW): 178
|
| [51] |
Zhang A X, Verou L, Karger D. Wikum: bridging discussion forums and wikis using recursive summarization. In: Proceedings of 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 2017, 2082−2096
|
| [52] |
Braylan A, Lease M. Aggregating complex annotations via merging and matching. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2021, 86−94
|
| [53] |
Inel O, Khamkham K, Cristea T, Dumitrache A, Rutjes A, van der Ploeg J, Romaszko L, Aroyo L, Sips R J. CrowdTruth: machine-human computation framework for harnessing disagreement in gathering annotated data. In: Proceedings of the 13th International Semantic Web Conference. 2014, 486−504
|
| [54] |
Wang B, Wu V, Wu B, Keutzer K. LATTE: accelerating LiDAR point cloud annotation via sensor fusion, one-click annotation, and tracking. In: Proceedings of 2019 IEEE Intelligent Transportation Systems Conference. 2019, 265−272
|
| [55] |
Branson S, Van Horn G, Perona P. Lean crowdsourcing: combining humans and machines in an online system. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017, 6109−6118
|
| [56] |
Abbas T, Khan V J, Gadiraju U, Markopoulos P. Trainbot: a conversational interface to train crowd workers for delivering on-demand therapy. In: Proceedings of the 8th AAAI Conference on Human Computation and Crowdsourcing. 2020, 3−12
|
| [57] |
Nguyen Q V H, Duong C T, Nguyen T T, Weidlich M, Aberer K, Yin H, Zhou X . Argument discovery via crowdsourcing. The VLDB Journal, 2017, 26( 4): 511–535
|
| [58] |
Kaur H, Williams A C, Thompson A L, Lasecki W S, Iqbal S T, Teevan J . Creating better action plans for writing tasks via vocabulary-based planning. Proceedings of the ACM on Human-Computer Interaction, 2018, 2( CSCW): 86
|
| [59] |
Baba Y, Li J, Kashima H. CrowDEA: multi-view idea prioritization with crowds. In: Proceedings of the 8th AAAI Conference on Human Computation and Crowdsourcing. 2020, 23−32
|
| [60] |
Hung N Q V, Viet H H, Tam N T, Weidlich M, Yin H, Zhou X . Computing crowd consensus with partial agreement. IEEE Transactions on Knowledge and Data Engineering, 2018, 30( 1): 1–14
|
| [61] |
Goncalves J, Hosio S, Kostakos V . Eliciting structured knowledge from situated crowd markets. ACM Transactions on Internet Technology (TOIT), 2017, 17( 2): 14
|
| [62] |
Michael J, Stanovsky G, He L, Dagan I, Zettlemoyer L. Crowdsourcing question-answer meaning representations. In: Proceedings of 2018 Conference of the North American Chapter of the Association for Computational Linguistics. 2018, 560–568
|
| [63] |
Schmitz H, Lykourentzou I . Online sequencing of non-decomposable macrotasks in expert crowdsourcing. ACM Transactions on Social Computing, 2018, 1( 1): 1
|
| [64] |
Tran-Thanh L, Huynh T D, Rosenfeld A, Ramchurn S D, Jennings N R. Crowdsourcing complex workflows under budget constraints. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015, 1298−1304
|
| [65] |
Pavlichenko N, Stelmakh I, Ustalov D. CrowdSpeech and VoxDIY: benchmark datasets for crowdsourced audio transcription. 2021, arXiv preprint arXiv: 2107.01091
|
| [66] |
Lipping S, Drossos K, Virtanen T. Crowdsourcing a dataset of audio captions. In: Proceedings of Workshop on Detection and Classification of Acoustic Scenes and Events 2019. 2019, 139−143
|
| [67] |
Kaspar A, Patterson G, Kim C, Aksoy Y, Matusik W, Elgharib M. Crowd-guided ensembles: how can we choreograph crowd workers for video segmentation? In: Proceedings of 2018 CHI Conference on Human Factors in Computing Systems. 2018, 111
|
| [68] |
Deng D, Wu J, Wang J, Wu Y, Xie X, Zhou Z, Zhang H, Zhang X, Wu Y. Eventanchor: reducing human interactions in event annotation of racket sports videos. In: Proceedings of 2021 CHI Conference on Human Factors in Computing Systems. 2021, 73
|
| [69] |
Song J Y, Lemmer S J, Liu M X, Yan S, Kim J, Corso J J, Lasecki W S. Popup: reconstructing 3D video using particle filtering to aggregate crowd responses. In: Proceedings of the 24th International Conference on Intelligent User Interfaces. 2019, 558−569
|
| [70] |
Mutton A, Dras M, Wan S, Dale R. GLEU: automatic evaluation of sentence-level fluency. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 2007, 344−351
|
| [71] |
Niwattanakul S, Singthongchai J, Naenudorn E, Wanapu S. Using of Jaccard coefficient for keywords similarity. In: Proceedings of International MultiConference of Engineers and Computer Scientists 2013. 2013, 380−384
|
| [72] |
Tang W, Yin M, Ho C J. Leveraging peer communication to enhance crowdsourcing. In: Proceedings of World Wide Web Conference. 2019, 1794–1805
|
| [73] |
Huang Y C, Huang J C, Wang H C, Hsu J. Supporting ESL writing by prompting crowdsourced structural feedback. In: Proceedings of the 5th AAAI Conference on Human Computation and Crowdsourcing. 2017, 71–78
|
| [74] |
Heim E, Seitel A, Andrulis J, Isensee F, Stock C, Ross T, Maier-Hein L . Clickstream analysis for crowd-based object segmentation with confidence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40( 12): 2814–2826
|
| [75] |
Barbosa N M, Chen M. Rehumanized crowdsourcing: a labeling framework addressing bias and ethics in machine learning. In: Proceedings of 2019 CHI Conference on Human Factors in Computing Systems. 2019, 543
|
| [76] |
Piech C, Bassen J, Huang J, Ganguli S, Sahami M, Guibas L, Sohl-Dickstein J. Deep knowledge tracing. In: Proceedings of the 29th International Conference on Neural Information Processing Systems. 2015, 505−513
|
| [77] |
Marcus A, Parameswaran A . Crowdsourced data management: industry and academic perspectives. Foundations and Trends® in Databases, 2015, 6( 1−2): 1–161
|
| [78] |
Difallah D E, Catasta M, Demartini G, Ipeirotis P G, Cudre-Mauroux P. The dynamics of micro-task crowdsourcing: the case of amazon MTurk. In: Proceedings of the 24th International Conference on World Wide Web. 2015, 238−247
|
| [79] |
Dawid A P, Skene A M . Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 1979, 28( 1): 20–28
|
| [80] |
Demartini G, Difallah D E, Cudré-Mauroux P. ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Proceedings of the 21st International Conference on World Wide Web. 2012, 469−478
|
| [81] |
Kim H C, Ghahramani Z. Bayesian classifier combination. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics. 2012, 619−627
|
| [82] |
Raykar V C, Yu S, Zhao L H, Valadez G H, Florin C, Bogoni L, Moy L . Learning from crowds. The Journal of Machine Learning Research, 2010, 11( 4): 1297–1322
|
| [83] |
Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L. ImageNet: a large-scale hierarchical image database. In: Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248−255
|
| [84] |
Crosby P B. Quality is Free: the Art of Making Quality Certain. New York: McGraw-Hill, 1979
|
| [85] |
Paun S, Hovy D. Proceedings of the first workshop on aggregating and analysing crowdsourced annotations for NLP. In: Proceedings of the 1st Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP. 2019
|
| [86] |
Li C, Markl V, LI Q, Li Y, Gao J, Su L, Zhao B, Demirbas M, Fan W, Han J . A confidence-aware approach for truth discovery on long-tail data. Proceedings of the VLDB Endowment, 2014, 8( 4): 425–436
|
| [87] |
Aydin B I, Yilmaz Y S, Li Y, Li Q, Gao J, Demirbas M. Crowdsourcing for multiple-choice question answering. In: Proceedings of 28th the AAAI Conference on Artificial Intelligence. 2014, 2946−2953
|
| [88] |
Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques. Cambridge: MIT Press, 2009
|
| [89] |
Venanzi M, Guiver J, Kazai G, Kohli P, Shokouhi M. Community-based Bayesian aggregation models for crowdsourcing. In: Proceedings of the 23rd International Conference on World Wide Web. 2014, 155−164
|
| [90] |
Cer D, Yang Y, Kong S Y, Hua N, Limtiaco N, John R S, Constant N, Guajardo-Cespedes M, Yuan S, Tar C, Sung Y H, Strope B, Kurzweil R. Universal sentence encoder. 2018, arXiv preprint arXiv: 1803.11175
|
| [91] |
Devlin J, Chang M W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics, 2019, 4171−4186
|
| [92] |
Vaughan J W . Making better use of the crowd: how crowdsourcing can advance machine learning research. The Journal of Machine Learning Research, 2017, 18( 1): 7026–7071
|
| [93] |
Wang Y, Papangelis K, Saker M, Lykourentzou I, Khan V J, Chamberlain A, Grudin J. An examination of the work practices of crowdfarms. In: Proceedings of 2021 CHI Conference on Human Factors in Computing Systems. 2021, 139
|
| [94] |
Dellermann D, Calma A, Lipusch N, Weber T, Weigel S, Ebel P. The future of human-AI collaboration: a taxonomy of design knowledge for hybrid intelligence systems. arXiv preprint arXiv:2105.03354, 2021
|
| [95] |
Feldman M Q, McInnis B J . How we write with crowds. Proceedings of the ACM on Human-Computer Interaction, 2021, 4( CSCW3): 229
|
| [96] |
Ørting S N, Doyle A, van Hilten A, Hirth M, Inel O, Madan C R, Mavridis P, Spiers H, Cheplygina V . A survey of crowdsourcing in medical image analysis. Human Computation, 2020, 7( 1): 1–26
|
| [97] |
Kovashka A, Russakovsky O, Fei-Fei L, Grauman K . Crowdsourcing in computer vision. Foundations and Trends® in computer graphics and Vision, 2016, 10( 3): 177–243
|
| [98] |
Zheng Y, Li G, Li Y, Shan C, Cheng R . Truth inference in crowdsourcing: is the problem solved?. Proceedings of the VLDB Endowment, 2017, 10( 5): 541–552
|
| [99] |
Jin Y, Carman M, Zhu Y, Xiang Y . A technical survey on statistical modelling and design methods for crowdsourcing quality control. Artificial Intelligence, 2020, 287: 103351
|
| [100] |
Daniel F, Kucherbaev P, Cappiello C, Benatallah B, Allahbakhsh M . Quality control in crowdsourcing: a survey of quality attributes, assessment techniques, and assurance actions. ACM Computing Surveys (CSUR), 2018, 51( 1): 7
|
| [101] |
Wu G, Chen Z, Liu J, Han D, Qiao B . Task assignment for social-oriented crowdsourcing. Frontiers of Computer Science, 2021, 15( 2): 152316
|
| [102] |
Hu Z, Wu W, Luo J, Wang X, Li B . Quality assessment in competition-based software crowdsourcing. Frontiers of Computer Science, 2020, 14( 6): 146207
|
| [103] |
Zhu H, Dow S P, Kraut R E, Kittur A. Reviewing versus doing: learning and performance in crowd assessment. In: Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing. 2014, 1445−1455
|
| [104] |
Mesbah S, Arous I, Yang J, Bozzon A. HybridEval: a human-AI collaborative approach for evaluating design ideas at scale. In: Proceedings of 2023 ACM Web Conference. 2023, 3837−3848
|
| [105] |
Xu A, Huang S W, Bailey B. Voyant: generating structured feedback on visual designs using a crowd of non-experts. In: Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing. 2014, 1433−1444
|
| [106] |
Singer Y, Mittal M. Pricing mechanisms for crowdsourcing markets. In: Proceedings of the 22nd International Conference on World Wide Web. 2013, 1157−1166
|
| [107] |
Salehi N, Teevan J, Iqbal S, Kamar E. Communicating context to the crowd for complex writing tasks. In: Proceedings of 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 2017, 1890−1901
|
| [108] |
Aguinis H, Villamor I, Ramani R S . MTurk research: review and recommendations. Journal of Management, 2021, 47( 4): 823–837
|
| [109] |
Russell B C, Torralba A, Murphy K P, Freeman W T . LabelMe: a database and web-based tool for image annotation. International Journal of Computer Vision, 2008, 77( 1): 157–173
|
| [110] |
Kang X, Yu G, Kong L, Domeniconi C, Zhang X, Li Q . FedTA: federated worthy task assignment for crowd workers. IEEE Transactions on Dependable and Secure Computing, 2024, 21( 4): 4098–4109
|
| [111] |
Lin X, Wei K, Li Z, Chen J, Pei T . Aggregation-based dual heterogeneous task allocation in spatial crowdsourcing. Frontiers of Computer Science, 2024, 18( 6): 186605
|
| [112] |
Doroudi S, Kamar E, Brunskill E, Horvitz E. Toward a learning science for complex crowdsourcing tasks. In: Proceedings of 2016 CHI Conference on Human Factors in Computing Systems. 2016, 2623−2634
|
| [113] |
Cychnerski J, Dziubich T. Segmentation quality refinement in large-scale medical image dataset with crowd-sourced annotations. In: Proceedings of the 25th European Conference on Advances in Databases and Information Systems. 2021, 205−216
|
| [114] |
Benenson R, Popov S, Ferrari V. Large-scale interactive object segmentation with human annotators. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 11692−11701
|
| [115] |
Rechkemmer A, Yin M. Exploring the effects of goal setting when training for complex crowdsourcing tasks (Extended Abstract). In: Proceedings of the 30th International Joint Conference on Artificial Intelligence. 2021, 4819–4823
|
| [116] |
Allahbakhsh M, Arbabi S, Shirazi M, Motahari-Nezhad H R. A task decomposition framework for surveying the crowd contextual insights. In: Proceedings of 2015 IEEE 8th International Conference on Service-Oriented Computing and Applications. 2015, 155−162
|
| [117] |
Bragg J, Mausam, Weld D S. Sprout: crowd-powered task design for crowdsourcing. In: Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology. 2018, 165−176
|
| [118] |
Dumitrache A, Aroyo L, Welty C. Capturing ambiguity in crowdsourcing frame disambiguation. In: Proceedings of the 6th AAAI Conference on Human Computation and Crowdsourcing. 2018, 12−20
|
| [119] |
Biemann C . Creating a system for lexical substitutions from scratch using crowdsourcing. Language Resources and Evaluation, 2013, 47( 1): 97–122
|
| [120] |
de Boer P M, Bernstein A . PPLib: toward the automated generation of crowd computing programs using process recombination and auto-experimentation. ACM Transactions on Intelligent Systems and Technology (TIST), 2016, 7( 4): 49
|
| [121] |
De Boer P M, Bernstein A. Efficiently identifying a well-performing crowd process for a given problem. In: Proceedings of 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 2017, 1688−1699
|
| [122] |
Bhuiyan M, Zhang A X, Sehat C M, Mitra T . Investigating differences in crowdsourced news credibility assessment: raters, tasks, and expert criteria. Proceedings of the ACM on Human-Computer Interaction, 2020, 4( CSCW2): 93
|
| [123] |
Suzuki R, Sakaguchi T, Matsubara M, Kitagawa H, Morishima A. CrowdSheet: instant implementation and out-of-hand execution of complex crowdsourcing. In: Proceedings of the 34th IEEE International Conference on Data Engineering. 2018, 1633−1636
|
| [124] |
Dunnmon J A, Ratner A J, Saab K, Khandwala N, Markert M, Sagreiya H, Goldman R, Lee-Messer C, Lungren M P, Rubin D L, Ré C . Cross-modal data programming enables rapid medical machine learning. Patterns, 2020, 1( 2): 100019
|
| [125] |
Kittur A, Khamkar S, Andre P, Kraut R. CrowdWeaver: visually managing complex crowd work. In: Proceedings of 2012 ACM Conference on Computer Supported Cooperative Work. 2012, 1033−1036
|
| [126] |
Wang J, Ipeirotis P G, Provost F. Quality-based pricing for crowdsourced workers. NYU Working Paper, 2013
|
| [127] |
Cheng J, Teevan J, Bernstein M S. Measuring crowdsourcing effort with error-time curves. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 2015, 1365–1374
|
| [128] |
Yin M, Chen Y, Sun Y A. The effects of performance-contingent financial incentives in online labor markets. In: Proceedings of the 27th AAAI Conference on Artificial Intelligence. 2013, 1191–1197
|
| [129] |
Agapie E, Teevan J, Monroy-Hernandez A. Crowdsourcing in the field: a case study using local crowds for event reporting. In: Proceedings of the 3rd AAAI Conference on Human Computation and Crowdsourcing. 2015, 2–11
|
| [130] |
Lasecki W S, Wesley R, Nichols J, Kulkarni A, Allen J F, Bigham J P. Chorus: a crowd-powered conversational assistant. In: Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology. 2013, 151−162
|
| [131] |
Drapeau R, Chilton L, Bragg J, Weld D. Microtalk: using argumentation to improve crowdsourcing accuracy. In: Proceedings of the 4th AAAI Conference on Human Computation and Crowdsourcing. 2016, 32−41
|
| [132] |
Xiong T, Yu Y, Pan M, Yang J. SmartCrowd: a workflow framework for complex crowdsourcing tasks. In: Proceedings of the 16th International Conference on Business Process Management. 2018, 387−398
|
| [133] |
Rahman H, Roy S B, Thirumuruganathan S, Amer-Yahia S, Das G . Optimized group formation for solving collaborative tasks. The VLDB Journal, 2019, 28( 1): 1–23
|
| [134] |
Deci E L, Ryan R M. Self-determination theory. In: Van Lange P A M, Kruglanski A W, Higgins E T, eds. Handbook of Theories of Social Psychology. Washington: SAGE, 2012, 416−436
|
| [135] |
Pei W, Yang Z, Chen M, Yue C . Quality control in crowdsourcing based on fine-grained behavioral features. Proceedings of the ACM on Human-Computer Interaction, 2021, 5( CSCW2): 442
|
| [136] |
Ba Y, Mancenido M V, Chiou E K, Pan R. Data quality in crowdsourcing and spamming behavior detection. 2024, arXiv preprint arXiv: 2404.17582
|
| [137] |
Cui L, Zhao X, Liu L, Yu H, Miao Y . Complex crowdsourcing task allocation strategies employing supervised and reinforcement learning. International Journal of Crowd Science, 2017, 1( 2): 146–160
|
| [138] |
Tang F. Optimal complex task assignment in service crowdsourcing. In: Proceedings of the 29th International Conference on International Joint Conferences on Artificial Intelligence. 2021, 217
|
| [139] |
Mavridis P, Gross-Amblard D, Miklós Z. Using hierarchical skills for optimized task assignment in knowledge-intensive crowdsourcing. In: Proceedings of the 25th International Conference on World Wide Web. 2016, 843−853
|
| [140] |
Maarry K E, Balke W T, Cho H, Hwang S W, Baba Y. Skill ontology-based model for quality assurance in crowdsourcing. In: Proceedings of the 19th International Conference on Database Systems for Advanced Applications. 2014, 376−387
|
| [141] |
Hettiachchi D, Van Berkel N, Kostakos V, Goncalves J . CrowdCog: a cognitive skill based system for heterogeneous task assignment and recommendation in crowdsourcing. Proceedings of the ACM on Human-Computer Interaction, 2020, 4( CSCW2): 110
|
| [142] |
Aris H, Azizan A. A review on the methods to evaluate crowd contributions in crowdsourcing applications. In: Proceedings of the 4th International Conference of Reliable Information and Communication Technology. 2019, 1031−1041
|
| [143] |
Zlabinger M, Sabou M, Hofstätter S, Sertkan M, Hanbury A. DEXA: supporting non-expert annotators with dynamic examples from experts. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2020, 2109−2112
|
| [144] |
Zhu X. Machine teaching: an inverse problem to machine learning and an approach toward optimal education. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015, 4083−4087
|
| [145] |
Chen Q, Bragg J, Chilton L B, Weld D S. Cicero: multi-turn, contextual argumentation for accurate crowdsourcing. In: Proceedings of 2019 CHI Conference on Human Factors in Computing Systems. 2019, 531
|
| [146] |
Liu A, Soderland S, Bragg J, Lin C H, Ling X, Weld D S. Effective crowd annotation for relation extraction. In: Proceedings of 2016 Conference of the North American Chapter of the Association for Computational Linguistics. 2016, 897−906
|
| [147] |
McInnis B, Leshed G, Cosley D . Crafting policy discussion prompts as a task for newcomers. Proceedings of the ACM on Human-Computer Interaction, 2018, 2( CSCW): 121
|
| [148] |
Hettiachchi D, Schaekermann M, McKinney T J, Lease M . The challenge of variable effort crowdsourcing and how visible gold can help. Proceedings of the ACM on Human-Computer Interaction, 2021, 5( CSCW2): 332
|
| [149] |
Willett W, Heer J, Agrawala M. Strategies for crowdsourcing social data analysis. In: Proceedings of SIGCHI Conference on Human Factors in Computing Systems. 2012, 227−236
|
| [150] |
Hahn N, Chang J, Kim J E, Kittur A. The knowledge accelerator: big picture thinking in small pieces. In: Proceedings of 2016 CHI Conference on Human Factors in Computing Systems. 2016, 2258−2270
|
| [151] |
Xu A, Rao H, Dow S P, Bailey B P. A classroom study of using crowd feedback in the iterative design process. In: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. 2015, 1637−1648
|
| [152] |
Kobayashi M, Morita H, Matsubara M, Shimizu N, Morishima A. An empirical study on short-and long-term effects of self-correction in crowdsourced microtasks. In: Proceedings of the 6th AAAI Conference on Human Computation and Crowdsourcing. 2018, 79−87
|
| [153] |
Tsvetkova M, Yasseri T, Meyer E T, Pickering J B, Engen V, Walland P, Lüders M, Følstad A, Bravos G . Understanding human-machine networks: a cross-disciplinary survey. ACM Computing Surveys (CSUR), 2017, 50( 1): 12
|
| [154] |
Wang Y, Zhang X, Ju Y, Liu Q, Zou Q, Zhang Y, Ding Y, Zhang Y . Identification of human microrna-disease association via low-rank approximation-based link propagation and multiple kernel learning. Frontiers of Computer Science, 2024, 18( 2): 182903
|
| [155] |
Li J, Zhang R, Mensah S, Qin W, Hu C . Classification-oriented Dawid skene model for transferring intelligence from crowds to machines. Frontiers of Computer Science, 2023, 17( 5): 175332
|
| [156] |
Jayaram S, Allaway E. Human rationales as attribution priors for explainable stance detection. In: Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. 2021, 5540–5554
|
| [157] |
Russakovsky O, Li L J, Fei-Fei L. Best of both worlds: human-machine collaboration for object annotation. In: Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. 2015, 2121−2131
|
| [158] |
Gouravajhala S R, Yim J, Desingh K, Huang Y, Jenkins O C, Lasecki W S. EURECA: enhanced understanding of real environments via crowd assistance. In: Proceedings of the 6th AAAI Conference on Human Computation and Crowdsourcing. 2018, 31−40
|
| [159] |
Kanchinadam T, Westpfahl K, You Q, Fung G. Rationale-based human-in-the-loop via supervised attention. In: Proceedings of Workshop on Data Science with Human-in-the-Loop. 2020
|
| [160] |
Yang J, Smirnova A, Yang D, Demartini G, Lu Y, Cudre-Mauroux P. Scalpel-CD: leveraging crowdsourcing and deep probabilistic modeling for debugging noisy training data. In: Proceedings of World Wide Web Conference. 2019, 2158−2168
|
| [161] |
Nushi B, Kamar E, Horvitz E, Kossmann D. On human intellect and machine failures: troubleshooting integrative machine learning systems. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 1017−1025
|
| [162] |
Das A K, Ashrafi A, Ahmmad M. Joint cognition of both human and machine for predicting criminal punishment in judicial system. In: Proceedings of the 4th IEEE International Conference on Computer and Communication Systems. 2019, 36−40
|
| [163] |
Naderi B, Cutler R, Ristea N C. Multi-dimensional speech quality assessment in crowdsourcing. In: Proceedings of 2024 IEEE International Conference on Acoustics, Speech and Signal Processing. 2024, 696−700
|
| [164] |
Simpson E D, Venanzi M, Reece S, Kohli P, Guiver J, Roberts S J, Jennings N R. Language understanding in the wild: combining crowdsourcing and machine learning. In: Proceedings of the 24th International Conference on World Wide Web. 2015, 992−1002
|
| [165] |
Han G, Tu J, Yu G, Wang J, Domeniconi C. Crowdsourcing with multiple-source knowledge transfer. In: Proceedings of the 29th International Conference on International Joint Conferences on Artificial Intelligence. 2021, 2908−2914
|
| [166] |
Papineni K, Roukos S, Ward T, Zhu W J. Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 2002, 311−318
|
| [167] |
Yu J, Jiang Y, Wang Z, Cao Z, Huang T. UnitBox: an advanced object detection network. In: Proceedings of the 24th ACM International Conference on Multimedia. 2016, 516−520
|
| [168] |
Uma A N, Fornaciari T, Hovy D, Paun S, Plank B, Poesio M . Learning from disagreement: a survey. Journal of Artificial Intelligence Research, 2021, 72: 1385–1470
|
| [169] |
Ren L, Jiang L, Zhang W, Li C . Label distribution similarity-based noise correction for crowdsourcing. Frontiers of Computer Science, 2024, 18( 5): 185323
|
| [170] |
Vittayakorn S, Hays J. Quality assessment for crowdsourced object annotations. In: Proceedings of the British Machine Vision Conference. 2011, 1−11
|
| [171] |
Chung J J Y, Song J Y, Kutty S, Hong S, Kim J, Lasecki W S . Efficient elicitation approaches to estimate collective crowd answers. Proceedings of the ACM on Human-Computer Interaction, 2019, 3( CSCW): 62
|
| [172] |
Li J, Endo L R, Kashima H. Label aggregation for crowdsourced triplet similarity comparisons. In: Proceedings of the 28th International Conference on Neural Information Processing. 2021, 176−185
|
| [173] |
Basile V, Fell M, Fornaciari T, Hovy D, Paun S, Plank B, Poesio M, Uma A. We need to consider disagreement in evaluation. In: Proceedings of the 1st Workshop on Benchmarking: Past, Present and Future. 2021, 15−21
|
| [174] |
Timmermans B. Exploiting disagreement through open-ended tasks for capturing interpretation spaces. In: Proceedings of the 13th European Semantic Web Conference. 2016, 873−882
|
| [175] |
Klebanov B B, Beigman E. Difficult cases: from data to learning, and back. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014, 390–396
|
| [176] |
Zhang J, Jiang X, Tian N, Wu M . Label noise correction for crowdsourcing using dynamic resampling. Engineering Applications of Artificial Intelligence, 2024, 133: 108439
|
| [177] |
Yuan Q, Gou G, Zhu Y, Wang Y . MMCo: using multimodal deep learning to detect malicious traffic with noisy labels. Frontiers of Computer Science, 2024, 18( 1): 181809
|
| [178] |
Zhang Y, Jiang L, Li C . Attribute augmentation-based label integration for crowdsourcing. Frontiers of Computer Science, 2023, 17( 5): 175331
|
| [179] |
Lan O, Huang X, Lin B Y, Jiang H, Liu L, Ren X. Learning to contextually aggregate multi-source supervision for sequence labeling. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 2134–2146
|
| [180] |
Yang Y, Chen P, Sun H . Incorporating pixel proximity into answer aggregation for crowdsourced image segmentation. CCF Transactions on Pervasive Computing and Interaction, 2022, 4( 2): 172–187
|
| [181] |
Hettiachchi D, Kostakos V, Goncalves J . A survey on task assignment in crowdsourcing. ACM Computing Surveys (CSUR), 2022, 55( 3): 49
|
| [182] |
He W, Cui L, Huang C. Task assignments in complex collaborative crowdsourcing. In: Proceedings of the 13th CCF Conference on Computer Supported Cooperative Work and Social Computing. 2018, 574−580
|
| [183] |
Mo K, Zhong E, Yang Q. Cross-task crowdsourcing. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2013, 677−685
|
| [184] |
Khan A R, Garcia-Molina H. CrowdDQS: dynamic question selection in crowdsourcing systems. In: Proceedings of 2017 ACM International Conference on Management of Data. 2017, 1447−1462
|
| [185] |
Fan J, Li G, Ooi B C, Tan K L, Feng J. iCrowd: an adaptive crowdsourcing framework. In: Proceedings of 2015 ACM SIGMOD International Conference on Management of Data. 2015, 1015−1030
|
| [186] |
Ma Y, Gao X, Bhatti S S, Chen G . Clustering based priority queue algorithm for spatial task assignment in crowdsourcing. IEEE Transactions on Services Computing, 2024, 17( 2): 452–465
|
| [187] |
Ma F, Li Y, Li Q, Qiu M, Gao J, Zhi S, Su L, Zhao B, Ji H, Han J. FaitCrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 745−754
|
| [188] |
Whitehill J, Ruvolo P, Wu T, Bergsma J, Movellan J. Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems. 2009, 2035−2043
|
| [189] |
Blei D M, Ng A Y, Jordan M I . Latent dirichlet allocation. The Journal of Machine Learning Research, 2003, 3: 993–1022
|
| [190] |
Welinder P, Branson S, Belongie S, Perona P. The multidimensional wisdom of crowds. In: Proceedings of the 24th International Conference on Neural Information Processing Systems. 2010, 2424−2432
|
| [191] |
Zhao W X, Jiang J, Weng J, He J, Lim E P, Yan H, Li X. Comparing twitter and traditional media using topic models. In: Proceedings of the 33rd European Conference on Information Retrieval. 2011, 338−349
|
| [192] |
Flower L, Hayes J R . A cognitive process theory of writing. College Composition and Communication, 1981, 32( 4): 365–387
|
| [193] |
Kraut R, Galegher J, Fish R, Chalfonte B . Task requirements and media choice in collaborative writing. Human-Computer Interaction, 1992, 7( 4): 375–407
|
| [194] |
Russell D M, Stefik M J, Pirolli P, Card S K. The cost structure of sensemaking. In: Proceedings of INTERACT ’93 and CHI ’93 Conference on Human Factors in Computing Systems. 1993, 269−276
|
| [195] |
Pirolli P, Card S. The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In: Proceedings of International Conference on Intelligence Analysis. 2005, 2−4
|
| [196] |
Ericsson K A, Lehmann A C . Expert and exceptional performance: evidence of maximal adaptation to task constraints. Annual Review of Psychology, 1996, 47: 273–305
|
| [197] |
Bradel L, Endert A, Koch K, Andrews C, North C . Large high resolution displays for co-located collaborative sensemaking: display usage and territoriality. International Journal of Human-Computer Studies, 2013, 71( 11): 1078–1088
|
| [198] |
Drieger P . Semantic network analysis as a method for visual text analytics. Procedia-Social and Behavioral Sciences, 2013, 79: 4–17
|
| [199] |
Sun M, Mi P, North C, Ramakrishnan N . BiSet: semantic edge bundling with biclusters for sensemaking. IEEE Transactions on Visualization and Computer Graphics, 2016, 22( 1): 310–319
|
| [200] |
Zhao J, Glueck M, Isenberg P, Chevalier F, Khan A . Supporting handoff in asynchronous collaborative sensemaking using knowledge-transfer graphs. IEEE Transactions on Visualization and Computer Graphics, 2018, 24( 1): 340–350
|
| [201] |
Madge C, Yu J, Chamberlain J, Kruschwitz U, Paun S, Poesio M. Crowdsourcing and aggregating nested markable annotations. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 797−807
|
| [202] |
Guillaume B, Fort K, Lefebvre N. Crowdsourcing complex language resources: playing to annotate dependency syntax. In: Proceedings of the 26th International Conference on Computational Linguistics. 2016, 3041−3052
|
| [203] |
Urbanek J, Fan A, Karamcheti S, Jain S, Humeau S, Dinan E, Rocktaschel T, Kiela D, Szlam A, Weston J. Learning to speak and act in a fantasy text adventure game. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019, 673−683
|
| [204] |
Zhao Y, Wang Y, Duan P, Zhang H, Liu Z, Tong X, Cai Z . Mobile crowdsourcing quality control method based on four-party evolutionary game in edge cloud environment. IEEE Transactions on Computational Social Systems, 2024, 11( 3): 3652–3666
|
| [205] |
Muldoon C, O’Grady M J, O’Hare G M P . A survey of incentive engineering for crowdsourcing. The Knowledge Engineering Review, 2018, 33: e2
|
| [206] |
Scekic O, Truong H L, Dustdar S. Supporting multilevel incentive mechanisms in crowdsourcing systems: an artifact-centric view. In: Li W, Huhns M N, Tsai W T, Wu W, eds. Crowdsourcing. Berlin: Springer, 2015, 91−111
|
| [207] |
Braylan A, Alonso O, Lease M. Measuring annotator agreement generally across complex structured, multi-object, and free-text annotation tasks. In: Proceedings of ACM Web Conference. 2022, 1720−1730
|
| [208] |
Gadiraju U, Yang J. What can crowd computing do for the next generation of AI systems? In: Proceedings of Crowd Science Workshop: Remoteness, Fairness, and Mechanisms as Challenges of Data Supply by Humans for Automation co-located with 34th Conference on Neural Information Processing Systems. 2020, 7−13
|
| [209] |
OpenAI. GPT-4 technical report. 2024, arXiv preprint arXiv: 2303.08774
|
| [210] |
Fulker Z, Riedl C . Cooperation in the gig economy: insights from upwork freelancers. Proceedings of the ACM on Human-Computer Interaction, 2024, 8( CSCW1): 37
|
| [211] |
Xu J, Han L, Sadiq S, Demartini G. On the role of large language models in crowdsourcing misinformation assessment. In: Proceedings of the 18th International AAAI Conference on Web and Social Media. 2024, 1674−1686
|
| [212] |
Ouyang L, Wu J, Jiang X, Almeida D, Wainwright C L, Mishkin P, Zhang C, Agarwal S, Slama K, Ray A, Schulman J, Hilton J, Kelton F, Miller L, Simens M, Askell A, Welinder P, Christiano P, Leike J, Lowe R. Training language models to follow instructions with human feedback. In: Proceedings of the 36th Conference on Neural Information Processing Systems. 2022, 27730−27744
|
| [213] |
Bai Y, Jones A, Ndousse K, Askell A, Chen A, , . Training a helpful and harmless assistant with reinforcement learning from human feedback. 2022, arXiv preprint arXiv: 2204.05862
|
| [214] |
Anil R, Dai A M, Firat O, Johnson M, Lepikhin D, , . PaLM 2 technical report. 2023, arXiv preprint arXiv: 2305.10403
|
| [215] |
Grattafiori A, Dubey A, Jauhri A, Pandey A, Kadian A, , . The Llama 3 herd of models. 2024, arXiv preprint arXiv: 2407.21783
|
| [216] |
DeepSeek-AI. DeepSeek-V3 technical report. 2025, arXiv preprint arXiv: 2412.19437
|
| [217] |
Christiano P F, Leike J, Brown T B, Martic M, Legg S, Amodei D. Deep reinforcement learning from human preferences. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 4302−4310
|
| [218] |
Wu T, Terry M, Cai C J. AI chains: transparent and controllable human-AI interaction by chaining large language model prompts. In: Proceedings of 2022 CHI conference on human factors in computing systems. 2022, 385
|
| [219] |
Kittur A, Smus B, Khamkar S, Kraut R E. CrowdForge: crowdsourcing complex work. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology. 2011, 43−52
|
| [220] |
Bernstein M S, Little G, Miller R C, Hartmann B, Ackerman M S, Karger D R, Crowell D, Panovich K. Soylent: a word processor with a crowd inside. In: Proceedings of the 23rd Annual ACM Symposium on User Interface Software and Technology. 2010, 313−322
|
| [221] |
Wang X, Wei J, Schuurmans D, Le Q V, Chi E H, Narang S, Chowdhery A, Zhou D. Self-consistency improves chain of thought reasoning in language models. In: Proceedings of the 11th International Conference on Learning Representations. 2023
|
| [222] |
Press O, Zhang M, Min S, Schmidt L, Smith N A, Lewis M. Measuring and narrowing the compositionality gap in language models. In: Proceedings of Findings of the Association for Computational Linguistics. 2023, 5687−5711
|
| [223] |
Shinn N, Cassano F, Gopinath A, Narasimhan K, Yao S. Reflexion: language agents with verbal reinforcement learning. In: Proceedings of the 37th Conference on Neural Information Processing Systems. 2023
|
| [224] |
Parameswaran A G, Shankar S, Asawa P, Jain N, Wang Y. Revisiting prompt engineering via declarative crowdsourcing. In: Proceedings of the 14th Conference on Innovative Data Systems Research. 2024
|
| [225] |
Veselovsky V, Ribeiro M H, West R. Artificial artificial artificial intelligence: crowd workers widely use large language models for text production tasks. 2023, arXiv preprint arXiv: 2306.07899
|
| [226] |
Wu T, Zhu H, Albayrak M, Axon A, Bertsch A, , . LLMs as workers in human-computational algorithms? Replicating crowdsourcing pipelines with LLMs. 2025, arXiv preprint arXiv: 2307.10168
|
| [227] |
Hu T, Long C, Xiao C . CRD-CGAN: category-consistent and relativistic constraints for diverse text-to-image generation. Frontiers of Computer Science, 2024, 18( 1): 181304
|
| [228] |
He X, Lin Z, Gong Y, Jin A L, Zhang H, Lin C, Jiao J, Yiu S M, Duan N, Chen W. AnnoLLM: making large language models to be better crowdsourced annotators. In: Proceedings of 2024 Conference of the North American Chapter of the Association for Computational Linguistics. 2024, 165−190
|
| [229] |
He Z, Huang C Y, Ding C K C, Rohatgi S, Huang T H K. If in a crowdsourced data annotation pipeline, a GPT-4. In: Proceedings of 2024 CHI Conference on Human Factors in Computing Systems. 2024, 1040
|
RIGHTS & PERMISSIONS
The Author(s) 2025. This article is published with open access at link.springer.com and journal.hep.com.cn