Spreadsheet quality assurance: a literature review

Pak-Lok POON; Man Fai LAU; Yuen Tak YU; Sau-Fun TANG

doi:10.1007/s11704-023-2384-6

PDF(1637 KB)

Front. Comput. Sci. ›› 2024, Vol. 18 ›› Issue (2) : 182203. DOI: 10.1007/s11704-023-2384-6

Software

REVIEW ARTICLE

Spreadsheet quality assurance: a literature review

Author information +

History +

Abstract

Spreadsheets are very common for information processing to support decision making by both professional developers and non-technical end users. Moreover, business intelligence and artificial intelligence are increasingly popular in the industry nowadays, where spreadsheets have been used as, or integrated into, intelligent or expert systems in various application domains. However, it has been repeatedly reported that faults often exist in operational spreadsheets, which could severely compromise the quality of conclusions and decisions based on the spreadsheets. With a view to systematically examining this problem via survey of existing work, we have conducted a comprehensive literature review on the quality issues and related techniques of spreadsheets over a 35.5-year period (from January 1987 to June 2022) for target journals and a 10.5-year period (from January 2012 to June 2022) for target conferences. Among other findings, two major ones are: (a) Spreadsheet quality is best addressed throughout the whole spreadsheet life cycle, rather than just focusing on a few specific stages of the life cycle. (b) Relatively more studies focus on spreadsheet testing and debugging (related to fault detection and removal) when compared with spreadsheet specification, modeling, and design (related to development). As prevention is better than cure, more research should be performed on the early stages of the spreadsheet life cycle. Enlightened by our comprehensive review, we have identified the major research gaps as well as highlighted key research directions for future work in the area.

Graphical abstract

Keywords

decision support system / end-user computing / end-user programming / Excel / spreadsheet

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Pak-Lok POON, Man Fai LAU, Yuen Tak YU, Sau-Fun TANG. Spreadsheet quality assurance: a literature review. Front. Comput. Sci., 2024, 18(2): 182203 https://doi.org/10.1007/s11704-023-2384-6

Pak-Lok Poon received the PhD degree in software engineering from The University of Melbourne, Australia. He is an associate professor with the School of Engineering and Technology, Central Queensland University, Australia. His research interests include software testing, requirements engineering and inspection, spreadsheet quality assurance, electronic commerce, and computers in education. He was a guest editor of two special issues of the Journal of Systems and Software (in 2018 and 2021). He was also an organizer for the 3rd, 4th, and 5th International Workshops on Metamorphic Testing in 2018, 2019, and 2020, respectively

Man Fai Lau received his BSc(Hons) from The University of Hong Kong, China and PhD from The University of Melbourne, Australia. He joined Swinburne University of Technology, Australia in 2000. He received two Australian Research Council (ARC) Discovery Project (DP) research grants in 2005 (a 3-year DP project) and 2007 (a 5-year DP project). His research interests are on software engineering, software testing, data mining, data science, and applications of AI techniques in various areas such as oil drilling and financial analysis

Yuen Tak Yu received the PhD degree from The University of Melbourne, Australia. He is an associate professor with the Department of Computer Science, City University of Hong Kong, China. His research interests include software testing, ecommerce, and computers in education. His publications have appeared in scholarly journals and leading international conferences, such as ACMTOSEM, IEEETSE, IEEETRel, IEEETSC, JSS, IST, Information Research, Computers and Education, ICSE, FSE, ISSRE, COMPSAC, QRS, ICCE, and others. Dr. Yu is a past chair of the IEEE Hong Kong Section Computer Society Chapter

Sau-Fun Tang received the Master of Business (Information Technology) degree from the Royal Melbourne Institute of Technology, Australia and the PhD degree in software engineering from Swinburne University of Technology, Australia. She is currently working at the Royal Victorian Eye and Ear Hospital, Melbourne, Australia. She was an instructor in the Department of Finance and Decision Sciences at Hong Kong Baptist University, China and a lecturer in the School of Accounting and Finance at The Hong Kong Polytechnic University, China. Her research interests include software testing, spreadsheet quality assurance, management information systems, electronic commerce, and computers in education

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Grossman T A, Mehrotra V, Özlük Ö. Lessons from mission-critical spreadsheets. Communications of the Association for Information Systems, 2007, 20: 60

[2]	Ragsdale C T, Plane D R. On modeling time series data using spreadsheets. Omega, 2000, 28(2): 215–221

[3]	Aliane N. Spreadsheet-based control system analysis and design [Focus on Education]. IEEE Control Systems Magazine, 2008, 28( 5): 108–113

[4]	Bianchi C, Botta F, Conte L, Vanoli P, Cerizza L. Biological effective dose evaluation in gynaecological brachytherapy: LDR and HDR treatments, dependence on radiobiological parameters, and treatment optimisation. Radiologia Medica, 2008, 113( 7): 1068–1078

[5]	Zoethout R W M, Van Gerven J M A, Dumont G J H, Paltansing S, Van Burgel N D, Van Der Linden M, Dahan A, Cohen A F, Schoemaker R C. A comparative study of two methods for attaining constant alcohol levels. British Journal of Clinical Pharmacology, 2008, 66( 5): 674–681

[6]	Dzik W S, Beckman N, Selleng K, Heddle N, Szczepiorkowski Z, Wendel S, Murphy M. Errors in patient specimen collection: application of statistical process control. Transfusion, 2008, 48( 10): 2143–2151

[7]	AlTarawneh G, Thorne S. A pilot study exploring spreadsheet risk in scientific research. In: Proceedings of the EuSpRIG 2016 Conference “Spreadsheet Risk Management”. 2016, 49–69

[8]	Thorne S. The misuse of spreadsheets in the nuclear fuel industry: the falsification of safety critical data using spreadsheets at British Nuclear Fuels Limited (BNFL). Journal of Organizational and End User Computing, 2013, 25( 3): 20–31

[9]	Caulkins J P, Morrison E L, Weidemann T. Spreadsheet errors and decision making: evidence from field interviews. Journal of Organizational and End User Computing, 2007, 19( 3): 1–23

[10]	Powell S G, Baker K R, Lawson B. Errors in operational spreadsheets. Journal of Organizational and End User Computing, 2009, 21( 3): 24–36

[11]	McDaid K, Rust A. Test-driven development for spreadsheet risk management. IEEE Software, 2009, 26( 5): 31–36

[12]	Panko R R. Two experiments in reducing overconfidence in spreadsheet development. Journal of Organizational and End User Computing, 2007, 19( 1): 1–23

[13]	Burnett M, Cook C, Rothermel G. End-user software engineering. Communications of the ACM, 2004, 47( 9): 53–58

[14]	Panko R R, Port D N. End user computing: the dark matter (and dark energy) of corporate IT. Journal of Organizational and End User Computing, 2013, 25( 3): 1–19

[15]	Scaffidi C, Shaw M, Myers B. Estimating the numbers of end users and end user programmers. In: Proceedings of 2005 IEEE Symposium on Visual Languages and Human-Centric Computing. 2005, 207–214

[16]	Ko A J, Abraham R, Beckwith L, Blackwell A, Burnett M, Erwig M, Scaffidi C, Lawrance J, Lieberman H, Myers B, Rosson M B, Rothermel G, Shaw M, Wiedenbeck S. The state of the art in end-user software engineering. ACM Computing Surveys, 2011, 43( 3): 21

[17]	McGill T J, Klobas J E. The role of spreadsheet knowledge in user-developed application success. Decision Support Systems, 2005, 39( 3): 355–369

[18]	Erwig M. Software engineering for spreadsheets. IEEE Software, 2009, 26( 5): 25–30

[19]	Schultheis R, Sumner M. The relationship of application risks to application controls: a study of microcomputer-based spreadsheet applications. Journal of Organizational and End User Computing, 1994, 6( 2): 11–18

[20]	Poon P-L, Liu H, Chen T Y. Error trapping and metamorphic testing for spreadsheet failure detection. Journal of Organizational and End User Computing, 2017, 29( 2): 25–42

[21]	Croll G J, Butler R J. Spreadsheets in clinical medicine. 2007, arXiv preprint arXiv: 0710.0871

[22]	Jannach D, Schmitz T, Hofer B, Wotawa F. Avoiding, finding and fixing spreadsheet errors – a survey of automated approaches for spreadsheet QA. Journal of Systems and Software, 2014, 94: 129–150

[23]	Thorne S. A review of spreadsheet error reduction techniques. Communications of the Association for Information Systems, 2009, 25: 34

[24]	Powell S G, Baker K R, Lawson B. A critical review of the literature on spreadsheet errors. Decision Support Systems, 2008, 46( 1): 128–138

[25]	IEEE. ISO/IEC/IEEE 15026-1:2019 Systems and software engineering – systems and software assurance – Part 1: concepts and vocabulary. IEEE, 2019

[26]	IEEE. ISO/IEC/IEEE 24765:2017 Systems and software engineering – vocabulary. IEEE, 2017

[27]	IEEE. IEEE 1028−2008 IEEE standard for software reviews and audits. IEEE, 2018

[28]	Hofer B, Jannach D, Koch P, Schekotihin K, Wotawa F. Product metrics for spreadsheets — a systematic review. Journal of Systems and Software, 2021, 175: 110910

[29]	Power D J. A brief history of spreadsheets. See DSSResources website, 2004

[30]	Senders J W, Moray N P. Human Error: Cause, Prediction, and Reduction. Boca Raton, FL: CRC Press, 2020

[31]	Sheridan T B. Risk, human error, and system resilience: fundamental ideas. Human Factors, 2008, 50( 3): 418–426

[32]	Panko R R, Sprague R H Jr. Hitting the wall: errors in developing and code inspecting a ‘simple’ spreadsheet model. Decision Support Systems, 1998, 22( 4): 337–353

[33]	Elberzhager F, Münch J, Nha V T N. A systematic mapping study on the combination of static and dynamic quality assurance techniques. Information and Software Technology, 2012, 54( 1): 1–15

[34]	Myers G J, Sandler C, Badgett T. The Art of Software Testing. 3rd ed. Hoboken, NJ: Wiley, 2011

[35]	Kikuchi N, Kikuno T. Improving the testing process by program static analysis. In: Proceedings of the 8th Asia-Pacific on Software Engineering Conference. 2001, 195–201

[36]	Panko R R. Spreadsheets and Sarbanes-Oxley: regulations, risks, and control frameworks. Communications of the Association for Information Systems, 2006, 17: 29

[37]	Panko R R. What we know about spreadsheet errors. Journal of End User Computing, 1998, 10(2): 15–21

[38]	Galletta D F, Hartzel K S, Johnson S E, Joseph J L, Rustagi S. Spreadsheet presentation and error detection: an experimental study. Journal of Management Information Systems, 1996, 13( 3): 45–63

[39]	Cunha J, Fernandes J P, Mendes J, Saraiva J. MDSheet: a framework for model-driven spreadsheet engineering. In: Proceedings of the 34th International Conference on Software Engineering. 2012, 1395–1398

[40]	Grossman T A, Özlük Ö. A paradigm for spreadsheet engineering methodologies. 2008, arXiv preprint arXiv: 0802.3919

[41]	Leon L, Kalbers L, Coster N, Abraham D. A spreadsheet life cycle analysis and the impact of Sarbanes-Oxley. Decision Support Systems, 2012, 54( 1): 452–460

[42]	Panko R R, Halverson R P Jr. Spreadsheets on trial: a survey of research on spreadsheet risks. In: Proceedings of the 29th Hawaii International Conference on System Sciences. 1996, 326–335

[43]	Lawson B R, Baker K R, Powell S G, Foster-Johnson L. A comparison of spreadsheet users with different levels of experience. Omega, 2009, 37( 3): 579–590

[44]	Ronen B, Palley M A, Lucas H C Jr. Spreadsheet analysis and design. Communications of the ACM, 1989, 32( 1): 84–93

[45]	Read N, Batson J. Spreadsheet Modelling Best Practice. UK: Pricewaterhouse Coopers, 1999

[46]	Brown P S, Gould J D. An experimental study of people creating spreadsheets. ACM Transactions on Office Information Systems, 1987, 5( 3): 258–272

[47]	Kankuzi B, Sajaniemi J. A mental model perspective for tool development and paradigm shift in spreadsheets. International Journal of Human-Computer Studies, 2016, 86: 149–163

[48]	Kankuzi B, Sajaniemi J. An empirical study of spreadsheet authors’ mental models in explaining and debugging tasks. In: Proceedings of 2013 IEEE Symposium on Visual Languages and Human-Centric Computing. 2013, 15–18

[49]	Kankuzi B, Sajaniemi J. Visualizing the problem domain for spreadsheet users: a mental model perspective. In: Proceedings of 2014 IEEE Symposium on Visual Languages and Human-Centric Computing. 2014, 157–160

[50]	Kankuzi B, Sajaniemi J. A domain terms visualization tool for spreadsheets. In: Proceedings of 2014 IEEE Symposium on Visual Languages and Human-Centric Computing. 2014, 209–210

[51]	Carlsson S A. A longitudinal study of spreadsheet program use. Journal of Management Information Systems, 1988, 5( 1): 82–100

[52]	Cragg P B, King M. Spreadsheet modelling abuse: an opportunity for OR? Journal of the Operational Research Society, 1993, 44(8): 743−752

[53]	Kruck S E, Maher J J, Barkhi R. Framework for cognitive skill acquisition and spreadsheet training. Journal of End User Computing, 2003, 15( 1): 20–37

[54]	Abraham R, Erwig M, Kollmansberger S, Seifert E. Visual specifications of correct spreadsheets. In: Proceedings of 2005 IEEE Symposium on Visual Languages and Human-Centric Computing. 2005, 189–196

[55]	Erwig M, Abraham R, Cooperstein I, Kollmansberger S. Automatic generation and maintenance of correct spreadsheets. In: Proceedings of the 27th International Conference on Software Engineering. 2005, 136–145

[56]	Erwig M, Abraham R, Kollmansberger S, Cooperstein I. Gencel: a program generator for correct spreadsheets. Journal of Functional Programming, 2006, 16( 3): 293–325

[57]	Engels G, Erwig M. ClassSheets: automatic generation of spreadsheet applications from object-oriented specifications. In: Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering. 2005, 124–133

[58]	Luckey M, Erwig M, Engels G. Systematic evolution of model-based spreadsheet applications. Journal of Visual Languages and Computing, 2012, 23( 5): 267–286

[59]	Hermans F, Pinzger M, Van Deursen A. Automatically extracting class diagrams from spreadsheets. In: Proceedings of the 24th European Conference on Object-Oriented Programming. 2010, 52–75

[60]	Miller G, Hermans F. Gradual structuring in the spreadsheet paradigm. In: Proceedings of 2016 IEEE Symposium on Visual Languages and Human-Centric Computing. 2016, 240–241

[61]	Cunha J, Fernandes J P, Mendes J, Saraiva J. A bidirectional model-driven spreadsheet environment. In: Proceedings of the 34th International Conference on Software Engineering. 2012, 1443–1444

[62]	Mendes J. Coupled evolution of model-driven spreadsheets. In: Proceedings of the 34th International Conference on Software Engineering. 2012, 1616–1618

[63]	Cunha J, Fernandes J P, Martins P, Mendes J, Pereira R, Saraiva J. Evaluating refactorings for spreadsheet models. Journal of Systems and Software, 2016, 118: 234–250

[64]	Cunha J, Fernandes J P, Martins P, Pereira R, Saraiva J. Refactoring meets model-driven spreadsheet evolution. In: Proceedings of the 9th International Conference on the Quality of Information and Communications Technology. 2014, 196–201

[65]	Cunha J, Fernandes J P, Mendes J, Saraiva J. Embedding, evolution, and validation of model-driven spreadsheets. IEEE Transactions on Software Engineering, 2015, 41( 3): 241–263

[66]	Cunha J, Fernandes J P, Mendes J, Saraiva J. Extension and implementation of ClassSheet models. In: Proceedings of 2012 IEEE Symposium on Visual Languages and Human-Centric Computing. 2012, 19–22

[67]	Mendes J, Cunha J, Duarte F, Engels G, Saraiva J, Sauer S. Systematic spreadsheet construction processes. In: Proceedings of 2017 IEEE Symposium on Visual Languages and Human-Centric Computing. 2017, 123–127

[68]	Thorne S, Ball D, Lawson Z. Reducing error in spreadsheets: example driven modeling versus traditional programming. International Journal of Human-Computer Interaction, 2013, 29( 1): 40–53

[69]	Miyashita H, Tai H, Amano S. Controlled modeling environment using flexibly-formatted spreadsheets. In: Proceedings of the 36th International Conference on Software Engineering. 2014, 978–988

[70]	Kruck S E. Testing spreadsheet accuracy theory. Information and Software Technology, 2006, 48( 3): 204–213

[71]	Janvrin D, Morrison J. Using a structured design approach to reduce risks in end user spreadsheet development. Information and Management, 2000, 37(1): 1–12

[72]	Mather D. A framework for building spreadsheet based decision models. Journal of the Operational Research Society, 1999, 50( 1): 70–74

[73]	Conway D G, Ragsdale C T. Modeling optimization problems in the unstructured world of spreadsheets. Omega, 1997, 25( 3): 313–322

[74]	Sarkar A, Gordon A D, Jones S P, Toronto N. Calculation view: multiple-representation editing in spreadsheets. In: Proceedings of 2018 IEEE Symposium on Visual Languages and Human-Centric Computing. 2018, 85–93

[75]	Rust A, Bishop B, McDaid K. Test-driven development: can it work for spreadsheet engineering? In: Abrahamsson P, Marchesi M, Succi G, eds. Extreme Programming and Agile Processes in Software Engineering. Berlin: Springer, 2006

[76]	McDaid K, Rust A, Bishop B. Test-driven development: can it work for spreadsheets? In: Proceedings of the 4th International Workshop on End-User Software Engineering. 2008, 25–29

[77]	Isakowitz T, Schocken S, Lucas H C Jr. Toward a logical/physical theory of spreadsheet modeling. ACM Transactions on Information Systems, 1995, 13( 1): 1–37

[78]	Dinmore M. Design and evaluation of a literate spreadsheet. In: Proceedings of 2012 IEEE Symposium on Visual Languages and Human-Centric Computing. 2012, 15–18

[79]	Benham H, Delaney M, Luzi A. Structured techniques for successful end user spreadsheets. Journal of End User Computing, 1993, 5( 2): 18–25

[80]	Jansen B, Hermans F. XLBlocks: a block-based formula editor for spreadsheet formulas. In: Proceedings of 2019 IEEE Symposium on Visual Languages and Human-Centric Computing. 2019, 55–63

[81]	Hendry D G, Green T R G. CogMap: a visual description language for spreadsheets. Journal of Visual Languages and Computing, 1993, 4( 1): 35–54

[82]	Macedo N, Pacheco H, Sousa N R, Cunha A. Bidirectional spreadsheet formulas. In: Proceedings of 2014 IEEE Symposium on Visual Languages and Human-Centric Computing. 2014, 161–168

[83]	Williams J, Negreanu C, Gordon A D, Sarkar A. Understanding and inferring units in spreadsheets. In: Proceedings of 2020 IEEE Symposium on Visual Languages and Human-Centric Computing. 2020, 1–9

[84]	Panko R R, Halverson R P Jr. An experiment in collaborative spreadsheet development. Journal of the Association for Information Systems, 2001, 2( 1): 4

[85]	Ko A J, Myers B A. Development and evaluation of a model of programming errors. In: Proceedings of 2003 IEEE Symposium on Human Centric Computing Languages and Environments. 2003, 7–14

[86]	Hermans F, Aivaloglou E, Jansen B. Detecting problematic lookup functions in spreadsheets. In: Proceedings of 2015 IEEE Symposium on Visual Languages and Human-Centric Computing. 2015, 153–157

[87]	Klobas J, McGill T. Spreadsheet knowledge: measuring what user developers know. Journal of Information Systems Education, 2004, 15( 4): 427–436

[88]	Lu M-T, Litecky C R, Lu D H. Application controls for spreadsheet development. Journal of Microcomputer Systems Management, 1991, 3(1): 12–22

[89]	Mccutchen M, Borghouts J, Gordon A D, Jones S P, Sarkar A. Elastic sheet-defined functions: generalising spreadsheet functions to variable-size input arrays. Journal of Functional Programming, 2020, 30: 26

[90]	Leon L A, Abraham D M, Kalbers L. Beyond regulatory compliance for spreadsheet controls: a tutorial to assist practitioners and a call for research. Communications of the Association for Information Systems, 2010, 27: 28

[91]	Roy S, Hermans F, Van Deursen A. Spreadsheet testing in practice. In: Proceedings of the 24th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). 2017, 338–348

[92]	Hermans F. Improving spreadsheet test practices. In: Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research. 2013, 56–69

[93]	Harutyunyan A, Borradaile G, Chambers C, Scaffidi C. Planted-model evaluation of algorithms for identifying differences between spreadsheets. In: Proceedings of 2012 IEEE Symposium on Visual Languages and Human-Centric Computing. 2012, 7–14

[94]	Schmitz T, Jannach D. Finding errors in the Enron spreadsheet corpus. In: Proceedings of 2016 IEEE Symposium on Visual Languages and Human-Centric Computing. 2016, 157–161

[95]	Champion D, Wilson J M. The impact of contingency factors on validation of problem structuring methods. Journal of the Operational Research Society, 2010, 61( 9): 1420–1431

[96]	Finlay P N, Wilson J M. A survey of contingency factors affecting the validation of end-user spreadsheet-based decision support systems. Journal of the Operational Research Society, 2000, 51( 8): 949–958

[97]	Olphert C W, Wilson J M. Validation of decision-aiding spreadsheets: the influence of contingency factors. Journal of the Operational Research Society, 2004, 55(1): 12–22

[98]	Anastasakis L, Olphert C W, Wilson J M. Experiences in using a contingency factor-based validation methodology for spreadsheet DSS. Journal of the Operational Research Society, 2008, 59( 6): 756–761

[99]	Panko R R. Applying code inspection to spreadsheet testing. Journal of Management Information Systems, 1999, 16( 2): 159–176

[100]

Powell S G, Baker K R, Lawson B. An auditing protocol for spreadsheet models. Information and Management, 2008, 45( 5): 312–320

[101]

Morrison M, Morrison J, Melrose J, Wilson E V. A visual code inspection approach to reduce spreadsheet linking errors. Journal of End User Computing, 2002, 14( 3): 51–63

[102]

Ahmad Y, Antoniu T, Goldwater S, Krishnamurthi S. A type system for statically detecting spreadsheet errors. In: Proceedings of the 18th IEEE International Conference on Automated Software Engineering. 2003, 174–183

[103]

Antoniu T, Steckler P A, Krishnamurthi S, Neuwirth E, Felleisen M. Validating the unit correctness of spreadsheet programs. In: Proceedings of the 26th International Conference on Software Engineering. 2004, 439–448

[104]

Burnett M, Cook C, Pendse O, Rothermel G, Summet J, Wallace C. End-user software engineering with assertions in the spreadsheet paradigm. In: Proceedings of the 25th International Conference on Software Engineering. 2003, 93–103

[105]

Coblenz M J, Ko A J, Myers B A. Using objects of measurement to detect spreadsheet errors. In: Proceedings of 2005 IEEE Symposium on Visual Languages and Human-Centric Computing. 2005, 314–316

[106]

Erwig M, Burnett M M. Adding apples and oranges. In: Proceedings of the 4th International Symposium on Practical Aspects of Declarative Languages. 2002, 173–191

[107]

Singh R, Livshits B, Zorn B. MELFORD: using neural networks to find spreadsheet errors. Microsoft Research. See Microsoft website, 2017

[108]

Dou W, Cheung S-C, Gao C, Xu C, Xu L, Wei J. Detecting table clones and smells in spreadsheets. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 2016, 787–798

[109]

Dou W, Han S, Xu L, Zhang D, Wei J. Expandable group identification in spreadsheets. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 2018, 498–508

[110]

Li D, Wang H, Xu C, Zhang R, Cheung S-C, Ma X. SGUARD: a feature-based clustering tool for effective spreadsheet defect detection. In: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering. 2019, 1142–1145

[111]

Zhang Y, Dou W, Zhu J, Xu L, Zhou Z, Wei J, Ye D, Yang B. Learning to detect table clones in spreadsheets. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2020, 528–540

[112]

Zhang Y, Lv X, Dong H, Dou W, Han S, Zhang D, Wei J, Ye D. Semantic table structure identification in spreadsheets. In: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. 2021, 283–295

[113]

Abraham R, Erwig M. Header and unit inference for spreadsheets through spatial analyses. In: Proceedings of 2004 IEEE Symposium on Visual Languages and Human-Centric Computing. 2004, 165–172

[114]

Abraham R, Erwig M. UCheck: a spreadsheet type checker for end users. Journal of Visual Languages and Computing, 2007, 18( 1): 71–95

[115]

Abraham R, Erwig M, Andrew S. A type system based on end-user vocabulary. In: Proceedings of 2007 IEEE Symposium on Visual Languages and Human-Centric Computing. 2007, 215–222

[116]

Chambers C, Erwig M. Automatic detection of dimension errors in spreadsheets. Journal of Visual Languages and Computing, 2009, 20( 4): 269–283

[117]

Chambers C, Erwig M. Reasoning about spreadsheets with labels and dimensions. Journal of Visual Languages and Computing, 2010, 21( 5): 249–262

[118]

Dou W, Cheung S-C, Wei J. Is spreadsheet ambiguity harmful? Detecting and repairing spreadsheet smells due to ambiguous computation. In: Proceedings of the 36th International Conference on Software Engineering. 2014, 848–858

[119]

Dou W, Xu C, Cheung S C, Wei J. CACheck: detecting and repairing cell arrays in spreadsheets. IEEE Transactions on Software Engineering, 2017, 43( 3): 226–251

[120]

Xu L, Wang S, Dou W, Yang B, Gao C, Wei J, Huang T. Detecting faulty empty cells in spreadsheets. In: Proceedings of the 25th IEEE International Conference on Software Analysis, Evolution and Reengineering. 2018, 423–433

[121]

Cheung S-C, Chen W, Liu Y, Xu C. CUSTODES: automatic spreadsheet cell clustering and smell detection using strong and weak features. In: Proceedings of the 38th IEEE/ACM International Conference on Software Engineering. 2016, 464–475

[122]

Barowy D W, Berger E D, Zorn B. ExceLint: automatically finding spreadsheet formula errors. Proceedings of the ACM on Programming Languages, 2018, 2(OOPSLA): 148

[123]

Huang Y, Xu C, Jiang Y, Wang H, Li D. WARDER: towards effective spreadsheet defect detection by validity-based cell cluster refinements. Journal of Systems and Software, 2020, 167: 110615

[124]

Hermans F, Sedee B, Pinzger M, Van Deursen A. Data clone detection and visualization in spreadsheets. In: Proceedings of the 35th International Conference on Software Engineering. 2013, 292–301

[125]

Barowy D W, Gochev D, Berger E D. CheckCell: data debugging for spreadsheets. In: Proceedings of 2014 ACM International Conference on Object Oriented Programming Systems Languages and Applications. 2014, 507–523

[126]

Koch P, Schekotihin K, Jannach D, Hofer B, Wotawa F. Metric-based fault prediction for spreadsheets. IEEE Transactions on Software Engineering, 2021, 47( 10): 2195–2207

[127]

Zhang R, Xu C, Cheung S C, Yu P, Ma X, Lu J. How effectively can spreadsheet anomalies be detected: an empirical study. Journal of Systems and Software, 2017, 126: 87–100

[128]

Hermans F, Pinzger M, Van Deursen A. Detecting and refactoring code smells in spreadsheet formulas. Empirical Software Engineering, 2015, 20( 2): 549–575

[129]

Koch P, Hofer B, Wotawa F. On the refinement of spreadsheet smells by means of structure information. Journal of Systems and Software, 2019, 147: 64–85

[130]

Cunha J, Fernandes J P, Martins P, Mendes J, Saraiva J. SmellSheet detective: a tool for detecting bad smells in spreadsheets. In: Proceedings of 2012 IEEE Symposium on Visual Languages and Human-Centric Computing. 2012, 243–244

[131]

Hermans F, Pinzger M, Van Deursen A. Detecting and visualizing inter-worksheet smells in spreadsheets. In: Proceedings of the 34th International Conference on Software Engineering. 2012, 441–451

[132]

Hermans F, Dig D. BumbleBee: a refactoring environment for spreadsheet formulas. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. 2014, 747–750

[133]

Zhang J, Han S, Hao D, Zhang L, Zhang D. Automated refactoring of nested-IF formulae in spreadsheets. In: Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 2018, 833–838

[134]

Chan H C, Ying C, Peh C B. Strategies and visualization tools for enhancing user auditing of spreadsheet models. Information and Software Technology, 2000, 42( 15): 1037–1043

[135]

Koch P, Schekotihin K. Fritz: a tool for spreadsheet quality assurance. In: Proceedings of 2018 IEEE Symposium on Visual Languages and Human-Centric Computing. 2018, 285–286

[136]

Aurigemma S, Panko R. Evaluating the effectiveness of static analysis programs versus manual inspection in the detection of natural spreadsheet errors. Journal of Organizational and End User Computing, 2014, 26( 1): 47–65

[137]

Panko R R, Aurigemma S. Revising the Panko-Halverson taxonomy of spreadsheet errors. Decision Support Systems, 2010, 49( 2): 235–244

[138]

Sajaniemi J. Modeling spreadsheet audit: a rigorous approach to automatic visualization. Journal of Visual Languages and Computing, 2000, 11( 1): 49–82

[139]

Poon P-L, Kuo F-C, Liu H, Chen T Y. How can non-technical end users effectively test their spreadsheets? Information Technology and People, 2014, 27(4): 440−462

[140]

Chen T Y, Kuo F-C, Liu H, Poon P-L, Towey D, Tse , T H, Zhou Z Q. Metamorphic testing: a review of challenges and opportunities. ACM Computing Surveys, 2018, 51( 1): 4

[141]

Ringstrom D. Trapping errors within Excel formulas. Accounting Web. See AccountingWEB website, 2012

[142]

Burnett M. End-user software engineering and why it matters. Journal of Organizational and End User Computing, 2010, 22(1): 1–22

[143]

Burnett M, Sheretov A, Ren B, Rothermel G. Testing homogeneous spreadsheet grids with the “What You See Is What You Test” methodology. IEEE Transactions on Software Engineering, 2002, 28( 6): 576–594

[144]

Rothermel G, Burnett M, Li L, Dupuis C, Sheretov A. A methodology for testing spreadsheets. ACM Transactions on Software Engineering and Methodology, 2001, 10( 1): 110–147

[145]

Su T, Wu K, Miao W, Pu G, He J, Chen Y, Su Z. A survey on data-flow testing. ACM Computing Surveys, 2017, 50( 1): 5

[146]

Fisher M II, Rothermel G, Brown D, Cao M, Cook C, Burnett M.. Integrating automated test generation into the WYSIWYT spreadsheet testing methodology. ACM Transactions on Software Engineering and Methodology, 2006, 15( 2): 150–194

[147]

Abraham R, Erwig M. AutoTest: a tool for automatic test case generation in spreadsheets. In: Proceedings of 2006 IEEE Symposium on Visual Languages and Human-Centric Computing. 2006, 43–50

[148]

Scaffidi C, Cypher A, Elbaum S, Koesnandar A, Lin J, Myers B, Shaw M. Using topes to validate and reformat data in end-user programming tools. In: Proceedings of the 4th International Workshop on End-User Software Engineering. 2008, 11–15

[149]

Kakarla S, Momotaz S, Namin A S. An evaluation of mutation and data-flow testing: a meta-analysis. In: Proceedings of the 4th IEEE International Conference on Software Testing, Verification and Validation Workshop. 2011, 366–375

[150]

Abraham R, Erwig M. Mutation operators for spreadsheets. IEEE Transactions on Software Engineering, 2009, 35( 1): 94–108

[151]

Schmitz T, Jannach D, Hofer B, Koch P, Schekotihin K, Wotawa F. A decomposition-based approach to spreadsheet testing and debugging. In: Proceedings of 2017 IEEE Symposium on Visual Languages and Human-Centric Computing. 2017, 117–121

[152]

Galletta D F, Abraham D, El Louadi M, Lekse W, Pollalis Y A, Sampler J L. An empirical study of spreadsheet error-finding performance. Accounting, Management and Information Technologies, 1993, 3( 2): 79–95

[153]

Saariluoma P, Sajaniemi J. Transforming verbal descriptions into mathematical formulas in spreadsheet calculation. International Journal of Human-Computer Studies, 1994, 41( 6): 915–948

[154]

Jhugursing M, Dimmock V, Mulchandani H. Error and root cause analysis. BJA Education, 2017, 17( 10): 323–333

[155]

Teo T S H, Tan M. Spreadsheet development and ‘what-if’ analysis: quantitative versus qualitative errors. Accounting, Management and Information Technologies, 1999, 9( 3): 141–160

[156]

Leon L, Przasnyski Z H, Seal K C. Introducing a taxonomy for classifying qualitative spreadsheet errors. Journal of Organizational and End User Computing, 2015, 27( 1): 33–56

[157]

Teo T S H, Lee-Partridge J E. Effects of error factors and prior incremental practice on spreadsheet error detection: an experimental study. Omega, 2001, 29( 5): 445–456

[158]

Tukiainen M. Comparing two spreadsheet calculation paradigms: an empirical study with novice users. Interacting with Computers, 2001, 13( 4): 427–446

[159]

Tukiainen M. Uncovering effects of programming paradigms: errors in two spreadsheet systems. In: Proceedings of the 12th Annual Workshop of the Psychology of Programming Interest Group. 2000, 247–266

[160]

Powell S G, Baker K R, Lawson B. Impact of errors in operational spreadsheets. Decision Support Systems, 2009, 47( 2): 126–132

[161]

Dobell E, Herold S, Buckley J. Spreadsheet error types and their prevalence in a healthcare context. Journal of Organizational and End User Computing, 2018, 30( 2): 20–42

[162]

Hendry D G, Green T R G. Creating, comprehending and explaining spreadsheets: a cognitive interpretation of what discretionary users think of the spreadsheet model. International Journal of Human-Computer Studies, 1994, 40( 6): 1033–1065

[163]

Bishop B, McDaid K. Expert and novice end-user spreadsheet debugging: a comparative study of performance and behaviour. Journal of Organizational and End User Computing, 2011, 23( 2): 57–80

[164]

Grigoreanu V, Burnett M, Wiedenbeck S, Cao J, Rector K, Kwan I. End-user debugging strategies: a sensemaking perspective. ACM Transactions on Computer-Human Interaction, 2012, 19( 1): 5

[165]

Pirolli P, Card S. The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In: Proceedings of International Conference on Intelligence Analysis. 2005

[166]

Hofer B, Höfler A, Wotawa F. Combining models for improved fault localization in spreadsheets. IEEE Transactions on Reliability, 2017, 66( 1): 38–53

[167]

Hofer B, Wotawa F. Why does my spreadsheet compute wrong values? In: Proceedings of the 25th IEEE International Symposium on Software Reliability Engineering. 2014, 112–121

[168]

Jannach D, Schmitz T. Model-based diagnosis of spreadsheet programs: a constraint-based debugging approach. Automated Software Engineering, 2016, 23( 1): 105–144

[169]

Ruthruff J R, Burnett M, Rothermel G. Interactive fault localization techniques in a spreadsheet environment. IEEE Transactions on Software Engineering, 2006, 42( 4): 213–239

[170]

Lawrance J, Abraham R, Burnett M, Erwig M. Sharing reasoning about faults in spreadsheets: an empirical study. In: Proceedings of 2006 IEEE Symposium on Visual Languages and Human-Centric Computing. 2006, 35–42

[171]

Hofer B, Perez A, Abreu R, Wotawa F. On the empirical evaluation of similarity coefficients for spreadsheets fault localization. Automated Software Engineering, 2015, 22( 1): 47–74

[172]

Jannach D, Schmitz T, Hofer B, Schekotihin K, Koch P, Wotawa F. Fragment-based spreadsheet debugging. Automated Software Engineering, 2019, 26( 1): 203–239

[173]

Abraham R, Erwig M. GoalDebug: a spreadsheet debugger for end users. In: Proceedings of the 29th International Conference on Software Engineering. 2007, 251–260

[174]

Schmitz T, Jannach D. An Al-based interactive tool for spreadsheet debugging. In: Proceedings of 2017 IEEE Symposium on Visual Languages and Human-Centric Computing. 2017, 333–334

[175]

Goswami S, Chan H C, Kim H W. The role of visualization tools in spreadsheet error correction from a cognitive fit perspective. Journal of the Association for Information Systems, 2008, 9(6): 321−343

[176]

Davis S J. Tools for spreadsheet auditing. International Journal of Human-Computer Studies, 1996, 45( 2): 429–442

[177]

Mukhtar A, Hofer B, Jannach D, Wotawa F. Spreadsheet debugging: the perils of tool over-reliance. Journal of Systems and Software, 2022, 184: 111119

[178]

Cronan T P, Douglas D E. End-user training and computing effectiveness in public agencies: an empirical study. Journal of Management Information Systems, 1990, 6( 4): 21–39

[179]

Dou W, Xu L, Cheung S-C, Gao C, Wei J, Huang T. VEnron: a versioned spreadsheet corpus and related evolution analysis. In: Proceedings of the 38th IEEE/ACM International Conference on Software Engineering Companion (ICSE-C). 2016, 162–171

[180]

Cunha J, Erwig M, Mendes J, Saraiva J. Model inference for spreadsheets. Automated Software Engineering, 2016, 23( 3): 361–392

[181]

Fischer G, Giaccardi E. Meta-design: a framework for the future of end-user development. In: Lieberman H, Paternò F, Wulf V, eds. End User Development: Human-Computer Interaction Series. Dordrecht: Springer, 2006, 427–457

[182]

Bhadauria V S, Mahapatra R, Nerur S P. Performance outcomes of test-driven development: an experimental investigation. Journal of the Association for Information Systems, 2020, 21( 4): 1045–1071

[183]

Kroll P, Royce W. Key principles for business-driven development. IBM. See fulmanski.pl/zajecia/miasi/materials/kroll/index website, 2015

[184]

Kumar S, Bansal S. Comparative study of test driven development with traditional techniques. International Journal of Soft Computing and Engineering, 2013, 3(1): 352−360

[185]

Martin A. An integrated introduction to spreadsheet and programming skills for operational research students. Journal of the Operational Research Society, 2000, 51( 12): 1399–1408

[186]

Kumar B. Create a Power BI report from Excel using Power PI Desktop. SPGuides.com. See SPGuides website, 2019

[187]

Gordon K J. Spreadsheet or database: which makes more sense? Journal of Computing in Higher Education, 1999, 10(2): 111−116

[188]

Lakshmanan L V S, Subramanian S N, Goyal N, Krishnamurthy R. On querying spreadsheets. In: Proceedings of the 14th International Conference on Data Engineering. 1998, 134–141

[189]

Li Y, Zhang C, Wang H, Wu F, Nie Y, Ren Y. A method of data granulation and indicators standardization of spreadsheet. In: Proceedings of the 6th IEEE International Conference on Cloud Computing and Big Data Analytics (ICCCBDA). 2021, 126–130

[190]

Tang J-F, Zhou B, He Z-J, Uros P. Toward spreadsheet-based data management in distributed enterprise environment. In: Proceedings of the 8th International Conference on Computer Supported Cooperative Work in Design. 2004, 578–581

[191]

Microsoft. Using Access or Excel to manage your data. Microsoft Support. See Microsoft website, 2022

[192]

Broman K W, Woo K H. Data organization in spreadsheets. The American Statistician, 2018, 72( 1): 2–10

Funding note

Open Access funding enabled and organized by CAUL and its Member Institutions.

Open Access

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit creativecommons.org/licenses/by/4.0/.