Assessing the accuracy and utility of ChatGPT responses to patient questions regarding posterior lumbar decompression
Alec M. Giakas , Rajkishen Narayanan , Teeto Ezeonu , Jonathan Dalton , Yunsoo Lee , Tyler Henry , John Mangan , Gregory Schroeder , Alexander Vaccaro , Christopher Kepler
Artificial Intelligence Surgery ›› 2024, Vol. 4 ›› Issue (3) : 233 -46.
Assessing the accuracy and utility of ChatGPT responses to patient questions regarding posterior lumbar decompression
Aim: To examine the clinical accuracy and applicability of ChatGPT answers to commonly asked questions from patients considering posterior lumbar decompression (PLD).
Methods: A literature review was conducted to identify 10 questions that encompass some of the most common questions and concerns patients may have regarding lumbar decompression surgery. The selected questions were then posed to ChatGPT. Initial responses were then recorded, and no follow-up or clarifying questions were permitted. Two attending fellowship-trained spine surgeons then graded each response from the chatbot using a modified Global Quality Scale to evaluate ChatGPT’s accuracy and utility. The surgeons then analyzed each question, providing evidence-based justifications for the scores.
Results: Minimum scores across all ten questions would lead to a total score of 20, whereas a maximum score would be 100. ChatGPT’s responses in this analysis earned a score of 59, just under an average score of 3, when evaluated by two attending spine surgeons. A score of 3 denoted a somewhat useful response of moderate quality, with some important information adequately discussed but some poorly discussed.
Conclusion: ChatGPT has the ability to provide broadly useful responses to common preoperative questions that patients may have when considering undergoing PLD. ChatGPT has excellent utility in providing background information to patients and in helping them become more informed about their pathology in general. However, it often lacks the specific patient context necessary to provide patients with personalized, accurate insights into their prognosis and medical options.
Artificial intelligence / ChatGPT / lumbar decompression / spine surgery
| [1] |
Van Riel N, Auwerx K, Debbaut P, Van Hees S, Schoenmakers B. The effect of Dr Google on doctor-patient encounters in primary care: a quantitative, observational, cross-sectional study.BJGP Open2017;1:bjgpopen17X100833 PMCID:PMC6169945 |
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
OpenAI. Introducing ChatGPT. 2022. Available from: https://openai.com/blog/chatgpt. [Last accessed on 27 Aug 2024] |
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
Kreiner DS, Hwang SW, Easa JE, et al; North American Spine Society. An evidence-based clinical guideline for the diagnosis and treatment of lumbar disc herniation with radiculopathy. Spine J 2014;14:180-91. |
| [24] |
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
| [35] |
|
| [36] |
|
| [37] |
|
| [38] |
|
/
| 〈 |
|
〉 |