Sequence-based imitation learning for surgical robot operations
Gabriele Furnari , Cristian Secchi , Federica Ferraguti
Artificial Intelligence Surgery ›› 2025, Vol. 5 ›› Issue (1) : 103 -15.
Sequence-based imitation learning for surgical robot operations
Aim: This paper aims to advance autonomous surgical operations through imitation learning from video demonstrations.
Methods: To address this objective, we propose two main contributions: (1) We introduce a new dataset of virtual kidney tumor environments to train our model on. The dataset is composed of video demonstrations of tumor removal from the kidney, executed in a virtual environment, and kinematic data of the robot tools; (2) We employed an imitation learning architecture composed of vision transformers (ViT) to handle the frames extracted from the videos and of a long short-term memory (LSTM) structure to process surgical motion sequences with a sliding window mechanism. This model processes video frames and prior poses to predict the poses for both robotic arms. A self-generating sequence approach was implemented, where each predicted pose served as the latest element in the sequence, subsequently used as input for the next prediction together with the current frame of the video. The choice of architecture and methodology was guided by the need to effectively model the sequential nature of surgical operations.
Results: The model achieved promising results, exhibiting an average position error of 0.5 cm. The model was able to execute correctly 70% of the test tasks. This highlights the sequence-based approach’s efficacy in capturing and predicting surgical trajectories.
Conclusion: Our study supports imitation learning’s viability for acquiring task execution policies in surgical robotics. The sequence-based model, combining ViT and LSTM architectures, successfully handles surgical trajectories.
Imitation learning / robot-assisted surgery / artificial intelligence
/
| 〈 |
|
〉 |