Multi-scale spatial features and temporal attention mechanisms: advancing the accuracy of ENSO prediction