GaussDB-AISQL: a composable cloud-native SQL system with AI capabilities
Cheng CHEN , Wenlong MA , Congli GAO , Wenliang ZHANG , Kai ZENG , Tao YE , Yueguo CHEN , Xiaoyong DU
Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (9) : 199608
GaussDB-AISQL: a composable cloud-native SQL system with AI capabilities
Cloud-native data warehouses have revolutionized data analysis by enabling elasticity, high availability and lower costs. And the increasing popularity of artificial intelligence (AI) drives data warehouses to provide predictive analytics besides the existing descriptive analytics. Consequently, more vendors start to support training and inference of AI models in data warehouses, exploiting the benefits of near-data processing for fast model development and deployment. However, most of the existing solutions are limited by a complex syntax or slow data transportation across engines.
In this paper, we present GaussDB-AISQL, a composable SQL system with AI capabilities. GaussDB-AISQL adopts a composable system design that decouples computing, storage, caching, DB engine and AI engine. Our system offers all the functionality needed by end-to-end model training and inference during the model lifecycle. It also enjoys the simplicity and efficiency by providing a SQL-like syntax and removes the burden of manual model management. When training an AI model, GaussDB-AISQL benefits from highly parallel data transportation by concurrent data pulling from the distributed shared memory. The feature selection algorithms in GaussDB-AISQL make the training more data-efficient. When running model inference, GaussDB-AISQL registers the trained model object in the local data warehouse as a user-defined-function, which avoids moving inference data out of the data warehouse to an external AI engine. Experiments show that GaussDB-AISQL is up to 19× faster than baseline approaches.
database system / data management / OLAP / cloud computing / AI / machine learning
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
MindsDB. MindsDB. See mariadbcom/about-us/partners/mindsdb/ website, 2024 |
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
Oracle Corporation. Oracle machine learning. See Docs.oracle.com/en/database/oracle/machine-learning/ website, 2024 |
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
Substrait. See Github.com/substrait-io website, 2024 |
| [14] |
|
| [15] |
ONNX. See Onnx.ai/ website, 2024 |
| [16] |
|
| [17] |
|
| [18] |
Kaggle. The state of data science. See www.kaggle.com/kaggle-survey-2020 website, 2020 |
| [19] |
|
| [20] |
|
| [21] |
The Apache Software Foundation. Apache arrow. See Arrow.apache website, 2016 |
| [22] |
ClickHouse. ClickHouse. See githubcom/ClickHouse/ClickHouse website, 2024 |
| [23] |
|
| [24] |
MySQL. See www.mysql.com/ website, 2024 |
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
| [35] |
|
| [36] |
|
| [37] |
|
| [38] |
|
| [39] |
|
| [40] |
|
| [41] |
|
| [42] |
|
| [43] |
|
| [44] |
|
| [45] |
|
| [46] |
|
| [47] |
|
| [48] |
|
| [49] |
|
| [50] |
|
| [51] |
|
| [52] |
|
| [53] |
|
| [54] |
Microsoft. Microsoft SQL server machine learning services. website, 2024 |
| [55] |
|
| [56] |
Corporation I. IBM db2 machine learning. website, 2024 |
| [57] |
|
| [58] |
AP. SAP HANA predictive analysis library (PAL). See Help.sap.com website, 2024 |
| [59] |
|
| [60] |
|
| [61] |
|
| [62] |
|
| [63] |
|
| [64] |
|
| [65] |
|
| [66] |
|
| [67] |
|
Higher Education Press
/
| 〈 |
|
〉 |