VIEWS & COMMENTS

Smart systems engineering contributing to an intelligent carbon-neutral future: opportunities, challenges, and prospects

  • Xiaonan Wang , 1 ,
  • Jie Li 2 ,
  • Yingzhe Zheng 2 ,
  • Jiali Li 2
Expand
  • 1. Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
  • 2. Department of Chemical & Biomolecular Engineering, Faculty of Engineering, National University of Singapore, Singapore 117585, Singapore

Received date: 01 Sep 2021

Accepted date: 06 Nov 2021

Published date: 15 Jun 2022

Copyright

2022 Higher Education Press 2022

Abstract

This communication paper provides an overview of multi-scale smart systems engineering (SSE) approaches and their applications in crucial domains including materials discovery, intelligent manufacturing, and environmental management. A major focus of this interdisciplinary field is on the design, operation and management of multi-scale systems with enhanced economic and environmental performance. The emergence of big data analytics, internet of things, machine learning, and general artificial intelligence could revolutionize next-generation research, industry and society. A detailed discussion is provided herein on opportunities, challenges, and future directions of SSE in response to the pressing carbon-neutrality targets.

Cite this article

Xiaonan Wang , Jie Li , Yingzhe Zheng , Jiali Li . Smart systems engineering contributing to an intelligent carbon-neutral future: opportunities, challenges, and prospects[J]. Frontiers of Chemical Science and Engineering, 2022 , 16(6) : 1023 -1029 . DOI: 10.1007/s11705-022-2142-6

1 Introduction

In light of the pressing environmental and climate change challenges, smart approaches are indispensable for sustainable development towards a net-zero future. Advances in artificial intelligence (AI), especially machine learning (ML), provide an enormous variety of smart tools for processing complex data and information generated from experimental and computational research, as well as industrial applications [1]. Ultimately, these smart technologies could enable automated materials discovery, process optimization, environmental management, and all aspects of the model industry, thereby facilitating improvement in energy efficiency, emission reduction, and eventually the realization of carbon neutrality target [2].
“Smart systems engineering” (SSE) is an interdisciplinary field that deals with the design, operation, and management of multi-scale systems, as illustrated in Fig. 1. Accelerated materials discovery and development is the cornerstone of modern manufacturing that impacts all aspects of human life. Intelligent manufacturing plays a key role in economic and eco-friendly process development. Environmental management is highly relevant throughout the life cycle of all multi-scale systems. In these respects, AI can assist in each step through the whole life cycle of material discovery, advanced manufacturing, and general environmental management [3,4].
Fig.1 Illustration of multi-scale smart systems engineering (SSE). Multiple length and time scales ranging from atoms, molecules, and particles at nano/microscale towards process at mesoscale, till macroscale global environment are investigated using the general SSE approaches.

Full size|PPT slide

Various AI techniques have been developed including supervised learning, unsupervised learning, and reinforcement learning, among others. With the rapid increase in computing power, deep learning based on neural networks has shown huge promise. Interested readers are referred to previous reviews to learn the details of AI and ML in the discussed fields (e.g., Refs. [57]). Today’s world is experiencing an unprecedented spurt in the available data, and this makes ML an ineluctable tool in its digital transformation. Today’s society and industries, confronting increasingly severe climate crises, urgently require such intelligent approaches aimed at decarbonizing everything possible from both the supply and demand perspectives [8].

2 Smart systems approach in materials discovery

Different ML strategies can be utilized at each stage of material characterization, property prediction, synthesis, and theory paradigm discovery to accelerate the whole development process. Overall, three major categories of ML methods including supervised learning, unsupervised learning, and semi-supervised learning are applied to expedite the discovery of new materials (Fig. 2) [5]. In order to realize a smart system approach in material discovery, a process expert must identify the appropriate task framework for the given problem. Generally, ML task frameworks through the whole life cycle of material discovery can be categorized into four major categories, including 1) Regression, 2) Classification, 3) Generation, and 4) Interpretation (Fig. 2). Except for interpretation, the first three tasks normally have a general workflow: a feature engineering process is first implemented, which relies on either manually created features or machine-learned features (or a hybrid of both), followed by downstream algorithms, which use the processed features to perform specific tasks. The key differences in computer implementation between the different tasks are the downstream algorithms and the loss functions related to each task.
Fig.2 Typical machine learning methods and their application in material discovery.

Full size|PPT slide

Regression task frameworks are used in material discovery to predict one continuous or several continuous properties. For example, the bandgap of crystals [9], T1 and S1 gap of molecules [10], and the location of each molecule inside an scanning tunneling microscope image [11]. To achieve continuous properties, a linear function with an output range from negative infinity to positive infinity is often used as the activation function. ReLU can also be applied to the output in order to eliminate negative outputs. However, the predicted value will have some effects on learning efficiency and accuracy if the value is enforced to be 0 or a positive number. The loss functions for regression tasks are usually simple mean squared error or mean absolute error.
Classification task frameworks are used in material discovery to classify which category of a discrete property the material under study belongs to. For example, one may need to identify which space group a crystal material belongs to [12], which reaction type a specific reaction entry belongs to [13], whether a specific synthesis recipe can lead to atomically precise gold nanoclusters [14], and the chirality of molecules [11]. It is common to apply a sigmoid activation function on the output when it is a binary classification task and a softmax activation when there are more than two categories. Typically, the cross-entropy loss which quantifies the difference between two probability distributions is employed as the loss function.
Generation task frameworks in material discovery are much more complex than regression and classification. In material discovery, generation task frameworks are mainly used for inverse design problems. In these applications, it is necessary to first identify the desired properties, and then find the materials that can create such properties [15]. It is noted that the output materials should be in specific structures, e.g., molecular graphs, crystal graphs, molecular text language. Also, one desired property may correspond to many materials that fulfil the requirements. In such cases, different generation frameworks such as variational autoencoder, generative adversarial networks, and seq2seq models could be adopted. The specific output data structure is realized by a carefully designed output vector structure or series of regression and classification tasks. The one-to-many mapping problem is solved by combining a randomly sampled distribution to the condition information. The loss functions of generation task frameworks are customized for different tasks, for which they may include the reconstruction loss of the target material (i.e., whether the generated material is similar to the labelled correct ones), the prediction loss for certain interesting properties, and other specific terms for better training.
Interpretation tasks in material discovery are different from the aforementioned three tasks. The extraction of material knowledge is realized either by data visualization or by explaining a well-trained model. Data visualization methods such as t-distributed stochastic neighbor embedding and uniform manifold approximation and projection are used to visualize high-dimensional material data and extract certain insights from them [10,13]. Methods and strategies such as LIME and SHAP analysis that can explain black-box models are used to extract physiochemical knowledge from well-trained machine learning models [12,14]. Though data-driven approaches to material discovery are still at an infancy stage, these studies have shown great potential in this direction.
Overall, the SSE approaches have shown great promises in materials discovery process. Our previous work [11] is a typical example of applying AI and ML in material science and engineering. Data scarcity is identified as one of the key problems and a method for selecting high-quality data via unsupervised learning is developed, with data augmentation further implemented to increase the amount of data. The task frame is then designed as a combination of regression and classification tasks, which aimed to realize the final target of automated chiral molecule detection and identification. Various SSE approaches facilitate more efficient and effective material discovery: active learning helps scientists to save time and costs by enabling a better sampling of the whole material design space; regression and classification-based machine learning methods can substitute time-consuming wet experiments or computational simulations; generative models help scientists to discover desired materials within a minute instead of weeks; and ML-based interpretation can sometimes reveal rules that were previously unknown to scientists, thereby fostering better understanding of the system. By systematically using robotics, automation, and machine learning, the development cycle of new material is expected to be accelerated by factors of at least 10 times.

3 Smart systems approach in intelligent manufacturing

Safety, efficiency, and sustainability have always remained as the cardinal goals in any manufacturing process. Traditional state-of-the-art technologies typically require frequent human involvement, and thereby engendering serious limitations in practice. With the advent of Industry 4.0, intelligent manufacturing (IM) has emerged as the pivotal enabler for improving product quality and productivity while optimizing the use of energy and resources with minimal human intervention. At its core, IM is a multipartite endeavor that utilizes the concept of cyber-physical production systems (CPPS), which encompasses myriad contemporary technologies such as AI, cloud computing, and the industrial internet of things (IoT), to drive the development of a more computerized, automated and sophisticated manufacturing environment [16]. In this section, we will briefly discuss the application of ML as a computational and data-driven modeling SSE tool for process monitoring, optimization, and control (Fig. 3).
Fig.3 Schematic overview of intelligent manufacturing.

Full size|PPT slide

In general, among the three widely adopted approaches (theory-, experience- and data-driven) in process monitoring, the data-driven approach is perhaps most preferred as it avoids formulation of cumbersome mechanistic models and minimizes reliance on human judgments. For example, linear ML techniques, such as principal component analysis (PCA), can be utilized to extract vital information from large data sets through dimensionality reduction with minimal accuracy loss. Insights gleaned from monitoring these new reduced set of variables can enhance process understanding which forms the cornerstone for the convenient and efficient detection and diagnosis of process abnormalities. Gajjar et al. [17] adopt the sparse PCA, which is a variant of the conventional PCA, that yields principal components with sparse loadings via variance-sparsity trade-off for online monitoring and fault diagnosis. They proposed a method that enhanced the interpretability of the derived principal components and correctly identified the cause of fault in a comparative study using the Tennessee Eastman process. However, despite the benefits of data-driven approaches in process monitoring, several key issues that arise in practice have yet to be fully addressed. One major impediment pertains to the non-linearity of most real-world systems that forbids the use of these computationally efficient linear ML methods. Although more advanced ML techniques, such as the kernel methods that project the original process data onto higher-dimensional spaces where linear methods become applicable, can be employed, handling of heterogeneous and multi-source data, and accurate fault prognosis still remain challenges.
Process optimization is often deemed a challenging task due to its complex, constrained, non-convex, and multi-variate nature. The use of data-driven models in surrogate-based optimization can greatly facilitate this task as it eliminates the necessity of resource-intensive mechanistic or simulation paradigms for optimization. For example, our previous work incorporates ML data-driven models, such as tree-based and neural network algorithms, in developing a novel three-step framework of machine learning for smart energy to predict machine-specific load profiles via energy disaggregation [18]. The framework can be exploited for optimizing the energy consumption in machine-based operations through informed production planning and scheduling. However, although the prescriptive power of ML models can immensely alleviate manufacturing professionals’ effort in maximizing process capability, several limitations exist. Current ML models can rarely extrapolate reliably, and thus a rich training set covering the entire operating region is required. Moreover, as seen in our case [18], it is often necessary to test a wide range of ML models when selecting the most appropriate one for the process under consideration. To overcome these limitations in practice, data-driven modeling can be coupled with design of experiments or active learning to obtain a rich data set, and the burgeoning case studies in the ML community can further aid engineers in applying their domain expertise to match the right models to the right optimization problems.
In terms of process control, instead of adopting advanced process control techniques such as model predictive control (MPC), traditional manufacturing practices have resorted to statistical process control for measuring and controlling process operations [1]. Although requiring a high initial investment and a dedicated distributed control system architecture, the implementation of ML-based data-driven control systems allows widespread quality control throughout the product life cycle, which in turn serves to offset the aforementioned constraints. For example, our previous work designed recurrent neural networks (RNNs)-based MPC for applications in continuous pharmaceutical manufacturing [19]. Attributing to their mathematical structure, we have shown that RNNs are especially suitable for modeling the dynamics of a continuous-stirred tank reactor (CSTR) in pharmaceutical manufacturing, and thereby enabling satisfactory closed-loop performance for MPC of a complex reaction in CSTR. As noted in our previous work [4], satisfactory control requires quality system models that adequately describe the process dynamics across its operating region, and hence a rich training dataset is imperative.
Overall, as an integral and indispensable part of IM, the incorporation of ML-based SSE approaches greatly simplifies process monitoring, optimization, and control. However, their adoption has been disparate across the industries: while Google has successfully reduced up to 40% electricity consumption in their data centers with reinforcement learning [20], the integration of these SSE approaches in many manufacturing domains is still in its infancy. Although ML-based SSE approaches still face many limitations in practice, given the pervasive nature of data in modern manufacturing enterprises and the rapid advancements in ML, it is expected that these SSE approaches will play a critical role in industrial process understanding and decision-making, and ultimately in achieving more efficient, safe, and sustainable operations in the near future.

4 Smart systems approach in environmental management

Our attention is also drawn to the potential contribution of ML combined with big data to environmental management. Environmental research aims to improve the quality of water, soil, and air. With respect to these objectives, three general strategies can be adopted to: 1) control the pollution from the original source, 2) prevent the generated pollutants into the environment, and 3) remove the pollutants from the contaminated water, air, and soil, as shown in Fig. 4. Based on the above three aspects, possible applications and contributions of ML are summarized and discussed.
Fig.4 Smart systems approach with machine learning (ML) application in environmental management.

Full size|PPT slide

Regarding the original source control for protecting water, soil, and air, understanding the pollution status of a region is key to analyzing its causes. Therefore, the environmental data about the atmospheric monitoring (e.g., the concentration of PM2.5, O3, SO2, NO2, and, CO with time horizon), the watersheds (e.g., the concentration of total phosphorus, nitrogen, chemical oxygen demand, and some emerging contaminants of antibiotics, microplastic, heavy metals, etc.), and soil (e.g., pesticides, antibiotics, and heavy metals) would be employed as targets to develop ML model. A robust predictive model can be created by considering other economic or human activity data as inputs, based on which further concentrations of the above pollutants can be predicted, and the possible causes of pollution can be investigated. Such kinds of insights from ML models are beneficial for governments to make policies to protect the environment from source control. For example, Jain et al. developed spatiotemporal models for PM2.5, NO2, and CO prediction using ML algorithms, and found that the hybrid modeling approach improved R2 by 0.14 [21]. Thirteen input variables, e.g., the population, traffic, wind speed, temperature, etc. are considered in their model. However, in reality, ML can offer much more than that. Further research about ML-based decision and policy making can be conducted to protect the environment.
In the prevention of contaminants migration, although many efficient approaches, e.g., exhaust gas purification, wastewater treatment, waste reduction and utilization, have been well developed and commercialized, they can further benefit from SSE approaches. Many lab- or pilot-scale experiments still need to be conducted to optimize the process conditions and systems to control the contaminant migration. To avoid trial and error from experiments, ML methods can provide optimal solutions for specific targets by modeling these technologies. For air pollution prevention, a smart design of a combustion system for coal-fired power or waste incineration plant is indeed necessary, and ML can help to ensure high electricity generation efficiency and low air pollutant emissions [22]. For wastewater treatment processes, ML models can predict the quality of effluents, and ML-based optimization models can provide optimal inputs to minimize the material and energy consumption of the wastewater treatment, and thus contributing to carbon neutrality [23]. For household waste treatment, ML-based classification modeling in conjunction with online-sensor detection can aid in the automatic waste identification and sorting, which is greatly convenient for the selection of downstream conversion strategies [24]. The ML-based properties prediction of waste-derived products can also assist to evaluate their broad applications [25]. Furthermore, ML-based regression can be used to model, optimize, and interpret the conversion processes to achieve desired products. It has been reported that ML algorithms can be employed to model the gasification of organic waste to produce H2-rich syngas by assisting interpretation of the process, optimizing process conditions, and screening outstanding catalysts [26]. This work indicates that temperature and solid content of waste are the two most important conditions, and Fe-based catalyst shows great potential to increase H2 production.
In terms of the removal of contaminants from water, air, and soil, adsorption and degradation are the two most widely used approaches. It is found that ML methods are helpful for material development, as discussed in Section 2. Therefore, the adsorption and degradation related materials can be developed with the assistant of ML in a similar way. Moreover, due to serious issues pertaining to climate change, the development and identification of high-efficient porous materials for CO2 adsorption is a promising approach for carbon capture and emission mitigation. Our recent work has successfully proven that ML methods can model CO2 adsorption by porous carbon materials. The developed model and ML-based critical factor identification can effectively guide the synthesis of porous materials for CO2 adsorption [27]. In general, ML models developed from historical environmental data can provide predictions and new insights to aid in decision making and actions for effective environmental protection. Such ML-guided environmental management could spur the development of digital and smart environmental systems to save time, labor, energy, resources, and costs.

5 Conclusions and future prospects

Overall, promising applications of SSE approaches have been identified, especially in materials, manufacturing, and environment related areas as discussed above. We envision more value to be gained in future directions, such as: 1) Applying more advanced AI and ML as well as CPPS and IoT techniques to the current frameworks to further increase the efficiency and value added. With the rapid development of information technologies, many promising models and systems that resolve the current limitations could be exploited to achieve better performance. 2) Integrating domain expertise and human knowledge into SSE to form a hybrid data-driven and model-based approach. Using such hybrid frameworks, it would be possible to integrate physical knowledge and human experience with machine learning models in a coherent way, enabling the efficient use of both computing power and logic reasoning. 3) Evolving the norm of the next-generation intelligent laboratory and integrating smart systems thinking into daily work and life. Future development of unified data/ML frameworks built with easily accessible application programming interface will greatly increase the working efficiency. Workforce training on machine learning and Industry 4.0 is another key enabler for achieving social welfare.
Although promising opportunities are identified, many challenges also exist at this early stage, such as construction of valuable and open datasets, lack of standard algorithms workflow and fully autonomous experimental platforms, and insufficient awareness of technologies in traditional sectors. It is noted that AI is not an all-power tool and it might not be appropriate if there is too little accumulated data and knowledge or if ethics are a concern. The future of SSE supported by AI and various cyber technologies integrated with physical systems merits full expectation.

Acknowledgements

This work is supported by Tsinghua University Initiative Scientific Research Program and Tsinghua-Foshan Innovation Special Fund (TFISF). Xiaonan Wang thanks the award of Future Chemical Engineers and the Global Chinese Chemical Engineering Symposium (GCCES).
1
SuvarnaM, YapK S, YangW, LiJ, NgY T, WangX. Cyber-physical production systems for data-driven, decentralized, and secure manufacturing—a perspective. Engineering, 2021, 7( 9): 1212– 1223

DOI

2
LiL, WangX. Design and operation of hybrid renewable energy systems: current status and future perspectives. Current Opinion in Chemical Engineering, 2021, 31 : 100669

DOI

3
FangH, ZhouJ, WangZ, QiuZ, SunY, LinY, ChenK, ZhouX, PanM. Hybrid method integrating machine learning and particle swarm optimization for smart chemical process operations. Frontiers of Chemical Science and Engineering, 2022, 16( 2): 274– 287

DOI

4
CheeE, WongW C, WangX. An integrated approach for machine-learning-based system identification of dynamical systems under control: application towards the model predictive control of a highly nonlinear reactor system. Frontiers of Chemical Science and Engineering, 2022, 16( 2): 237– 250

DOI

5
LiJ, LimK, YangH, RenZ, RaghavanS, ChenP, BuonassisiT, WangX. Applications through the whole life cycle of material discovery. Matter, 2020, 3( 2): 393– 432

DOI

6
BertoliniM, MezzogoriD, NeroniM, ZammoriF. Machine learning for industrial applications: a comprehensive literature review. Expert Systems with Applications, 2021, 175 : 114820

DOI

7
GuoH, WuS, TianY, ZhangJ, LiuH. Application of machine learning methods for the prediction of organic solid waste treatment and recycling processes: a review. Bioresource Technology, 2021, 319 : 124114

DOI

8
InderwildiO, ZhangC, WangX, KraftM. The impact of intelligent cyber-physical systems on the decarbonization of energy. Energy & Environmental Science, 2020, 13( 3): 744– 771

DOI

9
LuS, ZhouQ, OuyangY, GuoY, LiQ, WangJ. Accelerated discovery of stable lead-free hybrid organic-inorganic perovskites via machine learning. Nature Communications, 2018, 9( 1): 1– 8

DOI

10
XuS, LiJ, CaiP, LiuX, LiuB, WangX. Self-improving photosensitizer discovery system via Bayesian search with first-principle simulations. Journal of the American Chemical Society, 2021, 143( 47): 19769– 19777

DOI

11
LiJ, TelychkoM, YinJ, ZhuY, LiG, SongS, YangH, LiJ, WuJ, LuJ, WangX. Machine vision automated chiral molecule detection and classification in molecular imaging. Journal of the American Chemical Society, 2021, 143( 27): 10177– 10188

DOI

12
OviedoF, RenZ, SunS, SettensC, LiuZ, HartonoN T P, RamasamyS, DeCostB L, TianS I P, RomanoG. . Fast and interpretable classification of small X-ray diffraction datasets using data augmentation and deep neural networks. npj Computational Materials, 2019, 5( 1): 1– 9

13
SchwallerP, ProbstD, VaucherA C, NairV H, KreutterD, LainoT, ReymondJ L. Mapping the space of chemical reactions using attention-based neural networks. Nature Machine Intelligence, 2021, 3( 2): 144– 152

DOI

14
LiJ, ChenT, LimK, ChenL, KhanS A, XieJ, WangX. Deep learning accelerated gold nanocluster synthesis. Advanced Intelligent Systems, 2019, 1( 3): 1900029

DOI

15
RenZ, Tian S I P, Noh J, OviedoF, XingG, LiangQ, ZhuR, Aberle A, SunS, WangX. An invertible crystallographic representation for general inverse design of inorganic crystals with targeted properties. SSRN, 2021. doi:10.2021.ssrn.3862821

16
SuvarnaM, BüthL, HejnyJ, MennengaM, LiJ, NgY T, HerrmannC, WangX. Smart manufacturing for smart cities—overview, insights, and future directions. Advanced Intelligent Systems, 2020, 2( 10): 2000043

DOI

17
GajjarS, KulahciM, PalazogluA. Real-time fault detection and diagnosis using sparse principal component analysis. Journal of Process Control, 2018, 67 : 112– 128

DOI

18
TanD, SuvarnaM, Shee TanY, LiJ, WangX. A three-step machine learning framework for energy profiling, activity state prediction and production estimation in smart process manufacturing. Applied Energy, 2021, 291 : 116808

DOI

19
WongW, CheeE, LiJ, WangX. Recurrent neural network-based model predictive control for continuous pharmaceutical manufacturing. Mathematics, 2018, 6( 11): 242

DOI

20
EvansR, GaoJ. DeepMind AI reduces google data centre cooling bill by 40%. DeepMind, 2016

21
JainS, PrestoA A, ZimmermanN. Spatial modeling of daily PM2.5, NO2, and CO concentrations measured by a low-cost sensor network: comparison of linear, machine Learning, and hybrid land use models. Environmental Science & Technology, 2021, 55( 13): 8631– 8641

DOI

22
TuttleJ F, BlackburnL D, AnderssonK, PowellK M. A systematic comparison of machine learning methods for modeling of dynamic processes applied to combustion emission rate modeling. Applied Energy, 2021, 292 : 116886

DOI

23
HeoS K, NamK J, TariqS, LimJ Y, ParkJ, YooC K. A hybrid machine learning-based multi-objective supervisory control strategy of a full-scale wastewater treatment for cost-effective and sustainable operation under varying influent conditions. Journal of Cleaner Production, 2021, 291 : 125853

DOI

24
YanB, LiangR, LiB, TaoJ, ChenG, ChengZ, ZhuZ, LiX. Fast identification and characterization of residual wastes via laser-induced breakdown spectroscopy and machine learning. Resources, Conservation and Recycling, 2021, 174 : 105851

DOI

25
LiJ, ZhuX, LiY, TongY W, OkY S, WangX. Multi-task prediction and optimization of hydrochar properties from high-moisture municipal solid waste: application of machine learning on waste-to-resource. Journal of Cleaner Production, 2021, 278 : 123928

DOI

26
LiJ, PanL, SuvarnaM, WangX. Machine learning aided supercritical water gasification for H2-rich syngas production with process optimization and catalyst screening. Chemical Engineering Journal, 2021, 426 : 131285

DOI

27
YuanX, SuvarnaM, LowS, DissanayakeP D, LeeK B, LiJ, WangX, OkY S. Applied machine learning for prediction of CO2 adsorption on biomass waste-derived porous carbons. Environmental Science & Technology, 2021, 55( 17): 11925– 11936

DOI

Outlines

/