1. College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China; National Key Laboratory of Equipment State Sensing and Smart Support, National University of Defense Technology, Changsha 410073, China
2. School of Management, Zhengzhou University, Zhengzhou 450001, China
baiguanghan@nudt.edu.cn
Show less
History+
Received
Accepted
Published Online
2025-10-07
2026-01-28
2026-04-15
PDF
(27145KB)
Abstract
A fully automated production line suddenly halts—not because of a catastrophic breakdown, but due to the subtle interplay of equipment degradation, fluctuating feedstock quality, and environmental conditions. These factors collectively accumulate and eventually trigger random outages before the designed service life. This isn’t just complexity but is a phenomenon we term “coupled degradation cascading effect,” where micro-disturbances silently propagate and are amplified. In this paper, we decode this cascade with a novel performance framework that integrates the machine-feedstock-environment coupled degradation cascading effect in IIoT-based smart manufacturing systems (SMSs). By analyzing this multi-factor coupled degradation cascading process, we develop a coupled degradation model to quantify the machine failure probability after the accumulation and amplification of incremental damage. In addition, various operation and maintenance (O&M) activities are incorporated to mitigate the adverse impacts of such coupled degradation. We further propose two performance metrics—product-oriented performance efficiency (PEP) and order-oriented performance efficiency (PEO)—to enable a more realistic and comprehensive assessment of system performance. A case study of a flexible job shop manufacturing system for high-voltage electrical apparatus demonstrates that the median life of machines under machine-feedstock-environment coupled degradation falls short of their designed median life. The proposed methodology also reveals the practical implications of the machine-feedstock-environment coupled degradation, highlighting its influence on machine reliability, system O&M status, and the overall performance efficiency of SMSs. The insights revealed in this study will change how you design, manage, and evaluate SMSs in the Industry 4.0 era.
Smart manufacturing systems (SMS) are a new paradigm in Industry 4.0 driven by the Industrial Internet of Things (IIoT) (Sahoo and Lo 2022), industrial cyber-physical systems (Jbair et al. 2022), artificial intelligence (Fan et al. 2025), and digital twin (Huang et al. 2024), which are evolving toward highly interconnected, dynamic optimization, and intelligent decision-making (Xiang et al. 2024; Zhang et al. 2024). These advancements have led to substantial changes in global manufacturing technology systems through the extensive use of large-scale intelligent devices in the manufacturing process (Fu et al. 2024). SMSs, which transform design blueprints into physical products, directly determine an enterprise’s product quality, delivery capability, and market competitiveness through their operational performance. However, trends toward the globalization of the manufacturing industry and resilient supply chains, and increasing demands for small-batch, short-period, and customized products result in complexities and fluctuations in both external and internal manufacturing environments, which pose great challenges to manufacturing enterprises (Guo et al. 2023). As system complexity and environmental volatility surge, the vulnerabilities within SMSs are becoming more apparent. The synergistic interactions among machine degradation, fluctuations in feedstock quality, and uncertain environmental factors form a complex dynamic risk pattern that threatens the continuity and stability of manufacturing processes.
Much research on manufacturing systems often treats factors such as machine degradation, quality fluctuations, and environmental disturbances independently, which neglects their coupled degradation cascading effects. Recent studies have begun to explore the coupled relationships by considering pairwise factors, e.g., machine-feedstock interdependence (Ye et al. 2022). However, it still fails to capture the dynamic interactions and risk propagation mechanisms present in real-world manufacturing environments. Degradation analysis ignoring the coupling among machine, feedstock, and environment may lead to many consequences, namely underestimating system degradation rates, misjudging maintenance timing, allowing quality risks to propagate, etc. For example, substandard feedstocks may accelerate machine wear under specific environmental disturbances, while declining machine performance further increases the probability of quality defects. This creates a vicious cycle of positive feedback and ultimately manifests as rapid performance deterioration, significant operational volatility, and increased maintenance costs. Therefore, developing a comprehensive framework that better reflects the dynamic machine-feedstock-environment coupled degradation to overcome the limitation of existing literature has become a critical challenge in SMS operations and maintenance management.
To address the challenge above, this paper aims to systematically investigate the performance of SMSs under multi-source coupled failures. We developed a machine-feedstock-environment coupled degradation model and a dual performance evaluation method—product-oriented and order-oriented, which provide theoretical foundations and decision-making support for ensuring reliable, efficient, and low-cost operation of SMSs under multiple disturbances.
1.2 Literature review
Smart manufacturing is the application of advanced smart technologies that enable rapid and stable manufacturing of new products, dynamic response to personalized product demands, and real-time optimization of production and supply chain networks (Wang et al. 2021a). The prevailing trends toward the globalization of the manufacturing industry, the escalating market demands for customized products, and the strategic emphasis on building resilient supply chains collectively contribute to a manufacturing landscape characterized by complexities and fluctuations. These dynamics introduce uncertainties into both external and internal operational environments, thereby amplifying the inherent vulnerabilities of SMSs. Within these intricate systems, synergistic interactions among the degradation of machines, fluctuations in feedstock quality, and uncertain environmental factors are highly prone to occur. The complex interdependencies can precipitate unexpected failure modes that compromise production stability. Therefore, to effectively prevent and mitigate these risks, there is a pressing need for a clear theoretical framework capable of quantifying the synergistic interactions and their cumulative impact on system reliability.
SMSs are dynamic systems that achieve product production and value creation through orchestrated activities. Producing high-quality products consistently is crucial in manufacturing, as discarding or reprocessing low-quality products increases waste and reduces overall production efficiency (Banerjee et al. 2025). Quality 4.0 is an emerging concept that deals with aligning quality management practices with the emergent capabilities of Industry 4.0 to improve cost, time, and efficiency and increase product quality (Liu et al., 2023). However, achieving quality 4.0 is challenging due to the dynamic nature of SMSs. Among the various factors that influence product quality, the health condition of production machines stands out as a fundamental and time-varying element. Deterioration in machine precision leads to deviations in product quality, and the imperfect inspection results in unreliable quality judgments, which cause the propagation of unqualified products and aggravate the degradation of downstream machines (Ye et al., 2019). This intrinsic interplay between the machine’s state and the feedstock’s quality forms a critical feedback loop within SMSs. Therefore, a deep investigation into the interaction between machine degradation and product quality is essential, as it provides the foundational insight needed for proactive quality management and precise maintenance interventions. Ye et al. (2020) analyzed the competition between continuous degradation (e.g., wear and aging) and discrete shocks (e.g., substandard inputs), quantifying the impact of imperfect quality inspection on competing failures in serial manufacturing systems. As an extension of simple serial systems, Ye et al. (2021) investigated machine degradation trajectories in series-parallel systems under high/low-quality feedstocks and established a probabilistic model for manufacturing system states, including full production, reduced production, and failure. As manufacturing systems shift toward networked and flexible paradigms, manufacturing networks have attracted significant attention from the research community. Lin and Chang (2013) constructed manufacturing systems as manufacturing networks based on stochastic flow network models. Chang et al. (2017) employed activity-on-arc diagrams to represent manufacturing networks, where each arc denotes a workstation consisting of identical machines, and each node represents an inspection point. Liu et al. (2022) developed an adaptive evaluation network for digital twin machining systems, modeling decision errors along process routes as a network. In smart manufacturing networks, the interaction between machine degradation and product quality still persists. The network topology formed by process routes among machines in manufacturing networks is more complex than the connectivity relationships in series-parallel systems. Consequently, substandard output caused by precision deterioration in upstream machines will be the input of the downstream machine, accelerating the downstream machine degradation more rapidly. Then, the machine degradation can exacerbate the generation of defective products in manufacturing networks. Therefore, the propagation of substandard feedstocks along process routes can trigger a cascading effect. Ye et al. (2022) constructed a weighted adjacency matrix to represent the topology of the manufacturing network, thereby modeling the dependency relationships between machines and feedstocks in manufacturing networks. Yao et al. (2024) developed an extended stochastic flow manufacturing network model based on the operational mechanisms of multi-state manufacturing systems that account for heterogeneous feedstocks. Cao et al. (2025) proposed joint optimization of quality-based multi-level maintenance and buffer stock within multi-specification and small-batch production. In parallel, Ye et al. (2023) integrated four-dimensional information—quality, degradation, failure, and idle status—through machine state vectors, and proposed a joint optimization method for manufacturing network maintenance and quality inspection using deep reinforcement learning. However, most of the aforementioned models treat manufacturing networks as static homogeneous systems, failing to account for machine heterogeneity and product diversity. In flexible manufacturing networks, the growing variety of products increases the interdependence complexity between machine reliability and product quality. Product diversity and machine flexibility promote the explosion of process routes and sequences, thereby widening the gap between current performance evaluation and engineering practice in SMSs. To tackle complex flexible manufacturing networks, recent study proposed a mathematical framework that describes a complete manufacturing process involving feedstock quality, machine degradation, product processing quality, and quality inspection status, solved the measurement of interacting behaviors between machine reliability and product quality under simultaneous influence of multiple process routes and product types, and evaluated connectivity risks across multiple process routes and quality risks for diverse product types (Wang et al., 2024). Although the aforementioned studies consider the competition mechanism between internal degradation and discrete internal shocks within complex manufacturing networks, these models fail to incorporate the impact of external environmental factors or random shocks on SMSs. Consequently, developing more comprehensive degradation models that integrate these influence factors is crucial.
Based on the critical need to incorporate external factors, the competing failure model provides an analytical framework for understanding the aforementioned complex interactions in SMSs. As a mission-oriented engineering system, SMSs inevitably experience degradation and failures over their service life. The deterioration stems from two primary sources (Dui et al. 2024a): continuous internal wear and aging, and external random shocks such as mechanical vibrations or thermal fluctuations. The two factors do not act in isolation; instead, they compete to cause system failure. The occurrence of either failure mode will lead to system breakdown, thereby precluding the opportunity for the other failure mode to manifest. Namely, both failure modes collectively determine system reliability in a competitive manner. Traditional competing failure models often assume a fixed impact of shocks, while recent studies have introduced more advanced features. Wang et al. (2020c) innovatively proposed that the current degradation state of an engineering system modulates the consequences of shocks, and developed a continuous state-dependent model along with a closed-form reliability function. Considering that the shock-induced damage is influenced by the current degradation state of the engineering system, the bidirectional dependency mechanism between natural degradation and shocks has been investigated (Wang et al. 2020b). Due to the limitations of closed-form solutions for continuous models, Wang et al. (2020a) proposed a discrete recursive framework and established a general discrete degradation model, in which the effects of non-fatal shocks depend on the current degradation state. The system’s capacity to withstand shocks varies dynamically across different degradation phases, thus necessitating the differentiation of shock effects at various stages of degradation. Thus, a multi-phase degradation-shock interaction model has been further developed (Wang et al., 2021b; Geng et al., 2023). Wu et al. (2023) revealed the coupled effects of dynamic environments and zonal shocks on systems under dependent failure processes. The aforementioned competing failure models that couple natural degradation with external shocks provide a universal framework for performance analysis of complex engineering systems. Nevertheless, these models exhibit limited explanatory power when applied to SMSs due to the following reasons. (1) Assuming that shocks are exogenous independent events, neglecting endogenous shock chains propagated through manufacturing networks—such as substandard feedstocks transmitted along production processes, where output from upstream equipment becomes a source of accelerated degradation for downstream equipment; (2) The physical significance of random shocks remains undefined, whereas in SMSs, it is essential to specify the concrete physical implications of shocks—such as the specific impact of inferior feedstocks on machining accuracy, or the effect of environmental factors on machine performance. In summary, SMSs face a dual degradation mechanism. (1) Internal causes include natural degradation (e.g., wear and aging) and internal shocks (e.g., substandard feedstocks). (2) External causes encompass progressive environmental disturbances (PED) (e.g., temperature and humidity accelerating equipment aging) and sudden external shocks (e.g., earthquakes and power outages). However, existing research has not yet integrated these two aspects into a unified modeling framework, leading to deviations in the performance evaluation of SMSs. Therefore, this paper models machine-feedstock-environment degradation by incorporating internal shock mechanisms and external factors to endow the competing failure model of SMSs with comprehensive physical significance.
When exploring the performance of SMSs, researchers predominantly focus their analytical perspectives on system-wide performance and risk interdependencies. Specifically, attention is placed on considering inherent and external risks (such as product quality, machine degradation, sudden events, and environmental conditions) as well as operational factors including maintenance strategies, rework processes, queuing phenomena, and order demand management, as shown in Table 1. Recent studies have demonstrated the effectiveness of quality control in reducing the long-term operational costs of manufacturing systems (Lv et al. 2024). The low-quality feedstocks trigger an interdependence between product quality and machine reliability, which further adversely affects the performance of SMSs. Within this dependency mechanism, Yao et al. (2024) quantified the interactions among processing machines, inspection machines, buffers, and heterogeneous feedstocks in multistate manufacturing networks. Gao et al. (2025) proposed a preventive maintenance (PM) decision-making method for manufacturing systems considering feedstock heterogeneity. Dui et al. (2025b) developed a multi-objective maintenance optimization approach considering heterogeneous feedstocks and human error. Current research on SMSs predominantly focuses on internal causes of degradation, while relatively less attention is paid to external sudden events and environmental factors. Awad and Hassan (2018) assessed the impact and consequences of climate change and extreme weather on manufacturing processes through scenario simulations. Such internal or external disturbances often cause the system to deviate from its normal operating range. If the system can promptly identify and recover from these disturbances, it may maintain normal operation. Failure to detect and mitigate such disruptions may push the system into a disturbed state. Without timely intervention to restore normalcy, an unstable system may further deteriorate into a catastrophic state (Leng et al. 2025). Therefore, modeling system degradation, proposing maintenance strategies, and enhancing system performance remain critical research priorities. Lu and Zhou (2019) evaluated the system failure rate by monitoring quality-related components in serial multistage manufacturing systems, and further compared the failure rate with the threshold to decide whether to trigger PM. For rapid damage recovery, Feng et al. (2022) developed a dual-strategy recovery method including rescheduling and perfect corrective maintenance (CM) for flexible job shop systems. Bahria et al. (2019) proposed integrated production-quality-maintenance policies for batch manufacturing systems where maintenance strategies include perfect PM and CM. During machine failures and maintenance, the processing of feedstocks is interrupted. In queuing theory, such an event can be described as a server suspending service to a customer. Yang and Wu (2017) studied a finite-capacity queueing system with working breakdowns, reneging, and retention of impatient customers, in which the server is subject to breakdowns and repairs while serving a customer. Steinbacher et al. (2023) incorporated PM with varying time intervals into the production queue of a failure-prone semiconductor manufacturing system. During the dynamic operation of SMSs, the steady-state of queuing models deserves particular attention. Kim and Park (2019) investigated the steady-state behavior of a flow line model in flexible manufacturing systems with deterministic service times, in which a single production line can manufacture products of different types and batch sizes. Although maintenance strategies can restore system performance, frequent machine failures and large quantities of substandard feedstocks inevitably lead to quality losses and cost increases. Therefore, Ye et al. (2022) evaluated performance indicators, including network path connectivity and quality loss in manufacturing networks under the impact of substandard feedstocks. To conserve production resources, SMSs often send substandard products detected by inspection machines to be reworked rather than scrapping them directly. Jia et al. (2023) focused on the real-time performance of rework production systems for small-batch orders. Zhu et al. (2020) studied the real-time performance of multi-stage serial manufacturing systems with rework loops and stochastic machine failures. Chiu et al. investigated a manufacturing system characterized by quality assurance through rework, defect elimination, and corrective probabilistic failures, etc. (Chiu et al. 2023) and argued that reworking defects and implementing corrective measures for unexpected failures can prevent manufacturing delays, thereby helping to meet external customers’ demands for quality and short order delivery times (Chiu et al. 2022). Although existing literature has explored many factors influencing the performance of manufacturing systems (as summarized in Table 1), these studies remain relatively fragmented and have yet to coalesce into a systematic and comprehensive framework. Smart manufacturing aims to establish a collaborative and integrated platform that introduces flexibility into product design and manufacturing processes, thereby better addressing customer needs (Wang et al. 2021c). However, when key influencing factors are not fully considered, performance evaluations of SMSs are inevitably biased. Therefore, this paper seeks to construct a relatively comprehensive framework for SMSs to enable more accurate assessment of the systems’ overall performance.
1.3 Research motivation and main contributions
With the increasing complexity of manufacturing systems and environmental volatility, the inherent vulnerabilities within SMSs have become pronounced. The interactions among machine degradation, feedstock quality fluctuations, and environmental uncertainties form complex dynamic risk patterns, threatening the continuity and stability of manufacturing processes. The machine-feedstock-environment coupled degradation cascading effects have severe practical consequences. Substandard feedstocks can accelerate machine wear under specific environmental disturbances, while declining machine performance, in turn, further increases the probability of product quality defects, creating a vicious cycle. This leads to rapid performance deterioration, increased operational instability, and rising maintenance costs. To ensure the reliable, efficient, and low-cost operation of SMSs under multiple disturbances, there is an urgent need for comprehensive performance analysis frameworks and decision-support tools that can more accurately reflect the machine-feedstock-environment coupled effects. Based on the literature review, although existing research has made significant progress in degradation modeling and performance evaluation of SMSs, the following limitations indicate that the issue requires further in-depth investigation. First, there is a lack of a unified modeling framework for the synergistic mechanisms of internal and external risk factors (such as fluctuations in feedstock quality, environmental disturbances, and sudden shocks). Secondly, most models fail to adequately consider the comprehensive impact of product diversity, internal and external manufacturing environments, system operational factors, and dynamic networked characteristics on system performance. Given the highlighted system vulnerabilities, the severe practical impacts, pressing industrial needs, and the identified shortcomings in existing research, this paper aims to solve the problem about how to develop a more systematic performance analysis framework for SMSs that considers machine-feedstock-environment coupled degradation. This paper centers on the development of a systematic analytical methodology and model. It does not delve into empirical validation, discussions of cross-industry generalizability, or economic analysis beyond modeling and simulation.
The SMS constructed in this paper is a flexible job shop manufacturing system (FJSMS). Through deep integration of the IIoT, the system establishes a networked production environment. Considering the machine-feedstock-environment coupled degradation and operational decision factors, a dual performance efficiency evaluation model for SMSs is proposed, which provides a new analytical perspective for evaluating system performance. The main contributions of this paper are listed below.
(1) This paper innovatively models machine-feedstock-environment coupled degradation in SMSs by extending Wang et al.’s competing failure model and integrating Ye et al.’s model of internal shocks triggered by substandard feedstocks.
(2) A comprehensive framework is developed that systematically incorporates inherent and external risks—such as feedstock quality, machine degradation, shocks, and environmental factors—while considering operational factors including multi-product and multi-route production, maintenance, rework processes, queuing phenomena, and order demand management.
(3) A dual-perspective performance efficiency model, including product-oriented performance efficiency (PEP) metric and order-oriented performance efficiency (PEO) metric in SMSs, is introduced. PEP is defined as the ratio of the actual total output of an SMS relative to the total demand for all products, reflecting the degree of matching between the system’s output and the total order demand after aggregating all product types. PEO refers to the proportion of successfully completed production missions relative to the total number of missions executed across multiple production operations in an SMS, reflecting the reliability of the flexible order fulfillment.
1.4 Overview
The remainder of this paper is organized as follows. Section 2 develops a framework for IIoT-based SMSs and a networked model. Section 3 models the machine-feedstock-environment coupled degradation. Section 4 proposes maintenance strategies for degraded SMSs. Section 5 investigates the performance of SMSs. Section 6 studies a case and discusses the results. Section 7 concludes the paper.
2 Framework and network modeling of SMSs under IIoT
2.1 Framework for SMSs under IIoT
Smart manufacturing is the application of IIoT, cloud computing, cyber-physical systems, and big data analytics, enabled by sensors and communication technologies that capture data at all levels and stages of manufacturing (Wang et al. 2021a). The robust capabilities in data acquisition, transmission, analysis, and application help the construction of flexible manufacturing networks capable of dynamically scheduling multiple process routes. In particular, recent advances in IIoT, along with the widespread adoption of embedded processors and sensors in factories, have made it possible to collect real-time manufacturing status data and build industrial cyber-physical systems for intelligent, flexible, and resilient SMSs (Guo et al. 2023). Based on the IoT architecture provided by Xing (2020) and Dui et al. (2024a; 2025a; 2026), this paper develops an architecture for SMSs under IIoT, which includes the physical layer, perception layer, communication layer, support layer, and application layer, as shown in Fig. 1.
Through the deep collaboration of the five-layer architecture shown in Fig. 1, the SMS can maintain high productivity and resilience under complex disturbances, progressively achieving the evolutionary goals of self-adaptation, self-healing, and self-optimization. The foundation of the system is the physical layer, which comprises physical entities such as CNC machine tools, industrial robots, visual sensing devices, and online inspection equipment, etc. The physical layer is responsible for executing machining commands and achieving process-level closed-loop control, thereby ensuring real-time precision in the manufacturing process. The perception layer relies on a multi-source sensor network to collect real-time data on equipment vibration, temperature, operational status, and product quality characteristics, and performs preliminary data screening. The communication layer employs standardized industrial communication protocols such as 5G, time-sensitive networking, industrial Ethernet, and OPC UA to establish a low-latency and high-reliability data transmission channel, which enables the secure and efficient aggregation of perceptual data to the industrial cloud platform. The support layer, based on a cloud-edge collaborative architecture, integrates data lakes, distributed computing, and digital twin models, which achieve unified management, visual traceability, and intelligent analysis of product lifecycle data, including machining parameters, quality inspection results, rework records, and scrap status. The application layer integrates data from manufacturing execution systems (MES), enterprise resource planning, and supply chain management systems, and then optimizes the allocation of production resources and enables real-time evaluation of system performance.
This five-layer architecture collectively supports a concrete SMS centered around a multi-process route flexible manufacturing network. The system integrates heterogeneous machines and multi-type products, enabling coordinated multi-process route production through processing, quality inspection, and product diversion. Process routes for different products converge at shared machines, forming queues in buffer areas. Equipped with embedded vision devices and intelligent inspection units, the system evaluates product quality in real time and classifies items into three categories: qualified, reworkable, or scrapped. The SMS then coordinates with the rework system for parameter adjustments or disposal. Reworkable defects (such as dimensional deviations) autonomously trigger rework procedures, with products re-entering the main process after repair. Non-reworkable defects are scrapped immediately. It is important to note that quality inspection is imperfect. Once the defective output is misjudged and flows downstream for processing, it can accelerate machine degradation, which is further exacerbated in harsh environments. The degraded machines, in turn, lead to more defective output due to reduced machining accuracy, creating a vicious cycle. To address equipment wear, process deviations, and environmental disturbances, in equipment health management, the SMS implements PM by proactively replacing high-risk components and implements CM based on the machine maintenance priority index during unexpected failures. Additionally, the SMS can intelligently evaluate comprehensive system performance in real-time by analyzing and integrating the machine degradation and maintenance history. Further, the historical and current performance data of the SMS can be collected, analyzed, and utilized so as to guide scheduling, maintenance, and quality control in manufacturing system operations. Ultimately, through a closed loop of “perception–communication–decision–execution,” the system deeply integrates quality traceability, environmental awareness, and maintenance decision-making, which continuously enhances resource utilization and order response capability amid internal and external fluctuations. This paves the way toward a highly resilient smart manufacturing future. Based on this, the overall research framework for the proposed SMS is shown in Fig. 2.
2.2 Dynamic directed network modeling of SMSs
To quantitatively model and analyze the aforementioned system, the system is abstracted as a complex manufacturing network composed of heterogeneous machines and heterogeneous product process routes, whose structure can dynamically adapt to production batches and material flows. In this paper, nodes represent manufacturing units, while edges denote the direction of process routes. The network can be formally defined as a tuple . Across different production batches, products are processed by specialized or flexible machines according to their process plans and real-time scheduling. This results in a system structure that, from a spatial perspective, forms a directed network, and from a temporal perspective, constitutes a dynamic network (Zhang and Zhang 2017). represents a set of manufacturing units, where . is a set of buffer areas. is a set of processing machines. is a set of quality inspection equipment. A manufacturing unit , where , consists of one buffer area, one processing machine, and its corresponding quality inspection equipment, which is abstracted as a single node in the network. refers to a set of product types. represents a set of all process routes. The process route for a product can be represented as a set of directed edges between nodes. For example, a specific route may be denoted as {}. In SMSs, using machine identifiers to represent process routes is also a concise and efficient method, particularly suited to complex manufacturing environments involving multi-machine collaboration and flexible production. Thus, a process route can alternatively be expressed as “.” The machine dependency relationships in process routes are denoted by representing a predecessor set of manufacturing unit and representing a successor set. The connectivity between nodes is described by the matrix , where the element indicates the state of the edge . . indicates that there exists an effective connection from to and both are functioning properly. indicates that no effective connection exists between the two manufacturing units, or the connection is abnormal. It is important to note that although product process routes are fixed, the equipment required may vary across production batches. As a result, the corresponding manufacturing network structure dynamically changes accordingly.
Products flow through the network following their process routes. Differences in production speed across machines and resource sharing cause work-in-process (WIP) to form queues in buffer areas. Each machine and its buffer can be modeled as a single-server queuing system with an infinite customer population. WIP items enter the queue independently with interarrival times following a certain distribution. If the machine is occupied, WIP items wait in the queue, following the First-Come-First-Served (FCFS) discipline. Service times are assumed to be deterministic. After processing, the product quality index is a continuous random variable following a normal distribution, . The specification limits are defined symmetrically around the mean . Based on the value of , product quality status is classified according to the following rules:
(1) Qualified. This qualified status is assigned if and only if the value of falls within the specification limits . The probability of this event, , is defined as .
(2) Defective. This defective status occurs if falls outside the limits , either below or above . The probability of this event is calculated as . This status is further classified into two categories based on rework criteria:
(i) Defective and reworkable. The reworkable status is assigned if the product is defective and can be reworked. The probability is .
(ii) Defective and non-reworkable (i.e., scrapped). The status is assigned if the product is defective and cannot be reworked. The probability is , where .
The probabilities are fundamentally derived from the cumulative distribution function of the normal distribution of , the positions of and , and the specific rework policy that partitions the defective region into reworkable and non-reworkable areas.
The inspection equipment has a misjudgment probability . The probability that a product passes inspection, whether truly qualified or misjudged, is defined as the pass probability, denoted by . If no actual inspection is performed after a processing step, it can be abstracted as a hypothetical inspection unit with a pass probability = 1.
3 Machine-feedstock-environment coupled degradation model for SMSs
In SMSs, two typical failure modes include abrupt failures and gradual degradation. Abrupt failures are triggered by sudden events (e.g., earthquakes, power outages, or software crashes), characterized by instantaneous loss of machine function or cliff-like performance drops. The failures are unpredictable and highly disruptive, requiring emergency maintenance for rapid response. Gradual degradation stems from long-term cumulative effects (e.g., bearing wear, lubrication failure, or electronic component aging), manifesting as slow performance decline over time. This mode can be proactively managed through real-time monitoring and health prediction (e.g., remaining useful life estimation) to plan maintenance strategies in advance.
The above two failure modes originate from the dual influence of internal degradation and external disturbances. Internal degradation refers to the decline in machine performance caused by inherent factors within the system, manifesting as material fatigue, component wear, or functional deterioration. Its driving force stems from the physical interaction between the equipment’s operational processes and the processed objects. Internal degradation is gradual and primarily classified into two categories. The first type of internal degradation includes natural aging, wear, loosening, etc. The second type of internal degradation is additional stress induced by processing substandard feedstocks, which accelerates machine degradation through physical interactions. External disturbances arise from natural environments or uncontrollable external factors independent of the equipment’s own operation, which may trigger sudden failures or gradual degradation. The machine-feedstock-environment coupled degradation in SMSs is illustrated in Fig. 3.
3.1 Machine-feedstock coupled degradation model in SMSs
For the first type of internal degradation, mechanical components (e.g., bearings, gears) develop material fatigue due to prolonged operation, leading to reduced strength and tolerance deviations. The baseline failure rate of a machine caused solely by material fatigue, component wear, and similar factors, excluding external disturbances or additional effects from processing non-conforming products. Due to the flexibility to characterize various failure patterns and good approximations to machine failure, the Weibull distribution has become a popular distribution used in the analysis of machine degradation (Ye et al. 2021; Ye et al. 2022). Compared to the system failure rate of the exponential distribution, the Weibull distribution provides a better insight for machine reliability decision-making (Das 2008). Thus, in this paper, the baseline failure rate is defined as
where is the cumulative operating time of a machine, is the shape parameter, is the scale parameter. If , increases over time. If , decreases over time. If , .
The second type of internal degradation is related to the number of misjudged products received by the processing machine from upstream and the extent of product quality deviation . Substandard products (e.g., dimensional inaccuracies, material inhomogeneity) induce abnormal load distributions during processing, subjecting the machine to stress impacts under off-design conditions. Substandard products with out-of-tolerance dimensions cause abnormal vibrations during processing due to clamping mismatches, leading to relative slippage between the workpiece and fixture system, which increase the friction coefficient and exacerbates fixture wear. Fixture wear further reduces clamping accuracy, resulting in more workpieces exhibiting dimensional deviations during processing, which in turn intensifies wear. Meanwhile, substandard feedstocks with abnormal hardness or material inhomogeneity may cause localized overload, such as a sudden surge in cutting force on the tool, leading to cutting edge chipping. This paper focuses on the relationship between product quality distribution and the incremental failure rate. The actual quality index of a product is denoted as , and the acceptable quality range is defined as . Generally, when , the SMS operates with a six sigma quality management system, where the probability of producing a substandard product is , effectively eliminating extreme deviations. When , the SMS follows a three sigma quality management system, with a substandard product probability of , resulting in a higher proportion of products with significant quality deviations. Compared to , the quality management system has a higher probability of processing severely substandard products, leading to greater cumulative damage to the system. Therefore, a smaller value corresponds to a higher probability of producing severely substandard products. Consequently, the frequency of misjudged substandard products being further processed increases, amplifying the detrimental impact of each substandard product on the machine. The product quality deviation is denoted by , defined as the difference between the product quality index and the acceptable quality range, characterizing the magnitude of out-of-specification deviation. When , . When , . During internal degradation, the incremental machine failure rate , where is a mapping from to . The greater the quality deviation , the larger the failure rate increment . This paper employs the Beta distribution to model the influence of products in out-of-specification status in the second type of internal degradation on the machine failure rate increment (Ye et al. 2022). The parameters and of the Beta distribution reflect the impact of quality deviation on the failure rate increment .
The internal degradation failure in SMSs is fundamentally a result of machine-feedstock coupled degradation. Substandard feedstocks induce localized stress concentration through physical interactions, accelerating wear of critical components, while the tail risk of the quality distribution (i.e., the occurrence of extremely substandard feedstocks) further amplifies the failure rate increment. The failure rate of a machine under machine-feedstock coupled degradation is
where the number of misjudged products received by a machine, denoted as , accumulates over time. This additive model captures the physical mechanism of cumulative damage observed in SMSs. The baseline failure rate represents the natural aging of the machine, while each substandard feedstock processing event acts as an independent internal shock, generating a discrete increment in failure risk. Each event independently contributes a portion of extra risk. Since these risk sources are mutually independent, their contributions are additive. This aligns with the core method in reliability engineering for handling competing failure modes between “natural degradation” and “external random shock” (Wang et al. 2020a; Wang et al. 2020b; Wang et al. 2020c). Compared to a multiplicative model, the additive model is more suitable for simulating such independent additional risk sources rather than a global acceleration effect. In this paper, the impact of substandard feedstocks is a discrete, localized extra damage, which constitutes an extra burden added on top of the baseline degradation process, not an accelerator that speeds up the entire process. Furthermore, a multiplicative model could make the failure rate excessively sensitive to parameters, potentially leading to an unrealistic explosive growth in failure probability when is large, which contradicts the actual gradual degradation behavior of most mechanical systems. The additive model can capture the influence, offer numerical robustness, and provide a conservative yet realistic estimate that aligns better with engineering intuition. Consequently, the failure probability of the machine under machine-feedstock coupled degradation, denoted as , is expressed as follows.
3.2 Machine-feedstock-environment coupled degradation model in SMSs
External disturbances, as exogenous drivers of failures in SMSs, collectively form a machine-feedstock-environment coupled complex degradation mode together with internal machine-feedstock coupled degradation. External factors include natural environments or uncontrollable external causes, which are categorized into PED and sudden external shocks.
PED refers to sustained environmental changes that gradually erode machine performance through physical or chemical effects. For example, high temperatures cause machine overheating or reduced lubricant viscosity; humidity triggers metal corrosion; and particulate intrusion into precision guideways increases the friction coefficient. These factors can weaken the machine’s ability to withstand impacts from substandard feedstocks, significantly amplifying the incremental failure rate induced by identical quality defects, which accelerates the internal aging process of machines and leads to greater fluctuations in product quality. Let represent different types of PED, such as high temperature, high humidity, extreme cold, etc. Under a specific disturbance, the product quality index follows , where denotes the mapping of the environmental impact on the product quality index and characterizes the disruption of processing stability caused by the environmental disturbance. Under progressive environmental influences, the incremental failure rate is given by , where a larger quality deviation results in a higher . The impact of products in out-of-specification status on the machine failure rate increment under PED is described using a Beta distribution. The machine failure rate under PED is expressed as
The sudden external shock refers to an instantaneous, high-intensity disturbance event that originates from outside the SMS. Examples include accidents such as power quality anomalies, or disasters such as earthquakes and fires. For instance, voltage sags or momentary interruptions caused by start-up/shutdown of heavy equipment nearby or severe weather can lead to control errors in CNC machine tools, resulting in tool-workpiece collisions and physical damage. The vibrations induced by a collision from an out-of-control heavy-duty vehicle can compromise the alignment of precision equipment. This misalignment can then adversely affect the manufacturing process, causing the product’s quality index to fall far from its specification limits. In addition, the fatal consequences of some disasters are evidenced by real-world cases. The 1995 Kobe earthquake severely damaged industrial infrastructure, causing a 7-d city-wide power outage and over 80 d to restore industrial utilities, with profound impacts at the plant level (Cole et al. 2016; Caputo et al. 2023). Similarly, the earthquake in Taiwan Province in 2016 halted semiconductor production (Qin et al. 2022). In 2000, a minor fire in a semiconductor manufacturing plant of the Philips NV Group contaminated cleanrooms and interrupted the production of cellular phone chips (Caputo et al. 2023). Besides, the 2011 Thailand floods devastated hard disk manufacturing, triggering a sharp decline in shipments (Qin et al. 2022). In this study, sudden external shocks arrive based on a Poisson process at a specific rate (Dui et al. 2025c). Let denote the arrival time of the -th sudden external shock. Let be the shock magnitude of the -th sudden external shock, which is a random variable following a certain distribution. If the shock magnitude is below the safety threshold (i.e., ), the shock causes no damage to the machine. Its impact can be mitigated through simple operations such as lubrication, and it does not affect the machine’s degradation process. If the shock magnitude exceeds the fatal threshold (i.e., ), the shock immediately causes a failure. If the shock magnitude falls between these two thresholds, it inflicts cumulative damage to the machine’s degradation process. Based on the threshold values, the impacts of sudden external shocks are classified into three grades. (1) Grade I (Safety zone): shocks in the range [0, ). (2) Grade II (Damage zone): shocks in the range . (3) Grade III (Fatal zone): shocks in the range (Wang et al. 2020b). Reference provides the rate at which multiple sudden external shocks of the same intensity accelerate machine degradation based on three thresholds (Dui et al. 2024b). In this paper, there are two thresholds and we consider that one sudden external shock falls. Thus, based on and adapted by the core principles of Dui et al.’s model (2024b), when one sudden external shock falls within the damage zone, the rate at which one shock accelerates machine degradation is given by
The resulting increment in the machine failure rate due to this shock is
where represents the duration of the shock’s impact up to the current time.
The comprehensive degradation is the sum of machine degradation and shock-induced degradation (Wang et al. 2020b; Wang et al. 2021b). Therefore, the comprehensive machine failure rate under machine-feedstock-environment coupled degradation is
where represents the total number of Grade II shocks in the damage zone experienced by the SMS during the time interval . Based on the definition of the failure rate, is expressed as
where is a probability density function. Then we derive . Thus, the comprehensive failure probability of the machine under machine-feedstock-environment coupled degradation, denoted as , is expressed as
In Eq. (9), the internal natural degradation, the impact of substandard feedstocks on the machine under PED, and the impact of external shocks are integrated into a single model. Specifically, the first term within the exponential brackets of Eq. (9) relates to the degradation of the machine’s inherent reliability, the second term corresponds to the influence of substandard feedstocks on machines under the specific environment, and the third term accounts for the impact of external shocks. When all shocks fall in the safety zone and the SMS is without PED, the result from Eq. (9) is equal to the result from Eq. (3). Notably, there are differences between the impact of substandard feedstocks on the machine and the impact of external shocks. If substandard feedstocks are processed by downstream machines, their effect on those machines is always present. The machine incurs cumulative damage from repeatedly processing such feedstocks. One minor instance of such damage may not immediately cause a failure of the machine at the moment of processing. However, the accumulation of multiple minor damages accelerates machine degradation, eventually leading to failure at a certain point in time. In contrast, the impact of external shocks exhibits a zoning effect. Shocks within the safety zone, whose effects can be mitigated through measures such as lubrication, are not accounted for in the model. Shocks in the damage zone have a cumulative impact on the machine, similar to that of substandard feedstocks, though their specific calculation methods differ. Shocks in the fatal zone cause the machine to fail instantaneously upon occurrence. In this case, the machine’s reliability drops to zero, meaning the failure probability becomes 1, and no further functional expression needs to be derived.
To visually demonstrate the impact of the coupled degradation cascading effects among machine degradation, substandard feedstocks, and environmental factors on equipment degradation, a comparative diagram of degradation processes under a stepwise cumulative damage mode is presented in Fig. 4. Each subplot represents the evolution of the degradation amount under different combinations of factors in the coupled model. It aims to reveal how the multi-factor coupled degradation influences the machine degradation process and increases the risk of machine failure, thereby providing a basis for subsequent maintenance decisions and performance evaluation.
4 Integrated maintenance mechanism for SMSs
This section proposes an integrated maintenance mechanism for SMSs. This mechanism comprises two strategies, including PM based on real-time machine reliability and CM based on machine maintenance priority index. The SMS studied in this paper is a repairable system. Both maintenance strategies belong to perfect maintenance, meaning machines are restored to their original state after maintenance. The maintenance process can be modeled as a multi-server single-queue queuing system, where the key behavioral difference lies in the waiting mechanism. PM tasks do not queue if maintenance resources are unavailable, whereas CM tasks do. In addition, both maintenance strategies are governed by different triggering conditions and maintenance sequencing rules.
4.1 Preventive maintenance based on real-time machine reliability
PM refers to a series of maintenance activities carried out before a machine fails. Through systematic inspection, equipment testing, and component replacement, PM aims to prevent functional failures and keep the machine in a specified state. PM can be regarded as an opportunity to proactively address potential system failures (Dui et al. 2024c). In this paper, PM is automatically triggered when a machine’s reliability falls below a predefined threshold and idle maintenance resources are available. PM tasks are scheduled in ascending order of real-time reliability. The machines having the lowest reliability are placed farthest forward in the queue. Service time (i.e., maintenance duration) is fixed. The number of servers (maintenance resources) is limited. If all servers are busy, newly arrived machines leave the queuing system immediately without joining the queue.
Since reliability and failure probability sum to 1, based on degradation model for SMSs proposed in the Eq. (9), the reliability of machine under the machine-feedstock-environment coupling is expressed by Eq. (10). The machine reliability is evaluated in real time, and PM is triggered when , where denotes the minimum acceptable reliability threshold allowed for the machine.
4.2 Corrective maintenance based on machine maintenance priority index
Machines are certainly subject to aging, deterioration, and some unavailability intervals due to PM activities as well as random machine breakdowns, which lead to CM activities (Ghaleb et al. 2021). CM involves addressing failures, requiring rapid and effective responses to resolve issues and prevent further damage. Therefore, when facing limited maintenance resources, it is critical to use a priority index to promptly identify the most critical components for system recovery. In this paper, the maintenance priority index is proposed based on the mean time between failures (MTBF) of machines and the productivity loss caused by machine downtime. The index quantifies the impact of different machines on the overall functionality of the SMS and determines a maintenance sequence. CM is triggered upon machine failure, i.e., when reliability drops to zero. CM tasks are prioritized based on the predetermined maintenance sequence. The machines having the highest maintenance priority index are placed at the front of the queue. Service time follows a probability distribution. If a failed machine arrives and no server is available, it joins the queue and waits until maintenance is completed.
MTBF is a key metric for evaluating equipment reliability. A higher MTBF indicates longer failure-free operation, reflecting greater reliability and a longer operational lifespan. Machines with high MTBF typically exhibit lower failure frequencies and are often critical components of the system (Sudadiyo 2025). While these machines are designed for extended service lives and fail infrequently, their failures can have severe system-wide consequences (Shanbhag et al. 2025). From an opportunity cost perspective, repairing a failed high-MTBF machine ensures that the system regains a highly reliable asset for a prolonged period. In contrast, a failed low-MTBF machine may be repaired quickly but is likely to fail again soon, offering only short-term benefits. Due to the limited maintenance budget, performing maintenance activities for all machines of the system is neither possible nor logical (Mirhosseini et al. 2022). Therefore, prioritizing the repair of failed high-MTBF machines represents a strategic trade-off, aimed at maximizing the long-term stable output of the system, which corresponds to the main focus of reliability-centered maintenance (Arno et al. 2015). This approach does not imply that low-MTBF machines are unimportant; rather, it reflects a resource-allocation decision during fault recovery when not all machines can be serviced immediately. Therefore, this strategy is suitable for systems requiring high reliability and long-term stability, such as SMSs.
Productivity loss due to machine downtime reflects the immediate impact of machine failures on system performance. The machine’s productivity loss is a simulation-based metric rather than a model-based theoretical estimate. An SMS operates for a period without any machine failures, and the baseline productivity during this period is recorded. Then, under identical initial conditions, only machine is allowed to fail. The system is run again for the same duration, and its productivity is recorded. The difference between these two productivity values represents the productivity loss attributable to the failure of machine . When a machine with high productivity loss fails, it instantly significantly affects system performance and production efficiency. Under limited resources, prioritizing the maintenance of machines with high productivity loss maximizes resource utilization efficiency. Although repairing such machines may require more resources, the effects are more substantial, enabling rapid recovery of system performance and production efficiency. Thus, this strategy is especially applicable in scenarios demanding quick responses and minimizing short-term economic losses.
Therefore, the maintenance priority index for a machine in the SMS is modeled as
where the first term represents a temporal metric, while the second term represents a performance metric. and are the MTBF and the productivity loss of a machine, respectively. The coefficient denotes the weighting factor, where , ensuring . When system managers prioritize long-term system stability, a larger should be set, indicating a greater emphasis on failed high-MTBF machines. Conversely, a smaller indicates a greater emphasis on the immediate economic impact of machine failures, and the maintenance priority is given to machines with high downtime losses.
5 Product analysis and dual performance efficiency evaluation in SMSs
5.1 Product quality analysis in manufacturing units
The output of a single processing machine is divided into three categories: qualified products with probability , reworkable defective products with probability , and non-reworkable defective products with probability . Considering that the inspection equipment may misjudge defective products as qualified with a probability , the flow of qualified products in the system includes both truly qualified products and misjudged defective products. The probability of truly qualified products being passed is , while the probability of misjudged defective products being passed is . Therefore, the probability that a product from the processing machine is passed to downstream processes (i.e., the pass rate of the inspection equipment denoted as ) is given by Eq. (12).
For reworkable defective products, if they are not correctly identified by the inspection equipment, the probability of being misjudged as qualified and flowing downstream is , which is already included in Eq. (12). If the inspection equipment correctly identifies reworkable defective products, these products are returned to upstream machines for reprocessing. Therefore, the probability of a product being sent for rework, denoted as , is given by Eq. (13).
For non-reworkable defective products, if they are not correctly identified by the inspection equipment, the probability of being misjudged as qualified and flowing downstream is , which is already included in Eq. (12). If non-reworkable defective products are correctly identified by the inspection equipment, they are directly scrapped. Therefore, the scrap probability, denoted as , is given by Eq. (14).
In summary, for a manufacturing unit , holds.
5.2 Product flow analysis in manufacturing units
The product arrival rate of the manufacturing unit is defined as the number of products arriving at the unit per unit time, which consists of two components, including inflow from the normal process route and return flow from the rework process route (Lin and Chang 2012). It is noteworthy that due to the possibility of misjudgment in quality inspection, the inflow from the normal process route includes both genuinely qualified products and defective products mistakenly classified as qualified, while the rework return flow consists of accurately identified substandard products. To precisely describe these two flows with different quality statuses, the apparent qualified arrival rate and the confirmed substandard arrival rate are defined. The apparent qualified arrival rate represents the average rate at which products enter under the predefined process path without rework, whose value depends on the set of predecessor nodes . If there is only one predecessor unit, has a single element. If is a convergence point with multiple incoming edges, contains multiple predecessor units. , where is the number of qualified products transferred from the upstream unit to . represents the connectivity between manufacturing units. The confirmed substandard arrival rate denotes the rate at which products are returned to due to the detection of reworkable defects. , where is the actual service rate of machine and is the rework probability. Therefore, the total product arrival rate for the manufacturing unit is
Upon arriving at the manufacturing unit, if the processing machine is idle, the product proceeds directly into the machine for processing. Conversely, if the processing machine is busy, the product enters the buffer queue and waits until the previous product is completed before being processed. After quality inspection, the product leaves the manufacturing unit. Therefore, for the normal process route, the total apparent qualified output from per unit time is . Considering that may have multiple directly downstream manufacturing units, the product flow on all outgoing process connections must satisfy the following conservation law.
where represents the flow of apparent qualified products from to the downstream unit . denotes the set of immediate downstream units of . The distribution of the output flow to downstream units depends on the connectivity and the routing rules of the manufacturing network. Equtions (16) and (17) ensure that the total apparent qualified output from is fully allocated to downstream units. The connectivity ensures that flow only occurs where active connections exist ().
5.3 Product-oriented and order-oriented performance efficiency evaluation of SMSs
The core objective of SMSs is to complete predefined production tasks within order delivery periods—that is, to produce specified types and quantities of products that precisely meet the ordering requirements of sellers. In adhering to this objective, SMSs prioritize a stringent task-oriented principle. Namely, achieving perfect alignment between delivered products and order demands is the ultimate goal; meanwhile, the system strives for high precision in production and optimal order fulfillment within delivery timelines. However, in complex manufacturing environments with potential uncertainties, scenarios may arise where fully achieving predefined tasks becomes challenging. Considering that sellers may intentionally over-order to mitigate potential demand fluctuations or supply risks, SMSs activate a flexible fulfillment mechanism. Under the constraint of the seller’s over-ordering rate , the system allows delivery quantities to be moderately lower than the seller’s order demand. Here, the over-ordering rate is defined as the ratio of the seller’s excess order quantity to the total order quantity.
Assuming each seller corresponds to one order, the SMS receives a total of orders from sellers. The demand vector for the -th order is denoted as . Based on the known order demands, the corresponding production plan can be formulated. From the perspective of network topology, a quantitative mapping relationship exists between the demand for different products in the orders and the output of the sink nodes in the corresponding process routes. Based on the specific quantity requirements for different products determined by the orders, these demands can be precisely allocated to the total output requirements of the corresponding process routes in the manufacturing network. That is, based on all order demands, the SMS needs to produce a total of types of products. Since the same type of product required by different orders can be produced collectively, the quantity demanded for the -th product is . Therefore, the ideal output vector of the manufacturing network is . Here, represents the ideal total output of the SMS, i.e., the total product demand.
However, constrained by process defects or quality fluctuations, some products require rework and re-entry into the production flow, while others are directly scrapped, leading to a decline in the effective yield rate. Due to the presence of multiple sources of uncertainty in the actual production process, the system adopts an over-supply strategy to compensate for non-value-added losses. Consequently, the actual output vector of the smart manufacturing network is denoted as . represents the actual total output of the SMS.
Given the complexity of operating a flexible SMS, with diverse product types and large order demands, the pursuit of overall system efficiency allows for occasional shortages of some products. Therefore, this paper proposes a product-oriented performance efficiency metric, denoted as , which reflects the proportion of the total demand covered by the actual total output of the system.
where the contribution of over-fulfilled products is saturated and recorded only as . . A higher PEP indicates a stronger global performance efficiency of the system and a greater capability to complete production tasks. However, this metric only considers the internal production objectives of the SMS and lacks measurement of order or seller fulfillment satisfaction.
Let the actual delivered quantity for the -th order be denoted as . Let be the number of orders for which product can be fully delivered, meaning the types and quantities of delivered products strictly match the order requirements. When , and . When , and . is expressed as
Under the flexible fulfillment mechanism, when , the SMS is considered to have flexibly met the actual demands of the seller. Let be the number of orders whose demands are flexibly satisfied. The criterion for successful production tasks in the SMS is that the proportion of orders flexibly satisfied within the task period exceeds the task baseline , i.e.,
where denotes whether the -th executed production task is successful. If the task is successful, namely , . When the SMS executes a total of tasks, the number of successful tasks is , where . Therefore, this paper proposes an order-oriented performance efficiency metric, denoted as .
In summary, the PEP and PEO metrics characterize the efficiency of SMSs from two distinct dimensions, including product demand coverage and order task completion, respectively. PEP represents the supply-demand matching degree after aggregating all product types, while PEO evaluates the overall fulfillment rate at the order level. The former reflects the system’s ability to cover total demand with its total output, where overproduction is disregarded. The latter focuses on order fulfillment reliability, allowing manufacturers to deliver products with shortfalls within a certain tolerance range. PEP emphasizes “doing the right thing” (producing according to demand), making it suitable for evaluating static production capacity allocation within SMSs. PEO emphasizes “doing things right” (reliably delivering orders), making it applicable for assessing dynamic interaction capabilities both inside and outside the system. For an SMS, high PEP but low PEO indicates sufficient production capacity but poor order coordination. Low PEP but high PEO suggests flexible order scheduling but insufficient or uneven production capacity. The introduction of PEP and PEO provides a dual perspective for quantifying the performance of SMSs. In the future, multi-objective optimization can be proposed by analyzing the Pareto frontier between PEP and PEO to balance product demand coverage and order delivery reliability.
6 Case study
6.1 Scenario of FJSMS for parts used in high-voltage electrical apparatus assembly
In a smart FJSMS for the parts production of high-voltage electrical apparatus, the job shop achieves digitalized, collaborative management across the entire process—from order to manufacturing and then to delivery—through the integration of the IIoT, as shown in Fig. 5 (Huang 2023). The system encompasses multiple advanced manufacturing units, including CNC lathes, vertical and horizontal machining centers, and other CNC equipment, all interconnected via an IIoT network. This infrastructure enables real-time data acquisition, adaptive scheduling, and dynamic resource allocation, ensuring manufacturing capabilities essential for complex component processing.
The case focuses on the production operations of the smart FJSMS over a continuous six-month period (totaling 2880 h). During this time, the FJSMS was primarily dedicated to the manufacturing tasks of 13 types of precision components with different specifications required for high-voltage electrical apparatus assembly. Based on the smart scheduling plan generated by the MES, the total production time was divided into 14 dynamically adjusted production periods. The specific time intervals for each period and their corresponding product production combinations are detailed in Table 2 (Zhang and Zhang 2017).
The execution of the production schedule in Table 2 involves 7 machines. The key performance parameters, real-time health status, and historical reliability data of these machines monitored via embedded sensors have been aggregated within the equipment performance management platform. Since PM may extend the start time of the next product, the number of machines undergoing PM at any given time is limited to no more than 2 to reduce capacity loss. For CM, considering the high cost and the constraints of the maintenance resources, it is assumed that the number of machines undergoing CM at any given time does not exceed 4. Assuming management prioritizes long-term benefits, the importance weight coefficient for maintenance is set to 1. Because all maintenance in this study is perfect maintenance, the calculation of failure probability only considers the impact of substandard products and environmental influences since the last maintenance. Table 3 shows parameters for machine failures and maintenance. These parameters are partly from a reference (Zhang and Zhang 2017). and are the shape parameter and scale parameter, respectively, in the definition of the machine’s baseline failure rate function. The failure rate parameter is in units of occurrences/hour. MTBF for each machine is under baseline failure conditions. The median life is calculated based on and . CM times follow either a Weibull or log-normal distribution. The units for MTBF, median life, and maintenance times are hours.
The process route and standard processing time for each product are defined within the FJSMS’s process database and advanced planning and scheduling (APS) system, as detailed in Table 4 (Zhang and Zhang, 2017). Benefiting from the extensively deployed machine and process flexibility, and combined with real-time production environment data, the APS system can dynamically recommend and optimally select feasible processing paths for products to achieve optimization of overall production efficiency and resource utilization.
The smart FJSMS has established an end-to-end, embedded real-time quality monitoring and predictive analytics system. Upon completing the machining of each WIP item, every processing machine utilizes its process monitoring sensors and online quality inspection modules to collect key quality characteristic data in real-time. These data are instantly transmitted to the system’s quality analysis platform. The platform can dynamically give the probability of each output being a qualified product, a reworkable defective product, or a non-reworkable defective product, denoted as , and , respectively. The output probability values for various products under different quality management levels reflect the system’s quantitative perception capability for quality risks, as detailed in Table 5. Furthermore, the product’s critical quality index (CQI) is continuously monitored. Under stable operating conditions, the CQI tends to follow a normal distribution. The quality inspection equipment deployed at workstations can strictly control the misjudgment probability for substandard products to 0.046 (Ye et al., 2022).
In the FJSMS, real-time sensor data reveal that the increase in failure rates, triggered by feedstock quality fluctuations, exhibits heterogeneous characteristics. The quality management level directly influences the probability of substandard products. When the value is lowered (reflecting looser control), the system detects a significant rise in the frequency of substandard products flowing into downstream processes, leading to an aggravated machine-feedstock coupled degradation cascading effect. This, in turn, increases the damage inflicted on the machine per processing instance of a substandard product. To quantify the impact of the machine-feedstock coupled degradation, this paper gives three typical impact patterns, which are mathematically characterized by the differentiated Beta distributions shown in rows 2-4 of Table 6 (Ye et al., 2022).
Furthermore, the environmental monitoring network captures PED (e.g., moisture, overheating, corrosive agents) in real-time. The adverse effects of these disturbances on the machines are quantitatively modeled. This paper employs differentiated Beta distributions to simulate their intensity in accelerating machine degradation, with the parameters detailed in rows 5-7 of Table 6. The environmental monitoring network is also capable of capturing sudden external shocks. The shock intensity (unit: GPa) can be modeled using a normal distribution (Wang et al., 2020a; Wang et al., 2020b; Wang et al., 2020c). . Notably, this paper uses a Poisson process to generate shock events, with a shock arrival rate of 0.0009 events per hour. If a machine that is processing a job encounters a sudden external shock, the current job is interrupted and marked as scrapped. If a machine undergoing maintenance encounters the shock, the current maintenance activity remains unaffected. Considering the hierarchical impact of external shocks, the safety threshold is set to and the fatal threshold to . Then the ranges are defined as the safe zone [0, 0.2), damage zone [0.2, 1.5), and fatal zone [1.5, + ∞] (Wang et al., 2020b). It is important to emphasize that the thresholds and zoning ranges (e.g., safe zone, damage zone, fatal zone) established in this study are illustrative examples within a specific research context, demonstrating possible scenarios of shock effects. These values are not universal constants. In practical applications, the specific thresholds vary depending on the particular characteristics of a job shop, its equipment conditions, and the actual operational environment.
This study adopts the minute as the time unit for simulating the job shop manufacturing system. All data that were originally in units of hours or seconds are uniformly converted into minutes in the simulation process. The final results are then converted into larger time scales such as hours, days, weeks, periods, and months to support visual analysis.
6.2 Machine failure probability under machine-feedstock-environment degradation under different quality management levels
This study simulates the machine degradation process triggered by multiple factors—machine, feedstocks, and environment—without considering maintenance scenarios. Figure 6 displays the machine failure probability under different quality levels (, where 1, 2, 3, 6). The three subplots in the first row present the coupled degradation mode of machine-feedstock with sudden external shock, while the three subplots in the second row illustrate the coupled degradation mode of machine-feedstock-progressive environment with sudden external shock. Notably, the sharp increase in failure probability observed in the Fig. 6 stems from sudden external shocks; however, the probability value does not instantaneously jump to 1. Simulation data indicate that the FJSMS first encountered a non-fatal Grade II shock with an intensity of 0.65 at the 1973.62-th hour. The damage induced by this shock began accumulating from this point onward, and by the 1978-th hour, the failure probability of all machines approached 1. Due to the horizontal time axis spanning 0–2880 h, the approximately 5-h damage accumulation process is not distinctly visible in the visual representation. The simulation of the machines’ degradation process adheres to the following principles. (1) Multiple machines operate independently and stop only upon failure. (2) Product queues are not considered. (3) Only substandard feedstocks and shock events are monitored. (4) Assuming that substandard feedstocks are processed immediately upon generation, thereby accelerating the machine degradation.
In Fig. 6, under the same quality level, the machine failure probability without PED is lower than that with such disturbances. In the absence of PED, degradation mode 1 (weak degradation) corresponds to the lowest machine failure probability, followed by degradation mode 2 (normal degradation), while degradation mode 3 (strong degradation) corresponds to the highest failure probability. Similarly, for degradation modes 4, 5, and 6 (i.e., weak, normal, and strong degradation under PED), the machine failure probability increases sequentially. Under the same degradation mode, stricter quality management (i.e., a larger in the quality level) reduces the frequency of dimensionally substandard feedstocks, resulting in fewer opportunities for substandard feedstocks to be misjudged and flow downstream for processing. Consequently, the machines experience milder wear, slower degradation, and a slower increase in the failure probability. Taking Machine 1 with the highest inherent reliability as an example, the median life obtained through normal machine-feedstock-environment degradation simulation for 1, 2, 3, 6 are 227, 660, 1416, and 1437, which are 0.16, 0.46, 0.9851, and 0.9997 times that of the designed median life, respectively. Taking the strictest 6 lean management as an example, the median life of machine 1–7 obtained through a strong machine-feedstock-environment degradation simulation are 1437, 820, 1301, 1013, 875, 1368 and 1159, which are 0.03%, 0.09%, 0.05%, 0.02%, 0.10%, 0.05% and 0.04% lower than the designed median life, respectively.
6.3 Manufacturing process analysis and performance evaluation under quality management
This study adopts a period-by-period sequential simulation strategy. By comprehensively considering all influencing factors proposed in this paper and through multiple rounds of simulation debugging, a rough estimate of the achievable input range and output range for each production period is determined. Based on these estimates, overproduction is calculated via the input-output ratio. The order reception mechanism dynamically matches production capacity, with total order quantities fluctuating randomly around the possible output range. Due to production process variability, the actual output corresponding to overproduction carries significant uncertainty. Different from the model in Section 6.2, this simulation introduces the following key mechanisms.
(1) Machines experience states such as idle, fault, or busy; upon failure, they enter a maintenance queue to await repair.
(2) Machine reliability is dynamically calculated based on the actual cumulative processing time.
(3) Processing strictly follows the process route, forming processing queues in front of busy machines.
(4) A perfect maintenance strategy is implemented; upon repair, machine reliability is reset to 1.
(5) The time of maintenance completion serves as the reset point for damage accumulation. Impacts from all shocks and substandard feedstocks occurring before this point are cleared. Only new events occurring after this point accumulate and affect reliability.
Figure 7 shows the simulation results of sudden external shocks. In all subsequent simulations, the manufacturing process is influenced by the external shocks depicted in Fig. 7.
Figure 8 shows the input situation of each product type per production period in the FJSMS under quality management.
Figure 9 illustrates the impact of each substandard feedstock processing event on the machine failure rate increment under quality level. Each scatter point represents one substandard feedstock processing event. The horizontal axis indicates the time of occurrence of the event, while the vertical axis shows the resulting increment in machine failure rate. In Fig. 9, the mean failure rate increments caused by substandard feedstock processing events in degradation modes 1 to 6 are , , , , , and , respectively. The results demonstrate that the impact on machine degradation progressively strengthens from degradation mode 1 to mode 6. Furthermore, the adverse effects of processing substandard feedstocks under PED (degradation modes 4, 5, and 6) are more pronounced than those without such disturbances (degradation modes 1, 2, and 3).
Based on the maintenance priority index, the prioritization order of machines in the CM is M1 (0.183), M6 (0.165), M3 (0.164), M7 (0.150), M4 (0.126), M5 (0.111), and M2 (0.103). It is worth emphasizing that this maintenance priority index remains unaffected by both the quality level and machine degradation modes. In other words, the machine prioritization order for CM remains consistent across all simulations in this study. Figure 10 summarizes the weekly frequency of different maintenance events in the FJSMS. The total number of PM is 31, 34, 38, 50, 52, 75, and the total number of CM is 100, 98, 105, 96, 94, 97 in degradation modes from 1 to 6. In Fig. 10, under quality level, the frequency of CM events across all six degradation modes is generally higher than that of PM events. Since the quality management standard is relatively stringent, the number of substandard feedstocks is comparatively lower, and machine degradation progresses more slowly. Consequently, the conditions triggering PM are met less frequently, leading to a relatively higher occurrence of CM after failures. Under PED in degradation modes 4, 5, and 6, machine degradation is more severe compared to degradation modes 1, 2, and 3. As a result, the probability of meeting PM conditions increases, necessitating PM for more machines.
Figure 11 shows the product output across different production periods under quality level. In Fig. 11, the differences in product output among all degradation modes are relatively small. Due to the stringent nature of quality management, the number of substandard feedstocks is relatively low, resulting in more stable and effective output.
Under the scenario where the sellers’ excess order rate is set at 10%, the acceptable order scale can be roughly estimated. Given the stochastic nature of order quantities, this study employs Monte Carlo repeated simulation experiments to obtain statistical means. The product order quantity received by the FJSMS is constrained both by the system’s production capacity limitations and by external market demand influences. The product order quantity refers to the sum of required quantities for the same product type across all orders. Figure 12 presents the order quantity received by the FJSMS for each product type within various production periods.
The PEP and PEO of the FJSMS for each period in all production periods are shown in Fig. 13. Except for isolated cases, PEP is generally greater than 0.8, and PEO is generally greater than 0.6. In most cases, PEO is lower than PEP. This is because PEP is only affected by the total demand and total output of each product type, whereas PEO is influenced not only by product demand and output but also by the number of orders and the order allocation strategy. Specifically, for example, in the 10-th period, PEO is slightly higher than PEP, indicating that the system adopted an effective order allocation scheme and leveraged the benefits of a flexible fulfillment mechanism during this period. In the second period in degradation mode 2 and the 12-th period in degradation mode 6, extreme phenomena in performance efficiency were observed. That is, different from other periods under this scenario, PEP plummeted and PEO approached zero in these two periods. In degradation mode 2, the PEP of the second period is lower than that of all other periods, indicating weaker production capacity during this period and poor coverage of total demand by the total output. Additionally, the PEO of the second period in degradation mode 2 is very low, suggesting that the dynamic interaction capability between the internal and external aspects of the FJSMS was severely constrained during this period. The same applies to the 12-th period in degradation mode 6.
6.4 Manufacturing process analysis and performance evaluation under lean management
Figure 14 shows the product order quantity of each production period received by the FJSMS under lean management.
Based on the production plan, Fig. 15 shows the input quantity of each product type for each production period under lean management.
For the substandard feedstock processing events under lean management, simulation results show that this event occurred only once in degradation mode 2 and mode 6, respectively, due to the extremely high product quality requirements. The failure rate increment triggered by this event was in degradation mode 2, whereas the increment reached in degradation mode 6. This is because degradation mode 6 includes PED while the mode 2 does not, leading to faster machine degradation when processing substandard feedstocks in mode 6.
Figure 16 shows the frequency of different maintenance events under lean management. The total number of PM is 35, 37, 32, 34, 36, 33, and the total number of CM is 102, 96, 98, 101, 98, 100 in degradation modes from 1 to 6. The results indicate that CM occurs more frequently than PM. Compared with quality management shown in Fig. 10, the frequency of CM under lean management is slightly lower. Additionally, under lean management, the frequency of PM corresponding to degradation modes 4, 5, and 6 is significantly lower than that under quality management.
Figure 17 shows the production output quantity of each production period in the FJSMS under lean management.
Figure 18 shows the PEP and PEO of the FJSMS for each production period under lean management. With the exception of a few cases, PEP is generally greater than 0.9, while PEO is generally greater than 0.7. In most instances, PEO is lower than PEP. This is because PEP is only influenced by the total demand and total output of each product type, whereas PEO is additionally affected by the number of orders and the system’s order allocation strategy. Notably, in the 10-th period, the PEO is slightly higher than PEP, indicating flexible order scheduling during this period. In contrast, during the 7-th period, the FJSMS exhibits high PEP but low PEO, suggesting sufficient production capacity yet poor order coordination capability. Compared to management results shown in Fig. 13, the stringent lean management has eliminated extreme values of PEP and PEO in the FJSMS.
Since quality management and lean management are more commonly adopted in smart manufacturing scenarios, the main body of this paper presents only the manufacturing process analysis and performance efficiency results under these two quality levels. Similar to the simulation modeling and analysis methods used for the FJSMS under and management, Appendix A and Appendix B provide the manufacturing process and performance efficiency results under and quality management, respectively.
7 Conclusions
7.1 Key findings
This paper developed an IIoT-based performance framework integrating a machine-feedstock-environment coupled degradation model for operation and maintenance of smart manufacturing systems (SMS). First, an architecture of the IIoT-based SMS was developed, and the dynamic network of the SMS was modeled. Second, a machine-feedstock-environment coupled degradation model for the SMS was proposed. Third, a preventive maintenance (PM) strategy based on real-time reliability and a corrective maintenance (CM) strategy based on maintenance priority index were presented. Last, product-oriented and order-oriented performance efficiency (PEP and PEO) models for the SMS were established. The key findings were drawn.
(1) quality level influences machine degradation
Higher values (e.g., 6) imply stricter quality control, lower defect rates, and reduced misjudgment processing instances, slowing machine wear and degradation. Environmental disturbances amplify the negative impact of defective products. Under progressive disturbances, the median machine life is lower than the design value.
(2) quality level drives maintenance frequency
CM frequency shows no significant difference across values. PM frequency decreases as increases. As the value increases, machine degradation slows, and reliability remains at a higher level. Then the triggering opportunities for PM decrease, leading to a notable reduction in its frequency. PM is less frequent than CM at = 3 or 6; but at = 1 or 2 with environmental disturbances, accelerated degradation triggers more PM, exceeding CM frequency.
(3) quality level ensures system performance stability
Lower values (e.g., 1) lead to looser quality control, increased defects, reduced output capacity, and lower order fulfillment, causing severe fluctuations and extreme lows in PEP and PEO. At = 6, PEP and PEO remain high and stable. Occasional sharp drops occur at = 2 or 3.
7.2 Managerial implications
This study provides insights not only for strategic quality management level selection but also for system operation optimization. On the one hand, the implementation of the lean management should be prioritized. Through stringent quality control, the 6 management slows machine degradation, reduces maintenance frequency, and ensures high and stable PEP and PEO. While the 3 management maintains relatively low maintenance frequency, it carries the risk of occasional sharp performance drops and should be used cautiously in order-oriented systems. The 2 management is not recommended due to higher maintenance costs, and the 1 management must be strictly avoided owing to high defect rates, severe machine degradation, and extreme performance instability. On the other hand, SMSs should go beyond static production capacity configuration. By leveraging flexibility, resilience, and smart technologies to dynamically respond to real-time orders and resource status, they can achieve coordinated optimization of output and order fulfillment. Introducing lean methods like the 6 is crucial for achieving reliable, efficient, and low-cost system operation.
7.3 Limitations and future research directions
Although the study has achieved the expected objectives, it still has several limitations that warrant further in-depth research. This paper does not account for the opportunistic maintenance and the trade-off model integrating PEP and PEO, which presents a critical direction for future investigation. Future work can therefore focus on enriching maintenance strategies to restore and enhance system performance. Establishing a multi-objective optimization model to balance feedstock quality, machine output capacity, and system performance is crucial for advancing system operation optimization. Additionally, simulating SMSs via multi-agent modeling represents another promising research direction.
8 Appendix A Manufacturing process analysis and performance evaluation under quality management
Figure A1 illustrates the impact of each substandard feedstock processing event on the machine failure rate increment in the FJSMS under quality level. The mean machine failure rate increments caused by substandard feedstock processing events in degradation modes 1-6 are , , , , , and , respectively, demonstrating progressively increasing impacts on the machine degradation. The adverse effects of processing substandard feedstocks under PED (degradation modes 4, 5, and 6) are more pronounced than those without such disturbances (degradation modes 1, 2, and 3). Since quality management is more lenient compared to and management, the total number of substandard feedstock processing events is higher.
Figure A2 summarizes the weekly frequency of different types of maintenance events under quality level. The results show that the total maintenance frequency under PED (corresponding to degradation modes 4, 5, and 6) is significantly higher than that under internal natural degradation only (corresponding to degradation modes 1, 2, and 3). The number of PM events progressively increases from degradation mode 1 to 6. Under PED (degradation modes 4, 5, and 6), the processing of numerous substandard feedstocks leads to more severe machine degradation. The deteriorated machine conditions are more easily detected by the intelligent condition monitoring mechanisms, resulting in a higher probability of triggering PM conditions and consequently more frequent CM activities. In degradation mode 6, PM occurs very frequently, which effectively prevents sudden machine failures. Therefore, the frequency of CM in this mode is lower than in other degradation modes.
Figure A3 shows the product output across different production periods in the FJSMS under quality level. The product output in degradation modes 1, 2, and 3 without PED is significantly higher than that in degradation modes 4, 5, and 6 with PED. The results suggest that the adverse effect of PED on product output is significant.
The PEP and PEO of the FJSMS for each production period are shown in Fig. A4. Overall, the performance efficiency values of degradation modes 1–6 exhibit a gradually deteriorating trend. Generally, the greater the intensity of degradation mode’s impact, the worse the system’s product output coverage of total demand and order fulfillment capability becomes, resulting in lower performance efficiency. Unlike under and quality management, at quality management level, the performance efficiency shows a phenomenon where PEO approaches zero even in degradation mode 1 with the weakest adverse impact. Moreover, periods with extremely low performance efficiency occur under each of the other degradation modes. Compared to and quality management, the number of periods exhibiting extreme performance efficiency under quality management is significantly higher.
9 Appendix B Manufacturing process analysis and performance evaluation under 2σ quality management
Figure B1 shows the processing events of substandard feedstocks in the FJSMS under quality management. The number of substandard feedstocks processing events under quality management is fewer than that under quality management, but greater than that under both and quality management.
Figure B2 presents the frequency of different types of maintenance events in the FJSMS under quality level. The total number of maintenance events under quality level is fewer than that under quality level, but greater than that under both and quality levels.
Figure B3 shows the output of each production period in the FJSMS under quality level.
Figure B4 shows the PEP and PEO of the FJSMS for each production period under quality level. The number of periods exhibiting the extreme performance efficiency under quality management is fewer than that under quality management but greater than that under quality management.
Arno R,Dowling N,Fairfax S,Schuerger R J,Weber J, (2015). What is RCM and how could it be applied to the critical loads. IEEE Transactions on Industry Applications, 51( 3): 2045–2053
[2]
Awad M I,Hassan N M, (2018). Joint decisions of machining process parameters setting and lot-size determination with environmental and quality cost consideration. Journal of Manufacturing Systems, 46: 79–92
[3]
Bahria N,Chelbi A,Bouchriha H,Dridi I H, (2019). Integrated production, statistical process control, and maintenance policy for unreliable manufacturing systems. International Journal of Production Research, 57( 8): 2548–2570
[4]
Banerjee A,Fizza K,Georgakopoulos D,Forkan A R M,Jayaraman P P,Milovac J K, (2025). Improving the high-quality product consistency in a digital manufacturing environment. IEEE Transactions on Industrial Informatics, 21( 1): 623–632
[5]
Cao Y,Wang P,Xv W,Dong W, (2025). Joint optimization of quality-based multi-level maintenance and buffer stock within multi-specification and small-batch production. Frontiers of Engineering Management, 12( 4): 754–773
[6]
Caputo A C,Donati L,Salini P, (2023). Estimating resilience of manufacturing plants to physical disruptions: Model and application. International Journal of Production Economics, 266: 109037
[7]
Chang P C,Lin Y K,Chen J, (2017). System reliability for a multi-state manufacturing network with joint buffer stations. Journal of Manufacturing Systems, 42: 170–178
[8]
Chiu Y S P,Chiu S W,Pai F Y,Chiu V, (2023). Minimizing operating expenditures for a manufacturing system featuring quality reassurances, probabilistic failures, overtime, and outsourcing. International Journal of Industrial Engineering Computations, 14( 1): 83–98
[9]
Chiu Y S P,Ke C Y,Chiu T,Yeh T M, (2022). Optimizing an FPR-based supplier-retailer integrated problem with an outsourcer, rework, expedited rate, and probabilistic breakdown. International Journal of Industrial Engineering Computations, 13( 4): 601–616
[10]
ColeM AElliottR J ROkuboTStroblE (2016). How do manufacturing plants respond to large physical shocks? The Kobe earthquake as a natural experiment. Available at the website of uni-heidelberg.de
[11]
Das K, (2008). A comparative study of exponential distribution vs Weibull distribution in machine reliability analysis in a CMS design. Computers & Industrial Engineering, 54( 1): 12–33
[12]
Dui H,Li H,Dong X,Wu S, (2025a). An energy IoT-driven multi-dimension resilience methodology of smart microgrids. Reliability Engineering & System Safety, 253: 110533
[13]
Dui H,Li H,Wu S, (2024a). Performance analysis of IoT-enabled hydro-photovoltaic power systems considering electrical power mission chains. Energy Conversion and Management, 319: 118962
[14]
Dui H,Lu Y,Wu S, (2024b). Competing risks-based resilience approach for multi-state systems under multiple shocks. Reliability Engineering & System Safety, 242: 109773
[15]
Dui H,Wang H,Yang Y,Xing L, (2025b). IoT-based mission reliability evaluation and maintenance optimization of intelligent manufacturing systems integrating human errors and heterogeneous feedstocks. Reliability Engineering & System Safety, 264: 111354
[16]
Dui H,Wu X,Wu S,Xie M, (2024c). Importance measure-based maintenance strategy optimization: Fundamentals, applications and future directions in AI and IoT. Frontiers of Engineering Management, 11( 3): 542–567
[17]
Dui H,Zeng Q,Xie M, (2025c). Generative AI-based spatiotemporal resilience, green and low-carbon transformation strategy of smart renewable energy systems. Frontiers of Engineering Management, 12( 4): 1220–1235
[18]
Dui H,Zhang H,Dong X,Tang C,Chen Z, (2026). A new spatiotemporal resilience optimization strategy for UAV swarm in data-physical-enabled low-altitude IoT networks. Reliability Engineering & System Safety, 266: 111762
[19]
Fan J,Yin Y,Wang T,Dong W,Zheng P,Wang L, (2025). Vision-language model-based human-robot collaboration for smart manufacturing: A state-of-the-art survey. Frontiers of Engineering Management, 12( 1): 177–200
[20]
Feng Q,Hai X,Liu M,Yang D,Wang Z,Ren Y,Sun B,Cai B, (2022). Time-based resilience metric for smart manufacturing systems and optimization method with dual-strategy recovery. Journal of Manufacturing Systems, 65: 486–497
[21]
Fu T,Liu S,Li P, (2024). Intelligent smelting process, management system: Efficient and intelligent management strategy by incorporating large language model. Frontiers of Engineering Management, 11( 3): 396–412
[22]
Gao Z,Yao J,He Y,Dui H,Liang Z,Li J, (2025). Remaining useful life prediction and preventive maintenance approach for multistate manufacturing systems with feedstock heterogeneity. Reliability Engineering & System Safety, 264: 111352
[23]
Geng Y,Wang S,Shi J,Zhang Y,Wang W, (2023). Reliability modeling of phased degradation under external shocks. Reliability Engineering & System Safety, 239: 109524
[24]
Ghaleb M,Taghipour S,Zolfagharinia H, (2021). Real-time integrated production-scheduling and maintenance-planning in a flexible job shop with machine deterioration and condition-based maintenance. Journal of Manufacturing Systems, 61: 423–449
[25]
Guo Z,Zhang Y,Liu S,Wang X,Wang L, (2023). Exploring self-organization and self-adaption for smart manufacturing complex networks. Frontiers of Engineering Management, 10( 2): 206–222
[26]
HuangF (2023). Shenyang: Mass production achieved in overcoming “neck-and-neck” technologies, TBEA’s high voltage bushing R&D and manufacturing base officially put into operation. Available at the website of ln.cri.cn
[27]
Huang J,Huang S,Moghaddam S K,Lu Y,Wang G,Yan Y,Shi X, (2024). Deep reinforcement learning-based dynamic reconfiguration planning for digital twin-driven smart manufacturing systems with reconfigurable machine tools. IEEE Transactions on Industrial Informatics, 20( 11): 13135–13146
[28]
Jbair M,Ahmad B,Maple C,Harrison R, (2022). Threat modelling for industrial cyber physical systems in the era of smart manufacturing. Computers in Industry, 137: 103611
[29]
Jia Z,Ni Z,Ma C, (2023). Dynamic process modeling and real-time performance evaluation of rework production system with small-lot order. IEEE Robotics and Automation Letters, 8( 5): 2874–2881
[30]
Kim W,Park K, (2019). Performance evaluation of deterministic flow lines considering multiclass customer and random setups. Journal of Manufacturing Systems, 53: 109–123
[31]
Leng J,Xie J,Li R,Zhou X,Gu X,Liu Q,Chen X,Zhang W,Kusiak A, (2025). Resilient manufacturing: A review of disruptions, assessment, and pathways. Journal of Manufacturing Systems, 79: 563–583
[32]
Lin Y K,Chang P C, (2012). Evaluate the system reliability for a manufacturing network with reworking actions. Reliability Engineering & System Safety, 106: 127–137
[33]
Lin Y K,Chang P C, (2013). Reliability-based performance indicator for a manufacturing network with multiple production lines in parallel. Journal of Manufacturing Systems, 32( 1): 147–153
[34]
Liu H C,Liu R,Gu X,Yang M, (2023). From total quality management to Quality 4.0: A systematic literature review and future research agenda. Frontiers of Engineering Management, 10( 2): 191–205
[35]
Liu S,Sun Y,Zheng P,Lu Y,Bao J, (2022). Establishing a reliable mechanism model of the digital twin machining system: An adaptive evaluation network approach. Journal of Manufacturing Systems, 62: 390–401
[36]
Lu B,Zhou X, (2019). Quality and reliability oriented maintenance for multistage manufacturing systems subject to condition monitoring. Journal of Manufacturing Systems, 52: 76–85
[37]
Lv X,Shi L,He Y,He Z,Lin D K J, (2024). Joint optimization of production, maintenance, and quality control considering the product quality variance of a degraded system. Frontiers of Engineering Management, 11( 3): 413–429
[38]
Mirhosseini M,Heydari A,Astiaso Garcia D,Mancini F,Keynia F, (2022). Reliability based maintenance programming by a new index for electrical distribution system components ranking. Optimization and Engineering, 23( 4): 2315–2333
[39]
Qin T,Du R,Kusiak A,Tao H,Zhong Y, (2022). Designing a resilient production system with reconfigurable machines and movable buffers. International Journal of Production Research, 60( 17): 5277–5292
[40]
Sahoo S,Lo C Y, (2022). Smart manufacturing powered by recent technological advancements: A review. Journal of Manufacturing Systems, 64: 236–250
[41]
Shanbhag V V,Kandukuri S T,Olimstad G,Schlanbusch R, (2025). Predictive maintenance of critical components in hydroelectric turbines: A review. IEEE Sensors Journal, 25( 17): 31959–31979
[42]
Steinbacher L M,Rippel D,Schulze P,Rohde A K,Freitag M, (2023). Quality-based scheduling for a flexible job shop. Journal of Manufacturing Systems, 70: 202–216
[43]
SudadiyoS (2025). Monte carlo-informed reliability and availability framework for innovative inspection timelines: A case study of the PA01–BC01 heat exchanger. Journal of Nuclear Science and Technology: 1–4
[44]
Wang B,Tao F,Fang X,Liu C,Liu Y,Freiheit T, (2021a). Smart manufacturing and intelligent manufacturing: A comparative review. Engineering, 7( 6): 738–757
[45]
Wang J,Bai G,Li Z,Zuo M, (2020a). A general discrete degradation model with fatal shocks and age- and state-dependent nonfatal shocks. Reliability Engineering & System Safety, 193: 106648
[46]
Wang J,Bai G,Zhang L, (2020b). Modeling the interdependency between natural degradation process and random shocks. Computers & Industrial Engineering, 145: 106551
[47]
Wang J,Han X,Zhang Y,Bai G, (2021b). Modeling the varying effects of shocks for a multi-stage degradation process. Reliability Engineering & System Safety, 215: 107925
[48]
Wang J,Li Z,Bai G,Zuo M, (2020c). An improved model for dependent competing risks considering continuous degradation and random shocks. Reliability Engineering & System Safety, 193: 106641
[49]
Wang X,Ke Y,Cai Z,Ye Z, (2024). Operation risk assessment of flexible manufacturing networks subject to quality-reliability coupling. Reliability Engineering & System Safety, 250: 110282
[50]
Wang Y,Li X,Mo D, (2021c). Knowledge-empowered multitask learning to address the semantic gap between customer needs and design specifications. IEEE Transactions on Industrial Informatics, 17( 12): 8397–8405
[51]
Wu B,Zhang Y,Zhao S, (2023). Modeling coupled effects of dynamic environments and zoned shocks on systems under dependent failure processes. Reliability Engineering & System Safety, 231: 108911
[52]
Xiang W,Yu K,Han F,Fang L,He D,Han Q L, (2024). Advanced manufacturing in industry 5.0: A survey of key enabling technologies and future trends. IEEE Transactions on Industrial Informatics, 20( 2): 1055–1068
[53]
Xing L, (2020). Reliability in internet of things: current status and future perspectives. IEEE Internet of Things Journal, 7( 8): 6704–6721
[54]
Yang D Y,Wu Y Y, (2017). Analysis of a finite-capacity system with working breakdowns and retention of impatient customers. Journal of Manufacturing Systems, 44: 207–216
[55]
Yao J,Gao Z,He Y,Peng C, (2024). Integrated mission reliability modeling for multistate manufacturing systems considering heterogeneous feedstocks based on extended stochastic flow manufacturing network. Reliability Engineering & System Safety, 243: 109840
[56]
Ye Z,Cai Z,Si S,Zhang S,Yang H, (2020). Competing failure modeling for performance analysis of automated manufacturing systems with serial structures and imperfect quality inspection. IEEE Transactions on Industrial Informatics, 16( 10): 6476–6486
[57]
Ye Z,Cai Z,Yang H,Si S,Zhou F, (2023). Joint optimization of maintenance and quality inspection for manufacturing networks based on deep reinforcement learning. Reliability Engineering & System Safety, 236: 109290
[58]
Ye Z,Cai Z,Zhou F,Zhao J,Zhang P, (2019). Reliability analysis for series manufacturing system with imperfect inspection considering the interaction between quality and degradation. Reliability Engineering & System Safety, 189: 345–356
[59]
Ye Z,Si S,Yang H,Cai Z,Zhou F, (2022). Machine and feedstock interdependence modeling for manufacturing networks performance analysis. IEEE Transactions on Industrial Informatics, 18( 8): 5067–5076
[60]
Ye Z,Yang H,Cai Z,Si S,Zhou F, (2021). Performance evaluation of serial-parallel manufacturing systems based on the impact of heterogeneous feedstocks on machine degradation. Reliability Engineering & System Safety, 207: 107319
[61]
Zhang C,Juraschek M,Herrmann C, (2024). Deep reinforcement learning-based dynamic scheduling for resilient and sustainable manufacturing: A systematic review. Journal of Manufacturing Systems, 77: 962–989
[62]
Zhang D,Zhang Y, (2017). Dynamic decision-making for reliability and maintenance analysis of manufacturing systems based on failure effects. Enterprise Information Systems, 11( 8): 1228–1242
[63]
Zhu C,Chang Q,Arinez J, (2020). Data-enabled modeling and analysis of multistage manufacturing systems with quality rework loops. Journal of Manufacturing Systems, 56: 573–584