Closed-Loop Digital Twin for the Product Lifecycle


 Digital Twin advances have provided the conceptual ground for integrating a physical product with its digital representation. However, Digital Twin implementation has been focused on the beginning of life and manufacturing optimization, leaving space for developing a Digital Twin model that encompasses and connects different stages of the entire product lifecycle. In this scenario, the integration between company-internal data with real-time customers' data is still a challenge. Besides, implementing such a model in a multiplatform environment is also an open issue in the literature. This paper proposes the definition of a Closed-loop Digital Twin implemented as a middleware software that connects PLM, ERP, and MES data with customers' usage data. The proposed concept was implemented and tested in a learning factory. Results demonstrated the concept potential to consolidate product data, support data analyses, and provide insights for different stages of the product lifecycle.


Introduction
Digital transformation is changing the interaction between industry systems and customers' data, requiring the development of a closed and controlled data environment. A connected environment can provide lifecycle information of the product and contribute to the creation of its digital model [1,2].
Digital Twin (DT) concept deals with this physical/digital duet [3][4][5][6]. "A digital twin is a digital representation of an active unique product … that comprises its selected characteristics, properties, conditions, and behaviors by means of models, information, and data within a single or even across multiple life cycle phases" [7].
DT is not just the merging of all the data that the product generates. It is the collection of the individualized data that is relevant for holistic information of the product [8][9][10]. It maps components on product lifecycle, physical and virtual data, and the interaction between them [11,12]. To create this environment, it is necessary to develop a solid digital architecture combining different data sources that, when individualized, compose the interface of a Digital Twin [13].
Current studies on DT are mainly focused on the beginning of life (BOL) or manufacturing optimization.
The literature lacks a DT model and application that encompasses different lifecycles phases and systems in a closed-loop information cycle [9,11]. Besides, the connection and storage of lifecycle data [10,12,14], and the implementation of a multiplatform solution on the entire lifecycle of a product [2,11,15] have not been explored in detail in the literature.
The literature and application gap, cited previously, are mainly related to the di culty of integrating and contextualizing data from systems across the product lifecycle [16], undermining the knowledge management ability based on the entire lifecycle [13,[16][17][18].
The attempts to organize product lifecycle data can signi cantly increase expenses to the industry once commercial shelf systems mainly support homogeneous solutions suites. Therefore, there is a gap in a scalable data infrastructure that integrates different types of systems and data -from different domains -while being able to be individualized in the creation of Digital Twin models [16].
Hence, this paper proposes the de nition of a Closed-loop Digital Twin, supported by a digital thread architecture, implemented as a middleware software that connects the PLM, ERP, and MES systems with the customers' usage data. The Closed-loop Digital Twin is a middleware software capable of collecting characteristics, properties, conditions, and behaviors of a speci c product by requesting this information on the dedicated software in which data is held. It also collects and integrates customers' usage information. Combining these data, the Closed-loop Digital Twin can perform different analyses and generate new insights for various stages and stakeholders of the product lifecycle. This paper is structured into six sections. Section 1 presents the introduction. Section 2 presents the main literature aspects considering Digital Twin and Digital Thread topics. Section 3 presents the research methodology and describes the steps taken for the DT development. Section 4 presents the research results. Section 5 presents the discussion of the results. Finally, section 6 presents the conclusions and suggestions for further research in the area.

Digital Twin
The concept of Digital Twin (DT) is not new [19]. As stated before, the Digital Twin deals with this physical/digital duet [4][5][6]20]. It consists of three components: the physical entity in the physical world, the virtual model in the virtual world, and the connection of data that ties the two worlds [4].
The current focus of research on Digital Twin relates to Product Lifecycle Management [21]. Digital Twin is more than a representation of the real world. It is a digital replica of a physical entity. It enables a seamless transfer of data by connecting the physical and virtual worlds [22].
The essence of a DT presents a middleware architecture to make real-time decisions [23]. The physical world data are transmitted to the virtual models to support simulations and validations [24][25][26][27][28][29]. In the interaction between data and physical entities, the integration of data is inevitable [4].
This integration between virtual and physical is described as the process of collecting data from manufacturing sites in the physical world and transmitting those data into information systems. In recent years, many new ITs have been applied to collect different data types concerning the full production lifecycle [30][31][32][33][34][35][36][37][38][39]. Examples are related to the integration between the PDM (Product Data Management), PLM (Product Lifecycle Management), and ERP (Enterprise Resource Planning) systems [15,23].
In fact, "a digital twin is a digital representation of a unique active product (real device, object, machine, service, or intangible asset) or unique product-service system (a system consisting of a product and a related service) that comprises its selected characteristics, properties, conditions, and behaviors by means of models, information, and data within a single or even across multiple life cycle phases" [7].
The Digital Twin contributes to creating a multi-modal data acquisition process that uses data analytics to predict future results of the object being studied [40]. This kind of data is deduced from physical data and is de ned as virtual data [10]. Finally, using volume, velocity, and variety of data captured with DT's help, it would be possible to leverage a new way of performing manufacturing [41].
The creation of a data-driven application, as the DT, relies on the existence of an infrastructure that links lifecycle contextualized information from various manufacturing systems. This infrastructure is described as a Digital Thread [13].

Digital Thread
The Digital Thread links different systems over the product lifecycle [17,42]. Lubell, et al. [43] reinforced by West and Blackburn [44] de ne the Digital Thread as an "unbroken data link". According to these authors, the Digital Thread connects the design and production by structuring in a unique data infrastructure information ranging from the conceptual design to the retro t and retirement of the product. Digital Thread is a data-driven architecture of shared resources that introduces the idea of connecting information generated from all stages of the product lifecycle [45]. It supports data farming and information re nement based on the context in which it is required [13].
It is important, however, to distinguish the Digital Twin concept. As discussed in previous section 2.1, the Digital Twin is a digital representation of a unique active product and serial number, a digital model of the as-built con guration. The Digital Thread is a "digital model of the as-designed system that grows in depth and detail in parallel with the corresponding system development effort" [44].
The capacity to combine data from cross-domain sources during the product lifecycle can be identi ed as an enabler of the Digital Thread. The data must be captured, stored, and transferred under the request of the surrounding system [46,47]. Hence, the Digital Thread control is not related to a unique model or platform but to the ability to link different data from the lifecycle supporting multiple viewpoints [13].
The selected architecture must consider the meaning of the data, information, and data models in order to handle the requested viewpoint required for each application [13]. Furthermore, Digital Thread is considered a data architecture capable of connecting different product lifecycle systems and prepared to generate unique views for each serial number. The Digital Thread can contain the necessary information to update the Digital Twin [45].

Research Methods
A Closed-loop Digital Twin pilot application has been developed and implemented in a learning factory that uses a con gurable skateboard as its sample product. As a design science, this experiment's conduction was focused on describing the data connection between different systems and platforms of an industrial site. However, much more than just a connection environment, the purpose was to identify gaps and challenges to create an unique data interface for product lifecycle management.
The research involved three main phases: 1) the construction of the manufacturer and the customer realtime data environments; 2) the connection between these two data environments and the DT; and 3) the development of processing algorithms -to predict behaviors -and feedback routines.
In the rst phase, two data environments were con gured. In the manufacturing environment, three IT systems were implemented at the learning factory-PLM, ERP, and MES. The PLM system was implemented on a high performance (8vCPU, 16GB RAM) Virtual Machine (VM) running an SQL Server and structured on a relational database. The ERP system was implemented on a high-performance extra memory machine (8vGPU, 32GB RAM) on a VM running SQL Server. A java-based customization environment was developed to receive the individualized customers' orders. The back end runs a relational database on a Progress environment. Finally, the MES system was implemented on a regular (2vCPU, 8GB RAM) VM running SQL Server and having a relational database on a SQL Server environment. Besides, the MES software has an integration script that automatically captures the XML le from the ERP customization environment and converts the data into its own database format.
In the real-time customer data environment, an IT platform was developed to collect users' data, captured through Internet of Things (IoT). The connection IoT device was developed on an Arduino Uno rmware with a Wi-Fi module and different sensors to capture real-time data from the product use. This data is sent to an individual No-SQL cloud storage and can be accessed through a REST API when authorized by the customer and requested by other applications. Furthermore, in this environment's development, two applications were developed to manage the customers' activity on the product use. The rst application is responsible for showing the product's usage aggregate information from the product activation until the present moment. This application collects the No-SQL data through a JavaScript request and presents the information on a HTML page. It also works the other way round, sending low-level commands from the HTML page that are manifested in the real product. The second application collects accelerometers' data to reconstruct the movements of the product. With this PHP application, it is possible to visualize the digital 3D model of the physical product. Fig. 1 presents the user interface of these two applications from the real-time data environment.
The IT systems application over the product lifecycle is described in Fig.2. Each system main characteristics are presented in Table 1. In the second phase, the connection between the Real-time database environment and the manufacturing environment was designed considering the security issues, the data protection regulations, and the processing capacity of each system. This is the stage in which the Digital Thread is conceived, preparing the digital architecture to lter individual data to create a Digital Twin. Fig. 3 presents the digital architecture developed to connect the different data sources. To establish this infrastructure, it was necessary to understand each system's export le formats and the best way to access information continuously. For the manufacturer data environment, the connection was to a SQL database through server connections. For the real-time data environment, to deal with the customers' privacy, the connection was made through a REST API on a No-SQL database. With this architecture, it was possible to con gure a basis for the creation of the Digital Twin.
After establishing the integration environment, in phase three, different processing algorithms were developed to capture individual data and use this data for simulation purposes and decision making. These algorithms were developed in Python and SQL to provide new inputs for the system, connecting the different lifecycle phases. The design space was used as an artifact to investigate new ways to use data over the lifecycle.

Closed-loop Digital Twin Application
Considering the proposed digital architecture, it was possible to prepare the environment for presenting the products' individual information, a Digital Twin of the physical products. A DT should assure real-time data collection, data integration between different platforms and systems, and product data delity [48].
Real-time data collection is allowed by a well-established communication process based on IoT technology [11,32]. A DT should explore different synchronization options with the event points of the product, especially in the case of connectivity instability on the customer side. Moreover, it should be able to import background data, component information, and con guration information from the physical components [37][38][39].
Data integration is established by the creation of a hybrid platform that merges various data formats.
The DT integration may access two distinct data environments. The rst is company internal manufacturer data (e.g., PLM, ERP, and MES). In this environment, design and production data are stored in structured databases that can be accessed when required by traditional communication protocols. The second environment is real-time customers' use data, captured through IoT technologies. The connectivity reliability to the database and the variety of data are essential for generating the big data to be processed [4]. Therefore, the DT may contribute to creating a multi-modal data acquisition process that uses data analytics to predict future outcomes of the studied object [40].
Data delity is achieved by assuring that every information processed by the DT is the same one used over its lifecycle. The DT may access different existing IT systems to create a unique data interface [23,30,49]. The DT key features of real-time connectivity, data integration, and data delity are considered in the Closed-loop Digital Twin conceptual model presented in Fig. 4.
The Closed-loop Digital Twin model may connect information from both the manufacturer environment and the customer environment, processes this information, and provide cross-validated insights to lifecycle phases and stakeholders (Fig. 4).
The Closed-loop Digital Twin implementation involved systems integration and the development of data processing algorithms. Fig. 5 presents the Closed-loop Digital Twin deployment at the learning factory.
The PLM, ERP, and MES software were installed at the cloud server of the learning factory. A remote connection to the databases was established through a TCP-IP protocol on port 1443 connection. The ERP and MES systems were already connected through XML on an integration developed by the systems' providers, so access to both systems' data was established through a TCP-IP protocol at the MES.
On the customers' side, an application was developed to collect customers' usage data and store it on a private partition of users' cloud space to be accessed through an API, respecting the information protection requirements. The user has control of its information being capable of canceling the API access at any given time. Besides, while in the manufacturer environment, the database is structured, in the customers' environment, there is a non-structured database to improve connection times and decrease storage space.
The DT was built on a Python/SQL platform that accesses the system's database by an ODBC and API connection. It was deployed on a 1vCPU, 1.7GB RAM virtual machine. The DT captures only the relevant information needed for the analysis performed and then runs a statistical analysis to predict incoming events for the manufacturer and customer -i.e., predictive maintenance, demand for spare parts to be replaced, potential product failure. The analysis results are stored in an application log and communicated to relevant product stakeholders.
This application was developed in an object-oriented view. Every system was considered as a different object in the integration module. It was designed to permit the addition of new modules without changing the DT's main structure. Every new function was added as an independent element only linked by the serial number. Thus, the model does not consume memory or hard drive and can be escalated without much use of computational resources.
An algorithm to recommend wheel replacement was implemented as an example of the closed-loop use of information for decision making. This algorithm captures, from the customers' database, the product initiation date, the total traveled distance, and the average speed. Then, this information is analyzed considering the actual wheels' maintenance parameters. The wheels' characteristics are obtained from the product structures of the PLM and ERP systems. The exact manufacturing batch parameters are captured from the MES system. The algorithm combines this data and gives the customers and manufacturer feedback about the need to order new wheels.
In the Closed-loop Digital Twin, the use of the existing systems to store the data is responsible for ensuring data delity and data governance. The fast connection time (0.0075s to the ERP, 0.0089s to the MES, 0.051s to the PLM) allows real-time data connection. The DT structure based on distinct data environments is necessary to prevent data redundancy. The integration between environments is possible because the proposed DT model is conceived for heterogeneous operation and preserves the original data structure. Hence, the application connects data from different stages of the product lifecycle, processes data, and provides new information, closing the product lifecycle information loop.

Discussion
The proposed digital architecture and Digital Twin model are connected with the current literature on the topic. The reviewed literature states that the capacity to combine data from different stages of the product lifecycle is an enabler of the Digital Thread [46,47]. Indeed, the digital architecture was developed considering each system's capacity to store and transfer data under request.
The proposed architecture is structured regarding the meaning of each information [13]. The manufacturer data environment combines the information that is typically already registered at the factory level -design, manufacturing, and customer customization -and makes it available in an integration environment. The real-time data environment captures customers' information and prepares the data to be connected with the factory data when it is requested. It contains the required information to be converted into a Digital Twin of a speci c product (serial number) [45].
The proposed Digital Twin model transforms the digital architecture into a digital representation of a unique active product [7]. It captures the individual information of the customers' use together with its speci c manufacturing, customization, and design information to create a particular viewpoint of the product. The data is deduced from the physical and is de ned as virtual data [10].
The proposed model captures data from the customers' use past information and the online state.
Therefore, it is more than a representation of the real world. It becomes a digital replica of the physical counterpart, enabling simulations to forecast different states of the physical model and act in a predictive manner [21,40].
Finally, the proposed model uses data volume, velocity, and variety to perform a new way of manufacturing. It uses the customers' captured data to provide new inputs for the manufacturing environment, closing the loop of information. The Closed-loop Digital Twin is responsible for more than just data integration and virtualization. It is responsible for generating a complete representation of the product lifecycle. It captures initial as-design and as-ordered information from the manufacturing systems. By the creation of a physical counterpart, it merges the as-manufactured and as-use data on a unique data model capable of processing this information and generating useful insights for every member of the manufacturing chain.

Conclusions
The digital architecture developed in this paper considers the main Digital Thread attributes. It connects different systems and information of the product lifecycle exploiting the advantages and various uses of the systems. Therefore, it was necessary to analyze each system database, searching for secured access points and protocols. Based on the architecture, the next step was to individualize the information to create the Digital Twin model.
The proposed Closed-loop Digital Twin model is based on the critical features of real-time data collection, data integration between different platforms and systems, and data delity. The proposed concept was implemented and assessed. The tests conducted indicated that the Closed-loop Digital Twin can connect information from various stages and systems of a product's lifecycle. It can also combine lifecycle data to perform relevant analyses based on prede ned algorithms, transforming data into contextual rich information to relevant stakeholders over the product lifecycle.
The proposed Closed-loop Digital Twin model considers a lifecycle perspective beyond the beginning of life, and a heterogeneous IT platform, addressing some of the research and practice challenges in the eld. Therefore, the Closed-loop Digital Twin model provides a conceptual reference for further DT developments.
This research has limitations that open the opportunity for further studies in the eld. The proposed model was tested in a learning factory. Although the considered learning factory IT environment features industry-standard enterprise software in a cloud server, new technical challenges may arise in an industry implementation. Besides, cybersecurity tests -to study the system's vulnerability -were not performed at this stage, being the object of future studies.