A Digital Twin-driven Human-robot Collaborative Assembly-commissioning Method for Complex Products

： In the process of complex products assembly-commissioning, manual operation is the main reason for low efficiency. The human-robot cooperation (HRC) technology combines the advantages of human and robot, and makes it complete the task in the shared space. It is an effective way to solve the problem by introducing the HRC technology into the complex products of assembly-commissioning. However, the current HRC technology has insufficient perception and cognitive ability of tasks. Therefore, this paper presents a digital twin-driven HRC assembly-commissioning framework. In this framework, a virtual-real mapping environment for HRC is constructed. In order to improve the cognitive ability of robot units to tasks, this paper proposes a method of intention recognition that integrates the features of parts into human joint sequences. In order to improve the adaptability of robot unit to task, the assembly-commissioning task knowledge graph is constructed to quickly extract the implement sequence of robot unit. At the same time, the deep deterministic policy gradient (DDPG) is used to adaptively adjust the robot unit implement action in the process of assembly-commissioning. Finally, the effectiveness of the proposed method is verified by taking a particular type of automobile generator as a case study product.


Introduction
As the final guarantee of precision control in the product manufacturing process, the assembly usually accounts for 45% of the average workload in the actual production process [1].The emergence of robot technology has dramatically improved the efficiency and quality of product assembly.The traditional industrial robot has been widely used in machining, assembly and spraying due to its advantages of high efficiency, high precision and high reliability [2][3].However, industrial robots cannot wholly replace humans in the process of industrial production.Even in the automobile manufacturing industry with the highest automation rate, about 20% of the assembly tasks have to be done manually.The traditional industrial robot technology has the following limitations： 1) The robot repeats one or several actions mechanically according to particular programming instructions and lacks the cognition of the environment; 2) Due to the single structure of the robot, it is challenging to complete the assembly task alone in some fine operation or complex environment; 3) For the flexible assembly task, there are great changes in the assembly process of the product.Every time the task changes, the robot program needs to be rewritten to meet the needs of the current task.
Although human has the limitations of fatigue and low repetition accuracy, they have the decision-making ability that robots cannot match.In the face of a complex environment or unexpected situation, people can make decisions and adjust according to the actual situation.With the development of industry 4.0, the product assembly workshop is gradually transforming to a digital direction.HRC technology in assembly workshops has gradually become a reality.HRC assembly system is composed of human and intelligent robot.In a given task and space, humans and robots work together to complete complex tasks.HRC technology makes the relationship between the human and the robot from the traditional "controlled" to "partners".In the theoretical research of HRC technology, the joint perception and cognition between humans and robots make the assembly process have high flexibility, high flexibility and high precision.However, in the existing HRC assembly system, the robot's cognitive ability to the environment is weak, making it impossible to update the assembly strategy when the assembly environment changes quickly.
In recent years, digital twin technology has been thriving in various fields.The digital twin concept is proposed by GRIEVES, which consists of three essential parts: physical products in real space, digital products in virtual space, and real-virtual connection [4][5].The introduction of digital twin technology in HRC systems is an effective way to improve the cognitive ability of the robot to the environment.By using sensor technology and Internet connection, the digital twin model of HRC is applied to the actual operation stage, and the intelligent real-time control of the system is realized.Therefore, this paper proposes a digital twin-driven HRC assembly-commissioning method for the operation process of complex products.Based on the digital twin model of the HRC system, the synchronous mapping between the virtual assemblycommissioning process and the actual assembly-commissioning process is realized by collecting environment data in real-time.By using the recognition method of fusing part features into the human joint sequence, human operation intention is recognized.
In order to improve the adaptability of the robot to tasks and the environment, this paper uses the DDPG algorithm to realize the adaptive adjustment of robot action based on understanding the human operation intention.The rest of the organization of this paper is as follows.In Section 2, the related work of this method is introduced; In Section 3, this paper constructs a digital twindriven HRC assembly-commissioning framework; The digital representation method, the recognition method of human operation intention and the adaptive adjustment method of robot strategy are introduced.In Section 4, the application verification is carried out with the example of the assembly-commissioning of a particular type of automobile generator.Finally, conclusions and future works are discussed.

Related work 2.1 Digital twin assembly technology
With the rapid development of computer technology, graphic display technology and artificial intelligence technology, digital twin technology has been widely used in various fields.In the industrial field, it is applied in the whole life cycle of products such as design [6], machining [7], assembly [8], residual life monitoring [9].In the product assembly stage, Sierla et al. [10] obtains a digital twin from the digital product description and then realizes assembly planning and resource allocation automatically.Sun et al. [11] studied the process of high-precision product assembly and commissioning with the digital twin.Based on the total factor model of digital twin assembly, the assemblability prediction and assembly process optimization is realized.
Guo et al. [12] proposed a digital twin-enabled intelligent manufacturing system (DT-GIMS) to solve the assembly work of fixed islands.
Although there is no unified standard for the definition of the digital twin, its essence is that: digital twin is a digital model of physical objects, which can evolve in real-time by receiving data from physical objects.To maintain consistency with physical objects in the whole life cycle.Based on digital twin, it can analyze, predict, diagnose, train and so on, and feedback the simulation results to the physical objects to help optimize and make decisions on the physical objects.

HRC assembly technology
At present, for highly complex and precise assembly tasks, robots lack the ability of independent processing.The combination of human flexibility and robot precision is an effective way to improve assembly ability.In recent years, the research on HRC has made significant progress.Takata et al. [13] proposed a planning method for HRC in a hybrid assembly system, allowing operators to choose initial human and robot configuration and minimize the expected total production cost, including robot investment and labor cost.Michalos et al. [14] discussed the design of HRC assembly workstation.According to the assembly process specification, different control, safety and human support strategies were implemented to ensure the personal safety and productivity of the whole system.Liu et al. [15] modeled the product assembly task as the human motion sequence and then controlled the robot to assist human work by predicting human motion.In order to make the robot aware of the tasks being performed by humans, Berg et al. [16] proposed an HRC assembly task action recognition method based on the Hidden Markov model.Nemec et al. [17] introduced the dynamic learning demonstration interface of compliant, cooperative tasks in the HRC assembly with compliant adaptation along the motion trajectory.
In HRC manufacturing, industrial robots will work together with the human who performs assigned tasks seamlessly.Compared with traditional manufacturing systems, HRC manufacturing system has more customization and flexibility.In the field of assembly, an actual HRC assembly system should predict human intention and assist humans in the assembly process.

Digital twin-based robot assembly technology
To improve the intelligence level of the robot, it is an effective method to introduce digital twin or CPS technology into the robot assembly system.Yao et al. [18] introduced HRC assembly in CPPs, which improved the planning, monitoring and control in HRC assembly.Darvish et al. [19] proposed a flexible HRC assembly architecture, which integrates perception, representation, planning and control.After recognizing human behavior, online reasoning is performed to complete the HRC assembly.The digital twin technology provides a visual control function for humancomputer interaction through integrated analysis and real-time data collection in virtual space.Droder et al. [20] studied the role of machine learning in controlling robot behavior in digital twins.The robot can automatically avoid obstacles through the machine learning method.Oyekan et al. [21] established a digital twin workshop to analyze human response to robot's predictable and unpredictable motion.Driven by digital twins, the interaction between the human and the robot will become smooth and frequent.Bilberg et al. [22] established a corresponding DT for flexible assembly units in which the use of the simulation model is extended to real-time control, task allocation, task sequencing and program development.
According to the operation requirements, the traditional HRC has a long cycle in scheme demonstration, layout planning, motion control, test and verification.The digital twin technology is introduced into HRC, and the virtual HRC environment corresponding to the physical HRC environment is established to realize the two-way interaction of virtual-real.The digital twin HRC shortens the design cycle, realizes the closed-loop control of action and feedback, and improves production efficiency.

Digital twin-driven HRC assembly-commissioning system
The digital twin technology is introduced into HRC to realize the integration of recognition, control and optimization of the HRC assembly-commissioning process.As shown in Fig. 1, this paper constructs a digital twin-driven HRC assemblycommissioning framework.According to the frame structure, this paper mainly studies three aspects: the digital expression of HRC assembly-commissioning environment, the human intention recognition and the adaption adjustment of robot unit strategy.(1) Digital expression of HRC assembly-commissioning environment.In HRC assembly-commissioning operation, human, robot unit, parts and auxiliary tools are integrated into the physical environment.The use of various sensors (such as cameras, 3D laser scanners, etc.) and robot controllers for relevant data acquisition, such as robot operation data, parts location information and assembly size information.The virtual environment is composed of digital twin models corresponding to physical elements of the HRC system.In the virtual environment, we can use the digital twin model to simulate the assembly-commissioning action, visualize the data and map the virtual and real action synchronously.OPC-UA communication protocol can be used to realize two-way data interaction between the physical environment and virtual environment.
(2) Human intention recognition.In the operation of HRC assemblycommissioning, it is necessary to realize the fast identification of the assemblycommissioning task.It is an effective way to judge the intention of assemblycommissioning by capturing human behavior.The task of assembly-commissioning is modeled as the action sequence of humans, and the intention of assemblycommissioning can be predicted by identifying the human actions.The robot can aid humans according to their intention of assembly-commissioning to support humans in time and adapt to the rhythm of human work.In the process of assemblycommissioning, the use of human intention recognition not only ensures the smooth operation of HRC but also improves production efficiency.
(3) Adaption adjustment of robot unit strategy.Based on completing the intention of the human, the robot unit extracts the feasible action sequence strategy according to the task situation.The robot unit is divided into the robot and end-effector.The assembly-commissioning process is necessary to control the robot motion path and the end-effector pose to complete the task.In order to realize the task quickly and accurately, the robot unit needs to adjust adaptively according to the task and the change of environment.

The digital representative of HRC environment
In the actual assembly process of complex products, the assembly environment includes diversity, dynamic and complexity.Through the digital representation of physical elements, the monitoring, prediction and optimization of the data-driven model can be realized.As shown in Fig. 2, digital twin technology is essentially a digital representation of physical entities.The digital twin model can truly reflect various attribute information of physical entities.As suggested by Tao [23], the DT is built in four layers, i.e., geometry (creation of 3D CAD objects), physics (kinematics of robots and human), behavior (placement of CAD objects in the scene), and rule (assembly process sequence).We use 3D modeling software (UG, CATIA, Solidworks, etc.).For parts, we need to build enough acceptable geometric models to reflect the assembly dimension error.In order to reduce the burden of computers, the digital model of complex equipment should be as lightweight as possible while retaining its essential functions.The static and dynamic information is saved in the XML file.The static information mainly includes product assembly topology, assembly process, physical attribute information and so on.And robot unit-related reference parameters.The dynamic data mainly includes the spatial pose data of parts, the operation data of the robot unit, the behavior data of humans, etc.In the process of assembly-commissioning, the real-time operation data of the robot (operation path, speed, acceleration, etc.), state data of end effector, human action behavior data and spatial position data of parts are collected by the binocular camera, robot controller and 3D laser scanner.The collected data drive the virtual space model to make it consistent with physical space elements.That is virtual and real synchronous mapping.

Human intention recognition
In HRC, how to make robot recognize human intention is one of the key technologies of HRC.Through human intention recognition, the robot unit can perform the corresponding feedback action without additional input operation (such as keyboard and mouse input), while ensuring the real-time and accuracy of the system.In this paper, the intention recognition method based on the human skeleton is adopted.There is a great similarity in human operation behavior in the process of an assembly task.In order to solve the problem of low recognition accuracy of similar assembly actions, we analyzed the characteristics of human assembly behavior.Generally, human behavior is fixed and process-oriented, and there is a sequence relationship between various behaviors of humans.Moreover, human behavior is mostly to operate the workpiece.The characteristics of parts play an important role in distinguishing the behavior of humans.Therefore, this paper proposes an intention recognition method that integrates part features into the human joint sequence, as shown in Fig. 3.
Fig. 3 The framework of assembly-commissioning task recognition

Human behavior recognition
This paper uses a lightweight human joint estimation model to extract joint human information and uses 1€Filter [24] to smooth it.The human skeleton feature sequence can be obtained after OpenPose [25][26] for the input video image sequence.As shown in Formula (1) Where   is the set of human key points at the input time  .This can be expressed as follows Formula (2).

𝑝 = {
: {( Where, (  ,   ) is the position of the ℎ key point in the image coordinate.In this paper, 14 upper body nodes are used in the body skeleton.One root node and 20 finger nodes are used in the hand skeleton.
At the same time, the original image uses the pre-trained convolution neural network to extract features.Human's assembly behavior has an obvious time sequence.We can extract the features of a new frame in the video stream through the self-attention model.By querying the recorded frames in memory, the order of historical skeleton information is obtained directly, and the skeleton information of each frame is used as the input of a moment.The feature extraction layer fully connects the input skeleton information, and position coding information is added, and then the output features of each behavior are obtained through two self-attention layers.The output features are spliced and combined with an output matrix to obtain the final temporal features.
Where,   is the output matrix and   is the output result of the ℎ input passing through the self-attention layer.From the data flow process and the principle of self-attention, we can see that in the process of temporal feature extraction, the skeleton features of each frame will be associated with the skeleton features of other frames, and the influence degree will be calculated, that is, the feature recognition of each frame behavior must consider the characteristics of all the frames before and after it.

Part feature recognition
The full convolution network is used to recognize the assembly features.Image segmentation is used for pixel by pixel classification.The output feature map has fixed length, width, and channel characteristics, and the feature vector with fixed length can be generated by convolution and full connection.The characteristic length can be determined by setting the number of neurons.In feature extraction, the downsampling parameters of the pre-training model are fixed to keep the original feature extraction ability.Convolution neural network is used to replace the hog, sift, and other artificial design features, which avoids the tedious manual feature design process and uses the black box characteristics of the neural network to obtain more valuable features.After obtaining the downsampling result, i.e., the feature graph of the image, a new fullyconnected layer is added for further feature extraction and generation of the onedimensional vector.The one-dimensional vector is integrated into the behavior recognition model as a part feature.

Multi-feature fusion recognition
For each newly entered frame in the video stream, three basic features are extracted: 1) the pose feature   of the current human skeleton key points is used to describe the pose information of human limbs; 2) The workpiece feature   in the frame is used to describe the position category information of parts, tools and other objects in the assembly scene; 3) Body time sequence information   is used to describe the changes of human limbs in a period.
For the fusion of the three features, a simple way is to directly splice to generate a high-dimensional vector, extract and classify the features through several fully connected layers.However, this method does not consider the interdependencies among features.For example, the body behavior of humans is the operation of parts, and the movement of historical limbs has guidance information for the current limb posture.
Therefore, this paper also uses a self-attention mechanism to consider the dependence of each feature because the order of features does not affect the results, so it does not need to add location coding information to distinguish the order.The self-attention model consists of two attention layers, three input steps, representing three features.
The output of the attention layer is spliced through two full connection layers, and the dimension is reduced to the number of assembly behavior categories.Finally, the probability of behavior classification is calculated through a softmax.

Adaption adjustment of robot unit strategy
According to the operation intention of the human, the robot unit is needed to assist the assembly-commissioning task.This section mainly describes the implement sequence and action method of the robot unit.

Robot and End-effector implement sequence
According to the actual situation, the assembly-commissioning sequence is a dynamic and variable process in complex products assembly-commissioning.This paper constructs a knowledge graph of the tasks of assembly-commissioning [27][28], which is used to realize the fast retrieval of the implement sequence of the robot unit, as shown in Fig. 4. The task knowledge graph includes two parts: pattern layer and data layer.The data layer is a triple node-attribute-value and node-connect-node composed of solid objects (parts) and relationships.A semantic network graph is obtained when triples exist in large quantities.The pattern layer is the core of knowledge graph modeling for assembly-commissioning tasks.In order to clearly describe the semantic information of the complex task of assembly-commissioning, the pattern layer organizes the information according to the assembly process mode.According to the information organization of the knowledge graph, the task document is defined by class relationship class and class attribute value.In order to simplify the complexity of the task, we take part as the first level node and the robot unit and human as the second level node.In the process of assembly-commissioning, the human operation intention (human behavior and parts) is regarded as the search label.The robot unit implement sequence can be extracted quickly by searching the corresponding task.

Robot and End-effector implement action
After acquiring the implement sequence of the robot unit through the assemblycommissioning task knowledge graph, the next step is to require the robot unit to execute actions according to the actual environment.During the movement of the robot unit, two key requirements are involved: obstacle avoidance and shortest path.It is necessary to establish a good mapping relationship between the sensor input data and the control output to realize the rapid response ability of the robot unit to obstacles.But the mapping is complex and nonlinear.The reinforcement learning algorithm is an effective way to realize the adaptive adjustment of robot unit action.

action function DDPG algorithm uses a deep neural network for function approximation, which
can solve continuous action space [29][30].As shown in Fig. 5, we have established two DDPG for robot path and end-effector pose: robot DDPG and end-effector DDPG.In the first stage, the robot DDPG is used to plan the path from the initial position to the target position (  →   ).After the robot reaches the target position, the second stage is executed.The second stage is to adjust the pose of the end-effector through the end-effector DDPG.Robot DDPG and end-effector DDPG are composed of two networks.The value function network is also called the critic network.The input of the critic network is action and observation, and the output is a value of the stateaction pair.In addition, the strategy function network is also called actor-network.The input of the actor-network is the observation value, and the output is the action value.We use   and   to parameterize the function approximator.The critic network is updated as Formula ( 5), and the actor-network is updated Formula (6).
According to the data flow, the actor-network selects   according to the behavior strategy and sends it to the robot controller to execute the action.Return the action reward   and the new state  +1 in the HRC environment.Actor stores  +1 conversion process (  ,   ,   ,  +1 ) in replay buffer as the data set of predict network.R transitions are randomly sampled from the replay buffer as a minibatch training data set of actors and critic predict network.A single transition in minibatch is represented by (  ,   ,   ,  +1 ).In DDPG algorithm, the optimizer is used to update the predicted network is critical by minimizing the loss, as shown in Formula (7)： Where   can be regarded as the label   =   +  ′ ( +1 ,  ′ ( +1 |  ′ )|  ′ ).
What is stored in the replay buffer is generated by the agent-based behavior strategy.
We optimize the actor-network by maximizing the policy objective function J, as shown in Formula (8).
The actor-critic network obtains each time step t of the robot and the end-effector, and obtains the optimal action strategy in the actor.Finally, the virtual HRC environment is transformed into the robot control program and sent to the robot controller.Moreover, this cycle continues until the completion of the task.

Reward function
In order to carry out reinforcement learning training more stably and make the robot unit obtain the optimal action strategy, we establish DDPG reward mechanism.
In training, the observation, action and reward function are presented in Tab. 1.The reward value is related to the observation space elements, and the observation space of robot ontology includes the running time (RT), path length (RPL), collision (RC) and singularity (RS).The observation space of the end-effector includes running time (ET), path length (EPL) and non-target collision (EC).[α1, α2, α3, α4, α5, α6, α7] is weight recombination, which can be obtained by experiment.Where α3, α4 and α7 must be -1.The reward obtained by the agent of the robot and end-effector is the weighted sum of all rewards.The motion of the robot is [θ1, θ2, θ3, θ4, θ5, θ6], which represents the rotation angle of each joint axis.The action of the end-effector is [X, Y, Z, Rx, Ry, Rz], which represents the pose of the end-effector.

Case study
The automobile generator has high complexity, high precision and multiconstraint conditions, and its structure is shown in Fig. 6.In the process of assemblycommissioning of automobile generators, more redundant manual operation results in low assembly-commissioning efficiency.Therefore, this paper introduces the digital twin-driven HRC assembly-commissioning method, which is used to verify the effectiveness of the proposed method.

Construction of digital twin HRC system
In order to realize the method proposed in this paper, we set up a physical experiment environment and virtual experiment environment, respectively, as shown in Fig. 7.The physical experiment environment mainly includes humans, UR5 robots, a test bench, auxiliary assembly tools, auto generator parts and related measurement tools.The data of the HRC assembly-commissioning process are uploaded to the virtual experimental environment through various measuring devices and robot controllers.
In the virtual experimental environment, we build a digital twin HRC system composed of a data acquisition module, an assembly-commissioning task knowledge graph module, a digital twin model simulation module, a core intelligent algorithm module and a data statistical analysis module.The functions of each module are as follows: 1) Data acquisition module: This module is used for data acquisition of physical experiment environment, mainly including robot state data, human behavior data, parts assembly state data, etc.On the one hand, these data drive the corresponding digital twin model to achieve real-time synchronous mapping.On the other hand, it can be used for the analysis of intelligent core algorithms; 2) Assembly-commissioning task knowledge graph module: the assemblycommissioning task knowledge map can quickly retrieve the execution sequence of the current robot unit; 3) Digital twin model simulation module: one function of this module is for HRC task simulation, and the other is for visualization of the virtual synchronous physical assembly-commissioning process; 4) Core intelligent algorithm module: in this paper, the module calculates and analyzes physical data.It is mainly for human intention recognition and dynamic planning of robot unit strategy; 5) Data statistics and analysis module: this module is mainly used for statistics of automobile generator assembly-commissioning process data.

Method application
According to the assembly-commissioning task of automobile generators, this paper carries out experiments from two aspects: human intention recognition and robot unit implementation strategy.The experimental group was divided into 20 groups, and the mean value calculated the results.
(1) Human intention recognition In this experiment, the method is compared with skeleton recognition and image recognition.This paper analyzes the identification efficiency and accuracy, respectively, as shown in Tab. 2. The method based on image recognition takes the complete image or the complete image sequence as input.The model will cause the excessive redundancy of input information because of the attention to the background and light of the image.The recognition efficiency (2.5s) is seriously hindered, and the recognition accuracy (62%) is greatly affected by the redundant part of the image.
Because the traditional image recognition method does not consider human behavior, its efficiency is slightly higher than the method proposed in this paper.The method based on skeleton recognition has the advantage in recognition efficiency (1.5s).However, because some human movements are highly similar, the recognition efficiency (83%) is relatively low.The results show that the recognition efficiency (1.7s) of the proposed method is much higher than that based on image recognition and slightly lower than that based on skeleton recognition.The recognition accuracy (98%) is much higher than the other two methods.At the same time, the steps are 15, and the detection noise is initialized to 0.1.According to different environments and tasks, the robot unit uses the DDPG model to obtain the implement action to reach the target position within the range of capability.

Methods
In this paper, we take rotor assembly-commissioning as an example.In the aspect of assembly-commissioning deviation, the collaborative assembly-commissioning deviation is 0.35mm.At the same time, the adjustment deviation of the DDPG model is 0.89mm; The setup deviation of pre-programming is 1.92mm.The method proposed in this paper adaptively adjusts the pose of the end-effector, which is the main reason to reduce the alignment deviation.In terms of assembly-commissioning time, it takes 272s to use this method for collaborative assembly-commissioning; It takes 498s to complete the same task with DDPG; It takes 728s to complete the same task by preprogramming.The method proposed in this paper can meet the requirements of mission accuracy because of its small adjustment error.At the same time, this method does not need to adjust the attitude of the robot unit manually many times, so it takes less time than before.

Results
In this paper, the assembly-commissioning time of traditional manual (T-M) and digital twin-driven human-robot cooperation (DT-HRC) methods are counted, respectively, as shown in Fig. 8.Our experimenters have already mastered the operation method through enterprise training and much practice.In this experiment, the mean value of 20 groups of data was calculated.The results show that the DT-HRC method is 46.3% of the T-M method in the total assembly-commissioning time of automobile generators.
Fig. 8 Comparison of results of total assembly-commissioning time

Conclusions and future work
In the process of complex products assembly-commissioning, manual operation accounts for a large proportion.At the same time, too much redundant manual operation not only consumes the labor force but also slows down the assembly efficiency seriously.In recent years, with the continuous improvement of robot security technology, cooperative robot as a new generation of industrial robots has been introduced into the production system.Unlike traditional industrial robots, cooperative robots and people can share work tasks and spaces.The human and cooperative robot performs tasks together, combining the advantages of manual production and automatic production, significantly improving production efficiency.The continuous application of cooperative robots makes the technology of HRC more and more mature.However, the existing HRC assembly system is not adaptable to the environment, and it cannot update the assembly strategy quickly when the assembly task changes.In order to meet the assembly requirements of complex products, this paper presents a digital twindriven HRC assembly-commissioning method.The main contributions of this paper are as follows: 1) A digital twin-driven HRC assembly-commissioning framework is established to realize the virtual-real synchronous mapping of the virtual-real HRC process; 2) In order to improve the efficiency and accuracy of task recognition, this paper proposes an intention recognition method that integrates part features into the human joint sequence; 3) In order to realize the adaptive ability of the robot unit to the environment, this paper extracts the implement sequence of the robot unit quickly by constructing the assembly-commissioning task knowledge graph.It uses the DDPG algorithm for the adaptive adjustment of the robot unit action.
Finally, this paper takes an automobile generator as an example to verify the application.The results show that, compared with the traditional assembly method, the proposed method can effectively improve the assembly efficiency.In the future, we will introduce AR technology into the system to enhance the virtual reality fusion ability of digital twins.

Declarations 6.1 Ethical Approval
The Author confirms: that the manuscript is not submitted to more than one journal for simultaneous consideration.
that the submitted work is original and is not published elsewhere in any form or language.that the study is not split up into several parts to increase the quantity of submissions and submitted to various journals or to one journal over time.that results are presented clearly, honestly, and without fabrication, falsification, or inappropriate data manipulation.Authors have adhered to discipline-specific rules for acquiring, selecting, and processing data.

Consent to Participate
Participants in the test clearly understand the risk, and signed the relevant informed consent to participate in the experiment.

Consent to Publish
The Author confirms: that the work described has not been published before (except in the form of an abstract or as part of a published lecture, review, or thesis); that it is not under consideration for publication elsewhere; that its publication has been approved by all co-authors, if any; that its publication has been approved (tacitly or explicitly) by the responsible authorities at the institution where the work is carried out.The Author agrees to publication in the Journal indicated below and also to publication of the article in English by Springer in Springer's corresponding Englishlanguage journal.
The copyright to the English-language article is transferred to Springer effective if and when the article is accepted for publication.The author warrants that his/her contribution is original and that he/she has full power to make this grant.The author signs for and accepts responsibility for releasing this material on behalf of any and all co-authors.The copyright transfer covers the exclusive right to reproduce and distribute the article, including reprints, translations, photographic reproductions, microform, electronic form (offline, online) or any other reproductions of similar nature.

Competing Interests
The authors of this paper have no affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers' bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-financial interest (such as personal or professional relationships, affiliations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript.

Availability of data and materials
The datasets used or analyzed during the current study are available from the corresponding author on reasonable request.
Digital twin-driven HRC assembly-commissioning framework The digital representation for HRC environment The framework of assembly-commissioning task recognition Robot unit implement sequence retrieval The overall structure of robot and end-effector DDPG Structure of automobile generator Digital twin HRC system Comparison of results of total assembly-commissioning time

Fig. 2
Fig. 2 The digital representation for HRC environment

Tab. 1
Observation space, actor space and reward function

Fig. 6
Fig.6 Structure of automobile generator

( 2 )
Robot unit implement strategy This experiment mainly observes the response time of robot action and the accuracy rate of reaching the target position, as shown in Tab. 3. Before the DDPG model training, set the number of sets to 1500 and the memory pool capacity to 56000.

Tab. 3 Comparison of robot unit performance by different methods
Xuemin Sun: Conceptualization, Methodology, Software, Writing -original draft, Writing -review & editing, Resources, Data curation, Visualization.Rong Zhang: Conceptualization, Methodology, Writing -review & editing, Supervision.Shimin Liu: Conceptualization, Methodology, Supervision.Qibing Lv: Conceptualization, Methodology, Supervision.Jinsong Bao: Supervision, Project administration, Funding acquisition.Jie Li: Software.Visualization, Validation.6.5 Funding This work is financially supported by National Key Research and Development Plan of China (Grant 2019YFB1706300), in part by the Fundamental Research Funds for the Central Universities and Graduate Student Innovation Fund of Donghua University (Grant No. CUSF-DH-D-2020051), in part by Fundamental Research Funds for the Central Universities (No. 2232019D3-32), and in part by Shanghai Sailing Program (19YF1401600).