An MDP-Based Lifter Assignment Algorithm for Inter-Line Transportation in Semiconductor Fabrication

As semiconductor device geometries continue to shrink, the semiconductor manufacturing process becomes increasingly complex. This usually results in unbalanced utilization of machines and decreases overall productivity. One way to resolve such a problem is to share the resource capacity between different lines divided by floors. To this end, designing an efficient lifter assignment method to more efficiently manage transfer requests (TRs) of wafer lots to different floors is required. Motivated by this, our study addresses the assignment of lifters for delivering wafer lots to different floors. Unlike previous studies, which consider the current state of the system, our study considers both the current and possible future states of the system. We formulate an optimization model based on the Markov decision process. Then, we design an efficient method as a solution using both clustering and tournament selection methods. Experiments based on historical data confirm the effectiveness of the proposed algorithm in reducing travel times and delivery delays compared to the benchmark rules in practice. Sensitivity analysis demonstrates the robustness of the proposed model as the number of TRs increased. The proposed approach is expected to yield significant economic savings in both operating costs and labor.


Introduction
Wafer fabrication in semiconductor manufacturing is an extremely complex process. To produce an electronic or photonic circuit on a minute semiconductor wafer, a wafer lot in a semiconductor fabrication facility (FAB) is frequently transported between and within different processing areas, such as those dedicated to etching or imprinting.
Many re-entrant processes in which a wafer lot enters the same group of machines repeatedly are also necessary. To manage these complex material flows, automated material handling systems (AMHSs)-sophisticated material control systems that move materials from one machine to another-have been successfully applied in semiconductor manufacturing. In particular, an AMHS employing an overhead hoist transfer (OHT) reduces the wafer transport time dramatically and contributes to reduced cycle times and increased equipment utilization with improves on-time delivery.
As semiconductor manufacturing processes are becoming increasingly complex, and the number of production steps is increasing, a single floor line often has an inadequate capacity to cover such production needs. Therefore, FABs with multiple floors have been introduced recently [1]. At the initial FAB operation phase, the interfloor transfers from one floor are constrained at the minimum level because the capacity of the process machines of each floor is kept balanced. However, as the product mix changes, the equipment utilization on each floor becomes unbalanced and more interfloor transfers to share equipment resources on other floors are required. In general, the number of lifters for inter-floor transfers is determined at the minimum level because the lifter itself is expensive to set up and consumes cleanroom space. Thus, the lifters become a major bottleneck of the AMHS when the inter-floor transfers increase [2].
The typical process of inter-floor transport is shown in Figure 1a. Upon the arrival of a transfer request (TR) to transport a wafer lot from a machine on floor A to another on floor B, the AMHS allocates an OHT and a lifter connected to floor B. Then, the assigned OHT unloads the wafer lot from the source machine and moves it to the dedicated lifter. After the OHT arrives at the lifter, it loads the wafer lot into the inbuffer port of the lifter for transportation to floor B. The buffer port is where a lot waits for transport to another floor by the lifter or to the destination by an OHT afterward.
The wafer lot in the in-buffer port is transported to the out-buffer port on another floor by the rack-master (Figure 1b). Finally, an OHT on floor B transports the wafer lot to the destination machine. Transport times between floors are much longer than on a single floor because an inter-floor transfer consists of two transfers by OHTs, a transport by a lifter, and waiting at the buffer port of the lifter. Thus, when the lifter is not selected properly, the travel time to the destination machine could be excessively long, and the shared machines would not be fully utilized on account of the transportation delays [2]. In practice, when the AMHS selects the lifter for each inter-floor transfer, dispatching rules, such as the round-robin rule or the shortest-travel-distance rule, are widely used instead of analytical models to identify the acceptable solutions in the dynamic environment of the actual scenario. The round-robin rule chooses lifters in a pre-set circular order; TRs are evenly distributed to all lifters. This approach has the advantage of keeping lifter workloads even and minimizing the variance of the waiting time in the buffer port of the lifter. It also decreases the vehicle congestion around lifters by preventing excessive traffic on specific lifters. However, because the travel time of an OHT is not considered, a lifter with a long distance to travel is frequently assigned, which causes a long travel time. The shortest-travel-distance rule assigns wafer lots to the nearest lifters to reduce the travel time. However, TRs could then be concentrated on specific lifters, which increases the waiting time in the buffers and causes congestion. Thus, with such simple rules, the operational efficiency of the AMHS is limited. For this reason, system operators manually adjust the rules according to changes in the manufacturing environment.
Nevertheless, this sort of adjustment is labor intensive and can miss the correct timing.
We therefore propose a method based on the Markov decision process (MDP) to optimally assign lifters to minimize the inter-floor transportation time in dynamic manufacturing conditions. One of the biggest limitations of current practice is that the future states of lifters operating in a FAB are not considered. Therefore, undesirable situations, such as delays in transporting a wafer lot to another line, cannot be fully prevented. However, by employing MDP, which derives solutions by considering a system's future states, we can address this issue. We also evaluate the performance of the proposed approach by using an emulator with data gathered from an actual FAB.
The remainder of this paper is structured as follows. Section 2 reviews the existing literature related to this study and discusses its implications and significance.
Our contributions to the existing literature are also clarified. Section 3 details the mathematical model. Section 4 describes the proposed solution algorithm based on MDP. Section 5 describes the experimental design, the results of the experiments, and their implications. Finally, Section 6 concludes our study.

Background
Our study addresses a lifter assignment problem that has been receiving considerable attention recently. Specifically, our objective is to increase the efficiency of AMHS and the productivity of FAB in the manufacturing domain. With this consideration, we first briefly review the research stream of AMHS, and then focus on relevant studies. For a general review on AMHS, refer to [3][4][5][6].
Modification of AMHS in operating FABs, such as relocating the machines or installing additional rails, storages, or lifters, typically results in a tremendous production loss because it requires shut down of the partial or entire system during that period. Therefore, many studies on vehicle management, including vehicle allocation, dispatching, and routing, which are the viable options to improve productivity without the power shut down of FABs, have been conducted.
The vehicle allocation problem in the semiconductor AMHS generally involves determining the optimal fleet size to minimize delivery time. Many researchers have proposed determination of the optimal fleet size, which would fulfil a specific transfer requirement [7][8][9][10]. To solve a certain vehicle allocation problem, some studies focused on dynamically repositioning idle vehicles to appropriate locations to minimize the transport time. Kim et al. [11] and Vahdani [12] proposed an idle vehicle circulation policy to balance the number of idle vehicles among the bays. More recently, Lee et al. [13] and Schemaler et al. [14] utilized the information of future transports to respond to highly dynamic manufacturing environments.
Another important issue in managing AMHS is to design efficient vehicle dispatching rules. In general, the vehicle dispatching problem aims to find the most appropriate dispatching rules to achieve operational goals, such as minimizing vehicle waiting time. For this, many studies compared several dispatching rules in various conditions [15][16][17][18]. Several studies showed that the dispatching rules have a significant impact on the performance of the system, such as the average transfer time, waiting time, vehicle utilization, and even throughput. Some investigations introduced new dispatching rules by reassigning vehicles during the unloading travel time [19][20][21]. They strived to dynamically exchange the vehicle-request assignment according to the system scenario and showed that it has a significant positive effect on reducing the vehicle assign time as well as load travel time.
The third and last problem to improve AMHS operation is to design efficient vehicle routing rules. The vehicle routing problem determines the optimal vehicle routes to visit, thereby aiming to minimize the transport times of TRs. Typically, studies in vehicle routing problems have focused on finding conflict-free routes [22][23][24][25]. Another objective is to design vehicle routing decisions that consider traffic congestion, which should be avoided as much as possible [26,27].
More recently, studies on improving AMHS have expanded their scope to include storage or lifter allocation. This expanded scope is inevitable owing to the fact that many FABs are starting to use multiple lines (floors) to increase their production capacity. Kim et al. [28] and Siebert et al. [29] focused on lot targeting to improve the material flows through the storage locations, while Lee et al. [30] proposed a machine learning approach to select the best dispatching rule for storage allocation. They showed that the storage allocation significantly affects the performance of the AMHS.
In terms of prior research, to the best of our knowledge, so far only three studies address the lifter assignment problem for inter-line transfers in semiconductor manufacturing facilities [1,31,32]. The first study to introduce the lifter assignment problem for inter-line transportation is by Jimenez et al. [31]. They suggested four rules for the selection of the rail for TRs to be sent to a stocker on the same floor (intra-line transportation), and four rules for the selection of the lifter to be sent to another floor (inter-line transportation). Based on simulation experiments in which the ratio of the intra-line to inter-line transportation volume was changed, they determined a rule that considers the number of waiting lots on lifters. This rule showed good performance when the inter-line transportation was increased, whereas when the inter-line transportation was decreased, considering both the travel time and the number of waiting lots was recommended. The other two studies were conducted relatively recently. Na et al. [1] presented the lowest utilization, the integer programming model, and the shortest-expected-arrival-time (SEAT) rule as methods of selecting a lifter.
They proved that the SEAT method decreases the average delivery time by 8.9% compared to the round-robin method used in practice. Lee et al. [32] proposed operation policies to improve the efficiency of lifter operations in material handling of semiconductor lines. Their policies involve the specific and practical operational decisions, such as the number of virtual ports, activating the alternative storage, and using a shelf extraction procedure by rack maters. However, they did not suggest an algorithm or mathematical model to generate the most optimal policies.
Our study differs from the above research in two respects. First, the prior studies that provided heuristic approaches were empirically shown to yield a reasonable performance; however, they did not provide an optimal solution algorithm that accounts for the stochastic nature of the problem. Also, they only considered the current state, whereas ours can consider both the current and the future states, thus making it possible to derive better solutions for AMHS management.
Based on the literature review above, this field has been studied quite extensively. However, we believe that our study contributes to it in three ways. First, our optimization model is the first to consider stochastic dynamics in the lifterassignment problem. To the best of our knowledge, all the studies on the lifterassignment problem [1,31,32], as well as the rules currently in actual use on the line, consider only the current state. Our proposed model considers not only the current state but also the states that will occur in the future. Second, we propose an algorithm that can be applied in a real-world setting. In general, MDP models have difficulty finding an optimal solution when the target system is large or complex. We solve this limitation by using a clustering technique to simplify the representation of the source machine with similar travel-time distributions and by grouping multiple lifters to partition the problem. Lastly, we propose a framework with autonomous control that can serve as the basis for establishing a smart factory. Since operators adjust the rules empirically according to changing operating conditions, there is a large variation in operational efficiency depending on the operator's expertise, and it is very labour-intensive. In contrast, the proposed framework enables autonomous control by automatically updating the model.

Mathematical Model
We consider a lifter-assignment problem in which the delivery time is to be minimized.
When a TR is generated and wafer lots should be transported to another line, the sequence of events occurs, as shown in Figure 2, from the perspective of the departure floor. We define assign time as the time from the moment of the TR's arrival until the OHT arrives at the source machine and loads wafer lots; travel time as the time required between unloading wafer lots from the source machine and loading them into the buffer port of the lifter; waiting time as the time it takes for wafer lots to be picked up by the rack-master after being loaded into the lifter's buffer port; and cycle time as the time until the rack-master transports the lot to another floor and returns to the departure floor.
Finally, the delivery time is the sum of the travel time and waiting time. Using these quantities, the decision to minimize the value function of the state is derived. The value function is formulated using Bellman's equation: where is a discount factor satisfying 0≤ <1. Finally, a policy that maps each state to the optimal action is derived.
The lifter assignment problem can be viewed as a sequential decision-making problem because it is necessary to determine which lifter to send the TR to at the moment the TR is generated. Therefore, we developed an MDP model that aims to  Based on the results of the preliminary analysis that fits historical data into some mathematical distributions, we assume that the inter-arrival times for all events follow an exponential distribution. Under this assumption, the event generation process of this model follows the Poisson process. Therefore, the system state transition probability is given by the arrival rate of each event divided by the sum of all rates [33,34], where λ i denotes the inter-arrival rate of TR from source machine ; We define as a 1 × |2 | unit vector, for which the element corresponds to th position.

Solution Method for Lifter Assignment in an Actual Fab
Although the MDP model in Section 3 guarantees the provisioning of an optimal policy for decisions on a lifter assignment, there exists a critical computational issue from a practical standpoint. An MDP model is generally solved by using dynamic programming (DP). However, when the problem size approaches the size typically found in real-world applications, the MDP model requires tremendous computing resources and time to handle the large number of variables. Because of this "curse of dimensionality," solutions of realistic problems usually cannot be obtained in a reasonable amount of time. Considering the size of an actual FAB and its complex operating environment, the size of the state space must be very large. To tackle this difficulty, we propose a solution method to efficiently solve the given problem.

Solution approach
The main idea of the proposed solution approach is to reduce the dimension of the   The next step is to obtain the optimal policies for each sub-problem. In order to obtain the final best lifter, it is also necessary to solve new sub-problems consisting of the optimal lifters selected from the sub-problems. That is, all optimal policies for all possible problems have to be prepared a priori. In Figure 6, the nine scenarios that are possible based on a combination of two sub-problems that have three results, respectively, are shown. It is worth noting that if the number of groups is greater than N, the selected lifters are regrouped in the manner described above and the tournament selection method is applied. The final step is to select the best lifter for each TR in the execution phase within a short time, almost real-time. When a new TR is issued, the best lifter is derived by the tournament selection that only selects the optimal policies, which have already been solved in the previous step 3 (Figure 7).

Algorithms for lifter assignment
The detailed procedure is presented as pseudo-code in Tables 1a and 1b. Considering the real operation of FABs, a lifter assignment algorithm using MDP is not necessarily updated in a real-time manner owing to the fact that the optimal policies can be prepared by considering all possible situations a priori. However, a lifter assignment step should be activated in real time whenever a TR for inter-line transportation arrives.
By considering this circumstance, we devised two algorithms for the actual FAB operations. One is the Lifter Assignment Policy Preparation Module, which constructs MDP models and derives lifter assignment policies using historical data periodically; the other is the Lifter Assignment Module, which assigns lifters in real time using policies derived in the Lifter Assignment Policy Preparation Module.  Divide into sub-problems Step 2-1 Make initial lifter groups = { 1 , … , ⌊ ⌋ } using ( , ) , ∀ ∈ (2 ≤ ( ) ≤ , ∀ ∈ {1, … , ⌊ ⌋}) Step 2-2 Make additional groups, , for all combinations that can be made when one lifter is picked from each group Step 3 Prepare the optimal policies for all possible scenarios a priori Step 3-1 Derive the optimal policy, , which determine one among the lifters in , ∀ ∈

{1, … , ⌊ ⌋}
Step 3-2 Derive the optimal policy, , which determine one among the lifters in each additional group ( ), ∀ ∈ Output: , ∀ ∈ {1, … , ⌊ ⌋} : the optimal policy for each initial lifter group , ∀ ∈ : the optimal policy for each additional lifter group ( , ) : a classifier which assigns a cluster ID for a machine using , coordinates of a machine Table 1a presents the pseudo-code for the Lifter Assignment Policy Preparation Module. In step 1, using the historical data of which machines sent TRs to all lifters, the source machines with similar travel time distributions for each lifter are grouped together. Next, in order to assign cluster IDs for machines that did not send TRs to all lifters, we create a classifier, ( , ), by learning the x and y coordinates and cluster IDs. By using this classifier, all machines can be grouped into clusters. After allocating a cluster to all machines, the average travel time from each cluster to each lifter ( ̅ ) and the inter-arrival time of the TR in each cluster ( ̅̅̅ ) are obtained using data of the machines in each cluster.
In a similar way, in step 2, we group the lifters using their physical location information, = { 1 , … , ⌊ ⌋ }. That is, by using the coordinates x and y of all lifters, we cluster up to lifters as one group. can be set according to the computing environment in which the algorithm is used to manage the computational load. Then, we form additional groups, , for all combinations that can be created when one lifter is picked from each group. For example, we create initial lifter groups for seven lifters with N set to three using the coordinates x and y of all lifters, IG={(1,2,3), (4,5), (6,7)}.
In step 3, we derive the optimal policy for all lifter groups formed in step 2 ( , ∀ ∈ {1, … , ⌊ ⌋} and , ∀ ∈ ). In other words, given the current state and source machine, a policy that determines the best lifter among lifters in the group is obtained. Optimal policies of additional lifter groups, , ∀ ∈ , are used when selecting the final optimal lifter through the tournament method in the Lifter Assignment Module. The best lifter in each initial group is determined by the current system state and cluster ID of the source machine. Since we do not derive the optimal policy from the model constructed in real time, we must consider all combinations of lifters that can be selected from each initial group. Therefore, by selecting one lifter from each group, we create additional groups for all cases that can be made and obtain the optimal policy of these groups.   IG={(1,2,3), (4,5), (6,7)}, we use the optimal policy of the group (1, 4, 7), (1,4,7) .

A framework for system implementation
The proposed solution approach and algorithms theoretically enable effective assignment of lifters in actual FABs. The remaining question is how to apply these algorithms to real FAB operation. Figure 8 shows the system framework for lifter assignment in real FABs. The two modules described above are aligned to the central operating system of AMHS; however, they work independently.

Figure 8: System implementation framework for lifter assignment in real FABs
The Lifter Assignment Policy Preparation Module produces optimal policies for lifter groups and cluster information for source machines by periodically using historical data. All these outputs are stored in the central system of AMHS in table format. The table of the optimal policy contains the system state, cluster ID, lifter group ID, and the optimal lifter ID in that group as columns. Using this table, given the current state and cluster ID, the best lifter in each group can be selected easily. The cluster information table stores the results from mapping the machine ID to the cluster ID.
In the Lifter Assignment Module, when a TR for inter-line transportation is generated, the state manager checks the current system state (the number of TRs moving to each lifter and the number of TRs waiting to be transported at each lifter).
Then, the decision manager uses as input the name of the source machine that sent the TR and the current state. By automatically processing the information, it obtains the x and y coordinates of the source machine and reads its cluster ID from the cluster information

Experimental Results
To verify the effectiveness of the proposed approach, we conduct both a simple analysis in a virtual environment and a simulation study that mimics a real-world FAB using an AMHS-oriented simulation model (emulator) offered by that company. Since it describes the actual FAB operating environment and is guaranteed to be a high-fidelity simulation, it is an appropriate testbed to examine various approaches to designing effective operational strategies for FABs.
In Section 5.1, we present the details of the experiments. Section 5.2 describes the results of the experiments in a simple environment. In this section, the merits of using our approach are presented. In Section 5.3, the experimental results obtained through the realistic FAB simulation are presented and discussed.

Experimental settings
We first conduct virtual experiments by assuming a small-sized environment. There are three lifters and eight machines. The travel times for all combinations of lifters and machines are shown in  Next, we construct a simulation study based on historical data gathered from Samsung Semiconductor. The FAB we used for our simulation study consists of more than 800 source machines, more than 150 OHTs, and 7 lifters. We use data from March 2020; more detailed information about the data cannot be disclosed because of security issues. Since the goal of this study is to examine the effectiveness of the proposed approach for lifter assignment, we use two bench-marking assignment rules (described earlier) that are used in the actual FAB: the round-robin rule and the shortesttransportation-time rule. The round-robin rule distributes TRs evenly to all lifters. It primarily aims to balance utilization of all lifters. Since the distance from the source machine to the lifter is not considered, it is not expected to reduce delivery times. The shortest-transportation-time rule aims to transfer the TR to another floor as quickly as possible by selecting a lifter for which the weighted sum of the travel time between the two machines and the number of TRs being transferred is the smallest. In practice, the weight may be adjusted by line managers depending on changes in the operating environment. In our experiment, the weight is set from the historical data.
As explained in Section 4, the proposed algorithm uses clustering and classification techniques. We cluster source machines by using the well-known DBSCAN clustering algorithm, which can provide robust clustering outcomes regardless of the distribution of data points and does not require the number of clusters a priori. DBSCAN assigns a cluster ID if there are at least data points within the distance from one data point . For more detail on DBSCAN, refer to Ester et al. [35]. The algorithm parameters and are set to 0.2 and 3, respectively, and the Euclidean distance is used for the distance measure. To train faster and reduce the likelihood of falling into local optimal states, we standardize the average travel time from the machine to each lifter used as the data point. In addition, we individually assign cluster IDs to source machines that are judged as noise points because they do not belong to any cluster. To create the classifier, the k-nearest neighbour (k-NN) algorithm is used. We set the parameter of the k-NN algorithm to 1 so that the cluster closest to the input machine is assigned. The Euclidean distance is also used as a distance measure in the k-NN algorithm. Finally, we derive the optimal solution of the MDP model by using the value iteration algorithm. The discount factor of the MDP model is set to 0.99 and the stopping criterion required for the value iteration algorithm is set to 0.001.
The proposed algorithm is implemented in the Java programming language. The computing environment used in the experiment is as follows: Intel Xeon 6146 (3.20GHz), 16 GB RAM, Windows 10 Enterprise. We limit the maximum allowable number of lifters per group to three to make the computational loads manageable.

Results of the simple experiment
Before testing the proposed approach on an emulator, we analyse its performance with a small-sized testbed. Under the conditions shown in Table 2, an optimal policy for assigning lifters is derived by utilizing the MDP model proposed in Section 3. Table 3 shows two notable cases from the optimal policy. First, since the objective of the MDP model is to minimize the total of the travel and waiting times, the lifter with the smallest value of + × ℎ is selected. For instance, in the case of source machine 4, lifter 2 is selected by comparing the values of + × ℎ for each lifter.
(Unlike the shortest-transportation-time rule, the weight is optimally obtained from the model by considering the given system environment variable and future state.)  instead. This is because lifter 1 is frequently called by other source machines (1, 2, 3, and 8). This may generate severe congestion by making the lifter-assignment policy sub-optimal. To avoid such inefficiency, the proposed approach selects another lifter, thereby distributing the traffic volume and balancing the load. Thus, the lifterassignment policy derived from the proposed approach can consider not only the immediate travel time but also the subsequent scenarios.

Clustering results
It commonly becomes more difficult to obtain an optimal policy as the problem size increases when using MDP. Therefore, we propose an efficient algorithm to place machines into a manageable number of clusters. Figure 9 shows the clustering results for the actual FAB. The symbol X coloured in black represents the location of the lifter; a circle indicates each source machine's location. Circles of the same colour refer to the same cluster. Forty clusters are formed in the FAB used in the experiment. The clustering outcomes shown in Figure 9 appear generally reasonable: machines located near a lifter are grouped together because of their relatively short travel times, whereas machines that are located far from the lifters and have long travel times are similarly clustered.

Performance evaluation with emulator
In order to evaluate the performance of the three lifter assignment rules (including the  Table 4 and Figure 10 show the results from the three assignment rules. One unit on the x-axis in Figure 10  shortest when the proposed algorithm is applied. It is worth pointing out that the proposed algorithm aims to find an optimal lifter-assignment policy that minimizes the sum of the average travel time and the waiting time, which is why it is superior to the bench-marking rules. As it considers both the immediate cost and the cost of the future system states, it provides solid policies that more effectively disperse traffic congestion.
Another notable finding, which may be observed in the graphs in Figure 10, is that the proposed algorithm also yields the shortest travel time while covering the most TRs in the time window we analysed. We thereby infer that the proposed algorithm performs stably compared to the bench-marking rules. This may have been on account of the fact that it can consider future system states and thus avoids undesirable situations a priori.
For all TRs, the proposed algorithm is superior to the bench-marking rules. The

Sensitivity Analysis
To check the robustness of the proposed algorithm, we conduct a sensitivity analysis to help predict its performance under unusual situations in which the load factors of OHTs and lifters increase in accordance with TRs. To examine such situations, we arbitrarily increase the arrival rate of the next TR to be 1.05 times, 1.1 times, 1.15 times, or 1.2 times higher than the historical data.
The results of the sensitivity analysis are shown in Table 5. Of the six measures, we illustrate as representative the average travel time, which is regarded as the most important measure in practice. Table 5 shows that, if the arrival rate of the next TR increases, the travel time to lifters also increases regardless of the type of assignment rule. Nonetheless, the proposed algorithm has the shortest travel time, even for the highest arrival rates we consider. To confirm this statistically, we conduct a t-test to examine the difference among the results. Based on the t-tests with a p-value <0.05, we conclude that the improvement in performance owing to the proposed algorithm is statistically significant.
Next, we test the performance of the proposed method by using out-of-sample data (specifically, a different dataset). As shown in Table 6, the travel times are increased for all test instances when we use a different dataset. This is expected because the dataset used for deriving the optimal policy is different from that used for testing the performance. Nevertheless, it is worth noting that there is no statistically significant difference in three out of the five test instances. This suggests that, even when the operating environment is altered by a small amount, the operating policy derived from the proposed approach will continue to perform adequately. This means that real-time updating of the lifter assignment policy, which requires tremendous computational resources, is not necessary. Hence, our approach of periodically updating the lifter assignment policy is valid in practice.  This series of experiments, both idealized and realistic, verified the effectiveness of the proposed approach. In particular, there is a significant improvement over the round-robin rule, which has no mechanism to consider travel time when TRs need transportation to a lifter. Moreover, our approach outperforms the fastest transfer option that does not consider a long-term perspective, showing the importance of accounting for a system's future state when assigning TRs to lifters.

Conclusion
This study addresses the problem of assigning TRs to lifters in semiconductor manufacturing. As the production process becomes increasingly complex, more capacity is required to cope with it. The use of multiple floor lines provides a means to increase capacity without construction of another FAB line; however, it requires meticulous control of OHTs to manage the traffic. In this study, we solve a lifterassignment problem that allocates TRs to an appropriate lifter via MDP. Unlike the rules used in the actual FAB lines, our model can provide an optimal lifter-assignment policy by considering the system's future states. We also propose an algorithm using a well-known clustering method to efficiently reduce the problem size. To the best of our knowledge, neither such a future-oriented formulation nor such an efficient solution approach have been introduced in the previous literature.
The effectiveness of our approach relative to two bench-marking rules is demonstrated with a simulation incorporating data from the operation of an actual FAB.
The proposed algorithm reduces the travel time significantly compared with the benchmarking rules. Sensitivity analysis also confirms the robustness of the proposed algorithm. In the context of semiconductor manufacturing, our approach is expected to provide major economic advantages over the status quo if implemented in an actual FAB. For example, our approach should reduce the travel time of all TRs by 2.3%. As a result, approximately 2.5 million USD (2.76 billion KRW) would be saved.
Moreover, our study elucidates the possibility of autonomous control in assigning lifters. Since our model is based on an optimization scheme that uses the system information as an input variable, it can easily consider the changes in FAB lines and automatically update the lifter assignment polices. Considering that the current rules require manual adjustments -varying weights on the basis of the system's statusour approach may reduce the workload of operators in semiconductor manufacturing.

Conflicts of interest/Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Availability of data and material
Not applicable. (The authors cannot disclose data because of security issues.)

Code availability
Not applicable.

Ethical Approval
Not applicable.

Consent to participate
Not applicable. Example of state-transition of the system with three lifters Figure 4 Example of step 1 of the solution approach: cluster source machines Example of step 2 of the solution approach: division into sub-problems Example of step 4: Obtaining the best lifter by the tournament selection Aggregation result of source machines Figure 10 Results from the three assignment rules by time window