Complex Internet of Things (IoT) applications provide services in the form of microservices at resource-constrained edge nodes. As the complexity of systems increases, the interactions and collaboration relationships, such as invocations and dependencies implied in the microservice combinations become more and more complex. Therefore, resource and latency issues arising from the concurrent execution of multiple microservices are a significant challenge in offloading computing tasks. To improve the efficiency of microservice scheduling, a microservice scheduling model based on queuing network is constructed for different arrival modes, processing types, workloads, and response times in large-scale IoT scenarios. Moreover, the mean response time of the task is calculated by analyzing the workload ratio of both edge servers as well as cloud servers, the arrival rate of different tasks, average number of nodes visited. Thus, for a given load, the optimal solution of the queuing network steady state can be obtained. As the workload increases, more and more microservices need to be migrated to cloud servers to get the minimum mean response time, while it is straightforward to get the respective workload ratios of the edge servers and the cloud servers. Finally, the scheduling policy is analyzed using different arrival rates of real workflows with different numbers of edge servers and cloud servers in cloud edge environments. It is verified that the optimal solution proposed in the paper reduces the mean response time of tasks and effectively meets the requirement of minimizing latency in microservices scheduling.