In the context of global warming, the frequency and intensity of extreme weather and climate events are increasing. However, the impact of these changes that is directly felt by people is the day-to-day temperature change. Extreme temperature changes between neighboring days (ETCNs) carry substantial disease risks and socioeconomic impacts. Evaluative studies of ETCN events with global circulation models (GCMs) remain unknown in China. This study quantitatively evaluates the performances of 36 GCMs and the multi-model ensemble (MME) of the Coupled Model Intercomparison Project 6 (CMIP6) in simulating the extreme cooling (EC) and extreme warming (EW) events of two consecutive days as defined by relative thresholds. Moreover, we select the optimal models in different regions at the seasonal and annual scales in China, providing theoretical support for the frequency projection and modeling improvement of ETCN events. The results showed that from 1981 to 2013, the annual average EW events and EC events in China showed increasing but not statistically significant trends, and the frequency of EW events was higher than that of EC events. EW events mostly occurred in spring, while EC events mostly occurred in autumn. Additionally, the performances of the CMIP6 models are quite different between EC and EW events. The simulations of EC events are generally more reliable than those of EW events, and the models can also capture the annual cycle of EC events well. Furthermore, the CMIP6 models overestimate the frequencies of EW and EC events but underestimate the frequency of EC events in autumn. The CMIP6 models exhibit poor performance in simulating the trend of and interannual variability in ETCN events and can only simulate the decreasing trend in autumn. Finally, according to the overall ranking of the CMIP6 models, GFDL-ESSM4 and EC-Earth3-Veg-LR achieve the best performance in simulating EW and EC events, respectively. The CMIP6 MME only effectively improved the capabilities of the models to simulate winter EC events in the WNW region. In terms of the trend of and interannual variability in ETCN events, individual models exhibited better performances than the CMIP6 ensemble.