Accurate sensing and understanding of gestures can improve the quality of human-computer interaction, and has great theoretical significance and application potentials in the fields of smart home, assisted medical care, and virtual reality. Device-free wireless gesture recognition based on WiFi Channel State Information (CSI) requires no sensors, and has a series of advantages such as permission for non-line-of-sight scenario, low cost, preserving for personal privacy and working in the dark night. Although most of the current gesture recognition approaches based on WiFi CSI have achieved good performance, they are difficult to adapt to the new domains. Therefore, this paper proposes ML-WiGR, an approach for device-free gesture recognition in cross-domain applications. ML-WiGR applies convolutional neural networks (CNN) and long short-term memory (LSTM) neural networks as the basic model for gesture recognition to extract spatial and temporal features. Combined with the meta learning training mechanism, the approach dynamically adjusts the learning rate and meta learning rate in training process adaptively, and optimizes the initial parameters of a basic model for gesture recognition, only using a few samples and several iterations to adapt to new domain. In the experiments, we validate the approach under a variety of scenarios. The results show that ML-WiGR can achieve comparable performance against existing approaches with only a small number of samples for training in cross domains.