An increasing number of studies have been devoted to the use of electroencephalogram(EEG) for identity recognition due to the properties of EEG signals that are not easily stolen. Most of the existing studies on EEG person identification have only studied brain signals in a single state, requiring specific and repetitive sensory stimuli. However, the reality of human states is diverse and rapidly changing, which limits the use of their methods in realistic conditions. This demonstrates the excellent ability of the attention mechanism to model temporal signals. In this paper, we propose a transformer-based approach that extracts features in the temporal and spatial domains using a self-attention mechanism for the EEG person identification task. We conduct an extensive study to evaluate the generalization ability of the proposed method among different states. Our method is compared with the most advanced EEG biometrics techniques and the results show that our method reaches state-of-the-art results. Notably, we do not need to extract any features manually.