The traditional imagery task for brain−computer interfaces (BCIs) consists of motor imagery (MI) in which subjects are instructed to imagine moving a certain part of their body. This kind of imagery task is difficult for subjects. In this study, we used a less studied yet more easily performed type of mental imagery—visual imagery (VI)—in which subjects are instructed to visualize a picture in their brain to implement a BCI. In this study, 18 subjects were recruited and instructed to observe one of two visual-cued pictures (one was static, while the other was moving), and then imagine the cued picture in each trial. Simultaneously, electroencephalography (EEG) signals were collected. Hilbert-Huang Transform (HHT), auto-regressive (AR) models and the combination of empirical mode decomposition (EMD) and AR were used to extract features, respectively. A support vector machine (SVM) was used to classify the two kinds of VI tasks. The average, highest, and lowest classification accuracies of HHT were 68.143.06%, 78.33%, and 53.3%, respectively. The values of the AR model were 56.292.73%, 71.67%, and 30%, respectively. The values obtained by the combination of EMD and the AR model were 78.402.07%, 87%, and 48.33%, respectively. The results indicate that multiple VI tasks were separable based on EEG, and that the combination of EMD and an AR model used in VI feature extraction was better than that of an HHT or AR model alone. Our work may provide ideas for the construction of a new online VI-BCI.