In the modern digitalized world, Speaker verification (SV) system is essential for authorizing the client’s credentials. To design an effective SV system, MGWOVSW-CAES-GMM system has been proposed. In this system, the Modified Grey Wolf Optimization (MGWO) technique was employed to optimize the variable sliding window size, FMPM features and training variables. The optimized features were watermarked and encrypted using a Chaotic-based Advanced Encryption Standard (CAES). Once the encryption process was completed, the encrypted features were forwarded to the recipient who executes the decryption and de-watermarking processes. At last, the decrypted features were classified using Gaussian Mixture Model (GMM) classifier. Conversely, MGWO has poor convergence rate and ineffective searching results. Hence, this article proposes an EEHOVSW-CAES-GMM system in which Enhanced Elephant Herding Optimization (EEHO) algorithm is applied instead of MGWO. On the contrary, the computational complexity of GMM classifier is high and its efficiency is less while increasing the number of features. For this reason, a Deep Neural Network (DNN) classifier is employed instead of GMM for recognizing the decrypted features and authorize the speaker’s identity. Besides, the parameters utilized in DNN topology are optimized using two different systems such as MGWOVSW-CAES-DNN and EEHOVSW-CAES-DNN for reducing the computational complexity and increasing the classification accuracy effectively when using more number of features. By using these classifiers, the speaker’s identity is verified and the attacks during the transmission are prevented with the highest security level.