Developing conversational agents that can generate appropriate responses for meaningful and natural conversations between humans and machines is a difficult task in the field of artificial intelligence. One key factor for agents to produce accurate responses is the ability to fully and effectively utilize the context of the current utterance in a conversation. However, many previous studies have neglected the relationship between utterances in a conversation, which limits their effectiveness. This paper aims to address this limitation by introducing a novel method for modeling the contextual information of the current utterance when generating a response. The goal of this research is to improve the accuracy of responses generated by conversational agents by considering the contextual information of the current utterance in a conversation. To achieve this, we propose a novel approach that combines a Deep Seq2Seq model with reinforcement learning. The Deep Seq2Seq model generates responses based on the left context of the current utterance, while reinforcement learning is used to evaluate the entire generated conversation based on the right context of the current utterance. We also utilize a pre-trained word embedding model to help build reward functions for the reinforcement learning component and to represent words in the generated responses. Experimental results show that the proposed model leads to significant improvements in BLEU scores compared to a baseline model.