The problem of controlling an agent, following by the leader, which moves by some complicated route, unknown for the follower is relevant both from a methodical point of view, and the practical one. In the first case, it is a rich polygon for investigation and prospective development of reinforcement learning (RL) methods. In the second case, a solution of the issue gives a means to control the route for an agent-follower which does not possess a detailed map of the area and the ability to navigate in space using external systems such as ”Global Positioning System” (GPS). The agent is obliged to remain on the route at a given distance from the leader. We consider the problem in the next statement: the agent follows by the leader, taking into account the avoidance of dynamical obstacles in a defined narrow area relative to a given or dynamically generated route. To train a reinforcement learning-based model we developed a two-dimensional (2D) environment simulator, where the "Light Detection and Ranging" (LIDAR) data from the last few steps are used as features, and the linear and angular velocities of the agent are used as the model's output. The paper presents the results of a study of various experimental configurations: the number and types of obstacles, variability of movement routes, and signs of the current state of the environment. The result shows the applicability of the developed model to the problem of following the leader along previously unknown routes in a three-dimensional (3D) environment built using the Gazebo simulator. Code is available at: https://github.com/sag111/continuous-grid-arctic.