One of the challenges of robotics in the modern manufacturing industry is assembly task. The manufacturing industry requires various insertion tasks, from peg-in-hole tasks to electronic parts assembly. Nowadays, robotic solutions for this problem often use the conventional methods. In those methods, the industrial robot is controlled by the hybrid force-position control and performs preprogrammed trajectories, such as a spiral path. However, electronic parts require more sophisticated techniques due to their complex geometry and susceptibility to damage. We propose a vision-driven method based on reinforcement learning (RL) for assembling electronic parts. In our approach, the input image for the RL agent is acquired from two cameras mounted to the robot's end-effector. In this work, we also analyze the influence of the observation modalities on the RL's agent performance metrics, such as insertion success rate and average assembly time. Results show that visual information acquired from a double-camera vision system significantly improves the RL method's robustness on the position disturbance in insertion tasks. The proposed method in this work outperforms conventional methods, such as random search, spiral search, and straight-down insertion in terms of success rate and robustness for the robot's initial position disturbances. Moreover, our approach is more robust for disturbance than a method that uses an external camera or a single camera mounted to the robot's end-effector for image acquisition.