This paper presents a vision-based in-hand manipulation method. Picking up an object from a pile is an important task for a robotic hand. In particular, for objects of various sizes, weights, and hardness, it is necessary to detect slippage between the object and the surface of the hand. Meanwhile, dexterously changing the position and orientation of the grasped object in the hand workspace is also necessary for sequential pick-and-place tasks. In an effort to achieve sequential motion, an algorithm for picking up and translating a desired object is proposed. A two-jaw parallel gripper with a conveyor belt on each gripper surface is used, which is capable of both translating and rotating a grasped object. In order to observe the slippage, the actuation of the belts is combined with the detection of the manipulation status from images captured by a stereo camera attached to the hand. With an algorithm for the combination, both picking up and translating various objects of unknown sizes and hardness are achieved without needing machine learning models (which require large amounts of training data) or any kind of tactile sensing. The validity of the proposed method is verified through experiments to pick up various objects and translate them to the given position.