Tool wear and faults will affect the quality of machined workpiece and damage the continuity of manufacturing. The accurate prediction of remaining useful life (RUL) is significant to guarantee processing quality and improve productivity of automatic system. At present, the most methods for tool RUL prediction are trained by history fault data. However, when researching on new types of tools or processing high value parts, fault datasets are difficult to acquired, which led to RUL prediction a challenge under limited fault data. To overcome shortcomings of above prediction methods, a deep transfer reinforcement learning (DTRL) network based on long short term memory (LSTM) network is presented in this paper. Local features are extracted from consecutive sensor data to track the tool states, and the trained network size can be dynamically adjusted by controlling time sequence length. Then in DTRL network, LSTM network is employed to construct the value function approximation for smoothly processing temporal information and mining long-term dependencies. On this basis, a novel strategies of Q-function update and transfer are presented to transfer the DRL network trained by historical fault data to a new tool for RUL prediction. Finally, tool wear experiments are performed to validate effectiveness of the DTRL model. The prediction result demonstrate that the proposed method has high accuracy and generalization for similar tools and cutting conditions.