Pipeline parallelism is a distributed deep neural network training method suitable for tasks that consume large amounts of memory. However, this method entails a large amount of overhead because of the dependency between devices in performing forward and backward steps through multiple devices. A method to remove forward step dependency through the all-to-all approach has been proposed for the compute-intensive models; however, this method incurs large overhead when training with a large number of devices and is inefficient in terms of weight memory consumption. Therefore, we propose a pipeline parallelism method that reduces network communication using a self-generation concept and simultaneously reduces overhead by minimizing the weight memory used for acceleration. In a Darknet53 training throughput experiment using six devices, the proposed method showed excellent performance of approximately 63.7% compared to the baseline by reduced overhead and communication costs and showed less memory consumption of approximately 17.0%.