Neural machine translation systems have recorded tremendous performance in data-intensive application, where the performance declines when we have limited training examples. In this paper we benchmark NMT between English and four African Bantu LRL pairs (Luganda, Swahili, Shona Tsonga [LSST]). Our aim was to evaluate current state of NMT for LRLs especially Bantu languages. Being the most morphological rich languages, but with OOV (Out of Vocabulary Problem) we proposed an NMT model using Multi-head self-attention. The model worked along with pre-trained BPE and Multi-BPE embeddings to develop a state-of the art translation system for low resourced morphological rich Bantu languages which have scarce translations online. We were able to address the issues with the publicly available corpora after refining the data for further use. We subjected our results to BLEU score for our system performance evaluation. Our experiments showed exemplary and first ever LRL translation BLEU score of; Eng.-Tsonga, Eng.-Swahili, Eng.-Shona and Eng.-Luganda as 62, 37, 22 and 20 respectively.