Weakly supervised semantic segmentation (WSSS) using only image-level labels has gradually become an emerging research hotspot in the field of computer vision in recent years due to its low annotation cost. Existing methods rely on Class Activation Maps (CAMs) from specific classification models to locate target regions. However, the classifiers tend to focus on the most discriminative regions of the input image and assign higher weights to these areas, leading to the problem of incomplete CAM target regions. To address this issue, we design a Siamese feature aggregation network, named SFA-Net, which introduces contextual information to activate more complete target regions while suppressing the similarly adjacent background regions. Specifically, the context-aware module in the SFA-Net is consisting of a multi-scale adaptive aggregation sub-module and a contextual linkage sub-module, which can uncover potential target features and identify global target areas. A background activation suppression loss is designed to minimize false activations in the background regions by measuring the similarity between the target object and background regions at the boundary. Extensive experiments on the challenging PASCAL VOC 2012 and COCO 2014 datasets show that our SFA-Net outperforms other state-of-the-art methods.