Underwater object detection (UOD) is crucial for marine ecosystem protection, ocean resource exploration, and maritime security. While generic object detection methods have shown success in atmospheric conditions, they fall short in the underwater environment. The complicated underwater environment and the physical properties of light make it highly challenging to distinguish ambiguous objects from the background. Also, the common phenomenon in the wild that small aquatic creatures often congregate in large groups further complicates the UOD task. In this study, we tackle these challenges and propose Aqua-DETR, an end-to-end framework tailored for underwater object detection. Specifically, we design an Align-Split Network to reinforce multi-scale feature interaction and fusion for small object identification. We also present a distinction enhancement module based on different attention mechanisms to enhance the identification of ambiguous objects. The experimental results on four challenging datasets demonstrate the superiority of our method over most existing state-of-the-art methods in the UOD task. The code will be available at https://github.com/Calendula597/Aqua-DETR.