This study presents a comprehensive data processing pipeline designed to convert diverse labeled audio recordings into a uniform labeled dataset suitable for input into machine learning algorithms. The primary emphasis lies in applying this pipeline to marine audio classification, with an illustrative example from this domain serving as experimental validation for the proposed approach.