The sound classification assisted by neural networks has led to high-accurate results. A high number of different applications uses this kind of sound classification, such as: control and monitoring of the number of vehicles on roads or identifying different types of animals in natural environments. While traditional acoustic processing applications have been developed taking into account high-throughput computing platforms equipped with expensive multichannel audio interfaces, the IoT paradigm demands the use of more flexible and energy-efficient systems. Although there exist software-based platforms devoted to implementing general neural networks, they are not optimized for sound classification leading to the waste of energy and computational resources. Here, we have used FPGAs to develop an ad-hoc system where only the required hardware for our application is synthesized, obtaining faster and cheaper (in terms of energy) circuits. The results show that our developments accelerates by a factor of 35 with respect to a software-based implementation on a Raspberry Pi.