Around-The-Pipe Audio Based Water Tracker for Home Residents

Water is one of the most necessary resources within an average household that cannot be quantitatively tracked on a regular basis. There are no methods to this without interfering with the pipeline and flow of water itself. This research presents a revolutionary approach that utilizes audio to measure the amount of water going through a household pipe at any given time with supervised machine learning models. First, different amounts of water were run throughout residential homes with a microphone mounted around the pipe. The output of the microphone is converted into a series of vibrations measured in decibels. The microphone was tested on seven different pipes in 7 different homes to decrease the data bias. It outputs the string of decibel values with a micro-computing device connected to it. The amount of water and received audio is documented into a CSV file with over 100 data points. The data was imported into the Jupyter Notebook and the features (water amount, range of decibels, size of pipe) were scaled and the dataset was split into training and testing subsets at a 70/30 ratio in order to assess the model’s accuracy. Neural Networks, K-Nearest Neighbors, Random Forest, Linear Regression, and Logistic Regression were the five models being used. After all the models were trained and edited to improve hyperparameters, they were compared by testing the data, Neural Networks ended up with the highest classification accuracy at 94.6% as reported by ROC methods.


Introduction
Household water trackage has made great advances in recent years but currently no conventional methods exist that do not require a lot of expenses, insurance and does not require breaking into a water pipe. This is a problem because these factors make it less likely that homeowners would care to track their regular water usage which is a problem because this causes water to be wasted and houses run a risk of. To prove this over 900 billion gallons of water is wasted during leaks per year, 6300 gallons of water per household is wasted per day, this causes 1 trillion dollars in water loss per year. The overarching question is, how does the sound created by water flowing correspond to the amount of water flowing? This work presents a validated artificial intelligence-based device that can track water using audio from around the water pipe. A device strapped around a water pipe with a microcomputing chip will be connected to a microphone to measure the decibel range of sound created when water flows through and will use supervised machine learning models to estimate the water flow quantity. I hypothesize that if devices exist that utilize machine learning models to measure water flow through household pipes, then homeowners will be able to save money on their water bills and become more concerned with their environmental impact.
To do this a prototype of the physical tool must be built and be programmed to measure the volume of the water when water flows through. The data will be recorded into a CSV file. From there four machine learning models will be run on the training data and the best model will be chosen from their performance of the testing and implemented to predict the quantity of water flowing based on the sound.

Related Work
The tracking of water usage is not-trackable by household owners because there are no devices that exist that don't require a complex network of sensors and are easy to install without interfering with the internals of a water pipe. A correlation between PSI (Pound-force per square) and water pressure exists but no correlation to the actual amount of water. The problem that the research is trying to address is to be able to track water usage in houses cheaply and easily using just one single device. This research is important because water usage is one of the few household necessities that are not easily traceable by homeowners so creating a device for this purpose will allow homeowners to be able to decrease money spent on water and monitor the resources they consume.

Materials
The following materials include both hardware tools and software tools that are used to complete the research.
The first and most important tool is a microcomputing chip (MCC), to record all data and test and train the ML models whilst connected to a microphone. For prototype purposes, this is a Raspberry Pi. A microphone must be connected to MCC and used to record a decibel range of water flowing through. A 7" LED Screen to relay decibel range readings during the data collection phase. All this is put together by soldering and working onto a Velcro Adjustable Strap. The microphone would safely rest on the Strap and the microphone would hang 15 cm from the top of the pipe for optimal sound.
The software tools are what will be used to code and design the project. All work was done in Visual Studio Code (VSCode). All machine learning models were tested here. The library that was mainly used was Scikit-Learn to train, test, and validate all the machine learning models. A CSV document contained the actual data points.

Data Collection
The procedure for calculating water flow from the outside of a pipe is as follows. A microphone will be placed 15 cm on the outside of a standard household water pipe using a size adjustable strap. Water will go through the pipe in different quantities as denoted by the occupancy of the value(full, half, quarter, eight). The amount of water flowing and decibel range (max volume -min volume) will be recorded into a CSV file and used as data. The valve amount recorded was made known by having an actual industry psi water tracker on the inside of the pipe half a yard from the audio device. Using the data, supervised machine learning models can train on the data to be able to predict water flow from the decibel range in the testing phase. Once the error rate is low enough the model will be implemented and that would be the final device effectively making it possible to track all of a household water usage from one exterior device. The following table shows the data that was recorded.

Models
To analyze the data, a correlation must be made between the recorded data of water flow and decibel range. By inputting the data in a CSV file, a customized python program can be used to visualize the relationship and various machine learning models (Logistic Regression, Linear Regression, Neural Network, KNN) can be run on the data to determine the best mathematical correlation between water amount and decibel range, therefore allowing the device to measure household water flow without ever having to interfere with the pipes.
All models were built and tested with the following accuracies. Multiple models were used in order to find the best possible model. All models had an accuracy between 93.22% to 98.3%, neural networks had the best accuracy at 98.3%, and the average of all models is 97.79.
Neural Networks are based on the way that brains process information. In a brain, there are over 100 billion neurons and they are connected by a series of synapses that are used to process information.
Similarly, in a machine learning neural network, artificial neurons are used to precisely produce an output.
The role of a synapse is to multiply the inputs and weights. The weights are the "strength" of the connection between neurons. Weights primarily define the output of a neural network. However, they are highly flexible. After, an activation function is applied to return an output.
Each synapse has a weight and each neuron has a bias and activation function, which determines whether the neuron will be activated.
Parameters fine-tuned for better accuracy: cost function, hidden layer layout, and type of activation function were fine-tuned in the process. After fine-tuning the model: Best cost function: Cross entropy Best activation function: Sigmoid function Best hidden layer layout: Two layers with four hidden nodes.
Equation of Neural Network: = + 1 1 + 2 2 +... + Z is the representation of the Neural Network W is the weights X is the inputs B is the bias Equation of Gradient Descent: C is the Cost value A is the final output Y is the desired output N is a number of independent variable X is the sum of input attributes.

Conclusion and Results
A low-cost device can track water flow through pipes without interfering with pipes at a 98.3% accuracy using Neural Networks. This is better than traditional methods because it does not interfere with pipes, it is cheap, it has an easy setup, the accuracy is near 100%, the algorithm can always be improved and changed, it has a possibility for future expansion, and the data is easily obtainable and expandable. This will allow homeowners to track their water usage in order to save money and analyze their environmental impact.

Future Research and Work
To expand this research I would like to add more input features to the neural network to make the water flow predictions more centered and focused on every individual pipeline. Possible features would include pipe size (radius), age of pipe, the material of the pipe, and water pressure. Throughout the paper, it has been focusing solely on the amount of water going through the pipe solely off of the decibel range it creates, having these inputs factored into the predictions will produce more accurate estimations.
Another way to expand this research is by creating more code to identify not only if the valve is full but if there's a flood. To do this one would need to create code to track the start time of a full valve and if it reaches over a certain amount of seconds there would likely be flooding.
Users of this device may be concerned about their water usage if there is a leakage of some sort and too much water is being utilized. If water is running the user would want to know what device is causing this. Having a user-customized code to test different water dispensers in the house and saving them as a table in the CPU would be of the utmost usage. A user would be able to run a script on the device and save the decibel range of water running to a name, for example, VALVE A. In the future when water is running and it has a similar decibel range to VALVE A, it can return that name, therefore, letting the user know what household tool is causing the water.
Lastly, a GUI needs to be created as an app to notify an actual homeowner of the predictions and information generated from the CPU script. An app will allow users to actually monitor all the data created. To do this the python code in the device needs to be linked to an account and save all the data to this account. This account can also be accessed via website or app and when logged in, the user will be able to see the actual data.