Assessment of contaminants in California drinking water: analyses by region and water system size

BACKGROUND While research has shown that small community water systems in the San Joaquin Valley, an agricultural region in California, experience a disproportionate amount of drinking water contamination, little is known about the extent of contamination in other California regions. Additionally, state-wide research on drinking water contamination in areas not served by community water systems (mostly private domestic well users) and research comparing contamination across system size are also limited. METHODS Using a novel method to assign drinking water and groundwater contaminant data to community water system service areas and areas outside of service area boundaries, we conducted a spatial analysis to estimate concentrations of thirteen contaminants and two drinking water standard violations by system size and California region. We developed a cumulative ranking method to evaluate which regions or system size category are most burdened by multiple drinking water pollutants. A trend test was also used to evaluate the influence of system size on contaminant concentrations. RESULTS The San Joaquin Valley, areas not served by community water systems, and small water systems had the highest cumulative rank for multiple high contaminant concentrations, most notably arsenic and nitrate. Large systems and the South Coast region, which includes Los Angeles, had the highest levels of disinfection byproducts and industrial contaminants like tetrachloroethylene. Violations, arsenic, lead, and cadmium had negative trends as system size increased (p<0.05). Industrial contaminants and disinfection byproducts had positive trends as system size increased (p<0.05). could Key a potential high burden of and TCR violations, 1,2,3-TCP, and nitrate in the SJV, DBCP in Northern California groundwater, and TCE, PCE, and THMs in the South Coast region and among large water systems. Overall, Californians residing in areas outside of regulated public water systems (non-CWS areas), served by small water public water systems, and people living in the SJV have the greatest burden of multiple contaminants. Future analyses could explore socioeconomic factors associated with drinking water contamination in specific regions or system size categories. As access to improved data becomes available and technology improvements yield more sensitive and affordable testing protocols, the ability to characterize drinking water quality by water system, linking boundaries to monitoring data and comparing potential exposures at both a local and state wide level, will important government and with public

3 CONCLUSIONS Few large-scale studies have examined how geographic region or system size impact drinking water quality. Although not indicative of violating drinking water standards, our results show where efforts for specific contaminants can be targeted in specific regions. The results presented here can help understand where contaminant levels might be elevated, both from an individual contaminant perspective as well as where multiple elevated contaminant levels accumulate. BACKGROUND California residents receive drinking water from a wide variety of sources and distribution systems due to the state's varied geography, topography, and climate.
The majority (95 percent) of the 38 million Californians receive their water from community water systems (CWSs) with different levels of treatment, while a small fraction of the state's population rely on very small water systems or privately operated groundwater wells with little to no treatment (1,2). The presence of chemical and bacterial contamination can reach the water supply through different pathways from both naturally-occurring and anthropogenic sources and has the potential to result in widespread exposures and health concerns (3).
A number of studies have documented high levels of drinking water contaminants in some regions of California and the potential adverse health effects from exposures to some contaminants. (4) For example, arsenic and uranium are naturally present in soil and rock and are among the most common contaminants detected in California, especially in San Joaquin Valley (SJV) groundwater (5,6). In the San Joaquin Valley (SJV) and in other agricultural areas of the state, the application of fertilizer and leaching of animal waste is a major source of groundwater contamination by nitrate (7)(8)(9). Exposure to drinking water containing arsenic 4 increases the risk of multiple health effects including cancer (10,11) and nitrate in drinking water has been associated with a specific type of anemia (methemoglobinemia) in infants, reproductive toxicity, developmental effects, and various cancers (12,13). Industrial sources such as facilities that manufactured perchlorate or used solid rocket fuel have produced significant groundwater contamination of perchlorate (14). Low-level perchlorate contamination of drinking water has been associated with abnormal thyroid function in neonates and pregnant women (15,16). Methods commonly used to disinfect water with chemicals such as chlorine before distribution can introduce by-products such as trihalomethanes (THMs), which have been linked to an increase risk of bladder cancer (17,18). In addition to concern over the effects of individual contaminants in drinking water, cumulative exposure to multiple compounds may affect health through chronic exposure, even at low-levels (19,20).
The State Water Resources Control Board (SWRCB)'s Division of Drinking Water regulates drinking water quality for a majority of Californians (about 94 percent) on public water systems. Of these state regulated public water systems, approximately 97% met state drinking water quality standards in 2014 (21), meaning that they did not receive violations for water contamination over the Maximum Contaminant Level (MCL), which is the regulatory concentration set by the SWRCB. The MCL is based on the public health goal (PHG), which is the concentration level for a contaminant that does not pose a significant risk to health and the feasibility to detect and treat contamination. Even though most CWSs meet MCL standards, many still strive to meet the PHG to the extent feasible. Small water systems with less than 200 service connections (usually households) and generally in rural areas, however, tend to have the greatest difficulty meeting drinking water standards in California (22).

5
The characteristics and locations of water systems that are especially vulnerable to contamination and that struggle to meet health-based and regulatory standards is of particular interest to policy-makers. Only a few large-scale studies have examined the role of characteristics such as system size, region or socioeconomic status on the quality of drinking water. National and local studies have evaluated trends in MCL violations and found that smaller and more rural systems tended to have the largest number of violations (23)(24)(25). Other studies have also observed similar trends small systems for specific contaminants such as nitrate and carcinogenic compounds (26,27). In the SJV, low-income, rural and socioeconomically disadvantaged communities were more vulnerable to violations and generally had higher levels of nitrates and arsenic (5,7). A recent report showed disadvantaged unincorporated communities in this region are mostly served by small systems, with about a quarter of systems providing water out of compliance with state drinking water quality standards (28). Additionally, smaller CWSs rely more heavily on local groundwater than surface water compared to larger CWSs making them more vulnerable (29). Mobile home park CWSs in California, which tend to be small systems, are more likely to incur a MCL violation compared to other system types (30). Also highly vulnerable are the two million Californians not receiving water from public water systems, but from very small systems (less than 15 connections) or private wells drawing from groundwater. Both of these sources are not monitored or regulated by the state and may be at most risk of exposure to multiple contaminants. A national analysis by the US Geological Survey showed that over 23% of domestic wells sampled had at least one contaminant with a concentration above the MCL (31).
As research on areas outside of CWSs (domestic well users and non-regulated small 6 CWSs) is limited, our study is one of the first to examine multiple contaminants in these areas along with assessing how contamination varies by system size.
Additionally, while some research has focused on CWSs in the SJV, less is known about contaminants that pose problems for water systems across other California regions. Our study is one of the first to characterize potential exposure to multiple contaminants by system size and across all California regions, for water systems and areas outside of water systems. Especially since communities are impacted by different sets of contaminants depending on their geography, identifying contaminant hotspots and trends in all areas will provide a more complete picture of contamination concerns throughout the complex California drinking water network.
Considering the cumulative effects (from the exposure of a combination of multiple sources) of multiple pollutants in drinking water rather than individual pollutants is important in protecting vulnerable communities that may already be facing multiple stressors from combined sources of pollution (20,32). This study builds upon an assessment of cumulative exposure to drinking water contaminants developed for an environmental-justice screening tool called CalEnviroScreen, which identifies communities impacted by multiple sources of pollution in order to target and steer resources towards impacted communities (32, 33). We take a comprehensive look at the concentration of contaminants in drinking water, both individually and cumulatively, to compare potential exposure across system sizes and regions and to determine vulnerable areas and trends. Understanding which contaminants are impacting different regions and populations as well as which areas face the highest cumulative burdens may help focus future research and remediation efforts to those communities who potentially face the greatest risks. We hypothesize that smaller water systems and regions of the state that are more rural, such as the SJV, have 7 higher levels of multiple contaminants in drinking water. To our knowledge, this is the first study to examine multiple drinking water contaminant concentrations by water system size including areas outside of public water systems as well as region.

Development of Drinking Water System Service Boundaries
Three methods were used to geographically assign all areas of California a representative water quality. We first used available CWS service boundaries downloaded from the California Tracking's Water Boundary Tool (34,35). More than 80 percent of the 2,949 CWSs that were active at the time of analysis were available from this web-based tool. The remaining CWS boundaries were approximated using information on the location of water monitoring and the population served by the system using a geographic information system (ArcMap 10.2). To summarize water quality for populated areas outside of CWS boundaries (non-CWS areas), a 6 square mile township grid from the Public Land Survey System was used. The 6 square mile grid was selected instead of larger geographies to assign a sufficient amount of ambient groundwater monitoring data to local residents. The boundary types are summarized in Table 1

Concentrations of Chemical Contaminants in CWS areas
The study focuses on drinking water intended for consumption without further 8 treatment, here termed delivered water. In order to extract water contaminant data reflective of delivered water for CWS areas, we used four SWRCB datasets.
California's Safe Drinking Water Information System (SDWIS) and its predecessor, the Permits, Inspection, Compliance, Monitoring and Enforcement Database (PICME), which had to most complete data at the time, both contain information on the location and treatment type of chemical monitoring within a system (37). Water Quality Management (WQM) and the Annual Compliance Report datasets provide contaminant sampling and violation data that was linked to the first two datasets (38,39).
Distribution systems for California CWSs vary considerably in complexity and size.
For example, some CWSs provide water directly to consumers from a single source (e.g., water reservoir), some may have one treatment plant, or larger CWSs may have multiple treatment plants serving different areas within a water system and chemical monitoring may occur at multiple locations. The PICME and SDWIS datasets contain data that identifies the type of treatment (or its absence) at each monitoring location within the distribution system. Monitoring locations identified as "Active Treated" (post-treatment) or "Active Untreated" (never to be treated) were extracted. "Raw" (pre-treatment) samples were selected only when a water system did not have reported treated or untreated samples for specific contaminants. Next, we extracted water quality data from the SWRCB's WQM databases for 2005 to 2013, reflecting the most recent three compliance periods at the time of this analysis. The nine-year compliance cycle, which contains three three-year compliance periods, was used to maximize the number of water quality samples.

Concentrations of Chemical Contaminants in Non-CWS Areas
In this study, Californians receiving drinking water from private wells or very small 9 systems (under 15 connections) without chemical monitoring data were assumed to drink local, or ambient groundwater. To summarize ambient water quality for these non-CWS areas, we selected raw or untreated groundwater monitoring data from WQM for any public water system, including samples from non-community water systems (schools, gas stations, etc.), within the township. We also incorporated groundwater samples from two sources from 2005 to 2013: 1) the U.S. Geological Survey Priority Basins Project, which assesses groundwater used for public drinking water supplies and 2) the SWRCB Groundwater Ambient Monitoring and Assessment Domestic Wells Project, which samples a limited number domestic wells for contaminants of concern on a voluntary basis. Less than one percent of the state's population did not have any available monitoring data within their township grid (Table 1 and Fig. 1).

Selection of Contaminants and Violations
In California, over 100 contaminants have drinking water standards which require routine testing and reporting to the state. For this analysis, we selected thirteen contaminants (names, abbreviations, units and DLRs are listed in Table 2) after receiving input from public workshops and key stakeholders and based on the following criteria: 1) sufficient monitoring across water systems in California, 2) adequate number samples with a concentration above the detection limit and 3) chronic and acute toxicity concerns. We also included an evaluation of systems in

Calculating Contaminant Concentrations by System
To estimate the average concentration of each contaminant in each CWS or non-CWS boundary area, all contaminant levels were adjusted by the contaminant's detection limit for the purposes of reporting (DLR). For contaminants with less than 25 percent of tests below the detection limit we used the value of the detection limit divided by the square root of 2. While for all other contaminants, concentrations listed at or below the reported detection limit were treated as zero (30). Sampling frequency varied greatly by contaminant and by water system. For CWSs, we calculated a time-weighted average for each contaminant at each drinking water sampling location. Weights were based on the number of days between the first sampling date and the next available sample for each calendar year. Since regulatory monitoring schedules for different contaminants vary, this time-weighting approach assumes that samples reflect water quality for the entire time before a subsequent test occurs for that contaminant. For non-CWS areas, samples were averaged by the calendar year without time weighting. For CWS and non-CWS, yearly means were averaged across the nine-year study period for each sampling location. These sampling location averages were then averaged to obtain a mean concentration for each system. Lastly, we incorporated the contribution of water purchased from wholesale water systems. We adjusted concentrations for systems that purchase water by known or default fractions (e.g., fifty-percent purchased water) of the water that the wholesaler supplies that system. For the violation measures, the total number of violations by system was summed for the study time period. All analyses were conducted using SAS 9.4. Calculating Contaminant Averages by Region and System Size 11 We divided California into eight regions (Fig. 2) comprising counties roughly corresponding to regional governmental bodies and adapted from a previous study (40). Water systems were categorized into either non-CWS or CWS areas, and CWS areas were then further subdivided into sizes (Small, Intermediate, Medium, and Large) based on the number of connections suggested by the Safe Drinking Water Plan for California (Table 3)  Statistical Analysis by Region and System Size 12 We used average contaminant concentrations to compare the relative level of pollution burden in drinking water across regions and system sizes. First, we tested for significant differences in the estimated average concentrations for each particular contaminant or violation measure by region and by system size using the Kruskal-Wallis test, a non-parametric analysis of variance. Second, to understand the regional burden of individual contaminants, we compared the regional average to the average concentration for the rest of the state by creating ratios. R = Population weighted average of the region S = Population weighted average for all areas except the region We also calculated ratios by system size categories, including the non-CWS areas.
We considered ratios greater than two large enough to show a regional burden, meaning the contaminant concentration for the region was two times higher than for the rest of the state. If a region had a high ratio (greater than 2) when stratified across all system sizes within a region, we considered that contaminant to be a substantial regional burden since the high ratios are not limited to a specific system size category or limited amount of systems.
Third, to determine regions with the greatest cumulative drinking water contamination, each region received a relative rank for each contaminant or violation measure. For example, the region with the lowest estimated average concentration of a chemical was given a rank of 1 for that chemical, and the region with the highest estimated average concentration of that chemical was given a rank of 8 (out of 8 regions). Similar to methodology used for the drinking water indicator 13 in CalEnviroScreen 3.0 (36), contaminant rankings were then summed for all chemicals and violation measures for each region and this total was used to represent cumulative burden of contaminants and determine the region that ranked the highest in terms of cumulative contamination.
Finally, to evaluate contaminant concentration trends across water system sizes, we used the Jonckheere-Terpstra two-sided trend test to evaluate negative and positive trends. To test if there was a decreasing level of contamination as system size increased, the negative (left-sided p-value) trend test was performed. The positive (right sided p-value) trend test was also performed for all the contaminants to test whether there is an increasing level of contamination as system size increased.
Since it was hypothesized that non-CWS areas have generally higher contamination, we ran the negative trend tests including non-CWS areas and we ran the positive trend test both with and without the inclusion of non-CWS areas. Differences were considered statistically significant when p-values were less than 0.05.  Tables 3, 4 and 5.

Regional Results
There was a statistically significant difference between the eight regions for all contaminant concentrations or number of violations (p < 0.0001). Contaminant concentrations by region and contaminant are listed in Table 6. Contaminants with a regional burden, meaning the regional average was greater than twice the average for the rest of the state, are listed by region in the boxes in Fig. 3

. The Sierra
Nevada region had the greatest number of regional burdens (six contaminants with ratios greater than two), followed by the SJV and Northern California (Fig. 3).
Contaminants with ratios greater than two, even when stratified by system size categories, are indicated with asterisk in the boxes in Our analysis on cumulative rankings by region showed that the SJV region had the highest cumulative burden of contaminants among all California regions, followed by the Desert and Central Coast. The cumulative rank score is shown in Fig. 3 as a choropleth map from red (highest score) to blue (lowest score). In addition, the SJV remained the region with the highest cumulative burden when we examined rankings across all sizes of CWS systems.

Results by System Size
There was a statistically significant difference between the five system sizes (non-CWS areas plus the four CWSs sizes) for all contaminants and violation measures (p < 0.0001) showing that contaminant concentrations vary considerably between system size categories. Among only CWSs, small water systems (less than 200 connections) have the most contaminants with concentrations greater than twice the average for the rest of the state: arsenic, cadmium, lead, uranium, MCL violations, and TCR violations (Table 7). For only CWSs, small water systems also had the highest concentration averages of arsenic, nitrate, uranium, cadmium, lead, MCL violations, and TCR violations (Table 8) Cumulatively, for all contaminants evaluated, small water systems had the highest rank, followed by medium, large and intermediate water systems (Table 7). Large water systems were the only size not burdened by MCL and TCR violations as compared to the rest of the state.
However, large water systems had the highest burden (ratios) and contaminant average rank for THMs, TCE, and PCE compared to the smaller water systems.
Non-CWS areas in California had the highest burden of water contamination compared to people served by regulated CWSs. Similar to small water systems, the non-CWS areas had higher number of contaminants (6) with ratios greater than two compared to intermediate, medium and large CWS size categories (Table 7).
Californians in non-CWS areas had the highest cumulative burden of contaminants compared to those being served by CWSs, ranking the highest in 7 of the 13 contaminants, including arsenic, nitrate, DBCP and hexavalent chromium (Cr + 6 ) ( Table 8). Our regional results for non-CWS areas indicate that the average concentrations for PCE and DBCP are the highest in Northern California and lead and cadmium are the highest in the Central Coast. Additionally, when evaluating regional burdens in just non-CWS areas, Radium 226, 228 was a significant burden in the Sierra Nevada region and 1,2,3-Trichloropropane (1,2,3-TCP) in the SJV (ratios > 10) (data not shown). Overall, the Central Coast also has the highest cumulative burden of multiple contaminants in non-CWS areas only, followed by 16 both SJV and South Coast regions (second highest rank) (data not shown).

Trend Test and Patterns across the State
The trend test results indicate that system size has an effect on the concentration for certain contaminants. We observed a statistically significant (p < 0.05) negative trend (as system size increased, the level of contamination decreased) for arsenic, cadmium, lead, MCL violations, and TCR violations among all systems in California (Fig. 4). In contrast to these contaminants, trend test results indicated that TCE and THM levels increased as system size increased. When only CWS were include in the positive trend test, perchlorate and PCE also showed a statistically significant trend While some contaminants did not show a significant trend or burden, they showed notable patterns (Fig. 6). DBCP appears to be primarily an issue for non-CWS areas in the Northern California region. However, when examining only CWSs, the SJV has the highest concentrations (Table 8). Results for nitrate and uranium indicate a disproportionate contamination mainly occurs in small water systems compared to large water systems. These patterns along with the trend test results by system size can be observed through a heat map table (Fig. 6). Regional contaminant concentration averages and system size averages are included in Additional File 1.

DISCUSSION
In this study, we developed a novel method that takes a large amount of water quality monitoring data and characterizes potential exposure to drinking water contaminants for all areas of the state, including those on private wells or small, unregulated systems. We overcame limitations associated with linking water quality data to multiple geographic boundaries across all of California This is the first study to use or develop a full set of drinking water system service boundaries in its analysis. Previous studies have examined contaminated water for a limited number of individual contaminants regionally or only for CWSs or did not use CWS service boundaries. In this study we examined a novel metric of cumulative exposure multiple contaminants statewide and by system type, including areas not served by CWSs, and size.
Our findings confirm that water quality varies throughout the state and provide a first screening look at regions and system sizes that are affected differently by specific and cumulative contaminants. Since most water systems in California are in compliance with standards and considering our large-scale use of regional averages, we found that regional contaminant concentration averages are much lower than the MCL standards. Even so, regional concentrations were frequently higher than health-based PHGs for several contaminants such as arsenic, Cr + 6 , and 1,2,3-TCP.

Smaller systems had higher concentrations of multiple contaminants and violations.
Our results on regions or system size categories that have substantially higher single or multiple contaminant burdens, can provide a useful step to address specific contaminants in areas most in need for pollution reduction.
Our findings support results of previous studies of California's SJV concerning high levels of arsenic, uranium, and nitrates (5)(6)(7)(8)(9). In addition to these contaminants, we Regional results can be used to identify research and treatment efforts regionally for specific contaminants.
Our results demonstrating that smaller systems tend to have more MCL and TCR violations is supported by previous research (22)(23)(24)(25)28). We also found small systems have significant higher concentrations in arsenic, cadmium and lead compared to larger systems, which gives indication for further efforts to be focused on small water systems in California. Although we did not find a significant trend for nitrate, we note that small systems had higher concentrations of nitrate compared to most larger water system categories, which aligns with previous research findings that small water systems are more vulnerable to higher nitrate levels in the SJV and nationally (7,26). On the contrary, we saw a significant positive trend for THMs as systems get larger, suggesting the potential for THM exposure is greater for people served by large systems. This is not surprising since THMs are a byproduct of water chlorination treatment, which is more heavily used by large systems. PCE and TCE also showed positive trends, which may be due to greater industrial influence in more populated areas of Southern California served by large water systems. Perchlorate also showed a positive trend by system size and warrants further evaluation.
Although this study is the first, to our knowledge, to investigate potential exposure to contaminants in drinking water across multiple regions in California, system sizes, and by multiple contaminants, the methods used here have several limitations for use in screening level exposure assessment. Our methods for assigning water quality to geographic areas were limited by availability and accuracy of data, such as using approximated water system boundaries and townships to assign water quality to certain areas of the state. However, the boundaries for 80 percent of CWSs in California that were extracted from the Water Boundary Tool represent the most complete and accurate set of boundaries available at the time of analysis.
The monitoring databases rely on data reported to the state by water systems, and there are limitations in both the testing and reporting for contaminants. Data gaps and non-standardized reporting requirements may have influenced our results, with inconsistencies in contaminant reporting cycles and some water systems receiving monitoring waivers. Due to the lack of statewide data on blending of water sources within a water system, our analysis assumes that water quality is homogeneous 20 within a CWS or non-CWS area, which especially affecting larger water systems with multiple treatment plants. Contaminant concentrations for these systems may be overestimated, especially if sources are mixed unequally to reduce contaminant concentrations. Lastly, data on the level of contaminants in the final blend of water delivered to consumers is not directly available, so we characterized delivered drinking water using monitoring data from treatment plant sampling points. Our reliance on the databases used in this study highlights the importance of accessible and accurate state drinking water monitoring data.
While providing useful information on potential exposures from drinking water contaminants, the results are of most use in screening scenarios to identify and compare relative burdens of exposure and to understand areas for continued research. Further efforts are needed to assess detailed and direct exposure to contaminants, which could involve biomonitoring, a focus on a smaller geographic unit or even individual water systems. Future studies could examine other factors, such as socioeconomic disparities and water system capacity that are associated with water quality throughout the state. Additionally, continued work to address the gap in water quality between small systems and large has been identified as a priority for California and is supported by our findings.

CONCLUSION
As a screening level exposure assessment, this study is the first to examine multiple drinking water contaminants that may be a potential problem in specific regions across California and by different system sizes including areas not served by public water systems. The summarized regional and system size contaminant averages were all below MCL standards therefore, results presented here cannot be used to 21 conclude drinking water safety, but instead may provide a preliminary snapshot of areas with poor water quality that could help target localized investigation and treatment efforts throughout California. Key findings include a potential high burden of MCL and TCR violations, 1,2,3-TCP, arsenic, and nitrate in the SJV, DBCP in Northern California groundwater, and TCE, PCE, and THMs in the South Coast region and among large water systems. Overall, Californians residing in areas outside of regulated public water systems (non-CWS areas), served by small water public water systems, and people living in the SJV have the greatest burden of multiple contaminants. Future analyses could explore socioeconomic factors associated with drinking water contamination in specific regions or system size categories. As access to improved data becomes available and technology improvements yield more sensitive and affordable testing protocols, the ability to characterize drinking water quality by water system, linking boundaries to monitoring data and comparing potential exposures at both a local and state wide level, will be vitally important for government agencies and researchers concerned with public health.  Tables.pdf