This data set contains information about all Community Water Systems in California. Data are derived from California Office of Drinking Water (ODW) Water Quality Monitoring Database (WQMD; also know as Water Quality Inventory or WQI), Safe Drinking Water System (SDWIS), and Permits, Inspection, Compliance, Monitoring, and Enforcement (PICME) database. The data set contains one record for each year and Community Water System (CWS) that was active for all or part of the reporting period. It includes additional detail about how many retail connections by each CWS, how many people are served by each CWS, and the approximate location of each CWS.
This data set contributes to the Environmental Public Health Tracking Network. The EPHT cooperative agreement states that all grantees must track and make available core environmental health tracking measures on the State and National EPHT Network, including data/information on key drinking water contaminants for regulated public water supplies, as defined through the Content workgroup process. The Content Workgroup Water Team identified contaminants of concern for the national EPHT program, identified nationally consistent data sources, and developed nationally consistent indicators and measures. This data set can be used to enumerate all Community Water Systems in California. Using the Public Water System ID Number, it can be joined with the sampling results dataset.
The geographic locations provided in this dataset were not verified for accuracy by State Primacy Agency officers or water system staff. The locations are provided and intended for diagrammatic and visualization purposes, such that the approximate location of the CWS service area can be described on a small scale map. These locations are not designed for large scale analyses and should not be used in linking to health data.
850 Marina Bay Parkway, Bldg P, 3rd Floor
The EPHTN Drinking Water Content Workgroup concluded that an inventory of active, community water systems years 1999 to most current must be compiled. WQMD, SDWIS, and PICME are coded logically to reasonably ascertain active CWS for the reporting period, however, a small minority of active CWS only export drinking water to other CWS with a non-zero retail population. For this reason, the legal definition (i.e. serving 15 retail connections or 25 people, year-round) of a CWS was also used in populating the CWS inventory. 3363 total CWS were enumerated by this logic for the reporting period. The SDWIS inventory deactivation date of each system that is currently inactive allows for the inclusion of water systems that were formerly active during earlier years of the reporting period. In previous years' inventory datasets, only the inventory of the most recent year was extracted. Listed below are the annual frequencies of active systems included in this dataset. Year -- Num Systems 1999 -- 3,059 2000 -- 3,080 2001 -- 3,071 2002 -- 3,063 2003 -- 3,055 2004 -- 3,044 2005 -- 3,039 2006 -- 3,035 2007 -- 3,017 2008 -- 3,030 2009 -- 3,036 2010 -- 3,016 2011 -- 3,001 2012 -- 2,987 CWS-specific population-served estimates are often inaccurate, because individual water systems are given 3 options for performing this calculation: 1. using the US Census and/or CA Department of Finance, 2, multiplying the number of service connections by 3.3, and/or 3. estimating the population served for single connections that provide service to multiple dwelling establishments, such as mobile home parks, apartments, prisons, and other institutional facilities with permanent residents. As of September 2, 2012, according to PICME and reported through this dataset, the total 2012 active CWS population served was 41,012,626, whereas the US Census reports the CA population in 2012 as 38,041,430. Moreover, estimates of public drinking water use by USGS and from the 1990 Census found that only 85-90% of the CA population is served by CWS. Statewide, this means that the population served estimate as reported by this dataset may be 20-25% higher than the true value. Latitude and longitude coordinates to describe a representative location for CWS were found using 5 methods in the following priority: 1. centroid of service area polygon (LocationDerivationCode=SA), 2. centroid of facility coordinate locations expected to be near the service area (LocationDerivationCode=MFL), 3. centroid of principal city served coordinates and geocoded system headquarters address point (LocationDerivationCode=0), 4. principal city served coordinates (LocationDerivationCode=PCS), 5. geocoded system headquarters address point (LocationDerivationCode=GSH). The logic used for subsetting sampling station locations that are expected to be near retail populations was as follows: any active or combination groundwater sampling location (raw, treated, or untreated) was considered to be near the retail population served. Typically, groundwater systems lie in or very near the consuming population. For active, mixed and surface water sampling locations, only treatment plant sampling stations and distribution system sampling locations were used. Annual frequencies for distribution of how locations were found are listed below: 1999 -- SA=1,312; MFL=1,556; O=130; PCS=48; GSH=8; -999=5 2000 -- SA=1,319; MFL=1,565; O=136; PCS=48; GSH=8; -999=4 2001 -- SA=1,329; MFL=1,554; O=123; PCS=51; GSH=10; -999=4 2002 -- SA=1,348; MFL=1,544; O=116; PCS=47; GSH=5; -999=3 2003 -- SA=1,350; MFL=1,530; O=116; PCS=50; GSH=6; -999=3 2004 -- SA=1,357; MFL=1,510; O=117; PCS=50; GSH=7; -999=3 2005 -- SA=1,366; MFL=1,493; O=119; PCS=51; GSH=7; -999=3 2006 -- SA=1,371; MFL=1,482; O=119; PCS=53; GSH=7; -999=3 2007 -- SA=1,377; MFL=1,468; O=116; PCS=46; GSH=7; -999=3 2008 -- SA=1,388; MFL=1,458; O=125; PCS=48; GSH=6; -999=5 2009 -- SA=1,398; MFL=1,445; O=134; PCS=49; GSH=6; -999=4 2010 -- SA=1,398; MFL=1,430; O=132; PCS=46; GSH=6; -999=4 2011 -- SA=1,394; MFL=1,416; O=134; PCS=47; GSH=6; -999=4 2012 -- SA=1,387; MFL=1,409; O=134; PCS=47; GSH=6; -999=4
By definition, this dataset does not include unregulated drinking water providers or systems not defined as Community Water Systems according the Federal Safe Drinking Water Act, namely private drinking water wells or very small systems ("State Small") or mutuals in which there are less than 15 retail connections and less than 25 year-round residents. Latitude and longitude values could not be found for 7 CWS over the entire reporting period; This CWS was coded as missing -999.
CEHTP received comma-separated (CSV) PICME database from ODW (dated 9/2/2012) and imported SYSNUM1 (N=14398) table into SQL Server 2008. Principal county served was inferred from the first 2 digits of the PWSID.
CEHTP received SDWIS XML Inventory dataset from ODW (dated 2/22/2013) and imported WaterSystems (N=11846) table into SQL Server 2008. The PICME SYSNUM1.UPDT (deactivation date) was updated to reflect the deactivation dates specified in the SDWIS WaterSystems table.
CEHTP downloaded 4 WQMD tables (all dated 3/15/2013) CHEMICAL (N=3309604), CHEMARCH (N=7435614), CHEMXARC (N=8637112), CHEMHIST (N=6970088) in dbf format from ODW website at http://www.cdph.ca.gov/certlic/drinkingwater/Pages/EDTlibrary.aspx. Imported all 4 tables into SQL Server 2008, appended into single table named EPHTN_FINDINGS (N=26352418-127 with null store_num=26352291), normalized on unique samples, established unique and search-optimized indexes, and named new samples table EPHTN_SAMPLES (N=1748969). A unique sample was based on sampling station, sample date/time, analyte, and lab. For each unique water system, determined earliest sample date, and added this as column MIN_SAMPLE_DT to PICME SYSNUM1 table.
Estimated earliest water system activity in PICME SYSNUM1 by taking earliest date of following 4 date fields: SYSNUM1.REVISEDATE, SYSNUM1.INVEN_DATE, SYSNUM1.SYS_UPDATE, SYSNUM1.MIN_SAMPLE_DT
Developed SQL statement for creating year-specific CWS Inventory table. Primary logic holds PICME_SYSNUM1.PWS_CLASS='C' and (Year>=Year(PICME_SYSNUM1.MIN_DT AND ((PICME_SYSNUM1.UPDT IS NULL OR YEAR(PICME_SYSNUM1.UPDT) >= Year)) and ((PICME_SYSNUM1.CONNECTIONS>=15 and PICME_SYSNUM1.POPULATION>0) or PICME_SYSNUM1.POPULATION>=25) to select only Community Water Systems that are Active in a reporting year and with either 15+ connections or 25+ population served, respectively. The logic for determining optional Primary Source Code departs from EPA Surface Water Treatment Rule logic; If all the active sources contained in a water system are all of one source type (ie. GW, SWP, SW, GWP, GU, or GUP), then the system was attributed with that source type, otherwise it is attributed as missing. 3,044 system inventory records returned by this logic.
Downloaded Geographic Names Information System (GNIS) feature types and codes from geonames.usgs.gov and extracted feature ID based on joining the the county FIPS code and the city location name in PICME_SYNUM1 field. Merged the feature ID field into the inventory dataset. A code of -999 was reported for CWS that could not be attributed by this method (N=240).
Exported current database state of PWS service areas (N=2,063 PWS) from CEHTP Drinking Water Systems Geographic Reporting Tool (http://ehib.org/water). Extracted centroid of CWS (N=1,404) and merged the corresponding latitude/longitude value into the inventory dataset using a LocationDerivationCode of 'SA'.
For CWS not having service area centroids (N=1959), developed SQL statement for subsetting by PWSID individual sampling stations near service area using PICME_SOURCE table. Primary logic is as follows: union of following 2 where statements: 1. WATER_TYPE=G and ENTITY_INFO in (AR,AT,AU,IR,IT,IU,CR,CT,CU) and 2. WATER_TYPE in (S,M) and ENTITY_INFO in (AT,IT,CT,DR,DT). 1,648 CWS were found with latitude longitude coordinates through this method. To ensure security, the reported centroids were rounded to the nearest hundredth (ie. accurate to ~1km) of a decimal degree.
For the remaining CWS lacking latitude/longitude values (N=311), the water system headquarters address was geocoded using the CEHTP Centralized Geocoding Service. These coordinates were averaged with corresponding CWS that had a principal city location (N=212). For the remaining CWS (N=87) that didn't have both a principal city and geocoded system headquarters available, the principal city alone was used (N=80), then lacking that, the geocoded system headquarters was used alone (N=12).
The data set contains fields describing Community Water Supplies (CWS) such as location of service area, number of connections, and approximate people served.
Data dictionary is available from the National Tracking Program at http://www.cdc.gov/nceh/tracking.
850 Marina Bay Parkway, Bldg P, 3rd Floor
850 Marina Bay Parkway, Bldg P, 3rd Floor