California Department of Public Health logo: three likenesses of people colored blue, green, and orange  
Woman using inhaler

Join our list

Get updates on our project activities and new features of our website. Sign up for our newsletter here.

Contact Us

California Environmental Health Tracking Program

850 Marina Bay Pkwy, P-3
Richmond, CA 94804

(510) 620-3038
E-Mail Us
Last Edited: 10/1/2014

Asthma: Methods and Limitations

This section contains the following topics:


Data Sources

Hospitalization and Emergency Department Data

Since 1986, the Office of Statewide Health Planning and Development (OSHPD) has been responsible for routinely collecting data on hospital discharges from every licensed acute care hospital in California, excluding federal hospitals.  SInce 2005, OSHPD has been collecting data on emergency department (ED) visits from hospitals in California.  Each year, OSHPD compiles the data from all hospitals to create the Patient Discharge Database (PDD) and Emergency Department (ED) database. 

These datasets include information like age, gender, race/ethnicity, and diagnosis. A case due to asthma is identified by looking at the principal diagnosis based on the International Classification of Disease, Ninth Revision, Clinical Modification (ICD-9-CM).  A principal diagnostic ICD-9-CM code of 493 indicates a patient was admitted to a hospital or visited the ED because of asthma. 

PDD and ED data are available for hospitals to assess their level of performance.  The data are also available to some California State and Federal government programs and some academic institutions to be used for research and other public health purposes.

To find out more about what data is available from OSHPD, go to their website:
To request public use data from OSHPD, go here:


Back to top


Death Statistical Master File Data

For California, the Center for Health Statistics (CHS) is responsible for coordinating and overseeing the collection, management, and dissemination of public health and vital statistics data in conjunction with other State agencies, local government agencies, and other customers.

Data for asthma deaths are obtained from the Death Statistical Master Files (DSMF), collected by the Office of Health Information and Research at CDPH. The DSMF data files contains data from all the death certificates registered in California and data from all death certificates for California residents who died out-of-state. Coverage is theoretically 100%. All deaths after 1999 have been assigned ICD10 codes to reflect the Underlying Cause of Death (including asthma: ICD10 = J45 or J46). Also included are a variety of demographic variables of interest, such as sex, age, and race/ethnicity.


To find out more about what data are available from CHS, go to their website:

Back to top

Population Data

To calculate rates within a certain population and timeframe, one uses population data (i.e. denominator data) that represent that population during the specified period of time.  The U.S. Census is the only comprehensive source of population data and is collected every ten years.  The last census was collected in 2010 and the next collection will be in 2020.  To estimate the population during the interim (2011-2020), demographers use various algorithms to project how the population has changed.


Hospitalization, ED, and mortality rates presented by other data providers (e.g. CDC) may differ slightly because they may use different sources or different years of population projections. 

  • County-level population data

For the asthma rates presented on the CEHTP portal, we use population data from the California Department of Finance (DOF). Years 2000-2010: DOF uses the 2000 and 2010 population data collected by the U.S. Census Bureau and takes into account births, deaths, and migration for California to estimate the intercensal California population by county, race/ethnicity, age, and gender for 2000-2010 (State of California, Department of Finance, Race/Hispanics Population with Age and Gender Detail, 2000–2010. Sacramento, California, September 2012.). Years 2011-2012: DOF used the 2010 population data collected by the U.S. Census Bureau and takes into account survival rates, migration patterns, and fertility rates for California to project the California population by county, race/ethnicity, age, and gender for 2010-2060 (State of California, Department of Finance, Report P-3: State and County Population Projections by Race/Ethnicity, Detailed Age, and Gender, 2010-2060. Sacramento, California, January 2013.)

  • Zip code-level population data

For the asthma data at the zip code level, we use population estimates from a private vendor. This vendor takes into consideration U.S. Postal service routes, U.S. census population, and other data sources to estimate the population for each zip code. (See Limitations in geographic resolution below)


Back to top


Statistical Concepts


The number of hospitalizations, ED visits, or deaths among California residents is calculated by summing the total number of events (i.e., hospital discharges for asthma) for a given time period (e.g. year), geography (e.g. county), and demographic group (e.g. Hispanics).  The hospitalization event is based on the date of discharge.  The events are per discharge or visit to the ED and not per person, since some people can be admitted to the hospital or visit the ED more than once in a given year. ED events include visits to the ED that end up as hospitalizations.        


Back to top



  • Crude rates

Crude rates (i.e. unadjusted) are calculated by taking the total number of events for a given time period (e.g. year), geography (e.g. county), and demographic group (e.g. Hispanics), and dividing by the total underlying population for the same time period, geography, and demographic group.  The rates are then multiplied by 10,000 and expressed as X hospitalizations per 10,000 California residents. 


  • Age-adjusted rates

Age-adjusted rates take into account the age-distribution of a population and are calculated to allow for direct comparisons between two or more populations at one point in time or between a single population at two or more points in time.  Crude rates measure the true risk for a population, while age-adjusted rates are useful as a relative index of risk. 

Using the direct method of age-adjustment, crude rates are weighted to be comparable to a standard population.  For the age-adjusted hospitalization rates presented here, we use the U.S. Census 2000 population as the standard population (

Below are the steps taken to calculate the age-adjusted rates:

    1. Group the numerator (i.e. hospitalizations, a) and denominator (i.e. population, b) data into 19 5-year age group strata (0-4 years old, 5-9 years old, …, 75-79 years old, 85 years old and over)
    2. For each stratum, divide the numerator by the denominator (a/b)
    3. For each stratum, multiply a/b by the stratum-specific weight of the standard population (a/b*w).  The weight is calculated by taking the total number of people in each stratum of the standard population and dividing by the total number of people in the entire standard population
    4. Sum a/b*w across all strata and multiply by 10,000.The total is the age-adjusted rate per 10,000 California residents


Back to top

Confidence Intervals

Given the data at hand, to understand the range of possible values for the true rate, we calculate the 95% confidence interval for each rate.  Statisticians have developed a large number of methods for calculating these confidence intervals.  Usually the result is the same no matter which method is used, although when numbers of events are small, the results may differ. 

The method developed by Tiwari, Clegg, and Zou (Methods in Medical Research, 2006; 15:547-569), applies specifically to the situation of age-adjusted health outcomes such as hospitalizations due to asthma.  We have chosen to use this method for calculating confidence intervals for the CEHTP Portal, but it is important that users understand that others in the field may not be using the same method.


Back to top


Common limitations with the data

 Hospitalizations and ED visits do not represent the entire asthma burden

Asthma is a complex disease and thus is difficult to measure.  To understand the entire burden of asthma in a population means looking at all the asthma indicators.  See the “How is asthma measured?” section for more information.

Here on the CEHTP web portal, we present data on asthma hospitalizations and emergency department (ED) visits.  Of all the asthma indicators (with the exception of mortality), hospitalizations and ED visits are the most severe outcomes.  Thus, when interpreting the results presented here, the user must keep in mind that the data do not represent the entire spectrum of asthma burden in the California population, but rather, the data represent the more severe asthma cases.  


Back to top


Purpose of data collection differs

Per legislation, hospitals are required to report patient data to the California State Office of Statewide Health Planning and Development (OSHPD).  The data are collected for purposes of assessing healthcare quality, services, and insurance coverage and not reported specifically for public health surveillance.  Although the data are being used for public health purposes, the limitations associated with how the data is collected must be kept in mind and when possible, accounted for.

An example of such a limitation, has to do with how a patient’s principal diagnosis gets coded.  Since the diagnosis codes are recorded by hospitals for reimbursement purposes, the code might have been different if the purpose for recording was for primarily public health surveillance.  Since we identify events based solely on the principal diagnosis code, we must keep this limitation in mind.


Back to top


Limitations in geographic resolution

At this time, hospitals are not mandated to report patient addresses.  Thus the level of geographic resolution of the data is limited to state, county, and zip code.  Most of the data on the CEHTP web portal is county-level counts and rates. More local-level or neighborhood-level patterns of disease are useful to users who might like to locate “hot spots” in their neighborhood.


Zip code-level data are available on the CEHTP web portal only for asthma emergency department visits for 2009. There are many limitations to consider when using and interpreting zip code-level data:

  1. Zip codes are not geographic areas. While people may think of zip codes as defining communities or neighborhoods, zip codes were created to provide an efficient postal distribution and delivery network. While most zip codes are assigned to streets, sometimes a single building with large mail volume could have its own zip code. Thus any effort to assign a geographic area (i.e. polygon) to a zip code is an approximation at best. 
  2. Zip code populations are estimates. Populations assigned to a particular zip code are estimates as well. Because zip codes were created to facilitate mail delivery, there is no specific population associated with zip codes.  Zip code populations are estimated from the U.S. Census data using statistical methods, usually by commercial data vendors.  On the CEHTP web portal, the shape files used to create the map of the zip code-level data along with the population estimates used to calculate the rates are both from a private vendor.
  3. Zip codes can cross city or county boundaries. Zip code assignments are based on factors such as mail volume, geographic location, topography, but not necessarily city or other community boundaries.
  4. Zip codes may change from year-to-year and even within a year. Zip code assignments can change depending on mail delivery growth patterns or changing demographics. Thus aggregating zip code population estimates over time is not recommended. Additionally, looking at trends in zip code-level data from year-to-year should be done with caution. Depending on the source and time of zip code data, zip code population and area estimates may be different among data providers.


Back to top


Potential for race/ethnicity misclassification

The concepts of race and ethnicity are difficult to define.  Although hospitalization, ED, and mortality data include information on individual-level race and ethnicity, we know that these data:

  • Are not necessarily recorded consistently
  • May not reflect peoples' self-identification of their race/ethnicity
  • May not capture their experiences with respect to discrimination, acculturation, or vulnerability to health problems 

When using race/ethnicity information from these data sources, these limitations should be kept in mind.

  • Counts and rates presented by other data providers may differ slightly based on how they classify and/or group race/ethnicity. 

In hospitalization and ED data:

  • Only one race/ethnicity is recorded for each patient, which does not allow for the identification of multiracial patients
  • Hispanics are classified based on the national origin of their family, even if they were born in the United States
    • If an individual is reported as “Hispanic”, regardless of the race reported, they are classified as “Hispanic” in the data

Thus, these are the broad race/ethnicity categories used with hospitalization and ED data:

  • Hispanic
    • Includes a person having origins in or who identifies with peoples of Mexico, Puerto Rico, Cuba, Central or South America or other Spanish-culture country
  • Non-Hispanic Asian/Pacific Islander
    • Includes a person having origins in or who identifies with peoples of Hawaii, Laos, Vietnam, Cambodia, Hong Kong, Japan, China, India, Taiwan, the Philippines, and Samoa
  • Non-Hispanic Black
    • Includes a person who identifies as an African American and/or who has origins in any of the black racial groups of Africa
  • Non-Hispanic White
    • Includes a person who identifies as a Caucasian and/or who has origins from Europe, North Africa, and the Middle East
  • Other
    • Includes any possible option not covered by the above categories including Native American, Eskimo, Aleut, and Unknown

Back to top