Scorecard Methods

The Commonwealth Fund’s 2019 Scorecard on State Health System Performance evaluates states on 47 performance indicators grouped into four dimensions:

Access and Affordability (7 indicators):
includes rates of insurance coverage for children and adults, as well as individuals’ out-of-pocket expenses for health insurance and medical care, cost-related barriers to receiving care, and receipt of dental care.
Prevention and Treatment (15 indicators):
includes measures of receipt of preventive care and needed mental health care, as well as measures of quality in ambulatory, hospital, postacute, and long-term care settings.
Potentially Avoidable Hospital Use and Cost (13 indicators; including several measures reported separately for distinct age groups):
includes indicators of hospital and emergency department use that might be reduced with timely and effective care and follow-up care, as well as estimates of per-person spending among Medicare beneficiaries and working-age adults with employer-sponsored insurance.
Healthy Lives (12 indicators):
includes measures of premature death, health status, health risk behaviors including smoking and obesity, and tooth loss.

Disparities based on income

The Scorecard reports on performance differences within states associated with individuals’ income level for 19 of the 47 indicators. For each indicator, we measure the difference between rates for a state’s low-income population (generally less than 200% of the federal poverty level) and higher-income population (generally 400% or more of the federal poverty level). States are ranked on the relative magnitude of the resulting disparities in performance.

The following principles guided the development of the Scorecard:

Performance Metrics

The 47 metrics selected for this report span health care system performance, representing important dimensions and measurable aspects of care. Where possible, indicators align with those used in previous state scorecards. Several indicators used in previous versions of the scorecard have been dropped either because all states improved to the point where no meaningful variations existed (e.g., measures that assessed hospitals on processes of care) or the data to construct the measures were no longer available. New indicators have been added to the scorecard series over time in response to evolving priorities. See the box on page 21 for more detail on changes in indicators.

Measuring Change over Time

We were able to track performance over time for 45 of the 47 indicators. Not all indicators could be trended because of changes in the underlying data or measure definitions.

There were generally four to five years between indicators’ baseline and current-year data observation, though the starting and ending points depended on data availability. We chose this short time horizon to capture the immediate effects of changes relative to the policy and delivery system environment, such as coverage expansions under the Affordable Care Act and other reforms.

We considered a change in an indicator’s value between the baseline and current year data points to be meaningful if it was at least one-half (0.5) of a standard deviation larger than the indicator’s combined distribution over the two time points — a common approach used in social science research.

To assess change over time in the disparity dimension, we count how often the disparity narrowed within a state, so long as there was also an improvement in the observed rate for the state’s low-income population.

Data Sources

Indicators draw from publicly available data sources, including government-sponsored surveys, registries, publicly reported quality indicators, vital statistics, mortality data, and administrative databases. The most current data available were used in this report whenever possible. Appendix B provides detail on the data sources and time frames.

Scoring and Ranking Methodology

For each indicator, a state’s standardized z-score is calculated by subtracting the 51-state average (including the District of Columbia as if it were a state) from the state’s observed rate, and then dividing by the standard deviation of all observed state rates. States’ standardized z-scores are averaged across all indicators within the performance dimension, and dimension scores are averaged into an overall score. Ranks are assigned based on the overall score. This approach gives each dimension equal weight, and within each dimension it weights all indicators equally.

The z-score more precisely portrays differences in performance across states (as shown in Overall health system performance) than the simple ranking approach used in our scorecards prior to 2018. It is also better suited to accommodate the different scales used across scorecard indicators (e.g., percentages, dollars, and population-based rates). This method also aligns with methods used in the Commonwealth Fund’s international health system rankings.

As in previous scorecards, if historical data were not available for a particular indicator in the baseline period, the current year data point was used as a substitute, thus ensuring that ranks in each time period were based on the same number of indicators.

Regional Comparisons

The Scorecard groups states into the eight regions used by the Bureau of Economic Analysis to measure and compare economic activity. The regions are: Great Lakes (Illinois, Indiana, Michigan, Ohio, Wisconsin); Mid-Atlantic (Delaware, District of Columbia, Maryland, New Jersey, New York, Pennsylvania); New England (Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, Vermont); Plains (Iowa, Kansas, Minnesota, Missouri, Nebraska, North Dakota, South Dakota); Rocky Mountain (Colorado, Idaho, Montana, Utah, Wyoming); Southeast (Alabama, Arkansas, Florida, Georgia, Kentucky, Louisiana, Mississippi, North Carolina, South Carolina, Tennessee, Virginia, West Virginia); Southwest (Arizona, New Mexico, Oklahoma, Texas); and West (Alaska, California, Hawaii, Nevada, Oregon, Washington).

Data in the 2019 State Scorecard are generally comparable with those in the 2018 State Scorecard. However, because of changes in indicators and methodology, rankings in these two scorecards are not comparable to those reported in previous scorecard editions.

Methodological changes from the 2018 State Scorecard

The Centers for Medicare and Medicaid Services (CMS) changed the way it reports survey responses from the Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) in its Hospital Compare public use data. This change impacted one of the measures we used to evaluate hospitalized patients’ experiences. Rather than constructing our own composite of hospitalized patients’ experiences, as was done in the 2018 State Scorecard, we substituted a CMS-constructed composite summary of hospitals’ HCAHPS scores. This composite has a 100-point scale, with 100 points representing the highest possible patient experience summary score. We convert this into a measure suitable for use in the 2019 State Scorecard by calculating the share of hospitals in a state with HCAHPS patient experience summary scores lower than the national median.
The 2018 State Scorecard reported deaths from suicide, alcohol, and drug overdose in a single composite, referred to as “deaths of despair.” In 2019, we report each component separately to better capture variations among states in the underlying causes of death.
In previous years, the indicator measuring “high out-of-pocket medical spending relative to income” included over-the-counter drug costs. Since these costs were excluded from the most recent data (2016–17), we also removed them from our baseline estimate (2013–14) for comparability.
Several indicators in the 2018 State Scorecard were grouped and reported for separate age stratifications within the same measurement construct (e.g., potentially avoidable emergency department visits among working-age adults and Medicare beneficiaries). An adjustment was made to down-weight each age group within the construct for scoring. The same data are used in the 2019 State Scorecard, but we no longer make the scoring adjustment. Sensitivity analyses indicate that this change has no impact on state rankings.