U.S. energy efficiency programs: Lots of evaluation, little coordination
Suppose you are a regulator in a state or country new to energy efficiency programs and you want to design a set of financial incentives to encourage efficient appliances and equipment.
If you want to design something that generates significant energy and carbon emissions savings in a cost-effective manner, you have a number of decisions to make: What products and efficiency measures should you target? Should you offer incentives for very high-performing, super-efficient devices that may have a smaller market, or for more widely available but somewhat less efficient measures? How large should these incentives be? Should you offer incentives upstream (to manufacturers), midstream (to retailers), or downstream (to consumers)? Should you bundle the incentives with information and advertising, and if so, what is the best allocation of program resources? How do you need to vary your approach in different markets?
The U.S. appears, at first glance, to offer lots of evidence to help you make these choices. A search of the Database of State Incentives for Renewables and Efficiency yields 1124 separate U.S. programs that offer some form of rebate for energy efficiency measures. Programs have existed since the 1970s, so you have a long history to draw upon. Moreover, utility demand-side management (DSM) programs, which make up the lion’s share of energy efficiency programs, are routinely evaluated. In fact, U.S. efficiency program administrators budgeted at least $181 million for DSM program evaluation in 2011.
Unfortunately, despite the many programs and the many millions of dollars spent evaluating them, there is less evidence on what works and what doesn’t than there could and should be.
Aggregation of U.S. DSM program evaluation findings is thwarted by two interrelated but distinct problems: inconsistency in measurement and inconsistency in reporting.
Inconsistency in Measurement
State utility regulatory commissions are the principal drivers of DSM program evaluation. Each state sets its own procedures for evaluation, and different state commissions set different rules, frustrating cross-state comparisons. At the 2012 ACEEE Summer Study, NRDC and Heschong Mahone Group presented striking differences in the deemed savings values across states for the same products — including those whose energy-saving impact should be nearly the same regardless of location, such as lighting.
Moreover, deemed savings values are — relatively speaking — the easy part. They estimate gross savings — the savings due to products purchased under the program, regardless of whether the program actually induced those purchases. Net-to-gross ratios, which estimate the fraction of purchases caused by the program, are likely an even greater source of variance. Variation in data availability for different programs leads to differences in methods to estimate net-to-gross ratios. However, even for a single program, different methods can produce quite different results, necessitating occasionally contentious judgment calls as to the true ratio.
Much of the measurement inconsistency could, in theory, be reduced by harmonizing methods and increasing data availability. In practice, this is much easier said than done. There are good reasons for some differences between states, while vested institutional interests and behaviors support existing evaluation practices. Also, estimating impacts of these programs is a fundamentally difficult task, as it requires the evaluator to divine what would have happened if the program had not occurred. Even in a best-case scenario, we have to live with some noise in evaluation results.
Still, with good reporting and aggregation, we could learn an awful lot from these results.
Inconsistency in Reporting
Unfortunately, we do not have good reporting and aggregation systems in place.
The Energy Information Administration (EIA) requires utilities to report their aggregate DSM spending and energy savings. These data are somewhat incomplete: They do not include third-party DSM efforts (third parties are not required to report) and the reporting definitions are unclear (for example, it is not clear whether gross or net savings are expected). While the spending data have been put to good use, many analysts have seen more promise in inferring the impact of programs using aggregate energy consumption data than in using the reported savings.
More to the point, to answer the questions posed at the beginning of this post, you would need program-level and incentive-level data. However, spending and savings data by utility (as reported by EIA) don’t generate many actionable insights for program design because a single utility may run dozens of programs and subprograms.
And, while the evaluation community has generated program-level and incentive-level data by reams, very little aggregation of program evaluation findings takes place across the 50 different utility regulatory commissions and hundreds of energy efficiency program administrators. Evaluations are released on the internet in some states, but there is little consistency across states (and sometimes within states) on the format of these evaluations. Except for a few regional efforts, no one aggregates these data across state lines.
The Missing Middle
Relative to most other government programs, efficiency measures are evaluated thoroughly. We have very good evidence that DSM programs on the whole are cost-effective relative to supply-side energy options. Moreover, program administrators and utility regulators have been reviewing and revising their own programs for years in light of evaluation findings, and have presumably learned a great deal about what works in their specific markets.
But, there’s a hole in the middle: It’s virtually impossible to use program evaluation data across jurisdictions to learn anything meaningful about what works best. This situation will continue to frustrate efforts to learn from these programs until inconsistencies in measurement and reporting are resolved.