More excuses from the Breakthrough Institute on data quality
The following is a joint post from Danny Cullenward and Jonathan G. Koomey. ______________________________________________________________
CC BY-SA 3.0
File:Bouncing ball strobe edit.jpg
Created: 29 September 2007
More excuses from the Breakthrough Institute on data quality
Dr. Harry Saunders, a Senior Fellow at the Breakthrough Institute, has responded to our criticism of his work on the rebound effect. For completeness we will address some new issues he has raised, but at this point our exchange has thoroughly covered the disputed territory. We stand by our concerns and encourage interested readers to review the published journal articles (Saunders, 2013; Cullenward and Koomey, 2016), our original summary post, our response to initial Breakthrough Institute comments at our website and on social media, and Dr. Saunders’ new essay.
Too little, too late
Considering the significant errors we identified in Dr. Saunders’ published article, his latest reaction is a remarkable exercise in deflection. As described in our original post, we have clearly demonstrated that his data did not match his methods:
Dr. Saunders’ data actually concern national average prices, not the sector- and location-specific marginal prices that energy economists agree are necessary to evaluate the rebound effect. The distinction is most important because actual energy prices vary widely by sector and location; in addition, economic theory asserts that changes in the marginal (not the average) price of energy services cause the rebound effect. As a result, Dr. Saunders’ findings of high rebound and backfire are wholly without support.
Despite acknowledging these significant methodological inconsistencies, Dr. Saunders continues to insist they do not matter and that the burden is on his critics to show that his results are invalid. While he is free to make the case for why the mistakes we identified don’t affect his results, it’s worth reminding readers that this is a discussion we should have been having five years ago—and certainly by the time Dr. Saunders published his 2013 article, which entirely ignored the concerns we had already raised with him.
The problems we identify call for far more than an additional caveat in Dr. Saunders’ work because they show his study lacks a valid empirical basis. This episode also counsels serious skepticism of rebound-related research from the Breakthrough Institute, which breathlessly promoted Dr. Saunders’ work as a “detailed econometric analysis” and “rigorous new methodology”;[1]“an important contribution to the study of rebound effects that fills a key void in analyses of rebound for producing sectors of the economy”;[2] a “fruitful path for further examination of rebound effects”;[3]; and the very essence of intellectual modesty, with an “extensive discussion”[4] of cautions and limitations that “are rigorously illuminated by the author.”[5]
Yet when confronted with serious concerns about the empirical basis of the study—both privately within the expert community five years ago and publicly again this year—Dr. Saunders and his colleagues at the Breakthrough Institute doubled down on aggressive and unsubstantiated findings that fit their political narrative on rebound, not the facts.
A cursory look at the wrong data
Instead of acknowledging serious analytical errors, Dr. Saunders defends his results by asserting that there is no difference, econometrically speaking, between changes in national and state-level average energy prices. According to Dr. Saunders, there is no difference in the variation of prices observed at the state and national levels. Therefore, he claims his published results are unaffected by using national average prices in a model that estimates industry-specific rebound effects. He makes his case by analyzing EIA fuel price data for California, Texas, and the United States.
This effort falls well short of excusing his paper’s mistakes.
First of all, analyzing EIA price data doesn’t tell us anything about the validity of Dr. Saunders’ 2013 article because he used a completely different data set in that study. Again, Dr. Saunders’ paper relied on a data set from Professor Jorgenson, which, as we have repeatedly pointed out, is inconsistent with EIA’s more reliable data in the few instances where the categorization of the two data sets is even roughly comparable. On top of that, the Jorgenson data are explicitly constructed from non-EIA data sources. So how do patterns in the EIA data support Dr. Saunders’ approach?
Second, Dr. Saunders once again avoids confronting his model’s complete lack of primary data. Comparing state and national prices does not speak to the difference between national prices and the industry-specific prices Dr. Saunders incorrectly claimed his data provide. While it would be nice to see the difference the use of correct data would make, there are no primary energy price data at the level of Professor Jorgenson’s industrial classifications, which approximate—but only roughly—the 2-digit SIC classification scheme.[6]
The lack of data reflects the fact that five energy-producing sectors in Professor Jorgenson’s data do not correspond well to real-world energy markets. For example, the data have a combined oil and gas extraction sector, which is assigned a single national price; however, the relationship between oil and natural gas prices in North American energy markets is far more complex than a single composite price index could reasonably represent over nearly five decades. Section 9 in our published article’s Supplemental Information reviews this and several related concerns in detail.
Third, Dr. Saunders makes a very limited case that EIA data show little difference in energy price variation at the state and national levels. For one thing, he presents detailed data for only two states, not fifty.[7] Perhaps more importantly, he compares trends across incongruous time periods. His published article runs a model over 45 years of data (1960-2005), but in his blog post, Dr. Saunders compares state and national data across different time periods for natural gas (1967-2014), distillate fuel oil (1983-2010), residual fuel oil (1983-2010), and electricity (1990-2014). This is hardly a firm basis for establishing a fixed relationship between price trends over a much longer period; and it is all the more problematic because the statistical match he reports is actually quite poor for electricity, a key fuel for price-sensitive, energy-intensive industries that have historically been located in areas with distinct electricity fuel mixes (e.g., hydropower in the Pacific Northwest or coal in the Rust Belt).
Finally, Dr. Saunders glosses over the significant problem of using average prices to study the rebound effect. He acknowledges that “[i]n a microeconomic sense, it is true that producer decisions depend on marginal prices rather than average prices.” But he claims his paper’s reliance on average price data is acceptable because his econometric model takes as input the change in prices, not absolute prices. Implicit in this claim is the rather bold assertion that variation in marginal and average prices are statistically equivalent—a proposition without any support whatsoever in either his blog post or paper. By using average prices, Dr. Saunders rejects the standard approach in microeconomics and thereby fails to distinguish between rebound effects and all other behavioral responses to energy prices.
As a result, Dr. Saunders’ response fails to address the data quality concerns we raised in our paper.
Garbage in, garbage out
Then there is the question of the theoretical validity of Dr. Saunders’ model, a topic our response article explicitly did not address (see footnote 6 on page 206 of our published paper). Dr. Saunders mistakes our silence as evidence that his model is unassailable:
In many ways, the Cullenward/Koomey critique of the Saunders article is reassuring. They have plainly taken a deep look at the analysis and, finding no methodological issues to criticize, were reduced to challenging the Jorgenson et al. dataset used in the rebound analysis.
Dr. Saunders finds a strange comfort in our criticism. We focused on data quality not for lack of other concerns, but because we are experts in U.S. energy data and knew from unrelated research projects that no primary data sources could support the paper’s analysis. If that isn’t a methodological criticism, we don’t know what is.
We were careful not to cast aspersions in our response article on those aspects of Dr. Saunders’ work we did not analyze in detail, including his model structure. Nevertheless, we aren’t convinced that his model is any more accurate than his data and reject the notion that our silence implies a failure (or even an interest) in finding problems with his model.
If anything, the errors we found in Dr. Saunders’ data suggest that those who examine his model will find problems there, too. But we need not address that issue because the inconsistencies we found in Dr. Saunders’ data are sufficiently grave to invalidate his conclusions. The first question any good analyst asks is whether the data can speak to the research question at hand. If they can’t, the details of model structure are irrelevant.
Hide-and-seek in peer review
Finally, we note that Dr. Saunders places great reliance on the fact that his 2013 article made it through peer review:
Cullenwald [sic] and Koomey simply complain that they raised concerns about problems with my data set at a Carnegie Mellon workshop in 2011. This is indeed the case. I subsequently published my analysis, and it passed peer-reviewed muster, because … there is no evidence that those concerns are particularly material to the conclusions of my analysis.
Unfortunately, the quality control mechanisms of peer review should give readers little comfort in this instance. Dr. Saunders did not disclose any data quality issues to reviewers, who were ill equipped to assess the issue as a result.
We are grateful that the journal Technological Forecasting & Social Change, which published Dr. Saunders’ 2013 paper, was also willing to publish our response article. With respect, however, TF&SC is not primarily an economics journal. For example, one of our anonymous reviewers requested we explicitly define commonly understood economics terms (such as the principal-agent problem) in order to better communicate with the journal’s readers, not all of whom are familiar with standard economic jargon.
It is hard for us to imagine that peer reviewers at an interdisciplinary journal with limited readership among economists would have been able to identify the detailed data concerns we raised with Dr. Saunders in 2011 but which he did not disclose in his submission. As our response article demonstrates, his published paper fundamentally misconstrues the nature of its own data sources—an inconsistency a peer reviewer would only discover if he or she took the exceptional effort to read the references listed in Professor Jorgenson’s data documentation, not merely Dr. Saunders’ factually incorrect description of his own methodology.
Presumably Dr. Saunders had not yet realized these mistakes when he submitted his paper to the journal, in which case he has absolutely no business citing peer review as validation on this point. But this benign interpretation makes sense only if Dr. Saunders completely discounted our warning that no primary data existed at the level of specificity his model required (as one of us (D.C.) presented at a Carnegie Mellon University workshop Dr. Saunders attended in July 2011, and as both of discussed over lunch with Dr. Saunders and his colleague Jessie Jenkins in March 2011).
Alternatively, if at the time of journal submission Dr. Saunders knew (or reasonably suspected) his data didn’t match his model, it appears he withheld critical information from peer reviewers and misled the research community. Given the importance of the timing of Dr. Saunders’ realization, we would be grateful if he would clarify exactly when he realized that his data actually represent national averages, not industry-specific marginal prices.[8]
In light of the methodological inconsistencies we documented in Dr. Saunders’ work, we think the journal made the right decision to publish our peer-reviewed response article. To the extent Dr. Saunders believes the errors we documented don’t change his results, we would encourage him to make a full and complete rebuttal in the peer-reviewed economics literature.
_____________________________________________________________
References
[1] Jesse Jenkins, Ted Nordhaus, and Michael Shellenberger (2011). Energy Emergence: Rebound and Backfire as Emergent Phenomenon. Breakthrough Institute Report, page 16.
[2] Id. at page 16, footnote 13.
[3] Id. at page 32.
[4] Id. at page 32, footnote 32.
[5] Id. at page 31.
[6] As many economists know, the federal government stopped using SIC accounting in the late 1990s. It turns out that Professor Jorgenson never bridged the SIC and newer NAICS accounting structures, and therefore had to extrapolate the last five years of his KLEMS data because no government entity publishes data in the SIC structure he retained. See Cullenward and Koomey (2016), supplemental information at Section 7.
[7] Dr. Saunders also presents a graph of commercial natural gas price data for eight states and concludes that the visual pattern of variation across these states is comparable to variation in the national average price.
[8] While Dr. Saunders explicitly (and incorrectly) claimed to be using industry-specific energy prices, his paper never specified whether these were average or marginal prices. It is entirely possible that Dr. Saunders intentionally (as opposed to mistakenly) used average prices. Whatever the case, we believe the paper should been explicit about its departure from the standard approach in microeconomics.