Comments Received During Round One Protocol Development Process

Performance Basis Protocol

Comment on Performance Basis Protocol

Commenter

Change that was made

Why change was made or not made.

Measure Installations: This currently requires Program Administrators to routinely provide large amounts of information to the Energy Division that is not necessary and will not be used: measure-level installation counts and associated program costs every month for every program throughout the 3-year program cycle.

This requirement puts the overburdened Energy Division staff in the role of data intermediary between the Program Administrators and ED's contractors, leading to data overload and delays in data transmission. Over the last two years, we have learned that it is more efficient for the Program Administrators to provide data to each contractor directly, at the frequency and in the format that the contractor needs to do its work.

SCE

The data to the evaluation contractors should come from the IOU directly, not via the CPUC. Check and make sure of this. But ED does want to know when the data request has been met, delivered. Put in a sentence that the IOU will be notified ED when it is sent and that they are in compliance of the request.

This is a performance bases protocol issue not a how-to-protocol issue, but data protection is in the protocol now.

Net to Gross Ratio by Program Strategy: Add "and measure or end use, where appropriate." For example, Net-to-gross ratios can be different for some major rebated measures or end uses.

SCE

Added "measure and end use where feasible."

Agree with comment but replace "appropriate" with "feasible".

Expected Useful Lives of Measures: While it is impossible for surveys for persistence evaluations to begin until (at best) early 2006, Joint Staff should consider quickly starting a study to get earlier and more robust estimates for measures with the greatest uncertainty. The study should monitor retention among installers drawn from earlier program years, in order to get retention data over a longer period.

SCE

Added discussion of timing considerations and a need for studies to be conducted when enough data on failures is available. Put in that past program installs can be used.

Matter for Joint Staff to determine. Joint Staff will consider comments when planning evaluation efforts. Therefore, included some discussion of the matter, but the protocols are not the place to provide a prescription.

Reporting Protocol

Comment on Reporting Protocol

Commenter

Change that was made

Why change was made or not made.

Final chapter of the Second Draft Evaluation Protocols

Section on Information Needed from Administrators
As agreed in the workshop, this section should be moved to another chapter. A good placement would be near the end of the introductory chapter, since the data requirements may be useful for multiple types of evaluation and because the introductory chapter already ends with sections on confidentiality and customer contacts.

SCE

We added a chapter on what data administrators need to provide. It was referenced in the front of the document, and the chapter was inserted after the reporting chapter.

To improve flow of document

THE REPORTING REQUIREMENTS FOR PROGRAM AND PORTFOLIO EVALUATION ARE REASONABLE, WITH MINOR MODIFICATIONS

Pursuant to Decision 05-01-055, the Commission's Opinion on the Administrative Structure for Energy Efficiency, the Energy Division has for the past year worked with interested parties to develop Evaluation, Measurement and Verification (EM&V) policies, protocols and reporting requirements for Commission consideration. Such EM&V activities are intended to measure and verify energy and peak load savings for each of the utility-administered programs and portfolios and whether the portfolio goals are met. The draft EM&V reporting requirements are presented in The 2005 California Energy Efficiency Evaluation Protocols.3 These protocols serve the following primary purposes: (1) They identify the information that program and portfolio administrators will need to have readily available to support their evaluation efforts and the evaluation efforts of the Joint Staff (CPUC-ED and the CEC) and their evaluation contractors, in order for the evaluations to be successfully completed, (2) they identify the information that needs to be incorporated into the different types of evaluation reports, and (3) they specify how that information needs to be reported.4 This section of the comments is meant to address these aspects of reporting. Overall, SCE agrees with the draft EM&V reporting requirements, but provides minor inputs below for consideration.

The Ruling requests that commenting parties address the reasons for the recommended reporting requirements and why the specified data is or is not required. In the case of the draft

Evaluation Reporting Protocols, the question of "why" would apply both to the data which the program and portfolio administrators will need to have readily available to evaluators as well as to the information that needs to be incorporated by the evaluators into the different types of evaluation reports. The section of the draft Evaluation Reporting Protocols entitled "Information Needed from Administrators" details a large list of information which is to be requested from the program and portfolio administrators to be provided to the evaluation teams. While the list is lengthy, to the extent that the requested data is available for each individual program, the list is reasonable.

The list provides inputs which are generally collected for most of the programs and with sufficient time, can be developed and presented to the evaluation teams. It will be imperative, however, as stated in the draft Evaluation Reporting Protocols, that the evaluation contractors include in each evaluation plan a detailed description of the data that will be needed from the program administrators for the particular program.5 This will serve to provide the necessary notice to the program administrators, for each individual program, and ensure that the appropriate data is being collected during the program year to facilitate measurement of program impacts.

On the frequency and method of requesting the data to be requested from the program and portfolio administrators, the draft Evaluation Reporting Protocols propose that the administrators would respond to data requests from the Energy Division to provide the necessary data for the evaluations. This seems appropriate. The requested information would be classified as data which needs to be collected by the program administrators for access by the Commission, as opposed to standardized data to be submitted on a regular basis. It would not be appropriate to develop any type of standardized report to be completed on a regular (e.g., monthly, quarterly) basis, since the data will not be required this often. It is more appropriate to meet the needs of the evaluation reports with data requests only as information is needed for individual evaluation reports.

In response to the question of "why" the requested information needs to be incorporated into the evaluation reports, SCE agrees with the requested data and the sample reporting tables included at the end of the draft Evaluation Reporting Protocols. The requested data and the proposed formats for presentation of the data will provide the end-users with the information necessary to determine the success of the programs and portfolios. While the performance incentive mechanism has yet to be determined, the data requested for the reports should be able to integrate with any expected mechanisms.

SCE offers a few clarifying comments on the draft Evaluation Reporting Protocols: First, as agreed to by parties in the December 13-14 Energy Efficiency Evaluation, Measurement and Verification Protocols workshop, the section on Information Needed from Administrators should be moved to another chapter. A good placement would be near the end of the introductory chapter, since the data requirements may be useful for multiple types of evaluation and because the introductory chapter already ends with sections on confidentiality and customer contacts.

Second, the third paragraph of this section can be simplified by replacing the current text with the following:

"It is expected that the administrators will respond to all evaluation data requests within 30 working days by providing as much of the requested information as possible, either information required by this protocol or supplemental information needed for the evaluation. Information should be provided in formats agreed upon by the administrators and the evaluation team leads. If this timeline cannot be met, the administrator will provide the requesting organization and the CPUC-ED an explanation of why the timeline cannot be met and will work with them to establish a mutually agreed-upon delivery timeline."

3 The 2005 California Energy Efficiency Evaluation Protocols - Draft Evaluation Protocols, TecMarket Works

Team, December 8, 2005.

4 Id., p. 2.
5 Id., p. 3.

SCE

Added language to specify that data requests should go to the IOU and Joint Staff at the same time, not funneled through the CPUC. No need to put in a "why" section in the data request.

Also added that the evaluation contractor will work with the administrator's to agree when the data should be provided. Added data response period and method consistent with discussion at workshop.
Protocols now include reporting tables.

Decision to send at the same time to both the CPUC-Ed and the IOU.

The data request does not need to specify why each data point is requested.
Agree that reporting tables should be placed in the protocols.

Reporting Timelines and Content
There was a significant amount of discussion on report timing and content at the protocols workshop. It would be very helpful to integrate report timing and content into the protocols - in a timeline format if possible. However, we have concerns about specifying a single timeline to apply to all program evaluations. Due to the differences in program types and delivery, evaluation timing should be considered at the project level. The timing of an evaluation for a retail rebate program on CFLs may be very different from a residential new construction program, which may have virtually no realized savings until a year or two after program inception. One size does not fit all. Therefore, our recommendation is to include a timing guideline which allows for flexibility to maximize the usefulness of each report.

RLW Analytics, Inc.

Added language to specify that evaluation reporting must consider the timing of the information needs. This consideration needs to be documented in the evaluation plan, and included in the reporting schedule to ensure the information can be used in the planning cycle.

Agree with the comment.

Sampling and Uncertainty Protocol

Comment on Sampling and Uncertainty Protocol

Commenter

Change that was made

Why change was made or not made.

Page 102, Evaluation Planning

Paragraph 1, we recommend referencing the Framework, pages 305-313, "Allocation of Resources to Evaluation"
Paragraph 2 after the first sentence, we recommend referencing the Framework, pages 298-300, "Integrating the Results from Multiple Evaluation Studies"

RLW Analytics, Inc.

Recommended changes have been made

The referenced sections in the Framework will help the reader think through the issues raised in the Protocols.

The Protocols lack clear guidance as to how sample sizes should be assigned to programs in order to minimize the statistical uncertainty of the aggregated impact.

When program impacts are aggregated together to determine portfolio impacts, the resultant aggregate impact has a composite statistical uncertainty, which depends on the uncertainty of the individual programs, the sample sizes of the individual programs, and the weights used when adding up the programs. The protocols could help to reduce this composite statistical uncertainty by giving more guidance about the choice of sample sizes for individual programs. At present the protocols contain some instructions, which can be interpreted as giving some preliminary guidance on this matter in Table 15 in the Sampling and Uncertainty chapter. For enhanced regression methodologies the evaluators are expected to "conduct, at a minimum, a statistical power analysis as a way of initially estimating the required sample size." This sentence needs to be developed in more detail in order to be useful towards reducing the composite statistical uncertainty of portfolio impacts.

The following are some thoughts as to where such a development might go. As an example, energy savings program impacts are supposed to be added together to give the total impact of the programs on energy savings. The uncertainty of the total impact on energy savings is a function of the uncertainty of the individual programs, the sample sizes devoted to the individual programs, and the weights associated with the individual programs when they are added up. If these weights are utility specific, then programs for smaller utilities will be given less weight. To increase the accuracy of the final total impact estimate it is statistically optimal to assign smaller sample sizes to programs, which are given less weight, however there has been no discussion of this in the protocols.
Sample development through power analysis may be a way of carrying out such efficiency in some instances. To explain, programs in smaller utilities will presumably have proportionally smaller impact goals based on smaller utility potentials, but may have disproportionately larger numbers of program participants than larger utilities. In such cases the required per customer program effects need not be as large as for larger utilities because these per customer program effects will be multiplied by disproportionately larger program participant numbers to get the required program impact goals. Finally, the fact that the required per customer program effects need not be as large leads to a reduction in the needed initial sample sizes for such programs, if one determines the sample sizes through power analysis (because the expected program effects will be significantly larger than the required program effects in these situations.)

DRA (Division of Ratepayer Advocates)

We have added: "It is also recognized that the targeted precision at the program level must be allowed to vary in ways that produce the greatest precision at the program group level. For example, in some cases accepting a lower level of precision for programs with small savings might allow for the allocation of greater resources to programs with larger savings, thus increasing the achieved precision for the program group."

This is a very complicated issue for which all the possible issues cannot be anticipated. The general guidance provided in the Protocol raises the issue that can best be addressed in the evaluation planning process.

Page 99, Sampling and Uncertainty Protocol

"Finally, the guidelines regarding sampling and uncertainty must be followed for each utility service territory. For example, precision targets, when specified for a particular level of rigor, must be set for each utility service territory."
We recommend leaving the stratification decision at the program level, since in most cases statewide precision is optimal, but in some cases utility-level precision may be necessary depending on the evaluation goal. i.e., we recommend deleting this sentence.

RLW Analytics, Inc.

No change

Joint staff needs the Sampling and Uncertainty Protocol applied at the IOU level.

Page 101, Sampling and Uncertainty Protocol
We recommend adding a statement that the rigor levels are guidelines and tradeoffs can be made subject to Evaluation Planning on page 102

RLW Analytics, Inc.

No change

Joint Staff wants these not as guidelines, but as required levels. Joint Staff are setting the rigor levels; if Joint Staff want to approve a change, they can.

Page 102. Sampling and Uncertainty Protocol

The Evaluation Planning section contains a few overarching statements about sampling that are intended to, and should, take precedence in the Sampling and Uncertainty Protocol.

Page 102 "It is also recognized that the targeted precision at the program level must be allowed to vary in ways that produce the greatest precision at the program group level."

Therefore, we recommend moving the Evaluation Planning section (p.102) to the beginning of the Sampling and Uncertainty Protocol because it summarizes the basic relationship between the allocation of evaluation resources and sampling rigor by acknowledging that tradeoffs in precision may be desirable in order to maximize the reliability of the savings estimates.

Furthermore, we recommend deleting the references throughout the Protocol (pages 35, 36, and 101) that specify a minimum sample size of 300 units for self-reported net savings, and instead allocate sampling based upon maximizing value from the resources to be determined at the program level, as already stated on page 102. Recommended language:
"In the final plan, evaluation resources will be allocated in a way that maximizes the reliability of the savings and is consistent with cost-efficient evaluation, i.e., where evaluation resources are set and allocated at levels that maximize the value received from these resources."

RLW Analytics, Inc.

No change

It seems more appropriate to have the section "Development of the Evaluation Study Plan" follow the discussion of the impact, verification, and M&V tables.

The minimum sample of 300 for estimating net-to-gross ratios using the self-report method was kept to reduce the chances of manipulating the results. However, the sample size can be adjusted when developing the evaluation study work plan for the reasons given by RLW. Any adjusted sample sizes must be based on the careful analysis suggested by RLW.

Evaluation Identification and Planning

Comment on Evaluation Identification and Planning

Commenter

Change that was made

Why change was made or not made.

Page 8, under the heading: Use of the Evaluation Results to Document Energy Savings and Demand Impacts.
PG&E recommends Joint Staff describe (to the extent possible) staff's recommended approach for aggregating the individual program evaluations to a portfolio-level estimate. At the minimum, Staff should state that they will use a bottoms-up approach for aggregating to Portfolio-level. This should include an explanation of how specific results from the Risk Assessment will contribute to defining what areas of the Portfolio Staff will focus on to achieve the Portfolio-level estimate.

PG&E

Added language that Joint Staff will work on this with or without contractor help and will be developing how this will be done. This will be a summing process of some sort, may be complex in which reliability data is aggregated, or a simple summing. Does not need to be detailed in the protocol as it is done outside of the protocol process.

Agree it is an issue, added text to clarify, but will set final approach outside of the protocol.

Page 8, under the heading: The Evaluation Identification and Planning Process
PG&E recommends Joint Staff provide more discussion and details regarding the risk assessment process, its stated purpose, the propose methodological approach for assessing risk, and examples of expected output. Staff can use the Process Protocols handout as a start for this section. Going forward (after the Risk Assessment workshop) PG&E recommends Staff include a chapter on the methodology used, including a list of input parameters used to develop the risk models.

PG&E

This is not a protocol issue; no change needed except to delete language in the protocols that refers to the risk analysis process in any way that can affect the approach. Note: Joint Staff have issued a report and held a workshop on this.

Falls outside the scope of these protocols.

Glossary

Comment on Glossary

Commenter

Change that was made

Why change was made or not made.

In Appendix B, Page 166:
PG&E recommends Joint Staff alter the definition for Participant as follows: "An individual, household, business, or other utility customer that received the service or financial assistance offered through a particular utility DSM program, set of utility programs, or particular aspect of a utility program as described in the program theory, program logic, and/ or program description. Participation is determined in the same way as reported by a utility in its Annual DSM Summary.

PG&E

Cleaned up the definition a bit, but do not want to over specify.

The protocol glossary needs to be a generic definition that can be modified as needed in the detailed evaluation plan. The detailed evaluation work plan needs to identify what a participant and non-participant are. This does not need to be further defined in the protocol.

Page 140, Glossary
We recommend adding definitions of ex-ante reported savings, and ex-post reported savings

RLW Analytics, Inc.

Definitions were added for these terms that were consistent with Policy Rules. Protocol was checked for correct term use.

Agreed that clarification would be helpful.

Impact Protocol

Comment

Commenter

Change that was made

Why change was made or not made.

The Protocols lack clear guidance on how the savings will be attributed without double counting.

In the upcoming three-year program cycle, audit programs will for the first time be treated as a resource program instead of an information program¹. While there is currently an unspoken rule about the allocation of energy savings between an audit program and a downstream rebate program, i.e. if a customer decides to follow the recommendations of an energy audit and invest in energy efficiency, the resultant savings will be counted towards the audit program rather than the program that offers customer rebates, this allocation rule has not yet been documented in the protocols. Leaving such issues on the table may result in either double counting of energy savings or unnecessary disputes among program implementers as to who "owns" the energy savings credit. Also, if it is decided that the energy savings will be counted towards the audit programs, then program implementers will need to set up a data tracking system that allows them to identify which rebate claims have been influenced by an audit recommendation.
The question on how to allocate the energy savings credit without double-counting will need to be discussed by all parties involved in the delivery of energy savings - utility administrators, utility partners including local governments and third party program implementers. These parties need to continue to work together in a synergistic way to enable the utility administrators to meet their energy savings goals. ORA recommends that the assigned ALJ direct the utility administrators take the lead in addressing this question through the use of workshops or meetings as soon as possible. Following the workshop, the utility administrators will submit the meeting notes and any recommendations to the ALJ and Joint Staff for consideration and inclusion in the protocols

DRA

No edits required.

Evaluation Protocols already require evaluators to ensure no multiple counting occurs in reported evaluation results. As cited below. Workshop and party interaction comment is a process issue outside of scope for "how to" evaluation protocols. Where Protocols address potential multiple counting: Page 29 says that all program managers must supply the information on all programs & measures for each participant. AND that all evaluators must use this information in the evaluations in a manner to ensure no double counting of gross savings. Page 40 says that behavior evaluation that links to energy savings won't have the energy savings counted towards the portfolio unless Joint Staff find a method & determine that no double counting occurs. Page 82 calls for work in looking across Market Effects evaluations and program-specific evaluations to ensure no double counting occurs. Page 88 & 89 specifically refers to ensuring no double counting within using retailer data for Market Effects evaluations.

The one issue most likely to be disputed is the study findings on net-to-gross ratios.
DRA continues to support refining NTG as part of the EM&V efforts overseen by Joint Staff. Indeed, a NTG of 0.96 for nonresidential prescriptive rebates does appear to be artificially high. However, ex-ante NTGs should be used rather than ex-post NTGs to calculate the performance basis, while ongoing EM&V results should be used to update the ex-ante values. This will continue to provide feedback to program administrators for program design purposes without marked disruptions to their EE portfolio in the middle of a program year.

DRA

No changes made.

Outside scope. The decision to conduct true-up on the NTG ratios has been made by the Commission. The specific use of the evaluation findings regarding NTG ratios are not prescribed in the evaluators' protocols. The decisions that have been made thus far pertaining to use of findings regarding NTG ratios is covered in the Performance Basis Protocol.

The Protocols lack clear guidance on how program impacts are to be aggregated to determine portfolio impacts and portfolio cost effectiveness.
During the workshop, a question was raised to clarify how program impacts are to be aggregated to determine portfolio impacts. Joint Staff responded that the program impacts will be summed up to obtain the portfolio impacts, although it remains unclear what summation methodology will be used by Joint Staff or its consultant. ORA recommends that this be clarified as part of the Performance Basis Protocol.

DRA

No changes made.

Outside scope. This was provided as an issue for the Performance Basis Protocol.

The Impact Evaluation Protocol.

The Participant Net Impact Protocol (pp. 33-37) should withdraw from the arbitrary requirement for sample sizes of 300 in Level I and Level II Rigor. Such a requirement was one of the most problematic features of the 1990's Protocols. Just as with sample sizes for the accompanying gross savings estimation, the sample size here should be based on the overall energy savings of the program or measure and the number and variance among participants. The evaluator should provide a recommended sample size with an accompanying precision estimate and discussion of the potential for bias.
The section should also be given a thorough review to revise questionable requirements such as for power analysis, mentioned multiple times in the chapter and summary table, which is appropriate for hypothesis testing rather than establishment of precision of estimates, for unclear use of evaluations of similar programs, and for partial free ridership. An editorial review could clarify wording and order and correct typographical errors.

SCE

1. Text added that describes the special challenges within survey-based methods for NTG analyses due to construct validity issues and often a mix of quantitative and qualitative data. These challenges do not allow for a requirement that can ensure consistent rigor level for sample size requirements. This resulted in the sample requirement of 300 or census of decision-makers, whichever is smaller. Text has been added that specifies that an alternative to the 300 sample size requirement may be proposed by the evaluator in the evaluation plan as an option with justification that includes addressing all the issues presented by the aggregation of variances in the methodology proposed.

2. Greater clarification of the role of power analysis in the protocols as only being required in the evaluation planning process as one input among others (including past related evaluation studies and professional judgment). More explanation of using power analysis for a sample size requirement estimate was added to include references and an Appendix with further detail and additional software and literature references.
3. A professional editor working in the energy efficiency field conducted an edit of the protocols.

1. Sample size requirements for the survey-based NTG methodology remains at 300 or census (whichever is smaller) as issues with aggregation of variances, construct validity, and combining quantitative and qualitative information did not allow for an alternative methodology for sample size requirements that would ensure consistent rigor levels.
2. Power analysis remains a requirement for determining sample sizes in regression-based approaches (including regression, logistic/discrete choice regression, and ANCOVA).

Page 17, Impact Evaluation Protocol

We note that page 105 contains an appropriate discussion of Acceptable Sampling Methods, however, there is no corollary discussion of Acceptable Parameter Estimation Methods, in the Impact Evaluation Protocol. (The Gross Energy/Demand Evaluation Allowable Methods do not address a range of parameter estimation methods.) Therefore, we recommend the simplest addition - a sentence stating that,

"Generally accepted statistical methods can be used for parameter estimation from sample data."
This sentence could be added throughout the Impact Evaluation Protocol, but at a minimum should be included in the introduction paragraph of the Energy and Demand Impact Protocols.

RLW Analytics, Inc.

Text added to allow for generally accepted statistical and engineering methods for parameter estimates. Additional text also added to define generally accepted methods.

Additions were made to incorporate suggested change with definitions to support rigor for what is acceptable.

Spillover effects, whether it is participant spillover, non-participant spillover or however it is renamed, should not be counted towards net savings for the evaluation of performance basis.

During the workshop, utility representatives proposed introducing a new terminology to bypass the Commission decision on discounting "spillover effects" in the calculations of cost effectiveness and performance basis². As defined in D.0504051, "spillover" is the effect of a program to induce other customers to invest in energy efficiency even without a program incentive. There could be two types of spillover: (1) participant spillover, whereby a participant in a rebate program ("Customer A") decides to invest in additional energy efficiency (EE) measures that do not provide customer incentives; and (2) nonparticipant spillover, whereby an individual or business ("Customer B") decides to invest in energy efficiency without claiming the associated rebates (his/her decision may be influenced by word-of-mouth, local information programs and/or statewide marketing and outreach programs.) Both types of spillover are similar to free ridership³, where the participant would have made the EE investment regardless of the program rebate or program existence. The Commission has already made the determination that energy savings associated with free riders will be excluded from the net savings and cost effectiveness calculations (using the Net to Gross ratio). Hence, ORA recommends that the Commission maintain its position in discounting "spillover effects" in the calculations of cost effectiveness and performance basis and disallow the use of any new terminology to replace spillover.
ORA further cautions that should the Commission decide to count non-participant spillover effects, the evaluation of such effects will be a very costly undertaking. The study scope needs to cover not only program participants, but the entire universe of utility and muni customers. It will also be difficult to attribute the energy savings to each and every program that might have influenced the customer.

DRA

Text has been added to support Commission decision that no type of spillover, participant or non-participant, will be used for the performance basis. Impact evaluations are being required to measure participant spillover but not non-participant spillover for completeness in evaluation results. They are also being required to report savings net of free ridership but not including any type of spillover so that reporting supports the performance basis. Clarification wording concerning spillover versus free ridership has been added in the Impact, M&V, and aggregation sections of the protocol and reporting protocol.

The evaluators' protocols do not directly address the performance basis as this is done in the Performance Basis Protocol. However, wording has been changed/tightened to ensure that reporting of evaluations results for net of free ridership (with no inclusion of participant or non-participant spillover) are derived and reported.

Measurement and Verification

Comment on Measurement and Verification

Commenter

Change that was made

Why change was made or not made.

Page 51, Key Metrics, Inputs, and Outputs (first sentence)

"M&V Studies, since they are directed by the impact evaluation protocol..."
We recommend adding: "and/or the Process or Market Effects Protocol,"

RLW Analytics, Inc.

Change made by adding the recommended words.

Agree with the comment.

¹ During the 1994-1997 period, audit programs are under the program category of energy management services with non-energy goals tied to their performance.

² In D.0504051, the Commission denied PG&E's request to count "spillover effects" in the calculations of cost effectiveness and performance basis.

³ As defined in the Common Energy Efficiency Terms and Conditions in Appendix B in D.0504051, free riders are defined as "customer who would have installed the program measure or equipment even without the financial incentive provided by the program."

Comment on Performance Basis Protocol	Commenter	Change that was made	Why change was made or not made.
Measure Installations: This currently requires Program Administrators to routinely provide large amounts of information to the Energy Division that is not necessary and will not be used: measure-level installation counts and associated program costs every month for every program throughout the 3-year program cycle. This requirement puts the overburdened Energy Division staff in the role of data intermediary between the Program Administrators and ED's contractors, leading to data overload and delays in data transmission. Over the last two years, we have learned that it is more efficient for the Program Administrators to provide data to each contractor directly, at the frequency and in the format that the contractor needs to do its work.	SCE	The data to the evaluation contractors should come from the IOU directly, not via the CPUC. Check and make sure of this. But ED does want to know when the data request has been met, delivered. Put in a sentence that the IOU will be notified ED when it is sent and that they are in compliance of the request.	This is a performance bases protocol issue not a how-to-protocol issue, but data protection is in the protocol now.
Net to Gross Ratio by Program Strategy: Add "and measure or end use, where appropriate." For example, Net-to-gross ratios can be different for some major rebated measures or end uses.	SCE	Added "measure and end use where feasible."	Agree with comment but replace "appropriate" with "feasible".
Expected Useful Lives of Measures: While it is impossible for surveys for persistence evaluations to begin until (at best) early 2006, Joint Staff should consider quickly starting a study to get earlier and more robust estimates for measures with the greatest uncertainty. The study should monitor retention among installers drawn from earlier program years, in order to get retention data over a longer period.	SCE	Added discussion of timing considerations and a need for studies to be conducted when enough data on failures is available. Put in that past program installs can be used.	Matter for Joint Staff to determine. Joint Staff will consider comments when planning evaluation efforts. Therefore, included some discussion of the matter, but the protocols are not the place to provide a prescription.

Comment on Reporting Protocol	Commenter	Change that was made	Why change was made or not made.
Final chapter of the Second Draft Evaluation Protocols Section on Information Needed from Administrators As agreed in the workshop, this section should be moved to another chapter. A good placement would be near the end of the introductory chapter, since the data requirements may be useful for multiple types of evaluation and because the introductory chapter already ends with sections on confidentiality and customer contacts.	SCE	We added a chapter on what data administrators need to provide. It was referenced in the front of the document, and the chapter was inserted after the reporting chapter.	To improve flow of document
THE REPORTING REQUIREMENTS FOR PROGRAM AND PORTFOLIO EVALUATION ARE REASONABLE, WITH MINOR MODIFICATIONS Pursuant to Decision 05-01-055, the Commission's Opinion on the Administrative Structure for Energy Efficiency, the Energy Division has for the past year worked with interested parties to develop Evaluation, Measurement and Verification (EM&V) policies, protocols and reporting requirements for Commission consideration. Such EM&V activities are intended to measure and verify energy and peak load savings for each of the utility-administered programs and portfolios and whether the portfolio goals are met. The draft EM&V reporting requirements are presented in The 2005 California Energy Efficiency Evaluation Protocols.3 These protocols serve the following primary purposes: (1) They identify the information that program and portfolio administrators will need to have readily available to support their evaluation efforts and the evaluation efforts of the Joint Staff (CPUC-ED and the CEC) and their evaluation contractors, in order for the evaluations to be successfully completed, (2) they identify the information that needs to be incorporated into the different types of evaluation reports, and (3) they specify how that information needs to be reported.4 This section of the comments is meant to address these aspects of reporting. Overall, SCE agrees with the draft EM&V reporting requirements, but provides minor inputs below for consideration. The Ruling requests that commenting parties address the reasons for the recommended reporting requirements and why the specified data is or is not required. In the case of the draft Evaluation Reporting Protocols, the question of "why" would apply both to the data which the program and portfolio administrators will need to have readily available to evaluators as well as to the information that needs to be incorporated by the evaluators into the different types of evaluation reports. The section of the draft Evaluation Reporting Protocols entitled "Information Needed from Administrators" details a large list of information which is to be requested from the program and portfolio administrators to be provided to the evaluation teams. While the list is lengthy, to the extent that the requested data is available for each individual program, the list is reasonable. The list provides inputs which are generally collected for most of the programs and with sufficient time, can be developed and presented to the evaluation teams. It will be imperative, however, as stated in the draft Evaluation Reporting Protocols, that the evaluation contractors include in each evaluation plan a detailed description of the data that will be needed from the program administrators for the particular program.5 This will serve to provide the necessary notice to the program administrators, for each individual program, and ensure that the appropriate data is being collected during the program year to facilitate measurement of program impacts. On the frequency and method of requesting the data to be requested from the program and portfolio administrators, the draft Evaluation Reporting Protocols propose that the administrators would respond to data requests from the Energy Division to provide the necessary data for the evaluations. This seems appropriate. The requested information would be classified as data which needs to be collected by the program administrators for access by the Commission, as opposed to standardized data to be submitted on a regular basis. It would not be appropriate to develop any type of standardized report to be completed on a regular (e.g., monthly, quarterly) basis, since the data will not be required this often. It is more appropriate to meet the needs of the evaluation reports with data requests only as information is needed for individual evaluation reports. In response to the question of "why" the requested information needs to be incorporated into the evaluation reports, SCE agrees with the requested data and the sample reporting tables included at the end of the draft Evaluation Reporting Protocols. The requested data and the proposed formats for presentation of the data will provide the end-users with the information necessary to determine the success of the programs and portfolios. While the performance incentive mechanism has yet to be determined, the data requested for the reports should be able to integrate with any expected mechanisms. SCE offers a few clarifying comments on the draft Evaluation Reporting Protocols: First, as agreed to by parties in the December 13-14 Energy Efficiency Evaluation, Measurement and Verification Protocols workshop, the section on Information Needed from Administrators should be moved to another chapter. A good placement would be near the end of the introductory chapter, since the data requirements may be useful for multiple types of evaluation and because the introductory chapter already ends with sections on confidentiality and customer contacts. Second, the third paragraph of this section can be simplified by replacing the current text with the following: "It is expected that the administrators will respond to all evaluation data requests within 30 working days by providing as much of the requested information as possible, either information required by this protocol or supplemental information needed for the evaluation. Information should be provided in formats agreed upon by the administrators and the evaluation team leads. If this timeline cannot be met, the administrator will provide the requesting organization and the CPUC-ED an explanation of why the timeline cannot be met and will work with them to establish a mutually agreed-upon delivery timeline." 3 The 2005 California Energy Efficiency Evaluation Protocols - Draft Evaluation Protocols, TecMarket Works Team, December 8, 2005. 4 Id., p. 2. 5 Id., p. 3.	SCE	Added language to specify that data requests should go to the IOU and Joint Staff at the same time, not funneled through the CPUC. No need to put in a "why" section in the data request. Also added that the evaluation contractor will work with the administrator's to agree when the data should be provided. Added data response period and method consistent with discussion at workshop. Protocols now include reporting tables.	Decision to send at the same time to both the CPUC-Ed and the IOU. The data request does not need to specify why each data point is requested. Agree that reporting tables should be placed in the protocols.
Reporting Timelines and Content There was a significant amount of discussion on report timing and content at the protocols workshop. It would be very helpful to integrate report timing and content into the protocols - in a timeline format if possible. However, we have concerns about specifying a single timeline to apply to all program evaluations. Due to the differences in program types and delivery, evaluation timing should be considered at the project level. The timing of an evaluation for a retail rebate program on CFLs may be very different from a residential new construction program, which may have virtually no realized savings until a year or two after program inception. One size does not fit all. Therefore, our recommendation is to include a timing guideline which allows for flexibility to maximize the usefulness of each report.	RLW Analytics, Inc.	Added language to specify that evaluation reporting must consider the timing of the information needs. This consideration needs to be documented in the evaluation plan, and included in the reporting schedule to ensure the information can be used in the planning cycle.	Agree with the comment.

Comment on Glossary	Commenter	Change that was made	Why change was made or not made.
In Appendix B, Page 166: PG&E recommends Joint Staff alter the definition for Participant as follows: "An individual, household, business, or other utility customer that received the service or financial assistance offered through a particular utility DSM program, set of utility programs, or particular aspect of a utility program as described in the program theory, program logic, and/ or program description. Participation is determined in the same way as reported by a utility in its Annual DSM Summary.	PG&E	Cleaned up the definition a bit, but do not want to over specify.	The protocol glossary needs to be a generic definition that can be modified as needed in the detailed evaluation plan. The detailed evaluation work plan needs to identify what a participant and non-participant are. This does not need to be further defined in the protocol.
Page 140, Glossary We recommend adding definitions of ex-ante reported savings, and ex-post reported savings	RLW Analytics, Inc.	Definitions were added for these terms that were consistent with Policy Rules. Protocol was checked for correct term use.	Agreed that clarification would be helpful.

Comment	Commenter	Change that was made	Why change was made or not made.
The Protocols lack clear guidance on how the savings will be attributed without double counting. In the upcoming three-year program cycle, audit programs will for the first time be treated as a resource program instead of an information program¹. While there is currently an unspoken rule about the allocation of energy savings between an audit program and a downstream rebate program, i.e. if a customer decides to follow the recommendations of an energy audit and invest in energy efficiency, the resultant savings will be counted towards the audit program rather than the program that offers customer rebates, this allocation rule has not yet been documented in the protocols. Leaving such issues on the table may result in either double counting of energy savings or unnecessary disputes among program implementers as to who "owns" the energy savings credit. Also, if it is decided that the energy savings will be counted towards the audit programs, then program implementers will need to set up a data tracking system that allows them to identify which rebate claims have been influenced by an audit recommendation. The question on how to allocate the energy savings credit without double-counting will need to be discussed by all parties involved in the delivery of energy savings - utility administrators, utility partners including local governments and third party program implementers. These parties need to continue to work together in a synergistic way to enable the utility administrators to meet their energy savings goals. ORA recommends that the assigned ALJ direct the utility administrators take the lead in addressing this question through the use of workshops or meetings as soon as possible. Following the workshop, the utility administrators will submit the meeting notes and any recommendations to the ALJ and Joint Staff for consideration and inclusion in the protocols	DRA	No edits required.	Evaluation Protocols already require evaluators to ensure no multiple counting occurs in reported evaluation results. As cited below. Workshop and party interaction comment is a process issue outside of scope for "how to" evaluation protocols. Where Protocols address potential multiple counting: Page 29 says that all program managers must supply the information on all programs & measures for each participant. AND that all evaluators must use this information in the evaluations in a manner to ensure no double counting of gross savings. Page 40 says that behavior evaluation that links to energy savings won't have the energy savings counted towards the portfolio unless Joint Staff find a method & determine that no double counting occurs. Page 82 calls for work in looking across Market Effects evaluations and program-specific evaluations to ensure no double counting occurs. Page 88 & 89 specifically refers to ensuring no double counting within using retailer data for Market Effects evaluations.
The one issue most likely to be disputed is the study findings on net-to-gross ratios. DRA continues to support refining NTG as part of the EM&V efforts overseen by Joint Staff. Indeed, a NTG of 0.96 for nonresidential prescriptive rebates does appear to be artificially high. However, ex-ante NTGs should be used rather than ex-post NTGs to calculate the performance basis, while ongoing EM&V results should be used to update the ex-ante values. This will continue to provide feedback to program administrators for program design purposes without marked disruptions to their EE portfolio in the middle of a program year.	DRA	No changes made.	Outside scope. The decision to conduct true-up on the NTG ratios has been made by the Commission. The specific use of the evaluation findings regarding NTG ratios are not prescribed in the evaluators' protocols. The decisions that have been made thus far pertaining to use of findings regarding NTG ratios is covered in the Performance Basis Protocol.
The Protocols lack clear guidance on how program impacts are to be aggregated to determine portfolio impacts and portfolio cost effectiveness. During the workshop, a question was raised to clarify how program impacts are to be aggregated to determine portfolio impacts. Joint Staff responded that the program impacts will be summed up to obtain the portfolio impacts, although it remains unclear what summation methodology will be used by Joint Staff or its consultant. ORA recommends that this be clarified as part of the Performance Basis Protocol.	DRA	No changes made.	Outside scope. This was provided as an issue for the Performance Basis Protocol.
The Impact Evaluation Protocol. The Participant Net Impact Protocol (pp. 33-37) should withdraw from the arbitrary requirement for sample sizes of 300 in Level I and Level II Rigor. Such a requirement was one of the most problematic features of the 1990's Protocols. Just as with sample sizes for the accompanying gross savings estimation, the sample size here should be based on the overall energy savings of the program or measure and the number and variance among participants. The evaluator should provide a recommended sample size with an accompanying precision estimate and discussion of the potential for bias. The section should also be given a thorough review to revise questionable requirements such as for power analysis, mentioned multiple times in the chapter and summary table, which is appropriate for hypothesis testing rather than establishment of precision of estimates, for unclear use of evaluations of similar programs, and for partial free ridership. An editorial review could clarify wording and order and correct typographical errors.	SCE	1. Text added that describes the special challenges within survey-based methods for NTG analyses due to construct validity issues and often a mix of quantitative and qualitative data. These challenges do not allow for a requirement that can ensure consistent rigor level for sample size requirements. This resulted in the sample requirement of 300 or census of decision-makers, whichever is smaller. Text has been added that specifies that an alternative to the 300 sample size requirement may be proposed by the evaluator in the evaluation plan as an option with justification that includes addressing all the issues presented by the aggregation of variances in the methodology proposed. 2. Greater clarification of the role of power analysis in the protocols as only being required in the evaluation planning process as one input among others (including past related evaluation studies and professional judgment). More explanation of using power analysis for a sample size requirement estimate was added to include references and an Appendix with further detail and additional software and literature references. 3. A professional editor working in the energy efficiency field conducted an edit of the protocols.	1. Sample size requirements for the survey-based NTG methodology remains at 300 or census (whichever is smaller) as issues with aggregation of variances, construct validity, and combining quantitative and qualitative information did not allow for an alternative methodology for sample size requirements that would ensure consistent rigor levels. 2. Power analysis remains a requirement for determining sample sizes in regression-based approaches (including regression, logistic/discrete choice regression, and ANCOVA).
Page 17, Impact Evaluation Protocol We note that page 105 contains an appropriate discussion of Acceptable Sampling Methods, however, there is no corollary discussion of Acceptable Parameter Estimation Methods, in the Impact Evaluation Protocol. (The Gross Energy/Demand Evaluation Allowable Methods do not address a range of parameter estimation methods.) Therefore, we recommend the simplest addition - a sentence stating that, "Generally accepted statistical methods can be used for parameter estimation from sample data." This sentence could be added throughout the Impact Evaluation Protocol, but at a minimum should be included in the introduction paragraph of the Energy and Demand Impact Protocols.	RLW Analytics, Inc.	Text added to allow for generally accepted statistical and engineering methods for parameter estimates. Additional text also added to define generally accepted methods.	Additions were made to incorporate suggested change with definitions to support rigor for what is acceptable.
Spillover effects, whether it is participant spillover, non-participant spillover or however it is renamed, should not be counted towards net savings for the evaluation of performance basis. During the workshop, utility representatives proposed introducing a new terminology to bypass the Commission decision on discounting "spillover effects" in the calculations of cost effectiveness and performance basis². As defined in D.0504051, "spillover" is the effect of a program to induce other customers to invest in energy efficiency even without a program incentive. There could be two types of spillover: (1) participant spillover, whereby a participant in a rebate program ("Customer A") decides to invest in additional energy efficiency (EE) measures that do not provide customer incentives; and (2) nonparticipant spillover, whereby an individual or business ("Customer B") decides to invest in energy efficiency without claiming the associated rebates (his/her decision may be influenced by word-of-mouth, local information programs and/or statewide marketing and outreach programs.) Both types of spillover are similar to free ridership³, where the participant would have made the EE investment regardless of the program rebate or program existence. The Commission has already made the determination that energy savings associated with free riders will be excluded from the net savings and cost effectiveness calculations (using the Net to Gross ratio). Hence, ORA recommends that the Commission maintain its position in discounting "spillover effects" in the calculations of cost effectiveness and performance basis and disallow the use of any new terminology to replace spillover. ORA further cautions that should the Commission decide to count non-participant spillover effects, the evaluation of such effects will be a very costly undertaking. The study scope needs to cover not only program participants, but the entire universe of utility and muni customers. It will also be difficult to attribute the energy savings to each and every program that might have influenced the customer.	DRA	Text has been added to support Commission decision that no type of spillover, participant or non-participant, will be used for the performance basis. Impact evaluations are being required to measure participant spillover but not non-participant spillover for completeness in evaluation results. They are also being required to report savings net of free ridership but not including any type of spillover so that reporting supports the performance basis. Clarification wording concerning spillover versus free ridership has been added in the Impact, M&V, and aggregation sections of the protocol and reporting protocol.	The evaluators' protocols do not directly address the performance basis as this is done in the Performance Basis Protocol. However, wording has been changed/tightened to ensure that reporting of evaluations results for net of free ridership (with no inclusion of participant or non-participant spillover) are derived and reported.