Establishing and communicating high standards, monitoring personnel and equipment performance, assessing the effectiveness of the maintenance program, and implementing improvements with an emphasis an individual accountability is an essential step towards strengthening the management of maintenance activities. To that end:
· Company and station managers should establish safety as the highest goal in all company endeavors. Individuals at all levels should include respect for safety in decision-making, and foster an environment that supports questioning attitudes. Policies, standards and procedures should be developed around the principle of safety first.
· Company and/or station managers should establish and reinforce maintenance standards that provide clear direction to maintenance personnel. Standards should clearly define maintenance objectives, expected performance levels, and responsibilities and accountabilities for maintenance activities. Standards for maintenance activities should be integrated into maintenance department policies and procedures. Maintenance standards should be reinforced in training.
· Department goals and objectives should be derived from company goals and objectives, and provide direction, establish standards, and foster continuing improvements.
· Company and maintenance management should effectively monitor and assess maintenance activities. Managers should motivate maintenance managers to observe the activities of workers in the field and initiate coaching or corrective action.
· Managers should continually assess the effectiveness of maintenance programs through a variety of techniques such as collecting and analyzing selected data, observing work practices in the field, and identifying root causes of maintenance-related problems. This assessment should address personnel and equipment performance and the effectiveness of processes. Maintenance department staffs should be trained to perform these types of assessment activities.
· Maintenance personnel should be held accountable for their performance. Effective feedback mechanisms for personnel performance, such as managerial coaching, performance appraisals, recognition and rewards, and disciplinary measures should be established. Feedback should be actively solicited from all members of the maintenance organization.
· Maintenance managers should effectively manage change within the organization. Maintenance performance should be closely monitored to ensure changes have the intended effect and to make additional modifications, as necessary.
Managers should establish mechanisms to provide direction to personnel conducting maintenance activities. These mechanisms should employ both written and oral means and address the following aspects of management.
Maintenance managers should establish and maintain high standards of performance and ensure implementation of company and department policies that affect the achievement of these standards. Clearly define responsibilities for implementing these standards and policies, including the responsibility of maintenance personnel. Although management sets the standards, it is important that workers are given an opportunity to help define them. Maintenance personnel must understand their authority, responsibility, and interfaces with other groups. Industry and station operating experience should be used to develop performance standards. Industry technical standards such as IEEE, ASME or ANSI documents that normally provide scientifically developed and industry-accepted parameters for fulfilling technical performance criteria may also provide a basis for some maintenance standards.
Department standards should also provide guidance for the development of other more definitive documents that govern maintenance activities such as policies and procedures. Methods should be specified for the controls necessary to develop, revise and implement department standards.
Clear lines of communication should be developed among departments and external groups that contribute to and support the maintenance function (for example, operations, construction or modifications, materials management, engineering, and training). The maintenance program should clearly depict the relationships among these supporting groups, as related to overall generation/station maintenance, by defining responsibility and authority and addressing organization and process interfaces. Communications protocols should be defined to ensure information across interfaces is transmitted accurately and efficiently. The degree of control should be sufficient to ensure accuracy, but informal enough to prevent stifling teamwork.
Long range planning of maintenance programs is required to achieve high levels of equipment availability and reliability over the life of the plant. Resources can be managed to support ongoing maintenance and continuous improvement of equipment performance and reliability. Activities that should be included in maintenance program long-range planning includes, but are not limited to:
· company business planning
· coordination with California ISO
· recurring major maintenance items such as turbine overhaul, boiler inspections, transformer testing and major pump rebuilds
· planned maintenance outages
· major projects and modifications requiring maintenance organization involvement
· future organizational and staffing changes
· life cycle management, e.g., replacement of components that are projected to reach the end of their service life or become obsolete
· coordination of common resources used for outages with other plants
· contingency plans for system operations, environmental or industry issues and events that may impact the maintenance program
· contractor and company support
· personnel development needs
· audits and self-assessments to determine effectiveness of maintenance activities
Maintenance goals should be consistent with, and supportive of, company goals and serve to focus management and worker direction. Maintenance goals should be used as a management tool to in improve generating asset performance. Examples of general goals related to maintenance include the following:
· personnel safety goals
· reduction in the number of unit trips caused by maintenance activities or maintenance preventable failures
· reduction in the number of start failures caused by maintenance activities or maintenance preventable failures
· reduction in the number of derates or ramp rate restrictions attributable to maintenance program deficiencies
· reduction in equipment deficiencies that adversely impact the operators' ability to effectively operate the plant
· decrease the number and duration of unplanned outages
· optimize timeliness of scheduled preventive maintenance activities, and predictive maintenance activities
· optimize corrective and preventive maintenance backlog
· work delays by cause
Goals should be challenging but achievable. Actions to support the goals are determined with input from personnel involved in conducting maintenance activities. Additionally, the status of meeting goals is given frequent and wide dissemination.
Maintenance performance is monitored through observations of work activities, inspection and monitoring of equipment performance, and follow-up of corrective actions.
Managers should routinely monitor work in progress to determine ways to improve maintenance and verify maintenance activities are conducted in accordance with policies and procedures.
Good work practices are recognized and encouraged; improper work practices are corrected on the spot. Self-checking is reinforced. Causes of improper work practices are identified and corrected, and generic corrective actions are initiated as needed. Corrective actions to consider include clarifying expectations, holding workers accountable for their actions, and revising training programs. Examples of practices or conditions to be checked include the following:
· proper use of pre-job and post-job briefings (tailboards)
· industrial safety protection practices
· worker awareness and knowledge of the impact of maintenance on system/plant performance
· quality of workmanship, material, and parts
· use of and adherence to procedures and policies
· practices for foreign material exclusion
· use of correct tools for the job
· maintenance of clean and orderly work sites
· work progress and time required to perform the job, especially if time-critical maintenance is involved for equipment vital to plant operation
· work being performed on the correct component, system, and unit
· adequacy of turnover for work spanning multiple shifts
· adequacy of post-maintenance tests
· techniques for quality verification
· effectiveness and timeliness of communication of problems and delays encountered in critical activities
· worker knowledge and proficiency on maintenance being performed
Selected maintenance data is monitored and trended to identify performance barriers toward achieving maintenance goals and objectives. Periodic reports to management include trends, a brief explanation for trends that appear to be unusual (positively or negatively), and corrective measures where warranted. The number and nature of data to be monitored may be affected by the maintenance information management system. The following are examples of quantitative and qualitative measures that should be considered when developing a performance-monitoring program.
· number of equipment failures
· mean time between failures
· preventive maintenance tasks overdue
· number of overdue preventive maintenance tasks deferred with technical justification
· components and systems requiring corrective maintenance more than a designated number of times within a given interval
· components and systems with high unavailability or low reliability
· analysis reports of component performance that indicate failure rates greater than industry wide averages
· historical equipment data that indicates high maintenance cost
· items not in stock on demand (percent of stock items not available on request)
· scheduled work requests delayed because of parts
· work requests in progress with material restraints
· quantity of discontinued and infrequently used inventory
· actual length of outage compared to scheduled duration
· amount of scheduled work not performed
· amount of unscheduled work added to outage
· evaluation of plant performance following the outage
· Rework monitoring data is collected on a clear definition of rework.
· corrective maintenance recurring within a specific period
· additional maintenance required during or following completion of maintenance activities, possibly involving the following:
· incorrect re-assembly
· damage to other components during maintenance
· post-maintenance test failure
· trending of man-hours expended per work item, particularly repetitive tasks
· summaries of items scheduled versus items completed
· direct observation of work and identification of barriers to work productivity
· benchmarking to compare with similar size/age units
· analysis of trends in program strengths or weaknesses, communication skills, procedure adherence, and safe work practices, as indicated by personnel errors and their causes
· performance in reinforcing management expectations, as indicated by overall department performance or by maintenance program monitoring data and self-assessments
· monitoring of manager during conduct of assigned tasks
· performance of workers assigned to an individual manager, as indicated by injury rate, personnel errors, rework, and productivity
· manager observations of maintenance and training activities and associated reports to management
Self-evaluation activities, including inspections, audits, reviews, and investigations, are necessary for an effective maintenance program. Self-evaluation activities should be balanced to provide the management team with a comprehensive view of past performance and identification of improvements needed to meet projected performance goals. The following four approaches should be considered when self-evaluations are conducted:
· reactive - conducted in response to a performance shortfall, such as root cause analyses of a critical component failure or an adverse trend in rework
· continuous - conducted on a routine basis to identify performance strengths and shortfalls; for example, manager in-field observations, post-job critiques, and accident prevention
· periodic - conducted on an event-dependent or periodic basis, such as a post-outage critique and scheduled program assessments
· proactive - conducted to identify improvements needed to move performance to levels that are above current expectations or to prepare for performance of an evolution; for example, benchmarking and infrequently performed tests
The following are proven successful assessment methods.
Assesses the overall effectiveness of the maintenance program. Key attributes of successful comprehensive self-evaluations include the following:
· The self-assessment is a performance-based review of maintenance field activities that evaluates program implementation, rather than a programmatic review of maintenance procedures and policies for compliance with governing documentation.
· Sufficient resources, both personnel and time, are allocated for self-assessment activities. An unbiased input can be achieved by involving personnel from external organizations.
· An agenda is developed for the self-assessment with specific areas to examine and a clear definition of standards that are expected to be met in each area.
· Ownership is established for resolving issues developed in the self-assessment, with a specific time frame for resolution.
Specific elements of the maintenance program are evaluated to identify and correct program strengths and deficiencies. Such reviews, which may be performed by personnel outside the maintenance organization, and include input from maintenance managers as well as from groups such as operations, technical staff, and appropriate company departments. The evaluations address the overall effectiveness of program elements and inter- and intradepartmental coordination. Areas needing improvement are assigned for corrective action and follow-up. In addition, other work groups evaluate strengths for possible emulation. Examples of topics to be considered include the following:
· training and qualification of maintenance staff
· maintenance facilities and equipment
· planning of maintenance work
· scheduling of maintenance work
· post-maintenance testing
· conduct of on-line maintenance
· procurement of parts, materials, and services
· maintenance history
· trends in maintenance-related industry events
· results from inspections of maintenance activities at other facilities
· maintenance best practices as identified by industry support organizations such as Institute of Nuclear Power Operations and the Electric Power Research Institute.
Systematic analysis methods are used to determine causes of equipment and personnel performance problems or maintenance-related incidents. A threshold for selecting incidents that warrant root cause analysis is established. The initiation of root cause analysis may result from a management request, an adverse trend, or a desire for assistance in solving a specific problem. Analysis of human performance errors to address the organizational and environmental factors influencing individual behavior could help identify contributing factors to human performance errors. Incident reports, post-outage reviews, and other similar operating experience review methods supplement the maintenance history program and provide data, including human error data, to be reviewed by the analysis program.
Maintenance Problem Analysis include the following elements:
Incidents that require root cause analysis are identified based on incident type and performance trends. Maintenance department management establishes the required threshold for conducting root cause analyses of maintenance incidents. Considerations in making this selection include the following:
· actual or potential consequence of the incident in relation to plant or equipment reliability, and personnel safety
· sequence of occurrences or multiple failures during the incident
· recurring maintenance and human performance problems or equipment failures
· unexpected conditions encountered during the incident
· previous corrective action taken for similar incidents
Some factors that contribute to the success of a root cause analysis include the following:
· providing adequate time to investigate
· quarantining the area after an incident to prevent inadvertent loss of as-found information
· interviewing involved personnel as soon as possible after the incident while circumstances are still clear and perceptions have not formed that may rationalize away clues to the root cause
All relevant maintenance performance information is collected, analyzed, and actual or probable causes of a problem are evaluated as appropriate. Events or conditions not identified as warranting specific investigation for cause are trended to identify adverse performance trends. Adverse trends can then be investigated to identify apparent or root causes. Some proven techniques available for analyzing information to determine causes of problems. Examples of these include the following:
· event and causal factor charting
· barrier analysis
· walk-through task analysis managers
· interviewing
· change analysis and fault-tree analysis
Regardless of the technique used, direct involvement by maintenance line managers, and workers in this process is essential to achieve desired continuous improvements and buy-in by maintenance personnel.
To be validated, potential root and contributing causes meet the following criteria in relationship to the problem:
· The problem would not have occurred had the causes not been present.
· The problem will not recur because of the same causal factors if the causes are corrected or eliminated.
Care should be taken not to limit analysis to merely addressing the symptoms of a problem. Symptoms may be causes in themselves, but more often they are only indications that need to be pursued to find the underlying causes.
Once causes have been identified, additional action is taken to verify that correction of these causes will prevent recurrence. Viable corrective actions should be identified for each cause. The following criteria can be used to determine viability:
· Will these corrective actions prevent recurrence of the condition?
· Is the corrective action within the capability of the organization to implement?
· Have assumed risks been stated clearly and evaluated appropriately?
Planned corrective actions should be assessed on the impact they will have on the causes and whether they meet the above criteria. They must also be assessed in terms of the impact they will have on other plant systems/components or organizations. Root causes of incidents frequently involve management issues. Therefore, management should be involved and willing to take responsibility for corrective actions related to management issues. Once corrective actions have been defined and received management concurrence, they should be prioritized, scheduled, and tracked to timely completion. Interim compensatory actions may be required for those cases involving corrective actions, which will require a long time to implement.
The results of the problem analysis should be presented to appropriate management in sufficient detail to allow an understanding of the incident, its significance, the causes, and the recommended corrective actions. The same information should also be conveyed to appropriate personnel in a timely, manner to help prevent recurrence, e.g., at tailboard sessions, shift relief or in training sessions. Lessons learned that might be of interest to other station departments should be identified, and an effective method of communicating them employed.
If a maintenance-related event recurs, the original condition or event, in addition to the new condition or event, should be reevaluated. Methods are developed for tracking and trending corrective action and cause information. The program should address the extent of condition or common causes among different organizations, processes or systems/components. The self-evaluation process should be used to determine causes that contribute to recurrence of the performance weakness. In the case of an equipment problem, post-maintenance testing and performance monitoring may be required to determine if additional maintenance work or diagnostics should be performed. Extended monitoring of equipment during various modes of operation may be necessary to provide assurance that the cause(s) have been properly corrected. Long-term follow-up as a part of Self-Evaluation is appropriate to determine if the desired results are obtained from corrective actions such as retraining, procedure changes, and preventive maintenance changes.
Accountability for the effectiveness of the maintenance program should be clearly established. This includes ensuring clear understanding of performance expectations. This should include the expectations of managers, engineers, planners, craftsmen, warehouse personnel, and other personnel who support maintenance. A key element of personnel accountability is an environment in which feedback and communication are continuously encouraged. This environment supports the recognition of strengths and weaknesses and encourages participation in improvements.
Use feedback through such tools as performance appraisals to improve maintenance personnel performance. Accountability must include recognition of superior performance. Counseling, remedial training or disciplinary measures should be used to encourage personnel not meeting expectations, as appropriate.