Is Your Power Plant Headed for a HILP?
How to Avoid, Detect and Mitigate High Impact – Low Probability (HILP) Events
HILP events are those events which do not happen often but when they do occur they can cause extended unplanned outages. HILPs include catastrophic events such as turbine water induction, boiler explosions and major fires, generator winding failures and many, many other types of events. I have heard HILPs referred to as “first time events” but while a specific type of HILP event might not have occurred at your plant previously, it is very likely to have occurred at another similar plant. Some companies have established successful HILP reduction programs using data from the North American Electric Reliability Corporation’s (NERC) – Generating Availability Data System (GADS) which contains event data from 1982 for over 7500 units of all technologies. Recently I, along with Mike Curley, formally with NERC’s GADS Services, and Scott Stallard, Vice President of Black and Veatch, wrote a technical paper describing in detail how to establish a process to benchmark your plant’s unreliability due to HILPs and then to create a HILP reduction program to identify ways to avoid, detect and mitigate HILP events. I will be happy to provide a copy upon request.
A generating unit’s Forced Outage Rate (FOR) and Equivalent Forced Outage Rate (EFOR) can be thought of as consisting of two types of events:
- Events that are expected (they have previously happened with some degree of regularity) and cause the unit to have short or medium outages or deratings.
- Events that are unexpected and cause extended duration outages or deratings.
By separating these event types and calculating and benchmarking their impacts on your plant’s FOR and EFOR you can gain a new perspective for prioritizing your problem solution identification efforts.
As an example we can consider two units’ historic annual FORs shown in table above (for simplicity this example will only consider FOR but EFOR could also be used). While both units have averaged a 10% FOR, the type of events making up their FOR are very different.
Reliability data for Unit A shows many short to medium duration events so that the focus for improvement should be on reducing the frequency or duration of these type events.
Reliability data for Unit B, however, tells a much different story. Here the unit had far fewer short to medium duration events (~60% fewer) so that most of the time its FOR only averaged 4%. However, it experienced one major forced outage event (a HILP) that caused the unit to be out of service for 3 weeks. When both type events were included in the FOR calculation the annual average FOR = 10%. It should be clear from this reliability data that the failure modes for the two units are very different and therefore your investigation focus should probably be different for the two. For Unit B it might be best to implement a formal HILP reduction program using the steps we described in our technical paper (below):
It is very important that you initially select a peer group that balances the need for the units in the peer group to closely match you unit’s design and operating characteristics that most strongly influence its reliability with the need to maintain an adequate peer group size for statistical validity (we normally require a population of at least 30 units). This topic will be covered in future case studies that I will publish on this website.
After calculating the FORs for your unit and the units in the peer group due to all event types, you will need to decide on the minimum duration for a HILP. Since there is no standard industry definition, you are free to select any duration you wish. Keep in mind that the longer duration you select the fewer the events that will meet those criteria. In fact you might want to consider doing the activity in stages, starting with a long minimum duration (say, 3 months) and reducing it to 1 month and perhaps finally to 1 or 2 weeks.
With your HILP duration criteria set you can then use NERC’s PC-GAR-MT computer program to determine the peer group’s number of full forced outage hours for events with durations greater than your HILP duration criteria and calculate the FOR due to HILPs. You can then benchmark you unit’s FOR(HILP) to the peer group’s FOR(HILP) distribution to determine how large your HILP problem is.
As a simple example, I selected a peer group of subcritical fossil steam units that are base-loaded and coal fired. When these criteria are input into pc-GAR for a five year period 529 units were found (if I had not selected coal as the primary fuel burned, there would have been 592 units in the population and if I had not also selected base-loaded as a criteria, there would have been 1044 units). Running pc-GAR for the 529 units for a recent 5 year period, I got 2640.08 unit years of data giving a mean Forced Outage Rate of 4.61%. The mean service hours was 7442 hours per unit year and the total Full Forced Outage Hours was 360 hours per unit year.
Now I ran Nerc’s pc-GAR-MT software for the peer units previously identified and found that there were 22,644 full forced outage events, of which 21,849 had outage duration of less that 1 week (168 hrs). The remaining 795 events had an average Time to Repair (TTR) of 398.18 hours per HILP event (with HILP defined as longer than 1 week). This resulted is a 1.59 FOR due to HILPs or about 1/3 of the total mean FOR of the group, indicating that HILPs are a very significant part of this peer group’s average unreliability. If this peer group was similar to your unit being studied, you could then calculate your unit’s FOR(HILP) and benchmark it against this distribution of its peers.
The table below gives the make-up of the FOR(HILP) by system. While I choose to use NERC’s pre-defined system groupings of cause codes, you have the option to group cause codes in any way you choose. For example, you might want to only consider boiler tube leaks or even just superheater tube leaks.
As we can see for this example, the Boiler and Turbine systems are the leading contributors to FOR (HILP) followed by Generator and Balance Of Plant and Other (External, Performance and Personnel Errors). Drilling down to individual cause codes or groups of codes can further define the problem areas of most significance for HILPs.
The following is a summary of ways to assess your unit’s susceptibility to various HILP events. Details can be found in the full technical paper, provided upon request.
You should try to identify a wide variety of options to reduce HILPs by either:
- Preventing the HILP event
- Detecting the HILP event or
- Mitigating the impact of the HILP
After identifying improvement options you should gather sufficient information to be able to forecast the impact of each option. Then an economic analysis should be performed to:
- Justify each option (is it cost effective? Yes or no)
- Time each justified option (is now the best time to implement?)
- Prioritize each option (given all your fleet’s needs, will this project be the best use of your company’s resources?)
Details of advanced ways to justify, time and prioritize HILP reduction options will be described in a future case study on the Evaluation Phase of a Performance Improvement Program.
The final step is to monitor the results of each implemented HILP improvement option and compare to the expected results. You can also, over time, compare your fleet’s FOR trend due to HILPs. Then use the results from successful and unsuccessful project implementations to improve the process.
Remember: HILPS Happen!
Keep in mind that no power plant is immune to HILPS. Your plant may be either just recovering from a HILP or is about to experience a HILP. Certainly, the plant staff must respond to the “problems of the day” but top performing companies will find ways to devote some resources to seeking cost-effective ways to avoid, detect or mitigate HILPs. If HILP benchmarking shows that you currently have a HILP problem, you might consider starting a formal HILP reduction program soon. If you don’t have a HILP problem right now you might make plans to ensure your fleet stays ahead of the game.
Addressing HILP causes and seeking solution options “before a HILP happens” is a proven way to move from a fire-fighting to a pro-active style of management, one of the key characteristics of top-performing generating companies.