Preventive maintenance’s primary objective is to maintain the function of plant equipment. However, it is common knowledge that some maintenance tasks introduce inherent risk to the asset during their execution. This has resulted in asset-intensive industries relying more heavily on predictive maintenance and non-intrusive preventive maintenance tasks to mitigate equipment risk and extend asset life.
One successful example of this strategy is a centrifugal compressor system, where reliable operation of the asset heavily depends on the precise execution of maintenance tasks. High-risk and intrusive overhauls can be deferred with the successful execution and analysis of predictive maintenance and non-intrusive preventive maintenance tasks, such as vibration, thermography and lubrication. These tasks enable us to defer more intrusive tasks — with a higher degree of confidence — based on collected and analyzed data.
But what do facilities do when both the execution or deferral of a preventive maintenance task results in inherent and unacceptable safety risk to personnel? This may seem like a fabricated scenario, but it occurs every day in modern industrial facilities. As our safety standards become more and more stringent, the level of risk defined as unacceptable continues to drop. This results on further dependency on our condition monitoring programs and speculation by the business that the risk can be mitigated offline.
However, the need for online maintenance is still present, and can catch facilities off guard if proper risk mitigation plans are not in place. API 581 provides guidance on how to define risk and likelihood of a failure and also establishes the appropriate inspection plan to mitigate it — but it does not provide guidance on how to mitigate the risks associate with preventive or corrective actions. We will examine a case study of an operationally and safety-critical vessel with documented corrosion under insulation (CUI) damage that emphasizes how these long-term risk mitigating techniques can be used to bridge the gap between condition monitoring and high-risk intrusive preventive maintenance tasks.
Step 1: Risk identification
Corrosion under insulation is a common failure mode for insulated steel vessels and is heavily documented in API inspection codes. It has become a recent focal point for many aging industrial facilities in the Gulf Coast region as these vessels are approaching the expected age of failure for this failure mode. The most common tasks to mitigate CUI have traditionally been low-cost condition monitoring. These condition monitoring practices focus on:
Visual inspections to identify damaged or poorly installed insulation. Lack of insulation integrity allows for increased moisture ingress and likelihood of corrosion, but CUI can occur even where insulation appears to be in good condition.
Ultrasonic Thickness (UT) testing to measure remaining wall thickness of the vessel, typically utilizing pre-installed windows in the insulation. UT results are used to determine the remaining life of the vessel, based on a limited number of inspection points, which are not necessarily in areas with the highest likelihood of CUI.
CUI inspections, specifically designed to identify this failure mode, have historically involved strategic stripping of insulation and UT testing. While more advanced technologies, including radiography and pulsed eddy current techniques, can reduce cost, many sites are still reliant on more traditional “strip and inspect” practices.
Our specific case study will focus on a propylene concentrating tower operating in near ambient conditions along the Texas Gulf Coast. This vessel was identified as both operationally and safety-critical to the business, with the basic design and operational context data below:
- High concentration propylene gas and liquid with NFPA 704 Level 4 flammability risk.
- Operated at ~95% mechanical availability with no planned extended outages.
- Insulated stainless steel shell with carbon steel structural supports, platforms, etc.
- Roughly 225 feet tall by 15 feet in diameter with more than 10 platform levels.
During routine visual inspections, CUI damage was found on the carbon steel structural supports connected to the stainless-steel vessel shell. These supports, and other carbon steel components, were inspected using UT tests to ensure structural integrity was still intact for all platforms and ladders. These routine inspections also found multiple areas of damaged insulation and evidence of water ingress, leading to a high likelihood of additional areas of corrosion.
Given this critical asset’s specific operating context and current condition, additional inspections and corrective tasks posed several challenges.
The vessel dimensions made traditional means of access by scaffold cost-prohibitive, representing nearly 15% of the unit’s maintenance budget. Site standards allowed for rope access inspections and minor insulation removal, but not for corrective task execution.
Required availability targets of this unit meant that all inspections and repairs would need to occur during operation. Logistically, this would require mobilization of insulators, inspectors and welders alternatively as the entire tower was required to be at least 90% insulated.
Advanced inspection technologies, such as radiography, were limited in usefulness due to the size and design of the vessel. Due to the high safety risk of a wall failure, these technologies could not provide an adequate level of confidence.
This specific case resulted in a highly challenging engineering and maintenance dilemma, which required more than 18 months of planning and preparation to successfully solve.
Step 2: Risk mitigation planning
Because the initial tasks were the result of a failure modes, effects and criticality analysis (FMECA), a comprehensive risk mitigation plan was created prior to any further actions being taken. This plan was multifaceted, developed to mitigate the financial and safety risks posed by this failure mode, and can be used as a benchmark for routine maintenance risk mitigation planning within the process industries.
The initial actions prompting this planning were the result of a scheduled FMECA, more than three years in advance of the inspections and repairs. This long-term approach to risk mitigation enabled the organization to appropriately prioritize this risk against other, and oftentimes more immediate, risks on site. Following best practices, the FMECA follow-up actions were budgeted for the appropriate year, and engineering began to put together proposals.
Whereas Capital Expenditure (Cap/Ex) projects have a comprehensive stage-gate review process, typical Operating Expense (Op/Ex) projects have a much simpler approval and planning process. Despite these actions being classified as Op/Ex costs, these proposals included key business and risk information required to make a data-driven decision, consistent with Cap/Ex planning.
Feasibility reviews of multiple technologies and execution methods to determine a likelihood of success. For example, radiography inspections on this vessel had a much lower probability of detecting a defect than a visual and UT inspection. These success rates were taken into account for each proposed job package.
Total cost of each proposed method, including all preparation, planning, material, man-hours and production impacts. Since these vessels did not have planned shutdowns of sufficient duration, an analysis of the total cost of a shutdown and inspection was required. This allowed the organization to evaluate the benefit to cost of a more efficient execution against continued utilization.
Safety risks associated with each proposed mitigating action. Since new technologies and access methods were evaluated, including welding from rope access on an operating unit, these safety evaluations and mitigation plans were essential to successful execution.
Due to the cost and scope of these mitigating actions, these engineering analyses were translated into business proposals through a 10-year Life Cycle Cost Analysis (LCCA). The LCCA enabled the organization to evaluate technical packages in terms of business risk. The two primary proposals, outlined in the chart below, included:
Ultimately, after extensive review and revision, the organization decided to invest in new equipment and engineering required to safely execute the rope access proposal. At this point, the technical and business proposal was converted into a detailed execution job plan. More consistent with standard turnaround planning practices rather than what is typical for routine maintenance, this execution package was planned more than 12 months in advance to allow for appropriate funding and fabrication of custom safety systems.
The result of this 18-month effort was a fully stepped-out execution plan, supported by detailed engineering, with bundled materials and equipment. Due to the pre-existing CUI damage, ongoing inspections were performed during this planning process to ensure the vessel and structure continued to be safe to operate. Product inventory was also built up during this phase of planning to mitigate the risk of an unplanned shutdown of the unit as a result of this work. Finally, safety systems were staged at the vessel to minimize the mobilization time of the rope access crews.
More than three years after the FMECA prompted inspections of the vessel, the risk mitigation plan was executed on the live operating unit. As a result of the detailed engineering, analysis and planning, the job was executed safely and on budget. This resulted in a total savings of nearly $1 million for the business, representing nearly 15% of its total maintenance costs, with no lost production. The execution of this plan was so successful that the unit was able to share these practices across the organization for use in other units and sites. The execution of rope access CUI inspection and corrective tasks on live operating units was estimated to save the business in excess of $10 million over the next five years.
Step 3: Application in industry
This organization was successful where other organizations have failed for several reasons.
Where many organization plan and budget for routine maintenance on an annual basis, this organization was able to assess five-year risk profiles. Furthermore, this risk not only included standard EHS risks, but also the financial risk associated with those failures. This comprehensive risk assessment enabled the organization to evaluate mitigation proposals from a business perspective and make the most informed decision possible. In this instance, this long-term outlook enabled the organization to explore alternate technologies and execution methods and develop the safety measures necessary to successfully execute these plans.
However, time wasn’t the only factor in this organization’s success. Due to the critical nature of this unit, the business had extensive engineering and maintenance expertise fully dedicated onsite. Smaller sites with fewer resources may not have the expertise or availability of resources to complete such a detailed risk mitigation plan. In these instances, external support will likely be required to develop such comprehensive plans for critical assets. In addition to engineering and planning expertise, high craft skill is a requirement in the successful execution of such plans. In this instance, highly qualified specialty inspectors and welders were required for execution. These types of craft resources are frequently not available onsite and must be identified and qualified in advance.
Finally, this organization already had an advanced reliability culture and processes in place, which enabled it to evaluate risk and proposals according to standard practices. With such complex and critical tasks, standard practices provide the organization a robust structure with which to make decisions based on data and analysis, rather than experience or emotion. This is common practice in capital investment but is often overlooked in routine maintenance. When viewed as an investment in the reliability of the organization, rather than a simple operating cost, routine maintenance can be optimized and prioritized similar in a fashion that is typical capital spending.
Other organizations can learn from this long-term approach to risk management and routine maintenance, especially as the industry’s risk tolerance continues to decrease. The safe execution of mitigation plans is the top priority for any site, but safe execution does not need to come at the expense of the organization’s bottom line.
In fact, the ability to forecast risk and routine maintenance costs will not only reduce the cost of execution, but also the inherent risk involved in these tasks. Long-term outlooks allow organizations to build the associated risk mitigation plans — including job execution plans with staged materials and equipment — that mitigate risk. This is as relevant for a compressor overhaul as it is for CUI mitigation, where spare parts and materials may have extended lead times and limited availability. This is also especially true for equipment that is facing later stages of obsolescence, where failure without such a plan may result in months of downtime and excessive expediting costs.
Cap/Ex planning for Op/Ex projects
Applying a Cap/Ex project methodology for specific and complex Op/Ex jobs and tasks has been shown as an efficient way to mitigate the inherent risks in performing these tasks. These long-term risk identification and mitigation practices are already commonplace in the industry, but they are reserved for only capital projects. By expanding the number of projects that are planned and approved in this manner, including the creation of multiple project proposals and risk mitigation plans, companies can reduce the financial and safety risk that is common in intrusive routine maintenance tasks.
The risk mitigation plans developed during this process are engineering and maintenance tools used to increase business profitability by reducing risk. While risk mitigation planning is not a new concept in industry, its application is more important now than ever before. As the industry continues to become more complex and competitive, these tools can make a substantial difference in the profitability of the business. Best-in-class risk mitigation plans are designed to reduce risk by addressing the:
- Probability of an error occurring through measures such as high craft skill, training and detailed procedures.
- Severity of the consequence through measures such as proper bill of materials, materials management and safety systems.
Additional preparation is often needed to mitigate inherent safety concerns while continuing to meet business objectives. These critical tasks often require more than the standard 12-month forecast that is common for routine maintenance and are contingent on the skills and expertise that the organization has available. These practices, when implemented properly and systematically, can allow companies to safely execute critical tasks at the right cost, consistent with the business’ plan, while meeting ever-increasing safety standards and practices.
Colemann O’Malley is an engineering maintenance specialist with T.A. Cook Consultants. He is a reliability professional with extensive experience in asset management, risk assessment, and implementation of reliability projects in the chemicals, oil and gas industries. He is piloting the development and implementation of digital solutions in asset management and reliability, with the goal of increasing sites’ long-term planning and decision-making capabilities.