Improve condition monitoring with scalability

Traditionally, reaching a beneficial level of predictive maintenance requires a lot of man-hours and specialist knowledge. Most existing tools are targeted at diagnostic systems engineers, who monitor the alerts and outputs from a condition monitoring system and direct the work of a maintenance team.

The increasing number of connected devices now becoming available within Industry 4.0 and the Industrial Internet of Things (IIoT) offers organizations a significant opportunity to improve this situation, increasing the efficiency and cost-effectiveness of existing teams.

With a higher-level approach, information can be supplied directly to the maintenance organization, allowing the planner and technicians to react within a short time cycle.

Scalability means expanding the coverage of an organization’s assets while reducing the risk across the monitored equipment. The premise of scalability is that using the staff already in place, it is possible to monitor much more than just top-priority and critical assets.

Currently, the number of assets monitored per individual is likely to vary between 50 and 100. With increasing connectivity, the number of assets monitored can grow to the thousands.

By automating around 90 percent of the analytic tasks that the diagnostic system engineer performs, companies are able to address changes in the levels of critical assets within their organization without requiring a tenfold increase in their labor costs.

Extending coverage

Scalability allows organizations to service a higher quantity of machines to reduce the problems caused by unmonitored assets failing, and to avoid costly unplanned downtime.

A fan system used on a kiln furnace, for example, might be completely critical to plant operation; it would be vital to assess the system’s serviceability, whether that may be by monitoring the condition of the blades, the drive motor or the gear system.

Expensive pieces of equipment like this are traditionally assets that would be constantly monitored by a diagnostic systems engineer, however, there might well be secondary level equipment, such as motors, blowers and conveyors, with a lower lead time for replacement parts.

These may have a lower risk of downtime, with a low level of criticality and so are not typically covered by monitoring systems. What scalability offers is the chance to extend coverage out to these lower-cost, lower-criticality assets, allowing maintenance to be scheduled to minimize downtime.

For example, if a motor is approaching the point where it needs replacement, it can be replaced during a scheduled break in production without creating additional downtime. This is a less obvious advantage than monitoring a more critical asset, but nevertheless a significant one.

When a large asset fails, it can take weeks to replace — even months if specialist components are required. Condition monitoring in this instance is vital to reduce downtime through predictive maintenance. Even with advance warning, a plant operator experiences a significant amount of downtime while the asset is repaired or replaced.

However, an unmonitored and less critical asset may fail 10 times in the same time period, even though it could only require an hour to repair. This means that the overall downtime — and associated cost from loss of production — is more, even though the criticality is lower.

A generalized solution

A key factor in scalability is to ensure that any solution is generic enough that it can be scaled across different types of machines as well as across sectors.

Rather than create bespoke solutions for each machine type, technology firms working in this area aim to abstract the problem, offering an intuitive, all-round solution.

Developing this more generalized solution may require a reasonable quantity of data. However, it is essential to resist the natural temptation to embed high levels of expert knowledge into the solution, otherwise that solution may end up becoming highly bespoke and therefore not scalable to other machines.

There is a trade-off between the level of analysis produced across all the machines, and making the whole system work. The challenges lie not only in understanding everything that an expert might be looking for in the data, but also to develop a series of algorithms that expose key features in the data.

For example, in the rail industry, a monitoring solution may be needed for trackside equipment such as escalators. In this instance, current and vibration signals may be monitored.

The experts involved might understand all there is to know about escalator use in the London Underground, but that information may not scale well to other machinery. By looking at the escalators more generally as a conveying system, the solutions developed can be applied to any escalator system — it becomes a product suited to cross-sector scaling.

Continuous development

To enable a truly scalable monitoring solution, it is important to remove the user from the need to get involved. This allows the solution to produce the relevant critical data for effective condition monitoring and predictive maintenance, without the loss in scalability that customization and extensive user-tweaking would cause.

For example, a new solution that offers scalable predictive maintenance runs as a single iteration that sits in the cloud and is continually developed, using an agile software development process. This means that new capability finds its way into the product every two weeks with elements like the core detection capability for abnormal behavior continually being supplemented with new techniques.

End-user engagement

Many end users want to understand how they can embed some of their own knowledge and experience into automated systems. It is important to strike a balance of leveraging expert knowledge while maintaining scalability.

The challenge for condition monitoring system providers is to change the perspective of organizations from a traditional engineering viewpoint to one that looks more at the impact on the business.

It is common for teams to ask about levels of no faults found, false positives or "positive negatives" in the system. While these metrics are tracked, they are used internally to evaluate and optimize the system.

They can also be subjective, as some false positives might have come from the early stages of a fault that users feed back as negatives, so they are not really a useful overall measurement of system performance.

For a true assessment of a product, it is better to look at the benefits for the business, such as the reported downtime and the impact on key performance indicators (KPIs).

It is also important to look at the amount of user interaction with the product. Typically, a user should not have to spend more than a few minutes per asset per week in the system.

If too much output is provided, or it is the wrong type of output, then the maintenance team will spend more time than expected looking at notifications. This can negatively affect both the way the product is used, and the way it is perceived; it is becoming a key measurement for condition monitoring providers.

Making increased information manageable

When scalability grows to a certain level, the system notifications inevitably start to reach the maximum that a user can monitor. For example, if monitoring increases from 10 machines to 1,000, it is important to consider how many notifications those 10 machines generate and what that will look like with 1,000 machines.

This is likely to be unmanageable by existing personnel, but if extra staff have to be appointed to assess the output of the system, then that obviously defeats the purpose of the system.

A classic example comes from the aerospace industry, where the amount of output from monitoring systems on avionics components became overwhelming. The result was that the system was turned off, putting the onus on the pilot instead to report when something was broken.

So, the increase of scale is likely to drive a corresponding increase in automation for reporting.

This understanding has led to the development of a proprietary method to determine and prioritize asset health automatically by using condition monitoring, prognostics and asset criticality information.

Conclusion

Through continual development of algorithms and automated diagnostics, software can be used to solve the problem of condition monitoring at scale, providing an effective and manageable solution.

Scalability offers an efficient, cost-effective solution to achieve reliable condition monitoring of a rapidly increasing number of assets without increasing staff resources to match.

While expert knowledge and customer input plays a strong part in optimizing software, maintaining a generalized solution than can be applied across a variety of machines and applications is key to true industrial scalability; keeping predictive maintenance in pace with smart industry/Industry 4.0.

Robert Russell is chief technology officer at Senseye. He has a degree in mechanical engineering and has spent more than 20 years designing and deploying complex condition monitoring and prognostics solutions across the aerospace, defense and transport sectors.

Senseye is a leading cloud-based software for predictive maintenance. It helps manufacturers avoid downtime and save money by automatically forecasting machine failure without the need for expert manual analysis. Its intelligent machine-learning algorithms allow it to be used on any machine from any manufacturer, taking information from existing IIoT sensors and platforms to automatically diagnose failures and provide the remaining useful life of machinery.