In today’s competitive global environment, plant and operation managers are under substantial pressure to improve efficiencies while cutting costs. One of the challenges they face is using the mind-boggling quantities of data — both structured and unstructured — generated each day. This data can reveal areas for improvement, but accessing it quickly and affordably has always posed a major challenge. Historians serve as a repository for data from many systems, making them a good source for advanced analytics. However, process historian tools are not ideal for analyzing the data or search queries. They are "write" optimized, not "read/analytics" optimized. Finding the relevant historical event and building the process context are usually time consuming
and laborious.
To improve process performance and overall efficiency, a level of operational intelligence and context of data are required. Process engineers and other subject matter experts must be able to search time-series data over a specific timeline and visualize all related plant events quickly and efficiently. This includes the time series data generated by the process control systems, lab systems and other plant systems as well as the usual annotations and observations made by operators and engineers.
Limitations of existing analytics technologies
Process engineers and operators need the ability to accurately predict process performance or the evolution of a batch process while eliminating false positives. Accurately predicting process events likely to happen in a plant or facility requires accurate process historian or time-series analytics, search capabilities and the ability to diagnose accurately and determine the meaning of the patterns identified within the process data.
Process analytics solutions, in one form or another, have existed in the industrial software market for some time. Unfortunately, these largely historian-based software tools often require a great deal of interpretation and manipulation. They perform rear-looking trends or export raw data in Microsoft Excel spreadsheets. The tools, used to visualize and interpret process data, typically are trending applications, reports and dashboards. These can be helpful but are not particularly beneficial for predicting outcomes.
Predictive analytics, a relatively new dimension to analytics tools, can provide valuable insights about what will happen in the future based on structured and unstructured historical data. Many predictive analytics tools start by using a more enterprise approach and require more sophisticated distributed computing platforms, such as Hadoop or SAP Hana. These are powerful and useful for many analytics applications but represent a more complex approach to managing both plant and enterprise data. Companies that use this enterprise data management approach must often employ specialized data scientists to help organize and cleanse the data. However, in addition to the time and money required to perform these projects, data scientists are not always intimately familiar with the process like engineers and operators, which limits their ability to achieve the best results.
Furthermore, many of these advanced tools are perceived as engineering-intensive "black boxes" in which the user only knows the inputs and expected outcome without any insight into how the result was determined. Understandably, for many operational and asset-related issues, this approach is too expensive and time consuming, and it also requires skilled data scientists. This is why many vendors, from a return on investment (ROI) point of view, can target only 1 percent of the critical assets, ignoring other opportunities for process improvement. If the subject matter experts, typical non-data scientists, will use the solution, the requirement for the analytics tools change completely. They need plug-and-play, easy-to-use self-service analytics solutions.
New approach to big data management
ARC Advisory Group identified a few solution suppliers taking a different approach to providing industrial process data analytics and leveraging unique, multidimensional search capabilities for stakeholders. This approach combines the ability to visualize process historian, time-series data, overlay similar, matched historical patterns and provide context from data captured by engineers and operators. Those discovery analytics tools give answers to day-to-day operational questions. In as little as two hours, a pattern recognition process analytics solution that provides the ability to visualize time-series data, overlay similar historical patterns and provide context can be deployed. The pattern recognition solution is an on-premise, packaged virtual server deployment that easily integrates into the local copy of plant historian database archives and can evolve over time toward scalable architecture to communicate with available enterprise-distributed computing platforms.
The newer technology uses pattern search-based discovery and predictive-style process analytics targeting the average user. It is typically easily deployed in less than two hours, delivering immediate value with no data-modeling solution or data scientist required. Often called "self-service analytics," this software puts the power of extensive search and analytics into the hands of the process experts, engineers and operators, who can best identify areas for improvement.
Another problem typically presented by historian time-series data is the lack of a robust search mechanism along with the ability to annotate effectively. By combining both the search capabilities on structured, time-series process data with data captured by operators and other subject matter experts, users can predict more precisely what is occurring, or likely will occur, within their continuous and batch industrial processes.
According to Peter Reynolds, senior consultant for ARC Advisory Group, "The new platform is built to make operator shift logs searchable in the context of historian data and process information. In a time when the process industries may face as much as a 30 percent decline in the skilled workforce [because of] retiring workers, knowledge capture is a key imperative for many industrial organizations."
Self-service analytics deliver:
- A deep knowledge of both process operations and data analytics techniques to avoid the need for specialized data scientists
- A model-free, predictive process analytics (discovery, diagnostic and predictive) tool that complements and augments, rather than replaces, existing historian information architectures
- Cost-efficient virtualized deployment (plug and play) within the available infrastructure
- Easy scalability for corporate big data initiatives and environments
Multidimensional search solutions
Using pattern-recognition and machine-learning algorithms permits users to search process trends for specific events or detect process anomalies unlike traditional historian desktop tools. A simple example of this technology is the music app Shazam, self-service analytics that work by identifying significant patterns in data or "high-energy content" and matching them to similar patterns in its database instead of trying to match each note of a song. Shazam can identify songs quickly and accurately using this technique because, if it takes too long to get an answer, the user will close the search.
These technologies form the critical base layer of the new systems technology stack because they use existing historian databases and create a data layer that performs a column store to index the time-series data. These next-generation systems also work well with leading process historian suppliers including OSIsoft, AspenTech, Yokogawa and Honeywell, augmenting the ROI that companies have made in their historians. Typically, they are designed to be simple to install and deploy via a virtual machine without impacting the existing historian infrastructure.
Time for a new paradigm
The technology playing field for manufacturers and other industrial organizations has changed. To remain competitive, companies must use analytics tools to uncover areas for efficiency improvements.
"There is an immediate need to search time-series data and analyze these data in context with the annotations made by both engineers and operators to be able to make faster, higher quality process decisions," Reynolds said. "If users want to predict process degradation or an asset or equipment failure, they need to look beyond time-series and historian data tools and be able to search, learn by experimentation and detect patterns in the vast pool of data that already exists in their plant."
Fortunately, this new process analytics model can support the necessary "retooling" of traditional process historian visualization tools for a low cost investment in terms of both time and money.
IDEAL CAPABILITIES OF A SELF SERVICE ANALYTICS SOLUTION
- Column store with in-memory indexing of historian data
- Search technology based on pattern-matching and machine-learning algorithms, empowering users to find historical trends that define process events and conditions
- Diagnostic capabilities to quickly find the cause of detected anomalies and process situations
- Knowledge and event management and process data contextualization
Bert Baeck is cofounder of TrendMiner and became CEO in 2013. His professional experience includes more than 10 years within big data/analytics and the manufacturing industry. Before his work with TrendMiner, he was a process optimization engineer for Bayer MaterialScience (now Covestro). Baeck holds master’s degrees in computer science and micro-electronics from the University of Ghent.