Advanced time series analytics should be a critical component of every process manufacturer’s data strategy, and especially in the pharmaceutical industry, where large volumes of data are managed, often varying in type. This data and the resulting analytics drive significant decisions that impact the success of entire organizations.
Particularly in heavily regulated environments, such as pharma, data must be accessible, easily analyzed and closely monitored to effectively aid in consequential business decisions, affecting manufacturer bottom line and the safety of patients served.
In today’s fast-paced manufacturing environments, working with data in the pharmaceutical industry demands an advanced analytics platform capable of joining disparate sources to enable seamless analytics and visualizations, in addition to ad-hoc and highly polished reports. Equally important, this platform must be easy to use, bridging multiple analytical methods into a common environment, uniting all users and data sources throughout the organization. And finally, the solution should be flexible, serving current and future time series analytical needs.
Data access, disparity and processing challenges
Advanced analytics in the pharmaceutical sector is nothing new. It is common for manufacturers to use dated spreadsheet tools to manually wrangle, process, cleanse and export data to an analytics application.
These analytics applications are frequently used to build and deploy solutions like statistical process control (SPC), operational equipment effectiveness (OEE) and principal component analysis (PCA) models. But unfortunately, the typical streams of disparate data — along with varying analytical methods and tools — are complicated to use, creating a high barrier to entry in terms of time, talent and cost.
Because of operational and regulatory criticality, pharmaceutical manufacturers cannot avoid the need to integrate their various data sources, and to preprocess and cleanse data prior to analysis. Historically, the tools for accomplishing these tasks have been separate from those used for actual analytics. And even in cases where the same software toolset is used for preprocessing and analytics, it is rarely dynamic enough to keep pace with the constantly progressing advancements of the pharmaceutical industry.
In addition to the effort required to wrangle, cleanse and analyze, process manufacturers also face data integrity issues, inherent when managing and modifying data. To handle all of these tasks and concerns, manufacturers often maintain a large toolbox of software, with each requiring individual upkeep, training, licensing and support costs.
The combination of architectural and technical challenges results in siloed analytics, limited knowledge sharing, reduced optimization opportunities and potential decreases in analytics quality. But modern, advanced analytics solutions are now available, addressing these and other issues.
Advanced analytics platforms unite users and data
Overcoming the challenges of data access, processing and analysis is nearly impossible left to conventional software and methods. But modern, purpose-built advanced analytics platforms for time series data, like Seeq, greatly ease this procedure, providing extensibility for today’s analytical methods — such as SPC and OEE — and tomorrow’s machine learning and artificial intelligence solutions.
Advanced analytics platforms accomplish these tasks by first establishing a connection to the various data sources across a manufacturer’s operations. This includes large manufacturing databases, SQL-based stores of quality and batch data, individual CSV files on local user workstations and everything in between. Once connections are established, the software handles all data alignment and interpolation, while still providing end users the capability to review and adjust it as required.
This empowers users to dive in and add context to their data, visualize and analyze operational performance and simulate potential enhancements with purpose-built tools, enabling subject matter experts (SMEs) to rapidly identify problems or opportunities. Once identified, disparate data can be combined to easily execute various quantitative analytics, whether as simple as totalizing a feed rate, or complex as deploying a PCA model on a bioreactor.
In addition to enabling ad-hoc analytics, these platforms are built around operationalizing the analyses. Distinct reports and dashboards can be easily deployed for groups of engineers, operators and supervisors.
Until recently, the ability to utilize analytics techniques like SPC, OEE, PCA or partial least squares (PLS) modeling required standalone applications in addition to an ad-hoc analytics solution. Today, advanced analytics platforms combine all of these tasks, wrapping them up with customizable point-and-click configuration menus, along with shareable reports and dashboards.
For exceptional cases where built-in tools do not fulfill all user needs, a Python environment is available within these platforms, enabling SMEs to leverage the breadth of Python and R packages to deploy specialty models and data streaming. These models, which frequently encompass proprietary algorithms, can be seamlessly operationalized using the platform’s built-in data cleansing and contextualization capabilities. Once created, these customized algorithms can be deployed in the same way built-in models are, complete with graphical user interfaces that alleviate the need for end users to write any code.
Results: Large pharma deploys OEE, SPC and multivariate analytics
OEE
Using built-in tools in Seeq, a large pharmaceutical manufacturing company rapidly deployed an OEE analysis, helping SMEs identify opportunities for over $65M in additional revenue. This was done by connecting disparate data sources, bringing the information into the advanced analytics platform and creating robust data flows with intuitive out-of-the-box tools. In addition to the platform’s core tools, the purpose-built OEE extension aids in operationalizing analysis by creating insights that were previously unknown (Figure 1).
To make these discoveries resulting in the significant revenue increase, SMEs within the organization leveraged the advanced analytics platform to first explore process data and perform analysis with point-and-click tools. Then, the analyses were deployed at scale across thousands of process alarm events and reject reason codes. This entire effort, beginning with data connectivity and concluding with OEE dashboard creation, only took the team a few days, an impossible timeframe left to previous disparate tools and manual methods.
SPC
Another pharmaceutical manufacturer needed to quickly build control charts with near-real time data to create an unvalidated continued process verification (CPV) report. The company used Seeq to designate and analyze critical quality attributes (CQAs) and critical process parameters with just a few clicks (Figure 2).
Rather than rely on a myriad of tools and multiple SMEs to generate reports, the advanced analytics platform centralized this task, significantly reducing the time required to generate CPV-related insights. In one instance, this empowered the manufacturer to identify and act upon a downward-trending CQA almost 24 hours earlier than it would have prior to installing the platform.
These expedited results were driven by automating the previously manual data entry, SPC calculations and reporting process. For this manufacturer — and likely many others in the commercial drug manufacturing world — the additional 24 hours of notice enabled a batch-saving course correction, which was equated to upwards of $5M in cost avoidance.
PCA and PLS
Traditionally, building and deploying PCA- and PLS-based models on near-real time process data required the collaboration of multiple personnel and tools. But by implementing Seeq, one of the world’s largest pharmaceutical manufacturers also consolidated this workflow, enhancing its ability to build and operationalize models.
Before deploying the advanced analytics platform, SMEs spent dozens of hours each week collecting, cleansing and conditioning disparate data using manual spreadsheet tools. But the effort now requires only a few hours of oversight, with most of the legwork automated in the platform. The platform opened the previously siloed workflow to a significantly larger user group, which in turn improved the models and likelihood of anomaly detection.
Additionally, the platform alleviated data integrity concerns because information imports and exports, in addition to manual error propensity, is minimized. And the tools necessary to provide context for the analyses are also centralized, enabling easy model building and iterating.
An out-of-the-box PCA regression model and prediction tool enabled these users to input data, returning an operationalized model to provide near-real time monitoring capabilities of the dimensionally reduced components. By consulting the model, the team reduced time spent monitoring and manually analyzing individual signals, increasing the speed of anomaly detection and response (Figure 3).
The platform’s extensibility enabled further modeling methods using a custom, open-source Python environment, which a data scientist leveraged to apply additional methods for analysis, including PLS. Once created, these tools were made available to engineers and end users with a user-friendly user interface in the advanced analytics platform.
The workflow using this platform to create and execute these advanced multivariate models significantly reduced the number of tools, SMEs and hours required to transform disparate data into insights. And now, these customized models can be retrained and re-executed with only a few mouse clicks. Comprising a user-friendly interface, PCA and PLS modeling is no longer reserved for data scientists, but can now be interpreted and understood by a wide audience of users.
Enhancing operational efficiency and product quality
As time goes on, analytics platforms continue to evolve. For innovative adopters and process manufacturers looking to streamline operational efficiency, the current generation of advanced analytics platforms have significantly reduced the time and capital investments required to consolidate, cleanse and analyze disparate data. These solutions are particularly invaluable for pharmaceutical manufacturers looking to increase production, assure high quality and minimize losses.
With the ability to create advanced analyses like OEE, SPC, PCA and PLS models in the same software leveraged for data management, today’s advanced analytics platforms provide process manufacturers with the digital tools to take their operations to the next level. Once deployed, these platforms provide centralized data, increased context, anomaly detection, operational insights and enhanced reports for companywide sharing.
Synjen Marrocco is an analytics engineer at Seeq, where he helps process manufacturing organizations maximize value from their data. He has a process engineering background, most recently working for Amgen, with a master’s degree in chemical engineering from Northeastern University.
Seeq