Methods and tools used in drug safety analytics and pharmacovigilance analytics
Following are the techniques / tools we use in the fields of data mining, text and information mining, geographic information systems, and visualization analytics.
Among the several statistical techniques covered in data mining, PV analytics will use the following:
- Descriptive modeling, used to uncover shared similarities or groupings in historical data to determine reasons behind success or failure. Descriptive analytics tell us what happened, and why, and also what’s happening right now and why. Techniques belonging to this group are, among others:
- anomaly -outlier- detection
- association rule learning
- principal components analysis
- affinity grouping
- Predictive modeling, used to classify events in the future or estimate unknown outcomes. Predictive analytics will tell us what is likely to happen, and why. Techniques in this domain are:
- artificial neural networks
- decision trees, random forest
- support vector machines
- Prescriptive modeling, which is also called “the last frontier of analytic capabilities”, takes information from descriptive and predictive analytics and combines it with information obtained from unstructured data (text mining, web mining) for improved prediction accuracy. Prescriptive modeling looks at internal and external variables and constraints, to recommend one or more courses of action. Predictive analytics + rules, and marketing optimization are examples of prescriptive modeling.
Prescriptive analytics can tell us what our options are, and what we should do.
- Disproportionality methods, specific to patient safety, are used to identify statistical associations between products and events. Such methods compare the observed count for a product-event combination with an expected count. The expected counts can be obtained from large public databases like FAERS for drug-event combinations, or the Manufacturer and User Facility Device Experience (MAUDE) for device-event combinations.
The proportional reporting ratio (PRR) and the reporting odds ratio (ROR) are the foundational concepts for many disproprotionality methods. However, when the number of observed or expected reports is small, more advanced methods are employed, such as the Multi-item Gamma Poisson Shrinker (MGPS), which produces Empirical Bayesian Geometric Mean (EBGM) scores.
- GIS can be used to identify geographic trends over time, perform surveillance, visualize the locations of patients and determine if there are clusters of specific types of customer or patient experiences. These systems allow for the use of different layers of information.
- It also provides a method in which we can systematically address where certain events are more likely to occur and implement preventive or corrective measures. Tracking the spread of infectious diseases or preforming outbreak investigations are well known uses of GIS.
- Another interesting layer of information will be incorporating social media: it can play a significant role, as in the case of tweets informing of healthcare-related issues in relation to patients on company medicinal products.
- Tracking potential safety problems with GIS can provide new opportunities for real-time interventions, and identification of patients at risk, patterns, and areas where patient education and assistance may be adequate.
- GIS allows us to visualize, question, analyze, interpret, and understand data to reveal relationships, patterns and trends, perform surveillance and documentation of the geographic components of the diseases targeted by the company and its risk factors, and can help with disease prevention programs and policies.
Text and information mining
- An important part of the AE reporting is unstructured (narratives, event descriptions), as well as the information obtained from the Internet.
- Text and information mining help to detect specific text patterns or combinations, as well as trends.
- Visualization analytics encompasses the use of pictures and graphics to facilitate the understanding of complex data relationships by displaying data in a visually meaningful way.
- Very useful and fundamental instrument for visualizing patterns in the environment of multiple-source data integration. Widely used in healthcare data analytics.