Researchers are increasingly using Twitter to analyze what people are talking about at a given time point and over time. Among the multiple uses analysis of tweets can have, safety surveillance, signal detection and discovery of adverse drug events or adverse drug reactions is something that we are just starting to explore for pharmacovigilance analytics, in the framework of social media analysis.
In this post, we are going to analyze all papers retrieved from PubMed with the search string: twitter AND (safety OR pharmacovigilance). On 02 March 2018, that search resulted in 79 search results. From them, we have hand-picked those related to the use of Twitter for drug safety / pharmacovigilance surveillance purposes. Of course there are other keyword combinations that will provide different results as, for example, “social media data/mining/monitoring”,”adverse drug reaction”, “adverse event”, and others. These other results will be covered by other posts in this series.
This is our selection in chronological order:
The increasing popularity of social media platforms like Twitter presents a new information source for finding potential adverse events. Given the high frequency of user updates, mining Twitter messages can lead us to real-time pharmacovigilance. In this paper, the authors describe an approach to find drug users and potential adverse events by analyzing the content of twitter messages utilizing Natural Language Processing (NLP) and to build Support Vector Machine (SVM) classifiers. Due to the size nature of the dataset (i.e., 2 billion Tweets), the experiments were conducted on a High Performance Computing (HPC) platform using MapReduce, which exhibits the trend of big data analytics. The results suggest that daily-life social networking data could help early detection of important patient safety issues.
The authors talk about the changing landscape of drug abuse, and that traditional means of characterizing the change are not sufficient any more, because they can miss changes in usage patterns of emerging new drugs. The objective of this paper is to introduce tools for using data from social networks to characterize drug abuse. The authors outline a structured approach to analyze social media in order to capture emerging trends in drug abuse. An analysis of social media discussions about drug abuse patterns with computational linguistics, graph theory, and agent-based modeling permits the real-time monitoring and characterization of trends of drugs of abuse. These tools provide a powerful complement to existing methods of toxicovigilance.
Recent research has shown that Twitter data analytics can have broad implications on public health research. However, its value for pharmacovigilance has been scantly studied – with health related forums and community support groups preferred for the task. The authors present a systematic study of tweets collected for 74 drugs to assess their value as sources of potential signals for adverse drug reactions (ADRs). They created an annotated corpus of 10,822 tweets. Each tweet was annotated for the presence or absence of ADR mentions, with the span and Unified Medical Language System (UMLS) concept ID noted for each ADR present. Using Cohen’s kappa1, we calculated the inter-annotator agreement (IAA) for the binary annotations to be 0.69. To demonstrate the utility of the corpus, we attempted a lexicon-based approach for concept extraction, with promising success (54.1% precision, 62.1% recall, and 57.8% F-measure). A subset of the corpus is freely available at: http://diego.asu.edu/downloads.
Traditional adverse event (AE) reporting systems have been slow in adapting to online AE reporting from patients. In the meantime, increasing numbers of patients have turned to social media to share their experiences with drugs, medical devices, and vaccines. The aim of this study was to evaluate the level of concordance between Twitter posts mentioning AE-like reactions and spontaneous reports received by a regulatory agency. The authors collected public English-language Twitter posts mentioning 23 medical products from 1 November 2012 through 31 May 2013. Data were filtered using a semi-automated process to identify posts with resemblance to AEs (Proto-AEs). A dictionary was developed to translate Internet vernacular to a standardized regulatory ontology for analysis (MedDRA(®)). Aggregated frequency of identified product-event pairs was then compared with data from the public FDA Adverse Event Reporting System (FAERS) by System Organ Class (SOC). Of the 6.9 million Twitter posts collected, 4,401 Proto-AEs were identified out of 60,000 examined. Automated, dictionary-based symptom classification had 86 % recall and 72 % precision [corrected]. Similar overall distribution profiles were observed, with Spearman rank correlation rho of 0.75 (p < 0.0001) between Proto-AEs reported in Twitter and FAERS by SOC. In conclusion, patients reporting AEs on Twitter showed a range of sophistication when describing their experience. Despite the public availability of these data, their appropriate role in pharmacovigilance has not been established. Additional work is needed to improve data acquisition and automation.
Twitter has been proposed by several studies as a means to track public health trends such as influenza and Ebola outbreaks by analyzing user messages in order to measure different population features and interests. In this work the authors analyze the number and features of mentions on Twitter of drug brand names in order to explore the potential usefulness of the automated detection of drug side effects and drug-drug interactions on social media platforms such as Twitter. This information can be used for the development of predictive models for drug toxicity, drug-drug interactions or drug resistance. Taking into account the large number of drug brand mentions that we found on Twitter, it is promising as a tool for the detection, understanding and monitoring the way people manage prescribed drugs.
The authors aimed to evaluate clinical outcomes from applications of contemporary social media in chronic disease; to develop a conceptual taxonomy to categorize, summarize, and then analyze the current evidence base; and to suggest a framework for future studies on this topic. They performed a systematic review of MEDLINE via PubMed (January 2000 to January 2015) of studies reporting clinical outcomes on leading contemporary social media (ie, Facebook, Twitter, Wikipedia, YouTube) use in 10 chronic diseases. Of 378 citations identified, 42 studies examining the use of Facebook (n = 16), blogs (n = 13), Twitter (n = 8), wikis (n = 5), and YouTube (n = 4) on outcomes in cancer (n = 14), depression (n = 13), obesity (n = 9), diabetes (n = 4), heart disease (n = 3), stroke (n = 2), and chronic lower respiratory tract infection (n = 1) were included. Studies were classified as support (n = 16), patient education (n = 10), disease modification (n = 6), disease management (n = 5), and diagnosis (n = 5) within our taxonomy. The overall impact of social media on chronic disease was variable, with 48% of studies indicating benefit, 45% neutral or undefined, and 7% suggesting harm. Among studies that showed benefit, 85% used either Facebook or blogs, and 40% were based within the domain of support. The authors concluded that using social media to provide social, emotional, or experiential support in chronic disease, especially with Facebook and blogs, appears most likely to improve patient care.
There is growing interest in whether social media can capture patient-generated information relevant for medicines safety surveillance that cannot be found in traditional sources. The aim of this study was to evaluate the potential contribution of mining social media networks for medicines safety surveillance using the following associations as case studies: (1) rosiglitazone and cardiovascular events (i.e. stroke and myocardial infarction); and (2) human papilloma virus (HPV) vaccine and infertility. The authors collected publicly accessible, English-language posts on Facebook, Google+, and Twitter until September 2014. Data were queried for co-occurrence of keywords related to the drug/vaccine and event of interest within a post. Messages were analysed with respect to geographical distribution, context, linking to other web content, and author’s assertion regarding the supposed association. A total of 2537 posts related to rosiglitazone/cardiovascular events and 2236 posts related to HPV vaccine/infertility were retrieved, with the majority of posts representing data from Twitter (98 and 85%, respectively) and originating from users in the US. Approximately 21% of rosiglitazone-related posts and 84% of HPV vaccine-related posts referenced other web pages, mostly news items, law firms’ websites, or blogs. Assertion analysis predominantly showed affirmation of the association of rosiglitazone/cardiovascular events (72%; n = 1821) and of HPV vaccine/infertility (79%; n = 1758). Only ten posts described personal accounts of rosiglitazone/cardiovascular adverse event experiences, and nine posts described HPV vaccine problems related to infertility. The authors concluded that publicly available data from the considered social media networks were sparse and largely untraceable for the purpose of providing early clues of safety concerns regarding the prespecified case studies. Further research investigating other case studies and exploring other social media platforms are necessary to further characterise the usefulness of social media for safety surveillance.
Self-reported patient data has been shown to be a valuable knowledge source for post-market pharmacovigilance. In this paper the authors propose using Twitter to gather evidence about adverse drug reactions (ADRs) after firstly having identified micro-blog messages (also know as “tweets”) that report first-hand experience. In order to achieve this goal, they explore machine learning with data crowdsourced from laymen annotators. With the help of lay annotators recruited from CrowdFlower they manually annotated 1548 tweets containing keywords related to two kinds of drugs: SSRIs (eg. Paroxetine), and cognitive enhancers (eg. Ritalin). Results show that inter-annotator agreement (Fleiss’ kappa) for crowdsourcing ranks in moderate agreement with a pair of experienced annotators (Spearman’s Rho=0.471). Authors utilized the gold standard annotations from CrowdFlower for automatically training a range of supervised machine learning models to recognize first-hand experience. F-Score values are reported for 6 of these techniques with the Bayesian Generalized Linear Model being the best (F-Score=0.64 and Informedness=0.43) when combined with a selected set of features obtained by using information gain criteria.
For the task of selecting ADR data on the crowdsourced annotations Bayesian Generalized Linear Model (BGLM) was observed to be the model providing the overall highest F-Score among those tested, only surpassed by C50 when using the top 50% and the 100% of the features, although in terms of Informedness BGLM obtained the best scores all the time.
Error-reporting systems are widely regarded as critical components to improving patient safety, yet current systems do not effectively engage patients. The authors sought to assess Twitter as a source to gather patient perspective on errors in this feasibility study. They included publicly accessible tweets in English from any geography. To collect patient safety tweets, they authors consulted a patient safety expert and constructed a set of highly relevant phrases, such as “doctor screwed up.” then they used Twitter‘s search application program interface from January to August 2012 to identify tweets that matched the set of phrases. Two researchers used criteria to independently review tweets and choose those relevant to patient safety; a third reviewer resolved discrepancies. Variables included source and sex of tweeter, source and type of error, emotional response, and mention of litigation. Of 1006 tweets analyzed, 839 (83%) identified the type of error: 26% of which were procedural errors, 23% were medication errors, 23% were diagnostic errors, and 14% were surgical errors. A total of 850 (84%) identified a tweet source, 90% of which were by the patient and 9% by a family member. A total of 519 (52%) identified an emotional response, 47% of which expressed anger or frustration, 21% expressed humor or sarcasm, and 14% expressed sadness or grief. Of the tweets, 6.3% mentioned an intent to pursue malpractice litigation. The authors concluded that Twitter is a relevant data source to obtain the patient perspective on medical errors. Twitter may provide an opportunity for health systems and providers to identify and communicate with patients who have experienced a medical error. Further research is needed to assess the reliability of the data.
Limitations of classical data sources for post-market surveillance include potential under-reporting, lack of geographic diversity, and time lag between event occurrence and discovery. There is growing interest in exploring the use of social media (‘social listening‘) to supplement established approaches for pharmacovigilance. Although social listening is commonly used for commercial purposes, there are only anecdotal reports of its use in pharmacovigilance. Health information posted online by patients is often publicly available, representing an untapped source of post-marketing safety data that could supplement data from existing sources. The objective of this paper is to describe one methodology that could help unlock the potential of social media for safety surveillance. A third-party vendor acquired 24 months of publicly available Facebook and Twitter data, then processed the data by standardizing drug names and vernacular symptoms, removing duplicates and noise, masking personally identifiable information, and adding supplemental data to facilitate the review process. The resulting dataset was analyzed for safety and benefit information. In Twitter, a total of 6,441,679 Medical Dictionary for Regulatory Activities (MedDRA(®)) Preferred Terms (PTs) representing 702 individual PTs were discussed in the same post as a drug compared with 15,650,108 total PTs representing 946 individual PTs in Facebook. Further analysis revealed that 26 % of posts also contained benefit information. Authors concluded that social media listening is an important tool to augment post-marketing safety surveillance. Much work remains to determine best practices for using this rapidly evolving data source.
Adrover C, Bodnar T, Huang Z, Telenti A, Salathe M. Identifying Adverse Effects of HIV Drug Treatment and Associated Sentiments Using Twitter. JMIR Public Health Surveill 2015 Jul 27;1(2):e7. doi: 10.2196/publichealth.4488.
Social media platforms are increasingly seen as a source of data on a wide range of health issues. Twitter is of particular interest for public health surveillance because of its public nature. However, the very public nature of social media platforms such as Twitter may act as a barrier to public health surveillance, as people may be reluctant to publicly disclose information about their health. This is of particular concern in the context of diseases that are associated with a certain degree of stigma, such as HIV/AIDS. The objective of the study was to assess whether adverse effects of HIV drug treatment and associated sentiments can be determined using publicly available data from social media. The authors describe a combined approach of machine learning and crowdsourced human assessment to identify adverse effects of HIV drug treatment solely on individual reports posted publicly on Twitter. Starting from a large dataset of 40 million tweets collected over three years, we identify a very small subset (1642; 0.004%) of individual reports describing personal experiences with HIV drug treatment. Despite the small size of the extracted final dataset, the summary representation of adverse effects attributed to specific drugs, or drug combinations, accurately captures well-recognized toxicities. In addition, the data allowed us to discriminate across specific drug compounds, to identify preferred drugs over time, and to capture novel events such as the availability of preexposure prophylaxis. The authors conclude that the effect of limited data sharing due to the public nature of the data can be partially offset by the large number of people sharing data in the first place, an observation that may play a key role in digital epidemiology in general.
Korkcontzelos I, Nikfarjam A, Shardlow M, Sarker A, Ananiadou S, Gonzalez GH. Analysis of the Effect of Sentiment Analysis on Extracting Adverse Drug Reactions from Tweets and Forum Posts. J Biomed Inform 2016;62:148-68.
Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, the authors investigated the effect of sentiment analysis features in locating ADR mentions. To achieve that, the authors enriched the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, they evaluated the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions. Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14% to 73.22% in the Twitter part of an existing corpus using its original train/test split. Using stratified 10×10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57% to 80.14%, and in the Twitter part of the corpus, from 66.91% to 69.16%. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications. In conclusion, this study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums.
With the development of Web 2.0, social media has become a large data source for information on ADEs. The objective of this study was to develop a relation extraction system that uses natural language processing techniques to effectively distinguish between ADEs and non-ADEs in informal text on social media. The authors developed a feature-based approach that utilizes various lexical, syntactic, and semantic features. Information-gain-based feature selection is performed to address high-dimensional features. Then, they evaluated the effectiveness of four well-known kernel-based approaches (i.e., subset tree kernel, tree kernel, shortest dependency path kernel, and all-paths graph kernel) and several ensembles that are generated by adopting different combination methods (i.e., majority voting, weighted averaging, and stacked generalization). All of the approaches are tested using three data sets: two health-related discussion forums and one general social networking site (i.e., Twitter). When investigating the contribution of each feature subset, the feature-based approach attains the best area under the receiver operating characteristics curve (AUC) values, which are 78.6%, 72.2%, and 79.2% on the three data sets. When individual methods are used, we attain the best AUC values of 82.1%, 73.2%, and 77.0% using the subset tree kernel, shortest dependency path kernel, and feature-based approach on the three data sets, respectively. When using classifier ensembles, we achieve the best AUC values of 84.5%, 77.3%, and 84.5% on the three data sets, outperforming the baselines. In conclusion, the experimental results indicate that ADE extraction from social media can benefit from feature selection. With respect to the effectiveness of different feature subsets, lexical features and semantic features can enhance the ADE extraction capability. Kernel-based approaches, which can stay away from the feature sparsity issue, are qualified to address the ADE extraction problem. Combining different individual classifiers using suitable combination methods can further enhance the ADE extraction effectiveness.
Adverse drug events (ADEs) constitute one of the leading causes of post-therapeutic death and their identification constitutes an important challenge of modern precision medicine. Unfortunately, the onset and effects of ADEs are often underreported complicating timely intervention. At over 500 million posts per day, Twitter is a commonly used social media platform. The ubiquity of day-to-day personal information exchange on Twitter makes it a promising target for data mining for ADE identification and intervention. Three technical challenges are central to this problem: (1) identification of salient medical keywords in (noisy) tweets, (2) mapping drug-effect relationships, and (3) classification of such relationships as adverse or non-adverse. The authors used a bipartite graph-theoretic representation called a drug-effect graph (DEG) for modeling drug and side effect relationships by representing the drugs and side effects as vertices. We construct individual DEGs on two data sources. The first DEG is constructed from the drug-effect relationships found in FDA package inserts as recorded in the SIDER database. The second DEG is constructed by mining the history of Twitter users. We use dictionary-based information extraction to identify medically-relevant concepts in tweets. Drugs, along with co-occurring symptoms are connected with edges weighted by temporal distance and frequency. Finally, information from the SIDER DEG is integrate with the Twitter DEG and edges are classified as either adverse or non-adverse using supervised machine learning.
The authors examined both graph-theoretic and semantic features for the classification task. The proposed approach can identify adverse drug effects with high accuracy with precision exceeding 85 % and F1 exceeding 81 %. When compared with leading methods at the state-of-the-art, which employ un-enriched graph-theoretic analysis alone, our method leads to improvements ranging between 5 and 8 % in terms of the aforementioned measures. Additionally, we employ our method to discover several ADEs which, though present in medical literature and Twitter-streams, are not represented in the SIDER databases. In conclusion, the authors present a DEG integration model as a powerful formalism for the analysis of drug-effect relationships that is general enough to accommodate diverse data sources, yet rigorous enough to provide a strong mechanism for ADE identification.
Koutkias VG, Lillo-le-Louet A, Jaulent MC. Exploiting Heterogeneous Publicly Available Data Sources for Drug Safety Surveillance: Computational Framework and Case Studies. Expert Opin Drug Saf 2017;16(2):113-24.
In this article, the authors introduce and validate a computational framework exploiting dominant as well as emerging publicly available data sources for drug safety surveillance. Their approach relies on appropriate query formulation for data acquisition and subsequent filtering, transformation and joint visualization of the obtained data. Data from the FDA Adverse Event Reporting System (FAERS), PubMed and Twitter were used. In order to assess the validity and the robustness of the approach, the authors elaborated on two important case studies, namely, clozapine-induced cardiomyopathy/myocarditis versus haloperidol-induced cardiomyopathy/myocarditis, and apixaban-induced cerebral hemorrhage.
The analysis of the obtained data provided interesting insights (identification of potential patient and health-care professional experiences regarding ADRs in Twitter, information/arguments against an ADR existence across all sources), while illustrating the benefits (complementing data from multiple sources to strengthen/confirm evidence) and the underlying challenges (selecting search terms, data presentation) of exploiting heterogeneous information sources, thereby advocating the need for the proposed framework. The authors concluded that this work contributes in establishing a continuous learning system for drug safety surveillance by exploiting heterogeneous publicly available data sources via appropriate support tools.
Pierce CE, Bouri K, Pamer C, Proestel S, Rodriguez HW, Van Le H, et al. Evaluation of Facebook and Twitter Monitoring to Detect Safety Signals for Medical Products: An Analysis of Recent FDA Safety Alerts. Drug Saf 2017;40(4):317-31.
The rapid expansion of the Internet and computing power in recent years has opened up the possibility of using social media for pharmacovigilance. While this general concept has been proposed by many, central questions remain as to whether social media can provide earlier warnings for rare and serious events than traditional signal detection from spontaneous report data. The objective was to examine whether specific product-adverse event pairs were reported via social media before being reported to the US FDA Adverse Event Reporting System (FAERS). A retrospective analysis of public Facebook and Twitter data was conducted for 10 recent FDA postmarketing safety signals at the drug-event pair level with six negative controls. Social media data corresponding to two years prior to signal detection of each product-event pair were compiled. Automated classifiers were used to identify each ‘post with resemblance to an adverse event’ (Proto-AE), among English language posts. A custom dictionary was used to translate Internet vernacular into Medical Dictionary for Regulatory Activities (MedDRA®) Preferred Terms. Drug safety physicians conducted a manual review to determine causality using World Health Organization-Uppsala Monitoring Centre (WHO-UMC) assessment criteria. Cases were also compared with those reported in FAERS.
A total of 935,246 posts were harvested from Facebook and Twitter, from March 2009 through October 2014. The automated classifier identified 98,252 Proto-AEs. Of these, 13 posts were selected for causality assessment of product-event pairs. Clinical assessment revealed that posts had sufficient information to warrant further investigation for two possible product-event associations: dronedarone-vasculitis and Banana Boat Sunscreen–skin burns. No product-event associations were found among the negative controls. In one of the positive cases, the first report occurred in social media prior to signal detection from FAERS, whereas the other case occurred first in FAERS.
In conclusion, an efficient semi-automated approach to social media monitoring may provide earlier insights into certain adverse events. More work is needed to elaborate additional uses for social media data in pharmacovigilance and to determine how they can be applied by regulatory agencies.
Social media is an important pharmacovigilance data source for adverse drug reaction (ADR) identification. As human review is infeasible due to data quantity, natural language processing techniques are necessary. Social media includes informal vocabulary and irregular grammar, which challenge natural language processing methods. The objective of this study was to develop a scalable, deep-learning approach that exceeds state-of-the-art ADR detection performance in social media. The authors developed a recurrent neural network (RNN) model that labels words in an input sequence with ADR membership tags. The only input features are word-embedding vectors, which can be formed through task-independent pretraining or during ADR detection training.
Our best-performing RNN model used pretrained word embeddings created from a large, non-domain-specific Twitter dataset. It achieved an approximate match F-measure of 0.755 for ADR identification on the dataset, compared to 0.631 for a baseline lexicon system and 0.65 for the state-of-the-art conditional random field model. Feature analysis indicated that semantic information in pretrained word embeddings boosted sensitivity and, combined with contextual awareness captured in the RNN, precision.
Our model required no task-specific feature engineering, suggesting generalizability to additional sequence-labeling tasks. Learning curve analysis showed that our model reached optimal performance with fewer training examples than the other models.
In conclusion, ADR detection performance in social media is significantly improved by using a contextually aware model and word embeddings formed from large, unlabeled datasets. The approach reduces manual data-labeling requirements and is scalable to large social media datasets.
The digital revolution has contributed to very large data sets (ie, big data) relevant for public health. The two major data sources are electronic health records from traditional health systems and patient-generated data. As the two data sources have complementary strengths-high veracity in the data from traditional sources and high velocity and variety in patient-generated data-they can be combined to build more-robust public health systems. However, they also have unique challenges. Patient-generated data in particular are often completely unstructured and highly context dependent, posing essentially a machine-learning challenge. Some recent examples from infectious disease surveillance and adverse drug event monitoring demonstrate that the technical challenges can be solved. Despite these advances, the problem of verification remains, and unless traditional and digital epidemiologic approaches are combined, these data sources will be constrained by their intrinsic limits.
Comfort S, Perera S, Hudson Z, Dorrell D, Meireis S, Nagarajan M, et al. Sorting Through the Safety Data Haystack: Using Machine Learning to Identify Individual Case Safety Reports in Social-Digital Media. Drug Saf 2018;doi: 10.1007/s40264-018-0641-7.
There is increasing interest in social digital media (SDM) as a data source for pharmacovigilance activities; however, SDM is considered a low information content data source for safety data. Given that pharmacovigilance itself operates in a high-noise, lower-validity environment without objective ‘gold standards’ beyond process definitions, the introduction of large volumes of SDM into the pharmacovigilance workflow has the potential to exacerbate issues with limited manual resources to perform adverse event identification and processing. Recent advances in medical informatics have resulted in methods for developing programs which can assist human experts in the detection of valid individual case safety reports (ICSRs) within SDM. The objective of this study was to develop rule-based and machine learning (ML) models for classifying ICSRs from SDM and compared their performance with that of human pharmacovigilance experts. The authors used a random sampling from a collection of 311,189 SDM posts that mentioned Roche products and brands in combination with common medical and scientific terms sourced from Twitter, Tumblr, Facebook, and a spectrum of news media blogs to develop and evaluate three iterations of an automated ICSR classifier. The ICSR classifier models consisted of sub-components to annotate the relevant ICSR elements and a component to make the final decision on the validity of the ICSR. Agreement with human pharmacovigilance experts was chosen as the preferred performance metric and was evaluated by calculating the Gwet AC1 statistic (gKappa). The best performing model was tested against the Roche global pharmacovigilance expert using a blind dataset and put through a time test of the full 311,189-post dataset.
During this effort, the initial strict rule-based approach to ICSR classification resulted in a model with an accuracy of 65% and a gKappa of 46%. Adding an ML-based adverse event annotator improved the accuracy to 74% and gKappa to 60%. This was further improved by the addition of an additional ML ICSR detector. On a blind test set of 2500 posts, the final model demonstrated a gKappa of 78% and an accuracy of 83%. In the time test, it took the final model 48 h to complete a task that would have taken an estimated 44,000 h for human experts to perform.
In conclusion, the results of this study indicate that an effective and scalable solution to the challenge of ICSR detection in SDM includes a workflow using an automated ML classifier to identify likely ICSRs for further human SME review.