Since its inception, CDM has evolved significantly not only in its process and efficiency but also in its interaction with various stakeholders, functional areas, and the technology surrounding it. That includes the software applications that enable and support clinical data management. In the initial phases of CDM, activities were performed using specially written computer programs. With time, commercial off-the-shelf (COTS) products, such as Oracle Clinical, Clintrial, and SAS DataFlux, were adopted. The industry also moved towards the use of electronic data capture systems (EDC), and most EDC vendors had or built interfaces to integrate EDC with CDM systems.
The introduction of risk-based monitoring (RBM) has added another dimension of variability to CDM. The traditional roles of data management focused on internal data checks, third-party data transfers, coding, and electronic submission preparation have expanded. CDM professionals are expected to provide near real-time data intelligence to study teams. The advances in CDM software have enabled that transition, thereby increasing the value proposition and importance of CDM professionals. The evolving role of CDM, the need for integration with other data sources, the required skillset of CDM professionals, and the increased reliance on data management software require a relook at the selection and implementation of CDM software. In this paper, we look at the recent trends in CDM software, the selection criteria, and the implementation roadmap with a focus on newer, more sophisticated software.
Clinical data management (CDM) is a critical process in clinical research, which leads to the generation of high-quality, reliable, and statistically sound research results. Today, most CDM tasks are assisted by software. In the last 15 years, from a low level of software support and automation in the area of CDM-related activities, there has been a proliferation of such software. The selection of clinical data management software is paramount for CDM professionals. Additionally, the integration of clinical trial data from multiple sources, such as labs, imaging, and patient-reported outcomes (PRO), is becoming complex. The newer advancements include capabilities handling diverse data sources, like electronic health records (EHRs), registries, genomics, and personalized medicine data.
Definition and Importance of Clinical Data Management Software
The increasing number of clinical and preclinical studies globally enhances innovation in discovery, development, and translated medicine, promoting the demand for advanced electronic systems such as CDMS. The paper or web-based systems used to be the early electronic CDMS systems. A custom-built system is the main form of CDMS stage development. Many companies have started to develop and deliver innovative commercial products. The conventional CDMS providers present the product, including Oracle, Medidata Solutions, and others. Nowadays, relatively new companies or vendors offer CDMS with a flexible delivery model, rapid study deployment, configuration capabilities, and integration options using standards such as the Clinical Data Interchange Standard Consortium (CDISC). Prominent vendors offering CDMS include MedNet Solutions, Medidata, and others who offer innovative solutions hosted in the cloud.
Clinical Data Management Software (CDMS) is an application used to manage the life cycle of clinical data. The data generated during a clinical trial or preclinical study is stored in CDMS. The CDMS offers a data repository that has long-term value in supporting retrospective post-market surveillance and supporting data monitoring safety board. Additionally, CDMS includes electronic case report forms that enable users to enter data from a study and generate queries. These forms are specifically designed for researchers and clinicians to report the collected data associated with the study.
Key Features of Clinical Data Management Software
Data Coding and Clinical Dictionary: This feature allows users to standardize and encode data through a built-in dictionary that contains standard coding and terminology such as MedDRA and WHO-DRUG. The clinical dictionary feature allows users to add, remove, and modify terms as necessary.
Reporting and Analytics: This feature allows clinical research teams to generate reports and show key performance metrics relevant to the trial. Some software offers pre-configured reports whereas other reports can be customized. The ability to support ad hoc reporting should also be considered.
Randomization and Trial Supply Management: Some software offers randomization of trial participants into different arms of the study. This feature may also be linked to trial supply management that helps in managing the inventory of trial supplies. Be sure to check the capabilities of your software in order to prevent over-recruitment as well as drug supply shortages.
Electronic Data Capture (EDC): This is the most important feature of any clinical trial management software. EDC allows for the collection of clinical trial data into the trial’s database through the use of online forms. EDC eliminates the need for paper CRFs (case report forms) which are completed by the clinical site and then mailed to the sponsor (and/or CRO) for data entry into the database. EDC reduces the time required to collect data and allows for real-time data validation.
Data Collection and Entry
Consistency should lead to greater efficiency, as the same technology can be re-used from study to study, and improved data quality, as there are fewer opportunities for technology and process errors.
Over the last few years, standards bodies such as the Clinical Data Interchange Standards Consortium (CDISC) have worked to define and promote standards for the electronic interchange of clinical trial data. These standards, together with wider acceptance of technologies such as eXtensible Markup Language (XML) for defining electronic documents, and the use of the internet and intranets as the transport medium, are leading to greater consistency in the way data is collected and managed.
The available data management software systems vary widely in their approach and functionality. Some are commercially available “off-the-shelf products” which can be implemented with a minimum of configuration. Others are bespoke systems developed for a specific study, often at a considerable cost.
A number of software systems have been developed to enable more efficient data collection by allowing clinical investigators to access the study database via the internet or intranet. Often, these systems use techniques such as electronic pen and paper to convert the data into an electronic format.
Data collection at the clinical site continues to be heavily dependent upon traditional paper-based methods. Often, the same information needs to be collected on multiple case report forms and the same data subsequently entered into different systems. For example, laboratory results into a laboratory information management system, adverse events into a pharmacovigilance system, and clinical data into a data management system. This double data entry to support different systems is time-consuming and has significant cost implications.
Data Cleaning and Quality Control
Upon initiation of data collection for a new study, all checks associated with the new study including edit checks, and any data conversions, will be thoroughly reviewed and documented, in a manner consistent with already implemented study-specific procedures. The listing and review of the newly implemented quality control checks must include the review of any problems that have already occurred. Although a list of problems can be generated at any time during the data collection effort, often the generation of a “problem list” occurs only after a “dummy” data entry with “test” or “dummy” subjects has been performed. The review of test subjects is especially important, as the process offers the opportunity to test data collection, data handling, and data review procedures as well as the quality control checks.
Manual quality control review prior to the data load can be accomplished in two ways. If the EDC supports a “preview mode”, the reviewer can use the preview mode to verify the data on a screen-by-screen basis. Alternatively, all of the data entry forms can be printed and each form reviewed. Manual quality control review after data has been loaded can be viewed and printed, as displayed, which represents the data as it has been loaded for a specific form, or all of the loaded data can be extracted and then reviewed, corrected and reloaded. The capabilities of the EDC system provide for extensive validation of the clinical data through both automatic procedures and human review.
Data cleaning and QC process are facilitated by listing or addressing any large or unusual values, logic checks, or problems that have edits associated with them. The clinical data management software contains two types of quality control checks: automatic and manual. The automatic checks are performed each time data is entered for each participant or updated. The manual checks, along with a listing of the data being converted and any problems that have occurred with that data, can be performed both before and after data has been loaded into the software.
Benefits of Using Clinical Data Management Software
One of the key benefits of using clinical data management software is that it can help you to be more efficient in your work. For example, software can help with data collection by allowing you to create electronic case report forms (eCRFs) that are easy to use and can be completed quickly. Software can also help with data organization and storage by using a central database that can be accessed by multiple users. This type of database can store all of the study data, as well as any supporting documents, such as lab reports or questionnaires. In addition, software can help with data cleaning by using automated checks and edit programs that can identify and correct errors in the data. With these checks and edit programs, the data can be cleaned much faster than if it were done manually, which will save time and reduce the risk of errors.
Many companies are now using clinical data management software to help them with their day-to-day operations. This type of software can help with data collection, organization, and storage, as well as with data cleaning, analysis, and reporting. There are many different clinical data management software programs available, with varying levels of complexity and cost. Some companies have even developed their own proprietary software to meet their specific needs. No matter which software is used, there are many benefits to be gained from using it, such as increased efficiency, improved accuracy, and reduced time and cost.
Increased Efficiency and Accuracy
The real-time validation of data against the study edit check rules is made possible by integrating the eCRF software with the clinical trial management system (CTMS) and clinical trial database. When the user submits a page of data from the eCRF, the submitted data are validated using the edit check rules, and if no errors are found, the data are transferred to the database. However, if an error is identified, a data query is generated and displayed to the user, and the error message should also be displayed in both the query and the query listing views in real time. With the increased volume of data entered in a single page by the user and the complexity of the edit check rules, the real-time validation could fail to perform within an acceptable response time. In this case, the real-time validation should be performed for a certain portion of the data, and if the result is positive, the remaining data should be submitted, and the validation of the entire set of data should be performed in batch.
In this modern era of technology, where every manual task can be transformed into an automated task, the entry of data from case report forms into a clinical trial database can also be automated. However, scanning of paper case report forms integrated with optical character recognition (OCR) technology for automated data entry introduces errors. Hence, the interaction of users with the eCRF for data entry raised new challenges in designing user-friendly eCRFs. This resulted in the development of several types of eCRFs and associated software to meet the requirements of varying complexity of study designs. The most common types of eCRFs are double data entry, discrete (or interactive), and mixed-mode eCRFs. Of these, the discrete eCRF provides real-time data validation with error and warning messages, reducing the data query resolution time.
Challenges and Limitations in Clinical Data Management Software
Furthermore, the use of CDM software is accompanied by certain limitations. For instance, the use of electronic case report forms (eCRFs) can restrict the freedom of a patient’s natural flow of conversation during a study visit. This is because the form needs to be completed in a specific order to ensure the data are mapped correctly in the back-end system. Consequently, eCRFs can require more time than paper CRFs, which may lead to dissatisfaction among users. Additionally, the high costs of CDM software can be a major limiting factor for small companies or institutions with a low budget for research. Although open-source software can help reduce costs, there is often a lack of support when issues arise.
Clinical Data Management (CDM) software has made the complex process of data management in clinical trials relatively simple and efficient. However, several challenges and limitations are associated with CDM software. One of the main challenges is the integration and implementation of new CDM software with existing systems. This can be time-consuming and requires a significant amount of resources. Additionally, the quality of the software is an important but often underestimated factor. Poorly developed software can lead to errors in data that are difficult to identify and resolve.
Data Security and Privacy Concerns
Data should also be stored in a secure encrypted fashion with backups occurring at multiple secure locations. Individuals involved in private clinical research are also becoming more aware of additional country-specific laws that protect patient data (e.g. European Union Data Privacy Directive of 1995) as well as guidelines set by various private foundations such as the Good Clinical Data Management Practices (GCDMP) guidelines set by the Society for Clinical Data Management. It is becoming clear that web-based CDMS software will not be broadly used unless these issues are definitively addressed with well-tested secure and private solutions.
The primary issue which has hindered the rapid advancement of web-based Clinical Data Management Systems (CDMS) is the secure electronic storage of patient data with the safeguarding of patient privacy. In the U.S., the Health Insurance Portability and Accountability Act (HIPAA) has set into law a number of privacy directives which must be followed by anyone involved in the handling of patient data. This has caused a great deal of concern and fear within the clinical research community. Software developers are having to navigate through a great deal of confusion surrounding the interpretation of the HIPAA directives relating to electronic data exchange and storage. CDMS systems need to be developed with a clear understanding of these laws and should allow for secure encrypted transmission of data across the Internet.
Future Trends and Innovations in Clinical Data Management Software
Clinical Data Management Systems (CDMS) help clinical research operations run smoothly by managing the large amounts of data generated by studies. Data for clinical research can come from a variety of sources, including case report forms, medical images, lab results, and other diagnostic tools, and are usually managed by specific software programs. While the functions of CDMS have stayed relatively stable, recent innovations in CDMS software have improved key capabilities such as international data collection, time-sensitive data monitoring, and workflow integration. The future of CDMS is likely to include more advanced interactions with electronic medical records, mobile data collection applications, and cloud-based storage systems.
Several recent advancements in CDMS have increased its usefulness, especially in large, international, or long-term studies. As electronic medical records become more advanced, CDMS software will likely be able to directly interface with patient records, reducing the need for double data entry. Mobile applications also have the potential to reduce the work of data management, allowing study participants to input the data themselves. Cloud-based systems will make it easier for geographically diverse teams to work together and often have lower up-front costs. Using third-party or specialty systems beyond the classic CDMS will allow comprehensive data management between related study data such as images or lab results.
Integration of Artificial Intelligence and Machine Learning
The paradigm shift from a centralized database or repository model towards a blockchain-enabled model for decentralized data management and data sharing is explored by very few CDMS at the present time and requires significant research and development effort. The shift towards cloud computing by a majority of the existing CDMS vendors has provided larger user groups the ability to harness the power of various devices ranging from small handheld devices to powerful HPC clusters. Given the rapid advancements in AI hardware acceleration technologies, any CDMS that focuses on enabling only AI software models to be deployed at specific hardware targets will limit the capabilities of the user community.
In the recent past, increasing the volume of data from various sources such as EHRs, mHealth, wearable devices, and public and private health-related repositories has led to the development of Artificial Intelligence, including Machine Learning (AI/ML), techniques and methods of CI. The future of CDMS lies in the integration of these state-of-the-art ongoing advancements in AI and ML. However, there exist a very limited number of development efforts in this integration. It is highly expected that the near future will witness more CDMS platforms that not only incorporate CI features but also enable less experienced programmers or users to develop, deploy, and manage AI models through easy-to-use UI.