Example: bankruptcy

automated, replicable CDISC conversion

automated , replicable CDISC conversionTamr for CDISC WhitepaperAbstractAbout CDISCThe Complexity of CDISC and SASHow Tamr solves CDISC problems+ data Loading+ automated Transformation & Matching+ Expert Questions & Active Learning+ Validation & data ExportBenefitsDetailed workflow+ Catalog study data + Domain Assignment+ Variable Mapping+ Transformations+ Validation+ data ExportCore features+ Active Learning+ Expert Sourcing+ Logging and Auditing+ Programmatic APIsReferencesAbstract In the United States, the FDA requires that every clinical trial or application to market a new drug or biologic be accompanied by clinical study data showing the safety and efficacy of the proposed product. Converting clinical data into the FDA-required CDISC standards is a difficult, time-consuming and error-prone process; if not done correctly, the FDA might refuse to file the application, or send it back to the sponsoring organization for correction.

automated, replicable CDISC conversion Tamr for CDISC Whitepaper Abstract About CDISC ... and analytic results derived from that data (the Study Data Tabulation Model, or SDTM, and the Analysis Data Model, or ADaM), sponsors ... accepts for clinical study data is the XPORT (Transport) file format, an archaic format used as an interchange . 3

Tags:

  Model, Study, Data, Conversion, Automated, Data model, Cdisc, Study data tabulation model, Tabulation, Study data, Replicable cdisc conversion, Replicable

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of automated, replicable CDISC conversion

1 automated , replicable CDISC conversionTamr for CDISC WhitepaperAbstractAbout CDISCThe Complexity of CDISC and SASHow Tamr solves CDISC problems+ data Loading+ automated Transformation & Matching+ Expert Questions & Active Learning+ Validation & data ExportBenefitsDetailed workflow+ Catalog study data + Domain Assignment+ Variable Mapping+ Transformations+ Validation+ data ExportCore features+ Active Learning+ Expert Sourcing+ Logging and Auditing+ Programmatic APIsReferencesAbstract In the United States, the FDA requires that every clinical trial or application to market a new drug or biologic be accompanied by clinical study data showing the safety and efficacy of the proposed product. Converting clinical data into the FDA-required CDISC standards is a difficult, time-consuming and error-prone process; if not done correctly, the FDA might refuse to file the application, or send it back to the sponsoring organization for correction.

2 These delays in regulatory approval are very costly for pharmaceutical companies. Every day in delay means that the drug is not on the market, and therefore not generating any revenue during the limited time that the patent significantly reduces the time and effort required for CDISC conversion by simultaneously using two powerful methods: advanced machine learning and expert crowdsourcing. This paper describes in detail how Tamr for CDISC works, and its implication for pharmaceutical CDISC In the United States, the FDA requires that every clinical trial or application to market a new drug or biologic be accompanied by clinical study data showing the safety and efficacy of the proposed product. When a company submits clinical study data to the government, the FDA is required by law to review and respond to their application within a fixed time limit (usually 180 days).

3 However, studies can be quite complex, involving large numbers of participants, many different clinical sites, multiple trials phases or arms, different interventions and treatments, participants who leave the study early or who suffer adverse events related to a drug product, complex study designs, and complicated planned analyses of study data . To expedite the review process, the FDA requires that submitted study data be organized and submitted in specific schemas and file formats. For pharmaceutical companies, there is a never-ending need to convert trial data into formats accepted by regulatory , replicable CDISC conversionCDISC is the organization which, through collaboration between pharmaceutical industry members and reviewers from the United States and other government regulatory agencies, has built a complete set of standards, ontologies, and vocabularies to organize and encode these data the CDISC standards makes reviewers tasks easier and more consistent.

4 For example, the type of an adverse event will always be encoded in the AETERM variable, no matter who provided the data . Reviewers are able to build their own data systems, standard procedures, and tools that consistently organize, read, and manage the submitted these vocabularies and standards are so useful, the FDA now requires that data submitted to support an application to investigate or market a new drug product, biologic, or medical device be formatted according to the CDISC terminologies. If the data is not provided in the right format, or is inconsistent, the FDA might refuse to file the application -- or provide feedback to the sponsoring organization that essentially puts the review process on hold until the deficiencies in data formatting are delays in regulatory approval are very costly for pharmaceutical companies.

5 Every day in delay means that the drug is not on the market, and therefore not generating any revenue during the limited time that the patent Complexity of CDISC and SAS Because clinical studies are so varied, and their results so important, CDISC has developed a large and complicated set of standards. Even when we restrict ourselves to the two standards that are used to represent clinical data and analytic results derived from that data (the study data tabulation model , or SDTM, and the Analysis data model , or ADaM), sponsors must still organize data according to X different variables divided into Y different domains, using Z distinct codelists for standardizing clinical values themselves. There are several thousand pages of core documentation describing the use and extension of these schemas and vocabularies, as well as numerous presentations, guidance documents, wikis, and other sources of standards are themselves moving targets, which makes CDISC conversions more complicated.

6 New versions of these standards are released once to twice a year, even as the FDA maintains its own schedule for adopting and supporting these updated standards. Some experimental or clinical data types (or analysis methods) are not yet covered by the CDISC family of standards; these are often formatted into CDISC -like structures and submitted alongside the official standardized datasets. Some of these custom data types are often adapted into official releases of the CDISC standards in later , while CDISC provides a structure and organization for a clinical or analytic dataset, an electronic data submission to the FDA needs to be submitted in a particular format as well. Currently, the format which the FDA accepts for clinical study data is the XPORT (Transport) file format, an archaic format used as an interchange 3automated, replicable CDISC conversionformat between different versions of the SAS analytic software over several decades.

7 SAS formats aren t just required for FDA submission, they are also the standard for data collection and transmission to the study sponsor from the CROs that have been contracted to perform data collection. Current processes used for CDISC conversionWhat this means is that very often a scientist or an informatician within a study s sponsoring organization is faced with a collection of binary files in a proprietary SAS file format, each containing raw clinical data describing the event and interventions that were observed or carried out for each participant in the study . These files need to be normalized to the SDTM and ADaM standards, converted from one binary SAS file format into another, and assembled into a larger electronic data submission to the FDA -- and all of this work needs to be carried out as soon as possible. The answer to this problem, for most companies and study sponsors today, is to hire SAS programmers with CDISC experience.

8 These programmers can take weeks or months to write the scripts, curation, and validation codes in SAS to convert and check the clinical data in CDISC formats. Cost estimates vary from $35K to $1M+, depending on size of study , and the complexity and variety of data or the trial itself. This manual conversion process doesn t scale, and never gets easier. Since all the data curation and integration work occurs exclusively within SAS and is often performed by SAS-specific, CDISC -knowledgeable contractors, it is difficult for sponsors to build the kind of institutional knowledge about CDISC and CDISC conversion that would make future conversions easier. A large pharmaceutical company might run as many as 100 studies in a year, but the 100th study is just as difficult to convert, validate, and submit as the first Tamr solves CDISC problemsData LoadingTo start, Tamr for CDISC is able to quickly parse and load clinical data stored from a variety of formats, including proprietary SAS (.)

9 Sas7bdat) formats, narrow and wide tabular formats, database connections, and web services. Tamr for CDISC comes pre-loaded with a single, consistent version of SDTM and its associated codelists, so all that it needs to begin is the clinical data itself. 4automated, replicable CDISC conversionAutomated Transformation & MatchingOnce the data is loaded, Tamr s machine learning algorithms and automated rules attempt to build matches from the fields in each dataset to the variables in the CDISC standards. Tamr associates a confidence score with each matching decision; in cases where the confidence score is high, the match can be performed automatically; for lower confidence matches (or if the user chooses to review all matches), the user is able to review potential matches, perform standardized and custom data transformations to improve the match, and select the appropriate variable from CDISC for mapping.

10 Expert Questions & Active LearningIn cases where the machine learning is unable to suggest a match or transformation, and where the user is unable to determine the correct operation, Tamr provides the ability to reach out to an expert and ask a question. This expert may be a member of the sponsor s organization, or may be a colleague, collaborator, or contractor from a different company who can be reached via email. Tamr solicits advice and help from experts, provides an interface for those experts to provide answers, and then incorporates those answers (via active learning) into the automatic matching and machine learning models for future matching and & data ExportTamr for CDISC also includes the capability to validate the matched and transformed data , according to rules that are either pre-loaded into Tamr or provided by the user directly.


Related search queries