My first experience with medical data came with the department of Emergency Medicine at University of North Carolina at Chapel Hill. The medical record information system was known as PreMIS , and collected emergency patient care data throughout the state of North Carolina.
Clinical data is data pertaining to actual observation and treatment of patients.
CDISC SDTM : Model
CDSIC Define XML : Metadata
General classes of observations on subjects I,E,F : Interventions, Events, Findings
General structures : Identifiers, Topic variables, Timing variables, qualifiers
Model : wraps the observations
Domain : series of observations
- Identifier variables – identify the study, the subject. domain, sequence number
- topic variables (name of the test)
- timing variables (start and end date)
- qualifier variables (numeric units)
- rule variables (algorithim)
SDTM (Study Data Tabulation Model) defines a standard structure for human clinical trial (study) data tabulations that are to be submitted as part of a product application to a regulatory authority such as the United States Food and Drug Administration (FDA). The Submission Data Standards team of Clinical Data Interchange Standards Consortium (CDISC) defines SDTM.
- match any one of a set of characters , We put the several options in square brackets, select between single options
- with . (dot) match any single character, . can be taken to match any character whatsoever except a ‘newline’
- match several characters in the middle, + sign tells Perl to match one or more of the preceding character – one or more of any character with .+
- match zero or more characters with .*
- ? matches zero or one of the preceding character
- simple \ (backslash) to indicate that the subsequent character is to be regarded as something to match, and not some fancy control character
- i – case insensitivity
- s – allows match foo on one line and bar on next so that even /./ will match a “newline” character.
- m – allows the ^ $ to match after a new line and before next newline
- g keep track of where in string it left off. G means end of previous match
extract information from part of a match – /alpha(.+)gamma/
- “alpha beta gamma delta”
what do the (parentheses) achieve? The answer is simple – everything in parenthesis is put into the Perl variable $1. (If you have a second set of parentheses, the contents of this set go into $2, and so on).
\n newline (line feed)
\w a word character [a-zA-Z0-9_]
\W NOT a word character, that is [^a-zA-Z0-9_]
\s white space (new line, carriage return, space, tab, form feed)
\S NOT white space
\d a digit [0-9]
\D NOT a digit, i.e. [^0-9]
- \b Match a word boundary
- \B Match a non-(word boundary)
- \A Match only at beginning of string
- \Z Match only at end of string, or before newline at the end
- \z Match only at end of string
- \G Match only where previous m//g left off (works only with /g)