learning clinical

Clinical data is data pertaining to actual observation and treatment of patients.

http://www.cancer.gov/cancertopics/cancerlibrary/terminologyresources/cdisc :








SDTM Basics :


CDSIC Define XML : Metadata

General classes of observations on subjects I,E,F : Interventions, Events, Findings

General structures : Identifiers, Topic variables, Timing variables, qualifiers

Model : wraps the observations

Domain : series of observations


  • Identifier variables – identify the study, the subject. domain, sequence number
  • topic variables (name of the test)
  • timing variables (start and end date)
  • qualifier variables (numeric units)
  • rule variables (algorithim)

Clinical Trials

SDTM (Study Data Tabulation Model) defines a standard structure for human clinical trial (study) data tabulations that are to be submitted as part of a product application to a regulatory authority such as the United States Food and Drug Administration (FDA). The Submission Data Standards team of Clinical Data Interchange Standards Consortium (CDISC) defines SDTM.





Posted in Uncategorized | Leave a comment

matching in perl

  • match any one of a set of characters , We put the several options in square brackets, select between single options
  • with . (dot) match any single character, . can be taken to match any character whatsoever except a ‘newline’
  • match several characters in the middle, + sign tells Perl to match one or more of the preceding character – one or more of any character  with   .+
  • match zero or more characters with .*
  • ? matches zero or one of the preceding character
  • simple \ (backslash) to indicate that the subsequent character is to be regarded as something to match, and not some fancy control character

modifiers: /test/i

  • i – case insensitivity
  • s – allows match foo on one line and bar on next so that even /./ will match a “newline” character.
  • m – allows the ^ $ to match after a new line and before next newline
  • g keep track of where in string it left off. G means end of previous match

extract information from part of a match –       /alpha(.+)gamma/

  • “xxalphazzzgamma”
  • “alpha beta gamma delta”

what do the (parentheses) achieve? The answer is simple – everything in parenthesis is put into the Perl variable $1. (If you have a second set of parentheses, the contents of this set go into $2, and so on).

\n newline (line feed)
\w a word character [a-zA-Z0-9_]
\W NOT a word character, that is [^a-zA-Z0-9_]
\s white space (new line, carriage return, space, tab, form feed)
\S NOT white space
\d a digit [0-9]
\D NOT a digit, i.e. [^0-9]

  • \b Match a word boundary
  • \B Match a non-(word boundary)
  • \A Match only at beginning of string
  • \Z Match only at end of string, or before newline at the end
  • \z Match only at end of string
  • \G Match only where previous m//g left off (works only with /g)
Posted in Uncategorized | Leave a comment