EMS data

My first experience with medical data came with the department of Emergency Medicine at University of North Carolina at Chapel Hill. The medical record information system was known as PreMIS , and collected emergency patient care data throughout the state of North Carolina.

Posted in Uncategorized | Leave a comment

learning clinical

Clinical data is data pertaining to actual observation and treatment of patients.

http://www.cancer.gov/cancertopics/cancerlibrary/terminologyresources/cdisc :








SDTM Basics :


CDSIC Define XML : Metadata

General classes of observations on subjects I,E,F : Interventions, Events, Findings

General structures : Identifiers, Topic variables, Timing variables, qualifiers

Model : wraps the observations

Domain : series of observations


  • Identifier variables – identify the study, the subject. domain, sequence number
  • topic variables (name of the test)
  • timing variables (start and end date)
  • qualifier variables (numeric units)
  • rule variables (algorithim)

Clinical Trials

SDTM (Study Data Tabulation Model) defines a standard structure for human clinical trial (study) data tabulations that are to be submitted as part of a product application to a regulatory authority such as the United States Food and Drug Administration (FDA). The Submission Data Standards team of Clinical Data Interchange Standards Consortium (CDISC) defines SDTM.





Posted in Uncategorized | Leave a comment

matching in perl

  • match any one of a set of characters , We put the several options in square brackets, select between single options
  • with . (dot) match any single character, . can be taken to match any character whatsoever except a ‘newline’
  • match several characters in the middle, + sign tells Perl to match one or more of the preceding character – one or more of any character  with   .+
  • match zero or more characters with .*
  • ? matches zero or one of the preceding character
  • simple \ (backslash) to indicate that the subsequent character is to be regarded as something to match, and not some fancy control character

modifiers: /test/i

  • i – case insensitivity
  • s – allows match foo on one line and bar on next so that even /./ will match a “newline” character.
  • m – allows the ^ $ to match after a new line and before next newline
  • g keep track of where in string it left off. G means end of previous match

extract information from part of a match –       /alpha(.+)gamma/

  • “xxalphazzzgamma”
  • “alpha beta gamma delta”

what do the (parentheses) achieve? The answer is simple – everything in parenthesis is put into the Perl variable $1. (If you have a second set of parentheses, the contents of this set go into $2, and so on).

\n newline (line feed)
\w a word character [a-zA-Z0-9_]
\W NOT a word character, that is [^a-zA-Z0-9_]
\s white space (new line, carriage return, space, tab, form feed)
\S NOT white space
\d a digit [0-9]
\D NOT a digit, i.e. [^0-9]

  • \b Match a word boundary
  • \B Match a non-(word boundary)
  • \A Match only at beginning of string
  • \Z Match only at end of string, or before newline at the end
  • \z Match only at end of string
  • \G Match only where previous m//g left off (works only with /g)
Posted in Uncategorized | Leave a comment