What is a porp? How RPS XML is different (and scarier) than eCTD

Recently, I was providing some training to a GlobalSubmit client and one of the participants asked me about an xml document that was present in a folder along with the sample eCTD that we use for training.  The document was called “porp.xml”.  I explained that GlobalSubmit’s VALIDATE product can transform eCTD into RPS and when it does, it produces the XML backbone used for RPS, which is called porp.xml.  This is a single xml document that replaces eCTD’s index.xml, regional xml, and study tagging files.

The next question was “Why in the world is it called porp?”  I couldn’t answer that one.  But recently, I attended several training classes held by the font of all RPS knowledge, GlobalSubmit’s CTO Jason Rock.  I took advantage of that opportunity to gain insight into what is different about RPS xml when compared to eCTD xml.  [For the record, it’s called porp.xml because it had to be different than eCTD – and porp is a combination of po, the business domain within HL7, and rp for regulated product.]

But on to bigger topics.  I, along with most other remotely technical people who have dabbled in eCTD for years, am pretty comfortable with eCTD xml.  I can create the xml for a sequence by hand, and I can look at a sponsor’s xml and figure out what it represents and what is wrong with it.  But when I look at the xml for RPS, all bets are off.  Jason walked us through the xml in our training class and here are some of the observations that I made:

  • With RPS, any pretense of representing the xml as “human readable” is over.  New code systems and levels of indirection make this almost impossible.
  • The table of contents is not readily apparent.  It’s created by combining a content code and keywords to determine placement within the TOC or tree.  For example, the location of a drug substance specification would be determined by a code representing its document type and then several keywords representing substance name and possibly manufacturer – not by nesting the document within a TOC section.
  • Everything is referenced by ID.  For example, a Context of Use (sort of the replacement for a leaf) contains references to the ID of a content file, as well as any keywords necessary, such as a code representing a specific route of administration or species.  Instead of just looking at something like a study ID, you see a code representing the study ID which you must then locate.
  • IDs themselves are more complex.  With RPS, IDs must be either an OID or a GUID.  An OID is formed by taking a unique numeric string (e.g. 1.3.5.7.9.24.68) and adding additional digits in a unique fashion (e.g. 1.3.5.7.9.24.68.1, 1.3.5.7.9.24.68.2, 1.3.5.7.9.24.68.1.1, etc.).   A GUID is a 32 character hexadecimal character string, such as {21EC2020-3AEA-1069-A2DD-08002B30309D}.  Either way, they are much less easy to identify visually that the simpler IDs used by most of today’s eCTD publishing tools.
  • Many new concepts are introduces such as “Sets” (the equivalent of version trees in a document management system), explicit ordering of elements (a concept not present in eCTD, although many people assume it is), and status (active vs. obsolete, applies to keywords as well as documents).

The bottom line is that RPS XML is only for the brave.  Most of those intrepid souls creating XML by hand for eCTD will have to give up that practice, and everyone will need to ensure that they have a publishing vendor who is highly knowledgeable concerning the standard and who is able to produce very high quality software.

For those of you who would like to see what RPS xml looks like, check out Example RPS Code:  BLA Multiple Sequence along with other Informational Documents (i.e. Plans, Rosters, RPS Technical Walkthrough, Implementation Guide, etc. on the RPS wiki.  

Article written by

I am GlobalSubmit’s director of professional services. My area of expertise is electronic document management and regulatory submissions, specifically eCTD. I have 12 years of biopharmaceutical industry experience in the US, Europe, Israel and Japan, concentrating on content management for Regulatory Affairs & Submissions, Clinical Development, and Manufacturing, and in submission assembly and publishing. I have worked with over 50 pharmaceutical clients. Previously, I was the practice lead for the Subject Matter Experts/Business Analysts group at First Consulting Group (FCG), now part of CSC. I supervised a group of 12 SMEs/BAs. I was instrumental in the development of the FirstDoc® suite. I developed the project delivery methodology for FirstDoc® R&D, including training materials, project plans, workshop methodology and materials, and requirements and design specifications. I managed the design and development of FirstDoc® GMP and FirstDoc® Legal. Please feel free to contact me at kathleen.clark@globalsubmit.com if you are interested in talking about eCTD products or consulting services, or if you have ideas for the blog.

2 Responses

Page 1 of 1
  1. joelfinkle
    joelfinkle May 28, 2010 at 6:13 PM | |

    Kathie,
    Nice basics article on RPS’s guts… but still, what’s a PORP?

    I’m pretty sure the “RP” stands for Regulated Product… but much of HL7′s naming is pretty opaque.

  2. joelfinkle
    joelfinkle May 28, 2010 at 6:15 PM | |

    Sorry — hit the enter too soon. What you left out is that the PORP is the top-level XML object in an RPS file, or more exactly,

Please comment with your real name using good manners.

Leave a Reply

You must be logged in to post a comment.