简体   繁体   English

使用xslt或regex或两者结合处理hl7类型消息(XSLT 1.0)

[英]Processing hl7 type message using xslt or regex, or combination of two (XSLT 1.0)

so I have this hl7 type message that I have to transform using either regex or xslt or combination of two. 所以我有这个hl7类型的消息,我必须使用正则表达式或xslt或两者的组合进行转换。

Format of this message is DateTime(as in YYYYMMDDHHMMSS)^UnitName^room^bed|) . 该消息的格式为DateTime(如YYYYMMDDHHMMSS中的^ UnitName ^ room ^ bed |) Each location is separated with a pipe, so each person can have one or multiple locations. 每个位置都用管道隔开,因此每个人可以有一个或多个位置。 And the messages looks like this( when a patient has only one location): 消息看起来像这样(当患者只有一个位置时):

20130602201605^Some Hospital^ABFG^411|

End xml result should look like this: 结束xml结果应如下所示:

<Location>
 <item>
  <when>20130602201605</when>
  <UnitName>Some Hospital</UnitName>
  <room>ABFG</room>
  <bed>411</bed>
 </item>
</Location>

I would probably use substring type of function if it was only one location. 如果只有一个位置,我可能会使用子字符串类型的函数。 The problem I am running into is when there is more than one. 我遇到的问题是不止一个。 I am relatively new to xslt and regex in general so I don't know how to use recursion in these instances. 一般来说,我对xslt和regex相对较新,所以我不知道如何在这些情况下使用递归。

So if I have a message like this with multiple locations: 因此,如果我在多个位置收到类似这样的消息:

20130601003203^GBMC^XXYZ^110|20130602130600^Sanai^ABC^|20130602150003^John Hopkins^J615^A|

The end result should be: 最终结果应为:

<Location>

 <item>
   <when>0130601003203</when>
   <UnitName>GBMC</UnitName>
   <room>XXYZ</room>
   <bed>110</bed>
 </item>

 <item>
  <when>20130602130600</when>
  <UnitName>Sanai</UnitName>
  <room>ABC</room>
  <bed></bed>
 </item>

 <item>
  <when>20130602150003</when>
  <UnitName>John Hopkins</UnitName>
  <room>J615</room>
  <bed>A</bed>
 </item>

</Location>

So how would I solve this? 那么我该如何解决呢? Thanks in advance. 提前致谢。

Your source message is in a string form, you need to create a parser that uses regex to split the message based on first pipes and then carat. 您的源消息采用字符串形式,您需要创建一个解析器,该解析器使用正则表达式根据第一个管道然后按克拉来拆分消息。 refer to Unable to parse ^ character which has my original code for the parser and the solution gives a different approach to it. 请参阅无法解析^字符 ,该字符具有我的解析器原始代码,解决方案为此提供了另一种方法。

After you have individual elements you need to add it to your xml as nodes. 在拥有各个元素之后,您需要将其作为节点添加到xml中。

Given that your Hl7 message is "|^~\\&" encoded and not in an XML format, it is not clear how you will be using an XSLT 1.0 processor for your task. 假设您的Hl7消息是“ | ^〜\\&”编码的并且不是XML格式,则不清楚如何将XSLT 1.0处理器用于任务。 Can you describe your processing pipeline in greater detail? 您能否更详细地描述您的处理管道? Your snippets are not complete messages, and it is not clear whether you will be starting with complete messages or attempting to parse isolated fields handed to a larger processing task through parameters or something. 您的代码段不是完整的消息,也不清楚是要从完整的消息开始还是要解析通过参数等传递给较大处理任务的孤立字段。

If your processing starts with a complete HL7 message, I would suggest looking into the HAPI project, or a similar set of libraries, to have the messages converted from |^~\\& to </> format, then invoking your XSLT on that version of the data. 如果您的处理以完整的HL7消息开头,则建议您查看HAPI项目或一组类似的库,以将消息从| ^〜\\&转换为</>格式,然后在该版本上调用XSLT数据。 (You could also use the HAPI libraries in a full-Java solution. In either case, there are code examples at the HAPI site and at an Apache site on HL7.) If you are not interested in using Java at all, but are open to partial non-XSLT solutions, there are other projects that provide similar serialization options (eg, Net::HL7 for Perl, nHAPI for VB/C#, etc.). (您也可以在完整的Java解决方案中使用HAPI库。在任何一种情况下,HAPI站点和HL7上的Apache站点都有代码示例。)如果您完全不愿意使用Java,但是很开放对于部分非XSLT解决方案,还有其他项目提供了类似的序列化选项(例如,Perl的Net :: HL7 ,VB / C#的nHAPI等)。

If you have isolated "|^~\\&" encoded data in an otherwise XML formatted file, then I would suggest looking into the str:tokenize function in the XSLT 1.0 exslt functions. 如果您将“ | ^〜\\&”编码的数据隔离在其他XML格式的文件中,则建议您查看XSLT 1.0 exslt函数中的str:tokenize函数。 (XSLT 2.0 has a built-in tokenize function.) You can have str:tokenize split your data on the field or component separators, then create elements using the tokenized substrings. (XSLT 2.0具有内置的标记化功能。)您可以使用str:tokenize在字段或组件分隔符上拆分数据,然后使用标记化的子字符串创建元素。

Here is a stylesheet 这是一个样式表

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:str="http://exslt.org/strings"
    extension-element-prefixes="str"
    version="1.0">

    <xsl:output method="xml" indent="yes"/>

    <xsl:template match="data">
        <Location>
        <xsl:for-each select="str:tokenize(.,'|')">
            <xsl:call-template name="handle-field">
                <xsl:with-param name="field" select="."/>
            </xsl:call-template>
        </xsl:for-each>
        </Location>
    </xsl:template>

    <xsl:template name="handle-field">
        <xsl:param name="field"/>
        <xsl:variable name="components" select="str:tokenize($field,'^')"/>
        <item>
            <when><xsl:value-of select="$components[1]"/></when>
            <UnitName><xsl:value-of select="$components[2]"/></UnitName>
            <room><xsl:value-of select="$components[3]"/></room>
            <bed><xsl:value-of select="$components[4]"/></bed>
        </item>
    </xsl:template>

</xsl:stylesheet>

that runs over this input 在此输入上运行

<?xml version="1.0" encoding="UTF-8"?>
<data>20130601003203^GBMC^XXYZ^110|20130602130600^Sanai^ABC^|20130602150003^John Hopkins^J615^A|</data>

to produce this output with xsltproc: 用xsltproc产生此输出:

<?xml version="1.0"?>
<Location>
  <item>
    <when>20130601003203</when>
    <UnitName>GBMC</UnitName>
    <room>XXYZ</room>
    <bed>110</bed>
  </item>
  <item>
    <when>20130602130600</when>
    <UnitName>Sanai</UnitName>
    <room>ABC</room>
    <bed/>
  </item>
  <item>
    <when>20130602150003</when>
    <UnitName>John Hopkins</UnitName>
    <room>J615</room>
    <bed>A</bed>
  </item>
</Location>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM