简体   繁体   English

复制文本并替换 XSL 中的字符

[英]Copy text and replace character in XSL

I'm transforming a DITA document to a simplified, formatting-based XML to be used as an import into Adobe InDesign.我正在将 DITA 文档转换为简化的、基于格式的 XML 以用作导入到 Adobe InDesign 中。 My transformation is going really well, except for one element which omits the text in the output. The element is codeblock .我的转换非常顺利,除了一个元素省略了 output 中的文本。该元素是codeblock When I don't have a template specifying it at all, the element and any child elements are passed through to the new XML document, but none of the text is passed through.当我根本没有指定它的模板时,元素和任何子元素都会传递到新的 XML 文档,但不会传递任何文本。 This element should be passed through with text and child elements like every other element in my document for which a specific template is not defined.该元素应该与文本子元素一起传递,就像我的文档中未定义特定模板的所有其他元素一样。 There's nothing anywhere else in the XSL stylesheet that specifies codeblock or any of its attributes. XSL 样式表中没有任何其他地方指定codeblock或其任何属性。 I am completely stumped and cannot figure out what's going on here.我完全被难住了,无法弄清楚这里发生了什么。

It is also worth noting that a number of inline elements ( cmdname , parmname , userinput , etc.) are converted to bold on output. The downstream XML is for formatting and does not need to know semantic context.还值得注意的是,一些内联元素( cmdnameparmnameuserinput等)在 output 上被转换为bold 。下游 XML 用于格式化,不需要知道语义上下文。

This is what I'm trying to pass through:这就是我想要通过的:

<codeblock>This is the first line of my code block.
This is my second line to prove that line feeds are preserved.
This line proves that <parmname>child elements</parmname> are passed through.</codeblock>

With no template defined for codeblock , this is what I get as a result:没有为codeblock定义模板,这就是我得到的结果:

<codeblock><bold/></codeblock>

The actual result I want is:我想要的实际结果是:

<codeblock>This is the first line of my code block.&#8232;This is my second line to prove that line feeds are preserved.&#8232;This line proves that <bold>child elements</bold> are passed through.</codeblock>

I need the line feeds replaced with character entities because InDesign sees any new line that does not start with an element as a column break.我需要用字符实体替换换行符,因为 InDesign 会将任何不以元素开头的新行视为分栏符。 My goal was to simply replace the line feed character with &#8232;我的目标是简单地将换行符替换为&#8232; with the following template:使用以下模板:

<xsl:template match="codeblock//text()">
  <xsl:analyze-string select="." regex="(&#10;)">
    <xsl:matching-substring>
      <xsl:choose>
        <xsl:when test="regex-group(1)">&#8232;</xsl:when>                
      </xsl:choose>
    </xsl:matching-substring>
  </xsl:analyze-string>
</xsl:template>

But what I get is:但我得到的是:

<codeblock>&#8232;<bold/>&#8232;</codeblock>

I was finally able to pass the text through using this template:我终于能够使用此模板传递文本:

<xsl:template match="codeblock//text()">
  <xsl:copy/>
</xsl:template>

Success, Incidentally.成功,顺带一提。 I have to match at any level under codeblock so it includes the text of the child parmname element too.我必须在代码块下的任何级别进行匹配,因此它也包含子 parmname 元素的文本。 Since I was able to successfully pass it through with <xsl:copy> , I tried this to pass the text through while replacing the line feed at the same time:由于我能够使用<xsl:copy>成功传递它,因此我尝试在替换换行符的同时传递文本:

<xsl:template match="codeblock//text()">
  <xsl:copy>
    <xsl:analyze-string select="." regex="(&#10;)">
      <xsl:matching-substring>
        <xsl:choose>
          <xsl:when test="regex-group(1)">&#8232;</xsl:when>                
        </xsl:choose>
      </xsl:matching-substring>
    </xsl:analyze-string>
  </xsl:copy>
</xsl:template>

But now it won't replace the new line feed.但现在它不会取代新的换行符。 Instead, I get this (which is what I would expect to get without any template defined):相反,我得到了这个(这是我希望在没有定义任何模板的情况下得到的):

<codeblock>This is the first line of my code block.
This is my second line to prove that line feeds are preserved.
This line proves that <bold>child elements</bold> are passed through.</codeblock>

I know this is a long and somewhat convoluted question.我知道这是一个很长而且有点令人费解的问题。 I just feel like if I could resolve the issue of why it's not passing the text through in the first place, the rest would be fairly straightforward.我只是觉得如果我能解决为什么它不首先传递文本的问题,那么 rest 将相当简单。 And I'm sorry, I can't provide my source XML or XSL as it's under NDA, but if you need more, let me know and I'll try to provide it.很抱歉,我无法提供我的源代码 XML 或 XSL,因为它处于 NDA 之下,但如果您需要更多,请告诉我,我会尽力提供。 (My XSL stylesheets are made up of 12 different files, so there's no way for me to provide all of it, even if genericized.) (我的 XSL 样式表由 12 个不同的文件组成,所以我无法提供所有的文件,即使是通用的。)

Any suggestions for what I might look for in my stylesheet that might explain why the text is coming through or any suggestions for how to force it through as I did with <xsl:copy> while still replacing the line feeds will be greatly appreciated!任何关于我可能在我的样式表中寻找的内容的建议都可以解释为什么文本会出现,或者任何关于如何强制它通过的建议,就像我对<xsl:copy>所做的那样,同时仍然替换换行符,将不胜感激!

Edited to add: It has occurred to me that the reason it's not doing the replacement is that it looks like it's not actually a line feed character.编辑添加:我想到它没有进行替换的原因是它看起来实际上不是换行符。 It's more like a new line in the code than a line feed character (or hard return) in the text.它更像是代码中的新行,而不是文本中的换行符(或硬回车)。 I think I might need to normalize the text while inserting the &#8232;我想我可能需要在插入&#8232;时规范化文本。 character at the end of each line.每行末尾的字符。 Still investigating, but suggestions are welcome!仍在调查中,但欢迎提出建议!

Edited with update: Thanks to the post How to detect line breaks in XSLT , I have gotten closer, but still not quite where I need to be.编辑更新:感谢XSLT 中的 How to detect line breaks 帖子,我已经接近了,但仍然不是我需要的地方。 With this code, I'm able to detect line feeds in the XML and insert the line break character for InDesign:使用此代码,我能够检测 XML 中的换行符并为 InDesign 插入换行符:

<xsl:template match="codeblock//text()">
  <xsl:for-each select="tokenize(., '\n?')[.]">
    <xsl:sequence select="."/>
    <xsl:text>&#8232;</xsl:text>
  </xsl:for-each>
</xsl:template>

However, it also inserts the line feed character at the end of the string, even if it's not the end of the line.但是,它还会在字符串末尾插入换行符,即使它不是行尾也是如此。 For instance, I now get:例如,我现在得到:

<codeblock>This is the first line of my code block.&#8232;This is my second line to prove that line feeds are preserved.&#8232;This line proves that &#8232;<bold>child elements&#8232;</bold> are passed through.&#8232;</codeblock>

I don't want the line feed character in front of the 'bold' start and end tags or the codeblock end tag.我不希望“粗体”开始和结束标记或codeblock结束标记前面的换行符。 I just want it to appear where there's an actual new line.我只是想让它出现在有实际换行的地方。 I tried replacing \r but that just ignored the new lines and just put it in front of the tags.我尝试替换\r但只是忽略了新行并将其放在标签前面。 Does anyone know of another escape character that would work here?有谁知道另一个可以在这里工作的转义字符?

A very long question - yet it's still not clear what exactly you are asking (and no reproducible example, either).一个长的问题 - 但仍然不清楚你到底在问什么(也没有可重现的例子)。

If - as it seems - you want to replace newline characters with the line separator character in all text nodes under the codeblock element, you should be able to do simply:如果 - 看起来 - 你想在codeblock元素下的所有文本节点中用行分隔符替换换行符,你应该能够简单地做到:

<xsl:template match="codeblock//text()">
    <xsl:value-of select="translate(., '&#10;', '&#8232;')" />
</xsl:template>

If this doesn't work, then either you have an overriding template or the text does not contain newline characters.如果这不起作用,那么要么您有一个覆盖模板,要么文本不包含换行符。 You can test for the first case by changing the template to say:您可以通过将模板更改为以下内容来测试第一种情况:

<xsl:template match="codeblock//text()">BINGO</xsl:template>

and observe the result to see if all targeted text nodes are changed to "BINGO".并观察结果,看是否所有目标文本节点都更改为“BINGO”。 To test for the second case, you can analyze the text character-by-character using the string-to-codepoints() function.要测试第二种情况,您可以使用string-to-codepoints() function 逐个字符地分析文本。

Your template is missing xsl:non-matching-substring to process the non-matching sections of the text node.您的模板缺少xsl:non-matching-substring来处理文本节点的不匹配部分。

<xsl:template match="codeblock//text()">
  <xsl:analyze-string select="." regex="\n">
    <xsl:matching-substring>
      <xsl:text>&#8232;</xsl:text>                
    </xsl:matching-substring>
    <xsl:non-matching-substring>
      <xsl:value-of select="."/>
    </xsl:non-matching-substring>
  </xsl:analyze-string>
</xsl:template>

However, michael.hor257k's answer is more simple, as you don't need xsl:analyze-string to just replace a all substrings.但是, michael.hor257k 的答案更简单,因为您不需要xsl:analyze-string来替换所有子字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM