简体   繁体   English

Gulp工具通过CSS类或xml标签删除代码

[英]Gulp tool to remove code by either css class or xml tag

Im trying to create a tool that removes specific xml tags that runs in a gulp process. 我试图创建一个工具来删除在gulp进程中运行的特定xml标记。 The idea being that for front-end we have dummy content. 这个想法是对于前端我们有虚拟内容。 However in the next stage for back-end integration it takes that dummy content wrapped in an xml tag and dynamically inserts some back-end code with an variable based on the xml tag name. 但是,在下一阶段的后端集成中,它将伪内容包装在xml标签中,并根据xml标签名称动态插入一些带有变量的后端代码。 For this case there is php, but the idea is to create a tool to insert any back-end code 对于这种情况,有php,但是其想法是创建一个工具来插入任何后端代码

I have come across gulp-remove-code , but the issue is its all hard-coded names and the regex inside the index.js from the node modules matches based on comments with specific spaces. 我遇到过gulp-remove-code ,但是问题是它的所有硬编码名称和节点模块中index.js内的正则表达式基于带有特定空格的注释匹配。 "

In addition i have looked at gulp-inject-string to place before the tags new content. 此外,我还查看了gulp-inject-string放置在标签新内容之前的情况。 So the final idea is to tag the xml tag name, inject new code above the tags, then remove everything in the tags. 因此,最终的想法是标记xml标记名称,在标记上方注入新代码,然后删除标记中的所有内容。

//markup.html

<div class="home">
  // some text
  <div class="home__text">
    <cms_home_text>
      My dummy text
    </cms_home_text>
  </div>

  // an image
  <div class="home__image">
    <cms_home_image>
     <img src="someImage.png" alt="some alt" />
    </cms_home_image>
  </div>

  // a link
  <div class="home__link">
    <cms_home_link1>
     <a href="someLink1.html">here</a>
    </cms_home_link1>
  </div>

  // another link
  <div class="home__link">
    <cms_home_link2>
     <a href="someLink2.html">here</a>
    </cms_home_link2>
  </div>
</div>

becomes

//markup.php

<div class="home">
  // some text
  <div class="home__text">
    <?php $cms_home_text ?>
  </div>

  // an image
  <div class="home__image">
    <img src="<?php $cms_home_image ?>" alt="<?php $cms_home_image_alt ?>" />
  </div>

  // a link
  <div class="home__link">
    <a href=<?php $cms_home_link1 ?>"> 
      <?php $cms_home_link1_text ?> 
    </a>
  </div>

  // another link
  <div class="home__link">
    <a href=<?php $cms_home_link2 ?>"> 
      <?php $cms_home_link2_text ?> 
    </a>
  </div>
</div>

I tried a few things but got this working I think as you want. 我尝试了一些尝试,但是我认为可以按照需要进行这项工作。

const gulp = require("gulp");
const fs = require('fs');

const jsdom = require("jsdom");
const { JSDOM } = jsdom;

// hard-coded here but could be a gulp.src stream if you have more than one file to translate
const html = 'markup.html';

gulp.task('default', [addPHP]);

gulp.task('addPHP', function () {

  var dirty;
  var temp;
  var clean;

  dirty = fs.readFileSync(html, 'utf8');

  var frag = new JSDOM(dirty);

  console.dir(frag.window.document.body.children[0].children);

  var HLinks = frag.window.document.querySelectorAll("div.home__link");
  var HImages = frag.window.document.querySelectorAll("div.home__image");
  var HTexts = frag.window.document.querySelectorAll("div.home__text");

//   <div class="home__text">
//      <cms_home_text>
//        My dummy text
//      </cms_home_text>
//    </div>

//    <div class="home__text">
//      <?php $cms_home_text ?>
//     </div>

  HTexts.forEach(function (el, index, list) {
    console.log(el.className);

    var cmsTagName = el.childNodes[1].nodeName.toLowerCase();
    console.log(cmsTagName);

    var innerLink = frag.window.document.createTextNode("<?php $" + cmsTagName + "_text ?>");
    el.replaceChild(innerLink, el.childNodes[1]);
  });

//   <cms_home_image>
//      <img src="someImage.png" alt="some alt" />
//   </cms_home_image>

//  <img src="<?php $cms_home_image ?>" alt="<?php $cms_home_image_alt ?>" />

  HImages.forEach(function (el, index, list) {
    console.log(el.className);
    var cmsTagName = el.childNodes[1].nodeName.toLowerCase();
    console.log(cmsTagName);
    var temp = frag.window.document.createElement("img");
    temp.src = "<?php $" + cmsTagName + " ?>";
    temp.alt = "<?php $" + cmsTagName + "_alt ?>"

    el.replaceChild(temp, el.childNodes[1]);
  });

//   <cms_home_link1>
//        <a href="someLink1.html">here</a>
//  </cms_home_link1>

//   <a href="<?php $cms_home_link1 ?>">
//       <?php $cms_home_link1_text ?>
//   </a>

  HLinks.forEach(function (el, index, list) {
    console.log(el.className);

    var cmsTagName = el.childNodes[1].nodeName.toLowerCase();
    console.log(cmsTagName);
    var tempLink = frag.window.document.createElement("a");
    tempLink.href = "<?php $" + cmsTagName + " ?>";

    var innerLink = frag.window.document.createTextNode("<?php $" + cmsTagName + "_text ?>");
    tempLink.appendChild(innerLink);

    el.replaceChild(tempLink, el.childNodes[1]);
  });

  // because createTextNode changes <> to htmlEntities
  var cleaned = frag.window.document.querySelector("div.home").outerHTML.replace(/&lt;/gm, "<").replace(/&gt;/gm, ">");


  fs.writeFileSync("markup.php", cleaned, 'utf8');
  return;
})

I considered just a regExp approach but that is probably too brittle as you mentioned. 我只考虑了regExp方法,但是正如您提到的那样,它可能太脆弱了。 I also considered sanitize-html which gets you a long ways towards your goal and is handy to know about. 我还考虑过sanitize-html ,它可以使您很远地实现目标,并且很容易知道。

There are other html/dom parsers out there like htmlparser and xmldom but jsdom seemed the easiest to work with for me. 还有其他html / dom解析器,例如htmlparserxmldom,但是jsdom似乎对我来说最容易使用。

The main brittle part of this code is the line: 该代码的主要易碎部分是该行:

var cmsTagName = el.childNodes[1].nodeName.toLowerCase();

appearing in each of the forEach calls. 出现在每个forEach调用中。 If your dom structure varies from your example so that the tags are not el.childNodes 1 you will have to modify this code. 如果您的dom结构与示例有所不同,因此标记不是el.childNodes 1 ,则必须修改此代码。 And watch out for empty textNodes. 并留意空的textNodes。 I seem to recall a selector that would skip empty textNodes but I can't recall it just now. 我似乎想起了一个选择器,该选择器将跳过空的textNodes,但我暂时不记得它。

Let me know if this works for you. 让我知道这是否适合您。

The buffer comes from a gulp stream that gets file.contents to pass to the function. 缓冲区来自获得文件。内容传递到函数的gulp流。 Then while the contents is not null, we get the amount of matches in each document, then loop through and take the object matches for the reg-ex. 然后,当内容不为null时,我们获得每个文档中的匹配项数量,然后遍历并获取reg-ex的对象匹配项。

We extract the full match, ID and type, then replace with the dynamic php code so the front-end can have custom variables, with specific responses for text,images and links. 我们提取完整的匹配项,ID和类型,然后替换为动态php代码,以便前端可以具有自定义变量,并具有针对文本,图像和链接的特定响应。

function applyReplacements(buffer) {
        var contents = buffer.toString('utf8');
        const regex = /<(cms_.*)(.\b[^>]*)\b[^>]*>((.|\n)*?)<\/\1>/g;
        let m,cmsReplace,finalMatch,cmsID,cmsType;
        if (contents.length > 0) {
            while ((m = regex.exec(contents)) !== null) {
                let target = contents.match(regex).length;
                for(let i=0;i< target;i++){
                    //This is necessary to avoid infinite loops with zero-width matches
                    if (m.index === regex.lastIndex) {
                        regex.lastIndex++;
                    }    
                    // The result can be accessed through the `m`-variable.
                    m.forEach((match, groupIndex) => {
                        if(groupIndex == 0){
                            cmsReplace = match;
                        }
                        else if(groupIndex == 1) {
                            cmsID = match;
                        }
                        else if(groupIndex == 2) {
                            cmsType = match.split('"')[1];
                            // provide the final replacment variables 
                            if(cmsType == "cmsImage") {
                                finalMatch = '<img src="<?php $'+cmsID+'?>" alt="<?php $'+cmsID+'_alt ?>" width="100%" height="100%" />'; 
                            } else if(cmsType == "cmsLink") {
                                finalMatch = '<a href="<?php $'+cmsID+' ?>"><?php $'+cmsID+'_text ?></a>'; 
                            } else {
                                finalMatch = '<?php $'+cmsID+' ?>'; 
                            }
                            contents = contents.replace(cmsReplace,finalMatch);
                        }
                    });
                }
            }
        }
        return new Buffer(contents);
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM