简体   繁体   English

使用 javascript 导出表时剥离 HTML 标签

[英]Strip HTML tags while exporting table using javascript

i need to export datatable into excel csv file.我需要将数据表导出到 excel csv 文件中。 But when im exporting the table, the excel data comes with HTML tags like, tags or whaterver inside the datatable.但是当我导出表格时,excel 数据带有 HTML 标签,例如数据表中的标签或其他任何内容。 its actually a opensource project called 'leantime' from github. I tried a lot.它实际上是一个来自 github 的名为“leantime”的开源项目。我尝试了很多。 but i can't find anything related to this issue.但我找不到与此问题相关的任何内容。 Below mentioned example will be the column data if i export a file.如果我导出文件,下面提到的示例将是列数据。 <a class="ticketModal" href="http://localhost/pmt/tickets/showTicket/20#subtasks">test task - TEST ASSIGNER | Plan Hrs:2 | Hrs Left: 4</a>

Check the javascript code, & suggest me a code to strip html tags检查 javascript 代码,并建议我一个代码来剥离 html 标签

DataTable.ext.buttons.csvHtml5 = {
        bom: !1,
        className: "buttons-csv buttons-html5",
        available: function() {
            return window.FileReader !== undefined && window.Blob
        },
        text: function(dt) {
            return dt.i18n("buttons.csv", "CSV")
        },
        action: function(e, dt, button, config) {
            this.processing(!0);
            var output = _exportData(dt, config).str,
                info = dt.buttons.exportInfo(config),
                charset = config.charset;
            config.customize && (output = config.customize(output, config, dt)), charset = !1 !== charset ? (charset = charset || document.characterSet || document.charset) && ";charset=" + charset : "", config.bom && (output = String.fromCharCode(65279) + output), _saveAs(new Blob([output], {
                type: "text/csv" + charset
            }), info.filename, !0), this.processing(!1)
        },
        filename: "*",
        extension: ".csv",
        exportOptions: {
            
        },
        fieldSeparator: ",",
        fieldBoundary: '"',
        escapeChar: '"',
        charset: null,
        header: !0,
        footer: !1
    }

I want my exported csv file contains only a plain data (without html tags).我希望我导出的 csv 文件只包含一个纯数据(没有 html 标签)。 Give me idea to change code to prevent html tags from the file给我想法更改代码以防止文件中的 html 标签

I am not familiar with the DataTable.我不熟悉数据表。 You'd need to iterate over each row and cell, and convert the cell HTML text to plain text.您需要遍历每一行和单元格,并将单元格 HTML 文本转换为纯文本。

You can use this function to strip HTML from HTML text:您可以使用此 function 从 HTML 文本中删除 HTML:

    /**
     * Strip HTML and return an escaped text string for use in HTML attribute, such as title
     *
     * @param {String} HTML str
     */
    const htmlToText = function (str) {
      return str
        .replace(/\s+/g, ' ')
        .replace(/<\/(p|li|ul|ol)>/gi, ' ')
        .replace(/<(ul|ol|br[ \/]*)>/gi, ' ')
        .replace(/<li>/gi, ' • ')
        .replace(/<[a-z]+[^>]*>/gi, '')
        .replace(/<\/[a-z]+>/gi, '')
        .replace(/\s{2,}/g, ' ')
        .replace(/^\s+/, '')
        .replace(/\s+$/, '');
    };

You could add support for HTML entities as well, such as:您还可以添加对 HTML 实体的支持,例如:

        .replace(/&lt;/g, '<')
        .replace(/&gt;/g, '>')
        .replace(/&amp;/g, '&')
        .replace(/&copy;/g, '©')
        .replace(/&reg;/g, '®')
        .replace(/&trade;/g, '™')
        .replace(/&(#(?:x[0-9a-f]+|\d+));/gi, function(m, c1) {
          return String.fromCharCode(c1[1].toLowerCase() === "x"
            ? parseInt(c1.substr(2), 16)
            : parseInt(c1.substr(1), 10)
          );
        })

There are some unsupported corner cases:有一些不受支持的极端情况:

  • <h2 title="Use <this>">Use this</h2> -- no support for unescaped brackets inside attributes, example should be written as <h2 title="Use &lt;this&gt;">Use this</h2> <h2 title="Use <this>">Use this</h2> -- 属性内不支持未转义的括号,示例应写成<h2 title="Use &lt;this&gt;">Use this</h2>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM