简体   繁体   English

如何实现在文本中解析自己的html标签

[英]How to realize parsing of own html tags in text

I have task to realize own tags that making text bold , underline or strikethrough with any nesting. 我有任务要实现自己的标签,使文本粗体 ,下划线或删除任何嵌套。 Like a 像一个

*bold text* _underlinetext_ -strikethrough-

Also I need to make own hyperlink like a 另外我需要像a一样创建自己的超链接

[link | http://stackoverflow.com]

The first thought that came - it apply regexp. 第一个想法来了 - 它适用于正则表达式。 The code: 代码:

View.prototype.parseText = function(text) {

text = text.replace(/\*([^\*]+)\*/g, '<b>$1</b>');
text = text.replace(/\_([^\_]+)\_/g, '<u>$1</u>');
text = text.replace(/\-([^\-]+)\-/g, '<s>$1</s>');
text = text.replace(/\[([^\|].+)\|(.+)\]/g, '<a href="$2">$1</a>');

return text;};

It's working but I need extensibility. 它工作但我需要可扩展性。 Regex is not a good idea, since it's hardcoded. 正则表达式不是一个好主意,因为它是硬编码的。 How to realize that task with finite state machine (or any jQuery plugin))? 如何使用有限状态机(或任何jQuery插件)实现该任务? I would be grateful for any help. 我将不胜感激任何帮助。

No matter what you do, to extend your tagging system, you will need to: 1. define the tag, and 2. replace it with equivalent HTML. 无论你做什么,为了扩展你的标记系统,你需要:1。定义标记,然后2.用等效的HTML替换它。

Even if you write your own parser in js, at the end of the day, you will still have to do the 2 above steps, so it is no more extensible than what you have now. 即使您在js中编写自己的解析器,在一天结束时,您仍然必须执行上述2个步骤,因此它不再具有您现在所拥有的可扩展性。

Regex is the tool for the job unless you have other requirements (ie as replace only within such an such element, but do something else in another element, which requires parsing). 正则表达式是工作的工具,除非你有其他要求(即只在这样的元素中替换,但在另一个元素中做其他事情,这需要解析)。

You can wrap your regex calls in a function and simply add regex replaces to that function when you need to extend the feature. 您可以在函数中包装正则表达式调用,并在需要扩展该功能时简单地将正则表达式替换添加到该函数。 If needed in several pages, add it in an external js file. 如果需要多个页面,请将其添加到外部js文件中。

function formatUserContent(text)
{
  text = text.replace(/\*([^\*]+)\*/g, '<b>$1</b>');
  text = text.replace(/\_([^\_]+)\_/g, '<u>$1</u>');
  text = text.replace(/\-([^\-]+)\-/g, '<s>$1</s>');
  text = text.replace(/\[([^\|].+)\|(.+)\]/g, '<a href="$2">$1</a>');
  return text;
}

Once that's done, extending the feature is as simple as adding 完成后,扩展功能就像添加一样简单

text = text.replace(/\+([^\-]+)\+/g, '<em>$1</em>');

in the body of the function. 在功能的主体。 I doubt that rolling out your own finite state machine will be any easier to extend, quite the opposite. 我怀疑推出自己的有限状态机将更容易扩展,恰恰相反。

Spending hours on a finite state machine in the hope that it might save a few minutes at some unknown time in the future is just not a good investment... unless of course you want an excuse to write a finite state machine, in which case, go ahead. 在有限的状态机上花费数小时,希望它可能在未来的某个未知时间节省几分钟,这不是一个好的投资......除非你当然想借口写一个有限状态机,在这种情况下, 前进。

As a side note, I would recommend making your regex a little more fool proof. 作为旁注,我建议让你的正则表达式更加简单。

text = text.replace(/\[([^\|].+)\|\s*(http://.+)\]/g, '<a href="$2">$1</a>');

(Unless you have UI elements that will do the job for the user) (除非你有UI元素可以为用户完成工作)

I can suggest you the following implementation http://jsfiddle.net/NwRCm/5/ 我可以建议你以下实现http://jsfiddle.net/NwRCm/5/

It uses the State design pattern (little modified because of JavaScript and the purpose). 它使用State设计模式(由于JavaScript和目的很少修改)。 Under the surface all states are implemented with regular expressions but that's the most efficient way, in my opinion. 在表面下,所有状态都使用正则表达式实现,但在我看来,这是最有效的方式。

/* View definition */

function View(container) {
    this.container = container;
    this._parsers = [];
    this._currentState = 0;
};

View.prototype.parse = function(text) {

    var self = this;
    this._parsers.forEach(function (e) {
        self._parse(e);
    });

    return this.container.innerHTML;

};

View.prototype._parse = function (parser) {
    var text = parser.parse(this.container.innerHTML);
    this.container.innerHTML = text;
    return text;
};

View.prototype.nextState = function () {
    if (this._currentState < this._parsers.length) {
        return this._parse(this._parsers[this._currentState++]);
    }
    return null;
};

View.prototype.addParser = function (parser) {
    if (parser instanceof Parser) {
        return this._parsers.push(parser);
    } else {
        throw 'The parser you\'re trying to add is not an instance of Parser';
    }
};
/* end of the View definition */

/* Simulation of interface */
function Parser() {};

Parser.prototype.parse = function () {
    throw 'Not implemented!';
};

/* Implementation of bold parser */
function BoldParser() {};

BoldParser.prototype = new Parser();

BoldParser.prototype.parse = function (text) {
    text = text.replace(/\*([^\*]+)\*/g, '<b>$1</b>');
    return text;
};

/* Implementation of underline parser */
function UnderlineParser() {};

UnderlineParser.prototype = new Parser();

UnderlineParser.prototype.parse = function (text) {
    text = text.replace(/\_([^\_]+)\_/g, '<u>$1</u>');
    return text;
};

/* Link parser */
function LinkParser() {};

LinkParser.prototype = new Parser();

LinkParser.prototype.parse = function (text) {
    text = text.replace(/\[([^\|].+)\|(.+)\]/g, '<a href="$2">$1</a>');
    return text;
};


var v = new View(document.getElementById('container'));
v.addParser(new UnderlineParser());
v.addParser(new BoldParser());
v.addParser(new LinkParser());
v.nextState();
v.nextState();
v.nextState();

​Let me look a little deeper in the implementation. 让我更深入地了解一下实施情况。 First we have a base "class" (constructor function) View. 首先我们有一个基础“类”(构造函数)视图。 Each view has it's base container and a list of parsers, it also remember which parser should be applied next. 每个视图都有它的基本container和解析器列表,它还记得下一个应该应用哪个解析器。

After that we have the "abstract class" (constructor function with method in the prototype which throws an exception) named Parser it defines a method parse which must be implemented by each parser. 之后我们有了“抽象类”(构造函数与原型中的方法抛出异常),名为Parser它定义了一个必须由每个解析器实现的方法parse

After that we just define different concrete parsers and add them to the view. 之后我们只定义不同的具体解析器并将它们添加到视图中。 We can pass the states one by one ( View 's nextState ) or pass all states in a single method call ( View 's parse ). 我们可以逐个传递状态( ViewnextState )或在单个方法调用中传递所有状态( Viewparse )。 We can dynamically add new parsers. 我们可以动态添加新的解析器。

A thing which can be approved is including flyweight factory for managing the parsers. 可以批准的东西包括用于管理解析器的flyweight工厂。

Approach with the "abstract" constructor function is also very useful when implementing different patterns like Template method for example. 在实现像Template方法这样的不同模式时,使用“抽象”构造函数的方法也非常有用。

  • Edit may be there's a bit overhead because of the definition of all these constructor functions and objects. 由于所有这些构造函数和对象的定义,编辑可能会有一点开销。 Everything can be done with callbacks ie each state to be a different function. 一切都可以通过回调来完成,即每个状态都是不同的功能。 I used this approach because I was looking for the easiest for understanding, clear from language specific features answer. 我使用这种方法是因为我正在寻找最简单的理解,从语言特定的功能答案清楚。 I hope that I achieved it. 我希望我实现了它。

Perhaps you want to use an existing library, for instance the Markdown library at http://www.showdown.im/ 也许您想使用现有的库,例如http://www.showdown.im/上的Markdown库

If you prefer to write your own, then I'd recommend looking at the source code to see how it's parsed (and maybe the source code for Markdown processors in other languages). 如果您更喜欢自己编写,那么我建议您查看源代码以了解它是如何解析的(也许是Markdown处理器在其他语言中的源代码)。 Some recommendations for you: 一些建议:

  • Use jQuery for manipulating the markup 使用jQuery来操作标记
  • Don't use regular expressions for parsing a language. 不要使用正则表达式来解析语言。 You'll run into problems when markup elements are mixed together. 当标记元素混合在一起时,您会遇到问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM