简体   繁体   English

在PHP中删除文本字符串的一部分

[英]Deleting portion of a text string in PHP

I've constructed a form, but some rows of the form can potentially be returned blank, with default values. 我已经构建了一个表单,但是表单的某些行可能会返回空白,并带有默认值。 I'm trying to find a way of searching the form output and then deleting the bit which I know is not needed - which looks like: 我正试图找到一种搜索表单输出的方法,然后删除我不知道不需要的位 - 看起来像:

<tr bgcolor="#FFFFFF">
<td>2E</td>
<td id="8003">-800</td>
</tr>

I've used str_replace() effectively on a couple of bits, but my major problem is that bgcolor="#FFFFFF" can CHANGE to different hex values and also 我在几个位上有效地使用了str_replace(),但我的主要问题是bgcolor =“#FFFFFF”可以改变为不同的十六进制值,也是

I could write a str_replace() for every possible outcome I guess, but is there a preg_replace solution for anything like this? 我可以为每个可能的结果写一个str_replace(),但是有没有像这样的preg_replace解决方案? It would have to be a pretty complicated regular expression. 它必须是一个非常复杂的正则表达式。

You can use regular expression replacement with preg_replace() . 您可以使用preg_replace()替换正则表达式。 For example, to remove a bgcolor attribute that may or may not be there with a variable colour string: 例如,要删除可能存在或不存在可变颜色字符串的bgcolor属性:

$s = preg_replace('! bgcolor="#[0-9a-fA-F]{6}"!', '', $s);

But, as always, it's not recommended to use regular expressions to parse or process HTML. 但是,与往常一样,不建议使用正则表达式来解析或处理HTML。 Lots of things can go wrong with this: 很多事情都可能出错:

  • 3 letter colour code; 3个字母的颜色代码;
  • single quotes on attribute; 属性上的单引号;
  • no quotes on attribute; 没有关于属性的引用;
  • variable white space; 可变白空间;
  • uppercase attribute; 大写属性;
  • colour names; 颜色名称;
  • rgb(N,N,N) and other legal formats; rgb(N,N,N)和其他合法格式;
  • and so on. 等等。

And that's just for a limited subset of your problem. 这仅仅是针对您问题的有限子集。

It's far more robust to use DOM processing methods, of which there are several variants in PHP. 使用DOM处理方法要强得多,其中PHP有几种变体。 See Parse HTML With PHP And DOM . 请参阅使用PHP和DOM解析HTML

Couldn't you generate the correct html in php without the need to alter it later with string replacement? 难道你不能在PHP中生成正确的html而不需要在以后用字符串替换来改变它吗?

Maybe with some IF ELSE statement. 也许有一些IF ELSE声明。

It seems to me a better approach. 在我看来,这是一个更好的方法。

A regex to match hexadecimal strings is actually quite easy: 匹配十六进制字符串的正则表达式实际上非常简单:

/[0-9a-fA-F]+/

You'll probably hear that you should use a HTML parser to remove the unwanted nodes - maybe you should, but if you know what the input string is going to be like, then maybe not. 您可能听说过应该使用HTML解析器来删除不需要的节点 - 也许你应该这样做,但是如果你知道输入字符串是什么样的话,那么可能不会。

To match that first line in your example, you'd need this regex: 要匹配示例中的第一行,您需要此正则表达式:

preg_replace("/<tr bgcolor=\"#[0-9a-fA-F]+\">/", '', $string)

Can you not just check whether the field is "blank" in the code before you get to trying to display it? 您是否可以在尝试显示之前检查代码中的字段是否为“空白”? or put in some logic to not output that if it's blank, don't output it? 或者输入一些逻辑不输出如果它是空白的,不输出它?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM