简体   繁体   English

获取非数字字符,然后在 texf 块的每一行上编号

[英]Get non-numeric characters then number on each line of a block of texf

I have some strings which can be in the following format:我有一些可以采用以下格式的字符串:

sometext moretext 01 text
text sometext moretext 002
text text 1 (somemoretext)
etc

I want to split these strings into following:我想将这些字符串拆分为以下内容:

  • text before the number and数字前的文字和
  • the number号码

For example:例如:

text text 1 (somemoretext)

When split will output:当 split 将输出:

text = text text文本 = text text
number = 1数字 = 1

Anything after the number can be discarded.数字之后的任何东西都可以丢弃。

preg_match('/[^\d]+/', $string, $textMatch);
preg_match('/\d+/', $string, $numMatch);

$text = $textMatch[0];
$num = $numMatch[0];

Alternatively, you can use preg_match_all with capture groups to do it all in one shot:或者,您可以将preg_match_all与捕获组一起使用以一次性完成所有操作:

preg_match_all('/^([^\d]+)(\d+)/', $string, $match);

$text = $match[1][0];
$num = $match[2][0];

Use preg_match_all() + if you wish to match every line use m modifier :如果您希望匹配每一行,请使用preg_match_all() + 使用m 修饰符

$string = 'sometext moretext 01 text
text sometext moretext 002
text text 1 (somemoretext)
etc';
preg_match_all('~^(.*?)(\d+)~m', $string, $matches);

All your results are in $matches array, which looks like this:您的所有结果都在$matches数组中,如下所示:

Array
(
    [0] => Array
        (
            [0] => sometext moretext 01
            [1] => text sometext moretext 002
            [2] => text text 1
        )
    [1] => Array
        (
            [0] => sometext moretext 
            [1] => text sometext moretext 
            [2] => text text 
        )
    [2] => Array
        (
            [0] => 01
            [1] => 002
            [2] => 1
        )
)

Output example:输出示例:

foreach ($matches[1] as $k => $text) {
    $int = $matches[2][$k];
    echo "$text => $int\n";
}

The other answers do not demonstrate the use of \D to match non-digit characters.其他答案没有演示使用\D来匹配非数字字符。 \D is the opposite of \d . \D\d相反。

* as a quantifier means zero or more and + means one or more. *作为量词表示零或多个, +表示一或多个。 Quantifiers immediately followed by ?量词紧跟在? are made "lazy" -- effectively they try to make the shortest qualifying match, but this has a negative impact on performance and should be avoided when possible.变得“懒惰”——实际上他们试图进行最短的排位赛,但这会对表现产生负面影响,应尽可能避免。

The ^ means the start of a line when the pattern has a m flag.当模式具有m标志时, ^表示一行的开始。

Code: ( Demo )代码:(演示

$text = 'sometext moretext 01 text
text sometext moretext 002
text text 1 (somemoretext)
etc';

preg_match_all('/^(\D*)(\d+)/m', $text, $matches);

var_export([
    'non-digit' => $matches[1],
    'digit' => $matches[2]
]);

Output:输出:

array (
  'non-digit' => 
  array (
    0 => 'sometext moretext ',
    1 => 'text sometext moretext ',
    2 => 'text text ',
  ),
  'digit' => 
  array (
    0 => '01',
    1 => '002',
    2 => '1',
  ),
)

If you want to discard potential spaces at the end of the non-numeric string, add ?如果要丢弃非数字字符串末尾的潜在空格,请添加? to make the first group lazy and match zero or more whitespace characters without capturing.使第一组变得懒惰并匹配零个或多个空白字符而不捕获。 ( Demo ) 演示

preg_match_all('/^(\D*?)\s*(\d+)/m', $text, $matches);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM