简体   繁体   English

将字符串拆分为多个部分,同时保留所有非单词字符

[英]Splitting string into sections while maintaining all non-word characters

I'm working on an encryption function just for fun (for a non-production environment). 我正致力于加密功能,只是为了好玩(对于非生产环境)。 Currently running my encrypt function like this: 目前正在运行我的加密函数,如下所示:

encrypt("This is a string.");   

Produces the following string: 生成以下字符串:

GnulHynkAfdsGknp AfdsGknp Wgbf GknpLnugBuipAfdsCbhgByfg.

This is perfect, exactly what I wanted and expected - however, now I'm trying to write a decrypt function. 这是完美的,正是我想要和期望的 - 但是,现在我正在尝试编写解密函数。 Every character that is encrypted will have a single capital letter followed by 3 non-capital letters (As you can see from the example above). 每个加密的字符都有一个大写字母后跟3个非大写字母(正如您从上面的例子中看到的那样)。

My plan was to run preg_split() to get the different letters of the string. 我的计划是运行preg_split()来获取字符串的不同字母。

Here is my current PHP code (pattern ([AZ][az]{3}) ): 这是我目前的PHP代码(模式([AZ][az]{3}) ):

print_r(preg_split("/([A-Z][a-z]{3})/", $string));

There are a couple of problems with this. 这有几个问题。 While testing, I discovered that it is not returning what I expected, the return is: 在测试时,我发现它没有返回我的预期,返回的是:

Array
(
    [0] => 
    [1] => 
    [2] => 
    [3] => 
    [4] =>  
    [5] => 
    [6] =>  
    [7] =>  
    [8] => 
    [9] => 
    [10] => 
    [11] => 
    [12] => 
    [13] => .
)

(Via eval.in ) (通过eval.in

So this has the proper amount of returns, but they are all blank. 所以这有适当的回报,但它们都是空白的。 Why are all the values blank? 为什么所有值都是空白的?

Another thing that I thought of was that I needed to include other characters such as spaces, commas, periods etc in the preg_split() return. 我想到的另一件事是我需要在preg_split()返回中包含其他字符,如空格,逗号,句点等。 In the return I got from eval.in, it appears as though the final period has been included. 在我从eval.in获得的回报中,似乎已经包含了最后一段时间。 Is this true for spaces and other characters as well, or do I need to do something special in cases of these characters? 对于空格和其他角色也是如此,或者在这些角色的情况下我是否需要做一些特别的事情?

It's " splitting " on those matches so they are removed. 它在这些比赛中“ 分裂 ”,因此它们被删除。 You want preg_match_all or use PREG_SPLIT_DELIM_CAPTURE with PREG_SPLIT_NO_EMPTY . 您需要preg_match_all或将PREG_SPLIT_DELIM_CAPTUREPREG_SPLIT_NO_EMPTY一起使用。

print_r(preg_split("/([A-Z][a-z]{3})/",
                   $string,
                   null,
                   PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY));

You should remove capturing group () and use preg_match_all . 您应该删除捕获组()并使用preg_match_all

$text = "GnulHynkAfdsGknp AfdsGknp Wgbf GknpLnugBuipAfdsCbhgByfg.";
preg_match_all("/[A-Z][a-z]{3}|(?: |,|\.)/", $text, $match);
print_r($match);

Output : 输出

Array
(
    [0] => Array
        (
            [0] => Gnul
            [1] => Hynk
            [2] => Afds
            [3] => Gknp
            [4] =>  
            [5] => Afds
            [6] => Gknp
            [7] =>  
            [8] => Wgbf
            [9] =>  
            [10] => Gknp
            [11] => Lnug
            [12] => Buip
            [13] => Afds
            [14] => Cbhg
            [15] => Byfg
            [16] => .
        )
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 删除所有非单词字符,但“。” - Remove all non-word characters but “.” 仅包含非单词字符的字符串匹配的正则表达式 - Regular expression that matches a string only if it contains non-word characters 对于以非单词字符开头或结尾的单词的单词边界的自定义定义 - Custom definition for word boundary for words that begin or end with non-word characters 删除或重复的非单词字符(如HTML文本区域中的换行符)会导致modsecurity错误 - Removal or repetitive non-word characters like line breaks in a HTML textarea that cause errors in modsecurity 将字符串拆分为关联数组(同时保留字符) - Split string into associative array (while maintaining characters) 如何在保持所有非英语字母支持的同时清理字符串 - How can I sanitize a string while maintaining all non-English alphabet support 正则表达式替换周围的字符,同时保持之间的字符串 - Regex replace surrounding characters while maintaining the string between 显示字符串最多不超过单词个数 - Display string to a maximum of so many characters without splitting word 如何删除字符串中的所有不可打印字符? - How to remove all non printable characters in a string? 用西里尔字符删除字符串中的所有非拉丁字符 - Deleting all non-Latin characters from string with Cyrillic characters
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM