简体   繁体   中英

Splitting string into sections while maintaining all non-word characters

I'm working on an encryption function just for fun (for a non-production environment). Currently running my encrypt function like this:

encrypt("This is a string.");   

Produces the following string:

GnulHynkAfdsGknp AfdsGknp Wgbf GknpLnugBuipAfdsCbhgByfg.

This is perfect, exactly what I wanted and expected - however, now I'm trying to write a decrypt function. Every character that is encrypted will have a single capital letter followed by 3 non-capital letters (As you can see from the example above).

My plan was to run preg_split() to get the different letters of the string.

Here is my current PHP code (pattern ([AZ][az]{3}) ):

print_r(preg_split("/([A-Z][a-z]{3})/", $string));

There are a couple of problems with this. While testing, I discovered that it is not returning what I expected, the return is:

Array
(
    [0] => 
    [1] => 
    [2] => 
    [3] => 
    [4] =>  
    [5] => 
    [6] =>  
    [7] =>  
    [8] => 
    [9] => 
    [10] => 
    [11] => 
    [12] => 
    [13] => .
)

(Via eval.in )

So this has the proper amount of returns, but they are all blank. Why are all the values blank?

Another thing that I thought of was that I needed to include other characters such as spaces, commas, periods etc in the preg_split() return. In the return I got from eval.in, it appears as though the final period has been included. Is this true for spaces and other characters as well, or do I need to do something special in cases of these characters?

It's " splitting " on those matches so they are removed. You want preg_match_all or use PREG_SPLIT_DELIM_CAPTURE with PREG_SPLIT_NO_EMPTY .

print_r(preg_split("/([A-Z][a-z]{3})/",
                   $string,
                   null,
                   PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY));

You should remove capturing group () and use preg_match_all .

$text = "GnulHynkAfdsGknp AfdsGknp Wgbf GknpLnugBuipAfdsCbhgByfg.";
preg_match_all("/[A-Z][a-z]{3}|(?: |,|\.)/", $text, $match);
print_r($match);

Output :

Array
(
    [0] => Array
        (
            [0] => Gnul
            [1] => Hynk
            [2] => Afds
            [3] => Gknp
            [4] =>  
            [5] => Afds
            [6] => Gknp
            [7] =>  
            [8] => Wgbf
            [9] =>  
            [10] => Gknp
            [11] => Lnug
            [12] => Buip
            [13] => Afds
            [14] => Cbhg
            [15] => Byfg
            [16] => .
        )
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM