简体   繁体   中英

PHP Regex Pattern Need String

I have the following problem i need in array 3 the languages names not the hex code on array 4 i want only the audio codecs not anything other like hex values or something.

I have no solution i have all tested but all is wrong. Can someone help me ?

Here are the regex data:

Stream #0:1[0x1100](ger): Audio: dts (DTS) ([130][0][0][0] / 0x0082), 48000 Hz, 5.1(side), s16, 1536 kb/s
Stream #0:2(eng): Audio: dts (DTS-HD MA) ([134][0][0][0] / 0x0086), 48000 Hz, 5.1(side), s16, 1536 kb/s
Stream #0:3: Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, 192 kb/s
Stream #1:0: Audio: mp2, 41000 Hz, stereo, 48 kb/s

Here is my regex

/Stream #([0-9\.]+)?:([0-9\.]+).([A-Za-z][A-Za-z]*)?.+Audio: ([^,]+?), ([0-9]+) Hz, ?([^\n,]*)/

Here is the output array:

Array
(
[0] => Array
    (
        [0] => Stream #0:1[0x1100](ger): Audio: dts (DTS) ([130][0][0][0] / 0x0082), 48000 Hz, 5.1(side)
        [1] => Stream #0:2(eng): Audio: dts (DTS-HD MA) ([134][0][0][0] / 0x0086), 48000 Hz, 5.1(side)
        [2] => Stream #0:3: Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo
        [3] => Stream #1:0: Audio: mp2, 41000 Hz, stereo
    )

[1] => Array
    (
        [0] => 0
        [1] => 0
        [2] => 0
        [3] => 1
    )

[2] => Array
    (
        [0] => 1
        [1] => 2
        [2] => 3
        [3] => 0
    )

[3] => Array
    (
        [0] => 
        [1] => eng
        [2] => 
        [3] => 
    )

[4] => Array
    (
        [0] => dts (DTS) ([130][0][0][0] / 0x0082)
        [1] => dts (DTS-HD MA) ([134][0][0][0] / 0x0086)
        [2] => mp2 ([3][0][0][0] / 0x0003)
        [3] => mp2
    )

[5] => Array
    (
        [0] => 48000
        [1] => 48000
        [2] => 48000
        [3] => 41000
    )

[6] => Array
    (
        [0] => 5.1(side)
        [1] => 5.1(side)
        [2] => stereo
        [3] => stereo
    )

)

If you only want to match the immediate codec name after Audio: then remove all the extraneous match groups, and just look for alphanumeric characters:

 /Stream #([0-9\.]+)?:([0-9\.]+).([A-Za-z][A-Za-z]*)?.+Audio: (\w+)/

You could also just have used strtok($value, " ") to split out the first part from the result array entries.

One tries to get cue's when free-form parsing. Its usually inadequate based only on a small sample text only because you can't see the generating program.

Taking that into account, this might fix up your basic concern. But I would break it up into a few known simple parts, then parse those separately.

Stream[ ]+\#
([0-9.]+)? : ([0-9.]+)         # 1,2  title : chapter
[^:(]* (?:\(([^)]*)\))?        # 3    language
[^:]* :
[ ]* Audio:
[^(\w,]* (\w*)                  # 4   aud codec
[^,]* , 
[ ]*([0-9]*)[ ]* (?i:[mkhz]+)   # 5   aud frequency
[^,]* , 
[ ]* ([^\n,]*)                  # 6   aud chan's

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM