簡體   English   中英

在 python 中使用正則表達式提取值

[英]Extract values using regex in python

如何使用正則表達式從以下字符串中提取輸入:

{"eventid":"cowrie.command.input","input":"echo \"root:twrHxXE7YmIr\"|chpasswd|bash","message":"CMD: echo \"root:twrHxXE7YmIr\"|chpasswd|bash","sensor":"cowrieHoneypot2","timestamp":"2021-05-06T10:35:25.171419Z","src_ip":"121.201.95.106","session":"1ce15808ec97"}

以下是我目前正在使用的正則表達式模式:

\"input\":\"[a-zA-z0-9\s=+~_\\$-|]*\"

但它返回一半的值,如:

"input":"echo \"

那么,我怎樣才能修改這個正則表達式來獲得完整的值呢?

您需要在}{之間添加逗號,用換行符分隔,並且可以使用簡單的.replace("}\n{", "},\n{")來完成。

然后您可以使用json模塊解析 JSON :

import json

filepath = r'PATH_TO_FILE'

with open(filepath, 'r') as f:
    contents = f.read()

j = json.loads('[{}]'.format(contents.replace("}\n{", "},\n{")))
values = [n["input"] for n in j if 'input' in n]
print(values)

使用您的數據,output 是

['enable', 'system', 'system', 'shell', 'shell', 'sh', 'cat /proc/mounts; /bin/busybox KUHJY', 'cd /dev/shm; cat .s || cp /bin/echo .s; /bin/busybox KUHJY', 'tftp; wget; /bin/busybox KUHJY', 'dd bs=52 count=1 if=.s || cat .s || while read i; do echo $i; done < .s', 'while read i', '/bin/busybox KUHJY', 'rm .s; exit', 'cat /proc/cpuinfo | grep name | wc -l', 'echo "root:QEqRsCr9yFa5"|chpasswd|bash', "cat /proc/cpuinfo | grep name | head -n 1 | awk '{print $4,$5,$6,$7,$8,$9;}'", "free -m | grep Mem | awk '{print $2 ,$3, $4, $5, $6, $7}'", 'ls -lh $(which ls)', 'which ls', 'crontab -l', 'w', 'uname -m', 'cat /proc/cpuinfo | grep model | grep name | wc -l', 'top', 'uname', 'uname -a', 'lscpu | grep Model', 'cd ~ && rm -rf .ssh && mkdir .ssh && echo "ssh-rsa AAAAB3NzaC1yc2EAAAABJQAAAQEArDp4cun2lhr4KUhBGE7VvAcwdli2a8dbnrTOrbMz1+5O73fcBOx8NVbUT0bUanUV9tJ2/9p7+vD0EpZ3Tz/+0kX34uAx1RV/75GVOmNx+9EuWOnvNoaJe0QXxziIg9eLBHpgLMuakb5+BgTFB+rKJAw9u9FSTDengvS8hX1kNFS4Mjux0hJOK8rvcEmPecjdySYMb66nylAKGwCEE6WEQHmd1mUPgHwGQ0hWCwsQk13yCGPK5w6hYp5zYkFnvlC8hGmd4Ww+u97k6pfTGTUbJk14ujvcD9iUKQTTWYYjIIu5PmUux5bsZ0R4WFwdIe6+i6rBLAsPKgAySVKPRK+oRw== mdrfckr">>.ssh/authorized_keys && chmod -R go= ~/.ssh && cd ~', 'enable', 'system', 'system', 'shell', 'shell', 'sh', 'cat /proc/mounts; /bin/busybox PYIHO', 'cd /dev/shm; cat .s || cp /bin/echo .s; /bin/busybox PYIHO', 'tftp; wget; /bin/busybox PYIHO', 'dd bs=52 count=1 if=.s || cat .s || while read i; do echo $i; done < .s', 'while read i', '/bin/busybox PYIHO', 'rm .s; exit', 'enable', 'system', 'system', 'shell', 'shell', 'sh', 'cat /proc/mounts; /bin/busybox GYYXE', 'cd /dev/shm; cat .s || cp /bin/echo .s; /bin/busybox GYYXE', 'tftp; wget; /bin/busybox GYYXE', 'dd bs=52 count=1 if=.s || cat .s || while read i; do echo $i; done < .s', 'while read i', '/bin/busybox GYYXE', 'rm .s; exit']
 \"input\":\"([^\,\}\"]|\\\")*\"[,\{]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM