简体   繁体   中英

Python - How to compare the content of two list are in a string

I need to compare that all the items of list "present" are in the string "line" and all the elements of the list "absent" are NOT in the string "line"

So, having the 2 lists

present = ['SYN', 'ACK']
absent = ['RST', 'FIN']

And a file with all the TCP flags from here: https://github.com/robcowart/elastiflow/blob/master/logstash/elastiflow/dictionaries/tcp_flags.yml

"...
"12": RST-PSH
"13": FIN-RST-PSH
"14": SYN-RST-PSH
"15": FIN-SYN-RST-PSH
"16": ACK
"17": FIN-ACK
"18": SYN-ACK
"19": FIN-SYN-ACK
"20": RST-ACK
"21": FIN-RST-ACK
"22": SYN-RST-ACK
"23": FIN-SYN-RST-ACK
..."

I will read the file line by line if all the elements of "present" exist in the line and all the elements of "absent" do NOT exist in the line, then print the line

How should I do it? I imagine recursion or comprehension, but I can not find the way. thanks

for line in csv_reader:
    # parse the line and store the flags into a list
    # flags = line.split...

    # the logic to check for present and absent
    is_present = all(elem in flags for elem in present)
    is_absent = not any(elem in flags for elem in absent)
    if is_present and is_absent:
        print(line)
line="abcee"
present=['abc','cde','fgh']
absent=['bla','ghj']
def AllInLine():
    for i in present:
        if i not in line:
            return False;
    return True;
def NoneInLine():
    for i in absent:
        if i in line:
            return False;
    return True;

then if both functions return true you can print the line

Here is something you can try. It downloads raw YAML file from GitHub using requests , parses the YAML using PyYAML , then checks the existence of each item from absent and present using all() and prints the lines that have all present items in the line and all absent items not in the line.

The YAML file is also downloaded in chunks, just in case it gets big. This is a good practice when downloading files over HTTP anyways.

Demo:

from pathlib import Path
from requests import get
from yaml import safe_load, YAMLError

def download_file(url, chunk_size=1024):
    with get(url, stream=True) as req:
        filename = Path(url).name
        with open(filename, mode="wb") as f:
            for chunk in req.iter_content(chunk_size=chunk_size):
                if chunk:
                    f.write(chunk)

def parse_yaml_file(path):
    with open(path) as f:
        try:
            yaml_file = safe_load(f)
            return yaml_file
        except YAMLError as ex:
            print(ex)

if __name__ == "__main__":
    present = ['SYN', 'ACK']
    absent = ['RST', 'FIN']

    yaml_file = download_file("https://raw.githubusercontent.com/robcowart/elastiflow/master/logstash/elastiflow/dictionaries/tcp_flags.yml")

    data = parse_yaml_file(yaml_file)

    for number, line in data.items():
        if all(p in line for p in present) and all(a not in line for a in absent):
            print(f"{number}: {line}")

Output:

18: SYN-ACK
26: SYN-PSH-ACK
50: SYN-ACK-URG
58: SYN-PSH-ACK-URG
82: SYN-ACK-ECE
90: SYN-PSH-ACK-ECE
114: SYN-ACK-URG-ECE
122: SYN-PSH-ACK-URG-ECE
146: SYN-ACK-CWR
154: SYN-PSH-ACK-CWR
178: SYN-ACK-URG-CWR
186: SYN-PSH-ACK-URG-CWR
210: SYN-ACK-ECE-CWR
218: SYN-PSH-ACK-ECE-CWR
242: SYN-ACK-URG-ECE-CWR
250: SYN-PSH-ACK-URG-ECE-CWR

Note: The above will probably need more error checking for your scenario, but it shows the general idea.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM