简体   繁体   中英

Multiple Occurrences of same String to List

I have a subprocess command

temp = subprocess.check_output(cmd)

This returns a single string

The string will contain data, for every record which has a license

For Instance,

If there is currently only one record with a license : the string will look like:

b'Setting license file path to 5053@100.113.111.61\r\n\r\n\t------------------------\r\n\r\n\tredgiant license usage status on 100.113.248.61 (port 55952)\r\n\r\n\tmagicbulletlooks v999.9: administrator@nynle650 1/0 at 02/29 09:51  (handle: 62)\r\n\r\n'

If there are two records with a license, it will look like:

b'Setting license file path to 5053@100.113.111.61\r\n\r\n\t--------------------
----\r\n\r\n\tredgiant license usage status on 100.113.111.61 (port 55952)\r\n\r
\n\tmagicbulletlooks v999.9: administrator@nynle650 1/0 at 02/29 11:42  (handle:
 68)\r\n\tmagicbulletlooks v999.9: administrator@nynle647 1/0 at 02/29 11:46  (h
andle: 8d)\r\n\r\n'

and so on and so forth as the number grows.

I am trying to extract the magicbulletlooks v999.9: administrator@nynle647 1/0 at 02/29 11:46 portion, as many times as it occurs into a list.

For each occurrence, there should be one item in the list.

Currently I am using

def do_work():
    regex= re.compile("magicbulletlooks(.*\))")
    t = subprocess.check_output(my_cmd)
    return re.findall(regex,str(t))

However, this only returns me a list of 1 value, which has the complete string from beginning to end and does not store the individual occurrences.

Basically the Goal I am trying to reach is using regex, to create a list of:

['magicbulletlooks v999.9 administrator@nynle647 1/0 at 2/29 11:46',
  'magicbulletlooks v999.9 administrator@nynle650 1/0 at 2/29 11:42'
]

在此处输入图片说明

As a fix to your pattern, here is what I could came up with:

>>> re.findall('magicbulletlooks.*\d+', s)
['magicbulletlooks v999.9: administrator@nynle650 1/0 at 02/29 11:42', 'magicbulletlooks v999.9: administrator@nynle647 1/0 at 02/29 11:46']

Now in your function:

def do_work():
    pat = re.compile('magicbulletlooks.*\d+')
    t = subprocess.check_output(my_cmd)
    return pat.findall(str(t))

EDIT: Quoting from docs:

subprocess.check_output(args, *, stdin=None, stderr=None, shell=False, universal_newlines=False, timeout=None) Run command with arguments and return its output.

...

This is equivalent to:

By default, this function will return the data as encoded bytes. The actual encoding of the output data may depend on the command being invoked, so the decoding to text will often need to be handled at the application level.

This behaviour may be overridden by setting universal_newlines to True as described above in Frequently Used Arguments.

So you have here couple of options:

1 - set universal_newline flag to True as:

t = subprocess.check_output(my_cmd, universal_flag=True)
pat = re.compile('magicbulletlooks.*\d+')
return pat.findall(t)

2 - Specify the pattern in the re expression as a binary-like object:

pat = re.compile(b'magicbulletlooks.*\d+')
t = subprocess.check_output(my_cmd)
return pat.findall(t)

3 - Decode the binary-like object into an utf-8 (more preferable if dealing with Python3) or ascii string:

>>> t = subprocess.check_output(my_cmd)
>>>
>>> re.findall('magicbulletlooks.*\d+', t.decode('utf-8'))
['magicbulletlooks v999.9: administrator@nynle650 1/0 at 02/29 11:42', 'magicbulletlooks v999.9: administrator@nynle647 1/0 at 02/29 11:46']
>>>
>>> re.findall('magicbulletlooks.*\d+', t.decode('ascii'))
['magicbulletlooks v999.9: administrator@nynle650 1/0 at 02/29 11:42', 'magicbulletlooks v999.9: administrator@nynle647 1/0 at 02/29 11:46']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM