简体   繁体   中英

Python find string in list

I have a salt module which returns the following output (im storing all these into a list to make iteration easier):

    - 2015-05-21 19:25:08,060 [main] WARN  [::::::] c.p.core.FilteringPropertyPlaceholderConfigurer - Could not load properties from class path resource [proferi-component-test.properties]: class path resource [proferi-component-test.properties] cannot be opened because it does not exist
    - 2015-05-21 19:25:08,064 [main] WARN  [::::::] c.p.core.FilteringPropertyPlaceholderConfigurer - Could not load properties from class path resource [qe-prop-not-specified]: class path resource [qe-prop-not-specified] cannot be opened because it does not exist
    - 2015-05-21 19:25:13,290 [main] INFO  [::::::] c.p.a.m.persistence.modular.ModelSessionManager - Setup SessionManager modelSessionFactory
    - 2015-05-21 19:25:14,327 [main] INFO  [::::::] c.p.a.model.persistence.BlueprintsGraphReadSession - Loading model graph for application M-00000304-0000-0001-0000-000000000000 with version MV-0000000000002714-0000000000002695-true
    - 2015-05-21 19:25:14,658 [main] INFO  [::::::] c.p.a.m.p.hydration.AppModelGraphHydrator - AppModelGraph Hydration stats for app M-00000304-0000-0001-0000-000000000000 - total time:322ms | sql time:20ms | jackson mapping:32ms | vertex adding:6ms | core building:63ms | core population:15ms | proxying:84ms | invocation handler creation:80ms | interface list building:10ms | moving through result set:4ms | items processed:156
    - 2015-05-21 19:25:14,860 [main] INFO  [-:sales02:Session:SetPasswd:-:-:-] c.l.n.cluster.zookeeper.ZooKeeperClusterClient - Starting ClusterClient...
    - 2015-05-21 19:25:14,914 [main] INFO  [-:sales02:Session:SetPasswd:-:-:-] c.l.n.cluster.zookeeper.ZooKeeperClusterClient - Cluster started
    - 2015-05-21 19:25:14,915 [main] INFO  [-:sales02:Session:SetPasswd:-:-:-] com.proferi.core.NorbertProtobufServiceClient - Constructing NettyNetworkClient with close channel time -1 ms, max cnxns per node 10, stale request timeout 20 minutes, stale request purge frequency 2 minutes
    - 2015-05-21 19:25:14,961 [Thread-8] INFO  [-:sales02:Session:SetPasswd:-:-:-] c.l.n.c.zookeeper.ZooKeeperClusterManagerComponent - Connected to ZooKeeper
    - 2015-05-21 19:25:14,987 [Thread-8] INFO  [-:sales02:Session:SetPasswd:-:-:-] c.l.n.c.zookeeper.ZooKeeperClusterManagerComponent - Handling a Connected message
    - 2015-05-21 19:25:15,245 [main] INFO  [-:sales02:Session:SetPasswd:-:-:-] com.company.platform.cli.SetPassword - Password for email address john@tech.com for tenant sales02 was set
    - 2015-05-21 19:25:15,254 [main] INFO  [-:-:-:-:-:-:-] c.l.norbert.network.netty.NettyNetworkClient - Shutting down NetworkClient
    - 2015-05-21 19:25:15,273 [main] INFO  [-:-:-:-:-:-:-] c.l.norbert.network.netty.NettyNetworkClient - NetworkClient shut down
    - 2015-05-21 19:25:15,281 [main] INFO  [-:-:-:-:-:-:-] c.l.n.cluster.zookeeper.ZooKeeperClusterClient - Cluster shut down

From this return I want check if the block has the string

- 2015-05-21 19:05:18,108 [main] INFO  [-:sales02:Session:SetPasswd:-:-:-] com.company.platform.cli.SetPassword - Password for email address john@tech.com for tenant sales02 was set

What is the best way to do this?

I tried using a for loop -

for i in l:
    if s == i: # where s is the string above
        return True
    else:
        return False

but this dint work (returned False ). I tried using regex but the string is too complex for me to come up with a regex formula. Any help will be greatly appreciated.

Since this is a list i know I can use indexing to get the desired string but I dont want the string, I want to check it its there and then return a custom string to the user.

An easier method is to use something like:

if s in l:
    return True

No need for a for loop with a simple check like that, it will iterate over the list and return positive(or whatever you like to return) if it hits a match. It's also a way to build some robustness into CLI-level menus in python scripts.

Tips: you might want to change the list of strings you're checking against into a set. Runs slightly faster.

Edit: an even better approach:

return s in l

If you're dead set on using a for loop -

for i in l:
   if any(i == s for i in entry)
        return True

Although this is probably a less elegant solution, using any() does allow you to have somewhat softer matching than the simple comprehension above - this is, however, risky, because if you have a string in the list that has at least your match criteria string, as well as more information, it will return true as well.

If you're looking to truly take an approach with regular expressions, you may try the following. It may not be as efficient as simply iterating through a list, but it may give you an idea of how to implement it.

While literal matching of the regular expression is complicated (and messy), it allows you to grab each component of the log through match groups. In a real environment, you'd want to look at something like Logstash which utilizes Grok filters (learn them, they're quite fun!).

Note: The following regular expressions are not 100% accurate, and may need to be altered given more data, but you get the idea.

Literal matching of the line you want would look like:

-\s(\d{4}-\d{1,2}-\d{1,2})\s(\d{1,2}:\d{1,2}:\d{1,2},\d{1,3})\s(\[\w+\])\s(\w+)\s+(\[.*\])\s([a-zA-Z.]+)\s-\s(.*)

To make only the message a matching group:

(?:.*)\s(([a-zA-Z_$][a-zA-Z\d_$]*\.)*([a-zA-Z_$][a-zA-Z\d_$]+)\s-\s)(.*)

For example I won't use a list, but a string just to demonstrate:

import re

logLine = "- 2015-05-21 19:05:18,108 [main] INFO  [-:sales02:Session:SetPasswd:-:-:-] com.company.platform.cli.SetPassword - Password for email address john@tech.com for tenant sales02 was set"
rx = "-\s(\d{4}-\d{1,2}-\d{1,2})\s(\d{1,2}:\d{1,2}:\d{1,2},\d{1,3})\s(\[\w+\])\s(\w+)\s+(\[.*\])\s([a-zA-Z.]+)\s-\s(.*)"

reMatch = re.match(rx, logLine)

if (reMatch):
    print reMatch.group(1)
    # Groups 1-7

You can iterate over the match groups, and you'll see that a log line like that will give you:

  1. 2015-05-21
  2. 19:05:18,108
  3. [main]
  4. INFO
  5. [-:sales02:Session:SetPasswd:-:-:-]
  6. com.company.platform.cli.SetPassword
  7. Password for email address john@tech.com for tenant sales02 was set

So then you may check match group 7 to see if the string you want in the message exists, or simply alternate with the second regular expression to only have one group to check:

reMatch2 = re.match(rx2, logLine)

if (reMatch2):
    print reMatch2.group(1)

You'll get a similar result using the shortened regular expression, but it may allow you to do more with your logs in the future than simply determining if something in a line (or the line itself) exists if you ever need to.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM