简体   繁体   English

Python str。包含两个或多个字典

[英]Python str.contains from two or more dictionaries

I want to check if a string contains one or more values from two dictionaries. 我想检查一个字符串是否包含两个字典中的一个或多个值。

company = {"AXP": "American Express", "BIDU": "Baidu"}
stock_index = {"GOOG": "Google"}

for c, i in zip(company, stock_index):
    df.loc[df.name.str.contains(c, i), "instrumentclass"] = "Equity"

For some reason, it only writes "Equity" for the first match in the dictionaries, ie "AXP":"American Express" . 由于某种原因,它仅在字典中的第一个匹配项中写"Equity" ,即"AXP":"American Express" For "Baidu" and "Google" , nothing happens. 对于"Baidu""Google" ,什么也没有发生。

I know that I can combine the dictionaries to one as seen below, but I would prefer not to. 我知道我可以将字典合并为一个,如下所示,但是我不愿意。

benchmarks = company.copy()
benchmarks.update(stock_index)

The data is written and retrieved with help of a pandas DataFrame . 借助pandas DataFrame写入和检索数据。

import pandas as pd
df = pd.DataFrame(["LONG AXP", "SHORT AXP", "LONG BIDU", "LONG GOOG"], columns=["name"])

The code copies the column name to column instrumentclass and by doing this is supposed to substitute each cell to "Equity" if it contains "AXP" , "BIDU" or "GOOG" . 该代码将列name复制到列instrumentclass并且这样做可以将每个单元格替换为"Equity"如果它包含"AXP""BIDU""GOOG"

Why don't you start by breaking down this data, like this: 为什么不从分解这些数据开始,像这样:

df = pd.DataFrame(["LONG AXP", "SHORT AXP", "LONG BIDU", "LONG GOOG"],
                  columns=["name"])

# split on spaces and get the last part
df["company_name"] = df.name.str.split().str.get(-1)

>>> print df
        name company_name
0   LONG AXP          AXP
1  SHORT AXP          AXP
2  LONG BIDU         BIDU
3  LONG GOOG         GOOG

Now, it's much easier to work with these strings. 现在,使用这些字符串要容易得多。 Given this is a sample of your dictionaries: 鉴于这是您的字典示例:

company = {"AXP": "American Express", "BIDU": "Baidu"}
stock_index = {"GOOG": "Google"}

You can exploit "dictonary views" which behave like sets in Python: 您可以利用“字典视图”,其行为类似于Python中的集合:

# this is Python 2, if you use Python 3, .keys() method returns a view
all_companies = company.viewkeys() | stock_index.viewkeys()

>>> print all_companies
{'AXP', 'BIDU', 'GOOG'}

So now, we have a set-like object we can use to filter on the data and set "Equity": 因此,现在,我们有了一个类似集合的对象,可以用来过滤数据并设置“股票”:

df.loc[df.company_name.isin(all_companies), "instrumentclass"] = "Equity"

If you are concerned about not joining these dictionaries like that, you might want to consider using something like a ChainMap: https://docs.python.org/3/library/collections.html#collections.ChainMap That's Python 3 standard library, but backports to Python 2 should exist. 如果您担心不像这样加入这些词典,则可能要考虑使用类似ChainMap的方法: https ://docs.python.org/3/library/collections.html#collections.ChainMap那是Python 3标准库,但是应该存在向Python 2的反向移植。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM