[英]Python str.contains from two or more dictionaries
I want to check if a string contains one or more values from two dictionaries. 我想检查一个字符串是否包含两个字典中的一个或多个值。
company = {"AXP": "American Express", "BIDU": "Baidu"}
stock_index = {"GOOG": "Google"}
for c, i in zip(company, stock_index):
df.loc[df.name.str.contains(c, i), "instrumentclass"] = "Equity"
For some reason, it only writes "Equity"
for the first match in the dictionaries, ie "AXP":"American Express"
. 由于某种原因,它仅在字典中的第一个匹配项中写"Equity"
,即"AXP":"American Express"
。 For "Baidu"
and "Google"
, nothing happens. 对于"Baidu"
和"Google"
,什么也没有发生。
I know that I can combine the dictionaries to one as seen below, but I would prefer not to. 我知道我可以将字典合并为一个,如下所示,但是我不愿意。
benchmarks = company.copy()
benchmarks.update(stock_index)
The data is written and retrieved with help of a pandas DataFrame
. 借助pandas DataFrame
写入和检索数据。
import pandas as pd
df = pd.DataFrame(["LONG AXP", "SHORT AXP", "LONG BIDU", "LONG GOOG"], columns=["name"])
The code copies the column name
to column instrumentclass
and by doing this is supposed to substitute each cell to "Equity"
if it contains "AXP"
, "BIDU"
or "GOOG"
. 该代码将列name
复制到列instrumentclass
并且这样做可以将每个单元格替换为"Equity"
如果它包含"AXP"
, "BIDU"
或"GOOG"
。
Why don't you start by breaking down this data, like this: 为什么不从分解这些数据开始,像这样:
df = pd.DataFrame(["LONG AXP", "SHORT AXP", "LONG BIDU", "LONG GOOG"],
columns=["name"])
# split on spaces and get the last part
df["company_name"] = df.name.str.split().str.get(-1)
>>> print df
name company_name
0 LONG AXP AXP
1 SHORT AXP AXP
2 LONG BIDU BIDU
3 LONG GOOG GOOG
Now, it's much easier to work with these strings. 现在,使用这些字符串要容易得多。 Given this is a sample of your dictionaries: 鉴于这是您的字典示例:
company = {"AXP": "American Express", "BIDU": "Baidu"}
stock_index = {"GOOG": "Google"}
You can exploit "dictonary views" which behave like sets in Python: 您可以利用“字典视图”,其行为类似于Python中的集合:
# this is Python 2, if you use Python 3, .keys() method returns a view
all_companies = company.viewkeys() | stock_index.viewkeys()
>>> print all_companies
{'AXP', 'BIDU', 'GOOG'}
So now, we have a set-like object we can use to filter on the data and set "Equity": 因此,现在,我们有了一个类似集合的对象,可以用来过滤数据并设置“股票”:
df.loc[df.company_name.isin(all_companies), "instrumentclass"] = "Equity"
If you are concerned about not joining these dictionaries like that, you might want to consider using something like a ChainMap: https://docs.python.org/3/library/collections.html#collections.ChainMap That's Python 3 standard library, but backports to Python 2 should exist. 如果您担心不像这样加入这些词典,则可能要考虑使用类似ChainMap的方法: https ://docs.python.org/3/library/collections.html#collections.ChainMap那是Python 3标准库,但是应该存在向Python 2的反向移植。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.