[英]Pandas Assign column by partial string match size to array dimension error
I have a dataframe as such:我有一个这样的数据框:
Postcode Country
0 PR2 6AS United Kingdom
1 PR2 6AS United Kingdom
2 CF5 3EG United Kingdom
3 DG2 9FH United Kingdom
I create a new column to be assigned based on a partial string match:我根据部分字符串匹配创建了一个要分配的新列:
mytestdf['In_Preston'] = "FALSE"
mytestdf
Postcode Country In_Preston
0 PR2 6AS United Kingdom FALSE
1 PR2 6AS United Kingdom FALSE
2 CF5 3EG United Kingdom FALSE
3 DG2 9FH United Kingdom FALSE
I wish to assign the column "In_Preston" by a partial string match on "Postcode".我希望通过“邮政编码”上的部分字符串匹配来分配“In_Preston”列。 I try the following:
我尝试以下操作:
mytestdf.loc[(mytestdf[mytestdf['Postcode'].str.contains("PR2")]), 'In_Preston'] = "TRUE"
But this returns the error "cannot copy sequence with size 3 to array axis with dimension 2"但这会返回错误“无法将大小为 3 的序列复制到维度为 2 的数组轴”
I look at my code again and believe the issue is that I am selecting a slice of the dataframe from a slice of the dataframe.我再次查看我的代码并相信问题在于我正在从数据帧的切片中选择数据帧的切片。 As such I change to
因此我改为
mytestdf.loc[(mytestdf['Postcode'].str.contains("PR2")]), 'In_Preston'] = "TRUE"
but my interpreter tells me this is incorrect syntax, though I do not see why.但是我的解释器告诉我这是不正确的语法,尽管我不明白为什么。
What is the error in my code or my approach?我的代码或我的方法有什么错误?
You need remove inner filter:您需要移除内部过滤器:
mytestdf.loc[mytestdf['Postcode'].str.contains("PR2"), 'In_Preston'] = "TRUE"
Another solution is use numpy.where
:另一种解决方案是使用
numpy.where
:
mytestdf['In_Preston'] = np.where(mytestdf['Postcode'].str.contains("PR2"), 'TRUE', 'FALSE')
print (mytestdf)
Postcode Country In_Preston
0 PR2 6AS United Kingdom TRUE
1 PR2 6AS United Kingdom TRUE
2 CF5 3EG United Kingdom FALSE
3 DG2 9FH United Kingdom FALSE
But if want assign boolean True
s and False
s:但是,如果要分配 boolean
True
s 和False
s:
mytestdf['In_Preston'] = mytestdf['Postcode'].str.contains("PR2")
print (mytestdf)
Postcode Country In_Preston
0 PR2 6AS United Kingdom True
1 PR2 6AS United Kingdom True
2 CF5 3EG United Kingdom False
3 DG2 9FH United Kingdom False
EDIT by comment of Zero
:通过
comment of Zero
编辑:
If want check only start of Postcode
:如果只想检查
Postcode
开头:
mytestdf.Postcode.str.startswith('PR2')
Or add regex ^
for start of string:或者为字符串的开头添加正则表达式
^
:
mytestdf['Postcode'].str.contains("^PR2")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.