[英]python, x in list and x == list[28] deliver different results
Im trying to find if some string is in a list.我试图找出某个字符串是否在列表中。 when using: 'if string in list'
i get a false
.使用时: 'if string in list'
我得到一个false
。 but when im trying 'if string == list[28]'
i get a true
.但是当我尝试'if string == list[28]'
时,我得到了一个true
。
how come?怎么来的? the string is definitely in the list.该字符串肯定在列表中。
import pandas as pd
import numpy as np
import scipy.stats as stats
import re
nba_df=pd.read_csv("assets/nba.csv")
cities=pd.read_html("assets/wikipedia_data.html")[1]
cities=cities.iloc[:-1,[0,3,5,6,7,8]]
nba_df = nba_df[(nba_df['year'] == 2018)]
nba_df['team'] = nba_df['team'].apply(lambda x: x.split('*')[0])
nba_df['team'] = nba_df['team'].apply(lambda x: x.split('(')[0])
nba_df['team'] = nba_df['team'].str.strip()
cityList = cities['Metropolitan area'].str.strip()
actualCities = []
for idx, city in enumerate(nba_df['team']):
if city == 'New Orleans Pelicans':
print('string: ', city.split()[0] + ' ' + city.split()[1])
print('cityList[28]: ', cityList[28])
print('is string in list: ', (city.split()[0] + ' ' + city.split()[1]) in cityList)
print('is string == list[28]: ', (city.split()[0] + ' ' + city.split()[1]) == cityList[28])
output: output:
string: New Orleans
cityList[28]: New Orleans
is string in list: False
is string == list[28]: True
It looks like your issue is related to membership testing with the in
operator, particularly as it relates to pandas
"containers" such as DataFrames and Series.看起来您的问题与in
运算符的成员资格测试有关,特别是与pandas
“容器”(例如 DataFrames 和 Series)有关。 Keep in mind when you say:当你说:
how come?怎么来的? the string is definitely in the list.该字符串肯定在列表中。
This is not quite accurate.这不太准确。 Your cityList
is a Series
object, not a list
.您的cityList
是 object Series
,而不是list
。 This creates some quirks we have to work around, since we cannot treat a Series
the same as a list.这会产生一些我们必须解决的怪癖,因为我们不能将Series
视为列表。 In general Series
behave a bit more like a dictionary
rather than a list.一般来说, Series
的行为更像是dictionary
而不是列表。
I've created a truncated test example for your code, using the setup here:我使用此处的设置为您的代码创建了一个截断的测试示例:
import pandas as pd
data = {
"Teams": [ "Boston Celtics", "Brooklyn Nets", "New York Knicks", "Philadelphia 76ers", "Toronto Raptors", "Chicago Bulls", "Cleveland Cavaliers", "Detroit Pistons", "Indiana Pacers", "Milwaukee Bucks", "Atlanta Hawks", "Charlotte Hornets", "Miami Heat", "Orlando Magic", "Washington Wizards", "Denver Nuggets", "Minnesota Timberwolves", "Oklahoma City Thunder", "Portland Trail Blazers", "Utah Jazz", "Golden State Warriors", "Los Angeles Clippers", "Los Angeles Lakers", "Phoenix Suns", "Sacramento Kings", "Houston Rockets", "Memphis Grizzlies", "San Antonio Spurs", "New Orleans Pelicans" ],
"Cities": [ "Boston", "Brooklyn", "New York", "Philadelphia", "Toronto", "Chicago", "Cleveland", "Detroit", "Indiana", "Milwaukee", "Atlanta", "Charlotte", "Miami", "Orlando", "Washington", "Denver", "Minnesota", "Oklahoma City", "Portland", "Utah", "Golden", "Los Angeles", "Los Angeles", "Phoenix", "Sacramento", "Houston", "Memphis", "San Antonio", "New Orleans" ]
}
nba_df = pd.DataFrame(data, columns = ['Teams', 'Cities'])
# doing this to mimic your code of storing the Series to cityList
cityList = nba_df['Cities'].str.strip()
print(cityList)
print(type(cityList))
Output: Output:
0 Boston
1 Brooklyn
2 New York
...
28 New Orleans
<class 'pandas.core.series.Series'>
The key is to use cityList.values
, rather than just cityList
.关键是使用cityList.values
,而不仅仅是cityList
。 However, I encourage you to read the Series.values
documentation , as Pandas does not recommend using this property anymore (it looks like Series.array
was added in 0.24, and they recommend using that instead).但是,我鼓励您阅读Series.values
文档,因为 Pandas 不再建议使用此属性(看起来Series.array
是在 0.24 中添加的,他们建议改用它)。 Both PandasArray
and numpy.ndarray
appear to behave a bit more like a list
, at least in this example when it comes to membership test. PandasArray
和numpy.ndarray
看起来都更像一个list
,至少在这个例子中,当涉及到成员资格测试时。 Again, reading the Series.array
documentation is highly encouraged.同样,强烈建议阅读Series.array
文档。
Example from the terminal:来自终端的示例:
>>> cityList[28]
'New Orleans'
>>> 'New Orleans' in cityList
False
>>> 'New Orleans' in cityList.values
True
>>> 'New Orleans' in cityList.array
True
You could also just create a list from your cityList
(which again, is a Series
)您也可以从您的cityList
创建一个列表(这又是一个Series
)
>>> list(cityList)
['Boston', 'Brooklyn', ..., 'New Orleans']
>>> 'New Orleans' in list(cityList)
True
Side Note边注
I would probably rename your cityList
to citySeries
or something similar, to make a note in your code that you are not dealing with a list, but a "special" container from the pandas
library.我可能会将您的cityList
重命名为citySeries
或类似名称,以在您的代码中说明您处理的不是列表,而是pandas
库中的“特殊”容器。
Alternatively, you could just create your cityList
like so ( note: I'm using your code now, not my example):或者,您可以像这样创建您的cityList
(注意:我现在使用的是您的代码,而不是我的示例):
cityList = list(cities['Metropolitan area'].str.strip())
I did have to do a bit of research for this answer as I am by no means a pandas
expert, so here are the three questions that helped me figure this out:我确实需要为这个答案做一些研究,因为我绝不是pandas
专家,所以这里有三个问题帮助我解决了这个问题:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.