[英]Pandas - Create new column - if another column value is in list (correct way)
I've been struggling with making a new column stating weekend or not based on 'Day of Week' column.我一直在努力制作一个新的专栏来说明周末或不基于“星期几”专栏。 I am using the following code based off a previous Stack Overflow question.
我正在使用基于先前堆栈溢出问题的以下代码。
weekday_classification = {
'Weekday': ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday'],
'Weekend': ['Saturday', 'Sunday']
}
weekday_classification = {day: all_days for all_days, l in weekday_classification.items() for day in l}
df["Weekend"] = df['Day of Week'].map(weekday_classification)
df.head()
Though the above code produces the desired effect - I am getting a warning which states:虽然上面的代码产生了预期的效果 - 我收到一条警告,其中指出:
ipython-input-21-e273917f31f9:6: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
ipython-input-21-e273917f31f9:6:SettingWithCopyWarning:试图在 DataFrame 的切片副本上设置值。 Try using.loc[row_indexer,col_indexer] = value instead
尝试改用 .loc[row_indexer,col_indexer] = value
What is a way to get around this, I have read the documentation which says how to make a new column, however this seems to be only for more simplistic column creations.有什么方法可以解决这个问题,我已经阅读了说明如何创建新列的文档,但这似乎仅适用于更简单的列创建。
I'm still just dipping my toes in the sand with Python and data analysis, I'm happy to receive general feedback.我仍然只是用 Python 和数据分析将脚趾浸入沙中,我很高兴收到一般反馈。
Reverse your dictionary so it's like this反转你的字典,就像这样
weekday_classification = {
'Monday': 'Weekday',
'Tuesday': 'Weekday',
'Wednesday': 'Weekday',
'Thursday': 'Weekday',
'Friday': 'Weekday',
'Saturday': 'Weekend',
'Sunday': 'Weekend'
}
then construct a new dataframe based on that weekend_classification
dict to join with your existing df
然后构建一个新的
weekend_classification
基于周末分类字典加入您现有的df
In []: days = pd.DataFrame(data=weekday_classification.values(), index=weekday_classification.keys(), columns=['Weekday/end'])
days
Out[]:
Weekday/end
Monday Weekday
Tuesday Weekday
Wednesday Weekday
Thursday Weekday
Friday Weekday
Saturday Weekend
Sunday Weekend
In []: df.join(days, on=df['Day of Week'])
Out[]:
Day of Week Weekday/end
0 Monday Weekday
1 Tuesday Weekday
2 Wednesday Weekday
3 Thursday Weekday
4 Friday Weekday
5 Saturday Weekend
6 Sunday Weekend
Because your df
is a 'subset' of another DataFrame.因为您的
df
是另一个 DataFrame 的“子集”。 You may have done some filtering on another DataFrame's column to generate this df
like:您可能已经对另一个 DataFrame 的列进行了一些过滤以生成此
df
,例如:
df = df_p[df_p['some_col'].isin(some_set)]
Pandas may simply create reference to parts of the df_p
to present df
, rather than actually create df
. Pandas 可以简单地创建对
df_p
部分的引用来呈现df
,而不是实际创建df
。 On this situation, df
will be like a slice of df_p
and modifying df
will cause warnings because this may effect df_p
.在这种情况下,
df
就像df_p
的一部分,修改df
会导致警告,因为这可能会影响df_p
。 This is what the error message describes.这就是错误消息的描述。 Make sure
df
has its own data when df
is created.确保在创建
df
时df
有自己的数据。 Do filtering on the other DataFrame like:对其他 DataFrame 进行过滤,例如:
df = df_p[df_p['some_col'].isin(some_set)].copy()
or use copy.deepcopy() for complicated data.或使用 copy.deepcopy() 处理复杂的数据。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.