[英]Checking for unique values in Pandas Data frame column and crossreferenceing with a second column
I have a pandas dataframe that looks something like below.我有一个 pandas dataframe 看起来像下面这样。
I want to check the values in User ID to see if it is unique.我想检查用户 ID 中的值以查看它是否是唯一的。 If so, then I then want to check the License Type column to see if it is a full trial then return a 1 in a new column 'Full_direct'.
如果是这样,那么我想检查许可证类型列以查看它是否是完整试用版,然后在新列“Full_direct”中返回 1。 Else, i would return a 0 in the 'full_direct' column.
否则,我会在“full_direct”列中返回 0。
Date **User ID** Product Name License Type Month
0 2017-01-01 10431046623214402832 90295d194237 trial 2017-01
1 2017-07-09 246853380240772174 29125b243095 trial 2017-07
2 2017-07-07 13685844038024265672 47423e1485 trial 2017-07
3 2017-02-12 2475366081966194134 202400c85587 full 2017-02
4 2017-04-08 761179767639020420 168300g168004 full 2017-04
I made this attempt but wasnt able to iterate through the dataframe in this manner.我做了这个尝试,但无法以这种方式遍历 dataframe。 I was hoping to see if someone could advise.
我希望看看是否有人可以提供建议。 Thanks!
谢谢!
for values in main_df['User ID']:
if values.is_unique and main_df['License Type'] == 'full':
main_df['Full_Direct'] = 1
else:
main_df['Full_direct'] = 0
We do not need for loop here, let us try duplicated
我们这里不需要for循环,让我们尝试
duplicated
df['Full_direct'] = ((~df['User ID'].duplicated(keep=False)) & (df['License Type'] == 'full')).astype(int)
Fix your code修复你的代码
for values in df.index:
if df['UserID'].isin([df.loc[values,'User ID']]).sum()==1 and df.loc[values,'License Type'] == 'full':
df.loc[values,'Full_direct'] = 1
else:
df.loc[values,'Full_direct'] = 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.