[英]How can i transfer the results from one variable to a column in excel?
I want to add the values inside duplicates
to column Name
so that print(data["Name"])
can return all the values contained by the duplicates
. 我想将
duplicates
内的值添加到列Name
以便print(data["Name"])
可以返回duplicates
包含的所有值。 How can I achieve this? 我怎样才能做到这一点?
Quick story: I'm importing a csv file and I need to split the column Name
to get rid of meaningless information and then I'm using list comprehension to find the duplicates. 快速故事:我正在导入一个csv文件,我需要拆分列
Name
以删除无意义的信息,然后我使用列表Name
来查找重复项。
data = pd.read_csv(next(iglob('*.csv')))
data["Name"]= data["Name"].str.split("(", n = 1, expand = True)
duplicates = [x for x in data["Name"] if x in data["Name"]
[data["Name"].duplicated()].values]
Edit: 编辑:
df['dupicates'] = df['Name'].where(df['Name'].duplicated(keep=False), '')
Name duplicates
0 NameC
1 NameA NameA
2 NameB NameB
3 NameA NameA
4 NameA NameA
5 NameB NameB
Or if you only want to label those duplicate values...(remove keep=False
) 或者,如果您只想标记这些重复值...(remove
keep=False
)
df['duplicates'] = df['Name'].where(df['Name'].duplicated(), '')
Name duplicates
0 NameC
1 NameA
2 NameB
3 NameA NameA
4 NameA NameA
5 NameB NameB
IIUC, you can try something like this: IIUC,您可以尝试这样的事情:
df = pd.DataFrame({'Name':['NameC', 'NameA', 'NameB', 'NameA', 'NameA', 'NameB']})
duplicates = df.loc[df['Name'].duplicated(), 'Name'].unique().tolist()
duplicates
Output: 输出:
['NameA', 'NameB']
Explanation: Use duplicates
to create a boolean series, then filter the dataframe by the boolean series and column 'Name' then use unique to get the unique values of all the duplicates. 说明:使用
duplicates
创建布尔系列,然后通过布尔系列和“名称”列过滤数据框,然后使用唯一来获取所有重复项的唯一值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.