简体   繁体   English

如何使用python csv阅读器查看B列中是否存在A列中的一行

[英]How to see if a row in column A exists in Column B with python csv reader

I have two columns in a csv file我在 csv 文件中有两列

ColumnA ColumnB
jon     don
eric    cathrine
don     sony
jay     jon
ron
anne

What I am trying to do is check if each value in columnA exist in ColumnB, in this case only 'jon' and 'don' exist in columnB I am using python and its csv reader, so far i have used the following code我想要做的是检查 columnA 中的每个值是否存在于 ColumnB 中,在这种情况下,columnB 中仅存在“jon”和“don”我正在使用 python 及其 csv 阅读器,到目前为止我使用了以下代码

with open('samplefile.csv', 'r') as csvfile:
    csvreader = csv.reader(csvfile, delimiter=',')
    for line in csvreader:
      if line[0] not in line[1]:
        print(line[0]+ " Does not exist")

this does not work because my code compares line by line instead of each value in columnA to any value in columnB I also tried throwing the values from csv into a list but that does work because it also appends empty values from columnB to the 2nd list.这不起作用,因为我的代码逐行比较 columnA 中的每个值而不是 columnB 中的任何值我也尝试将 csv 中的值扔到一个列表中,但这确实有效,因为它还将 columnB 中的空值附加到第二个列表中。 Any help is appreciated.任何帮助表示赞赏。 I am not limited to csv reader I can use any other libraries like pandas.我不限于 csv 阅读器,我可以使用任何其他库,如 Pandas。

Change your code like this:像这样改变你的代码:

with open('samplefile.csv', 'r') as csvfile:
    csvreader = csv.reader(csvfile, delimiter=',')
    second_column = [l[1] for l in csvreader]
    first_column = [l[0] for l in csvreader]
    for line in first_column:
      if line not in second_column:
        print(f"{line} Does not exist")

with pandas we can use .isin to return a boolean series:对于熊猫,我们可以使用.isin返回一个布尔系列:

df['check'] = df['ColumnA'].isin(df['ColumnB'])

print(df)
  ColumnA   ColumnB  check
0     jon       don   True
1    eric  cathrine  False
2     don      sony   True
3     jay       jon  False
4     ron      None  False
5    anne      None  False

You can do it with pandas:你可以用熊猫做到这一点:

#df from csv
df=pd.read_csv('samplefile.csv', header=0)
#iterate the df
for index, row in df.iterrows():
    if not row['ColumnA'].isin(df['ColumnB']) :
        print (f"{row['ColumnA']} doesn't exist") 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM