[英]How to see if a row in column A exists in Column B with python csv reader
I have two columns in a csv file我在 csv 文件中有两列
ColumnA ColumnB
jon don
eric cathrine
don sony
jay jon
ron
anne
What I am trying to do is check if each value in columnA exist in ColumnB, in this case only 'jon' and 'don' exist in columnB I am using python and its csv reader, so far i have used the following code我想要做的是检查 columnA 中的每个值是否存在于 ColumnB 中,在这种情况下,columnB 中仅存在“jon”和“don”我正在使用 python 及其 csv 阅读器,到目前为止我使用了以下代码
with open('samplefile.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile, delimiter=',')
for line in csvreader:
if line[0] not in line[1]:
print(line[0]+ " Does not exist")
this does not work because my code compares line by line instead of each value in columnA to any value in columnB I also tried throwing the values from csv into a list but that does work because it also appends empty values from columnB to the 2nd list.这不起作用,因为我的代码逐行比较 columnA 中的每个值而不是 columnB 中的任何值我也尝试将 csv 中的值扔到一个列表中,但这确实有效,因为它还将 columnB 中的空值附加到第二个列表中。 Any help is appreciated.
任何帮助表示赞赏。 I am not limited to csv reader I can use any other libraries like pandas.
我不限于 csv 阅读器,我可以使用任何其他库,如 Pandas。
Change your code like this:像这样改变你的代码:
with open('samplefile.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile, delimiter=',')
second_column = [l[1] for l in csvreader]
first_column = [l[0] for l in csvreader]
for line in first_column:
if line not in second_column:
print(f"{line} Does not exist")
with pandas we can use .isin
to return a boolean series:对于熊猫,我们可以使用
.isin
返回一个布尔系列:
df['check'] = df['ColumnA'].isin(df['ColumnB'])
print(df)
ColumnA ColumnB check
0 jon don True
1 eric cathrine False
2 don sony True
3 jay jon False
4 ron None False
5 anne None False
You can do it with pandas:你可以用熊猫做到这一点:
#df from csv
df=pd.read_csv('samplefile.csv', header=0)
#iterate the df
for index, row in df.iterrows():
if not row['ColumnA'].isin(df['ColumnB']) :
print (f"{row['ColumnA']} doesn't exist")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.