[英]In python using pandas, I want my program to search strings in column A and column B in csv file and write results in column C
import pandas as pd
path = "C:\\Users\\Desktop\\Python\\"
filename = 'file.csv'
df = pd.read_csv(path+filename)
arr = []
for i in range(len(df['Column_A'])):
if df['Column_A'][i] == pd.np.nan:
continue
if df['Column_A'][i] is not pd.np.nan:
if 'ABC' in df['Column_A'][i]:
arr.append('X')
elif 'DEF' in df['Column_A'][i]:
arr.append('Y')
elif 'GHI' in df['Column_A'][i]:
arr.append('Z')
else:
arr.append('')
else:
arr.append(' ')
continue
df['Column_C'] = arr
filename = 'output.csv'
df.to_csv(path+filename)
In the above code I want to add column_B to search strings ("ABC", "DEF","GHI") along with column_A to write results in column_C as desired if match is found.在上面的代码中,我想将 column_B 添加到搜索字符串(“ABC”、“DEF”、“GHI”)以及 column_A,以便在找到匹配项时根据需要将结果写入 column_C。
It is not clear to me in your question if:我不清楚你的问题是否:
In the first situation Column_C will be 'X', 'Y', 'Z', '', or ' '.在第一种情况下,Column_C 将是“X”、“Y”、“Z”、“”或“”。 But what would happen if Column_A has 'ABC' and Column_B has 'DEF'?
但是如果 Column_A 有 'ABC'而Column_B 有 'DEF' 会发生什么呢?
import pandas as pd
path = "C:\\Users\\Desktop\\Python\\"
filename = 'file.csv'
df = pd.read_csv(path+filename)
arr = []
lookup = {'ABC': 'X', 'DEF': 'Y', 'GHI': 'Z'}
for col_A, col_B in zip(df['Column_A'], df['Column_B']):
to_append = ''
if col_A is not pd.np.nan and col_B is not pd.np.nan:
for key in lookup.keys():
if key in col_A and key in col_B:
to_append = to_append + lookup[key]
arr.append(to_append)
df['Column_C'] = arr
filename = 'output.csv'
df.to_csv(path+filename, index=False)
In the second situation Column_C will be 'XX', 'XY', ..., 'YZ','ZZ', '', ' '.在第二种情况下 Column_C 将是 'XX', 'XY', ..., 'YZ','ZZ', '', ' '。
import pandas as pd
path = "C:\\Users\\Desktop\\Python\\"
filename = 'file.csv'
df = pd.read_csv(path+filename)
arr = []
lookup = {'ABC': 'X', 'DEF': 'Y', 'GHI': 'Z'}
for col_A, col_B in zip(df['Column_A'], df['Column_B']):
to_append = ''
if col_A is not pd.np.nan and col_B is not pd.np.nan:
for key in lookup.keys():
if key in col_A:
to_append = to_append + lookup[key]
elif key in col_B:
to_append = to_append + lookup[key]
arr.append(to_append)
df['Column_C'] = arr
filename = 'output.csv'
df.to_csv(path+filename, index=False)
In both situations I also added index=False when writing the csv file, because I assumed you wanted to have a file exactly as your input file, but with an extra column, Column C.在这两种情况下,我还在编写 csv 文件时添加了 index=False,因为我假设您想要一个与输入文件完全相同的文件,但有一个额外的列,列 C。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.