简体   繁体   English

如何使用python中的DataFrame生成的结果写入csv?

[英]How to write into csv using the results generated from a DataFrame in python?

I am reading data from a tsv file using DataFrame from Pandas module in Python. 我正在使用Python中的Pandas模块中的DataFrame从tsv文件中读取数据。

df = pandas.DataFrame.from_csv(filename, sep='\t')

The file has around 5000 columns (4999 test parameters and 1 result / output value). 该文件大约有5000列(4999个测试参数和1个结果/输出值)。

I iterate through the entire tsv file and check if the result value matches the value that is expected. 我遍历整个tsv文件,并检查结果值是否与预期的值匹配。 I then write this row inside another csv file. 然后,我将此行写入另一个csv文件中。

expected_value = 'some_value'
with open(file_to_write, 'w') as csvfile:
  csvfwriter = csv.writer(csvfile, delimiter='\t')
  for row in df.iterrows():
    result = row['RESULT']
    if expected_value.lower() in str(result).lower():
        csvwriter.writerow(row)

But in the output csv file, the result is not proper, ie the individual column values are not going into their respective columns / cells. 但是在输出的csv文件中,结果不正确,即各个列的值未进入其各自的列/单元格中。 It is getting appended as rows. 它被追加为行。 How do I write this data correctly in the csv file? 如何在csv文件中正确写入此数据?

The answers suggested works well however, I need to check for multiple conditions. 建议的答案效果很好,但是,我需要检查多个条件。 I have a list which has some values: 我有一个列表,其中包含一些值:

vals = ['hello', 'foo', 'bar'] One of the column for all the rows has values that looks like this 'hello,foo,bar'. vals = ['hello','foo','bar']所有行的列之一的值看起来像是'hello,foo,bar'。 I need to do two checks, one if any value in the vals list is present in the column with the values 'hello, foo, bar' or if the result value matches the expected value. 我需要进行两项检查,一项是在vals列表中是否存在值“ hello,foo,bar”的列,或者结果值是否与期望值匹配。 I have written the following code 我写了下面的代码

df = pd.DataFrame.from_csv(filename, sep='\t')
for index, row in df.iterrows():
  csv_vals = row['COL']
  values = str(csv_vals).split(",")
  if(len(set(vals).intersection(set(values))) > 0 or expected_value.lower() in str(row['RESULT_COL'].lower()):
    print row['RESULT_COL']

You should create a dataframe where you have a column 'RESULT' and one 'EXPECTED'. 您应该创建一个数据框,其中有一列“ RESULT”和一个“ EXPECTED”。

Then you can filter the rows where both match and output only these to csv using: 然后,您可以使用以下命令过滤匹配的行并将其仅输出到csv:

df.ix[df['EXPECTED']==df['RESULT']].to_csv(filename)

You can filter the values like this: 您可以像这样过滤值:

df[df['RESULT'].str.lower().str.contains(expected_value.lower())].to_csv(filename)

This will work for filtering values that contain your expected_value as you did in your code. 与在代码中所做的一样,这将适用于过滤包含expected_value值。 If you want to get exact match you can use: 如果要获得完全匹配,可以使用:

df.loc[df['Result'].str.lower() == expected_value.lower()].to_csv(filename)

As you suggested in comment, for multiple criteria you will need something like this: 正如您在注释中建议的那样,对于多个条件,您将需要以下内容:

expected_values = [expected_value1, expected_value2, expected_value3]
df[df['Result'].isin(expected_values)]

UPDATE: 更新:

And to filter on multiple criteria and to filter desired column: 并根据多个条件进行过滤并过滤所需的列:

df.ix[df.isin(vals).any(axis=1)].loc[df['Result'].str.lower() == expected_value.lower()].to_csv(filename)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM