简体   繁体   中英

Writing Python pandas dataframe row slices to a file

I've got a CSV file with 4 columns, first column being case id (which is repetitive).

========INPUT csv file=============
case_num, serial,binary,review
23,29983, 1, "lorem ipsum ,lorem ipsum"
23,298829, 1, "Hi there"
29, 20020, 0, "hickery dickery dock"
29,298829, 1, "Hello there"
29, 28220, 0, "dickery dock"

I'm trying to filter all rows based on unique number of case ids only.

input=pandas.read_csv("inp.csv")
case_id=fl["case_num"]
case_id.sort
with open("out.csv","w") as fl:    
    for i in case_id.unique():
        fl.write(([input['case_num']==i].iloc[0].values)) 

Output:

[23 '29983' 1
 'lorem ipsum ,lorem ipsum'] #<type 'numpy.ndarray'>

[29 '20220' 0
 'hickery dickery dock']     #<type 'numpy.ndarray'>

As you can see the output is being written out in different lines, but I want them properly as one row each line split by comma.

=====DESIRED OUTPUT=======

23, '29983', 1,  'lorem ipsum ,lorem ipsum'
29 ,'20220', 0,  'hickery dickery dock'

To put it simply, if I've read some rows from a dataframe (generated using a csv file), then how do I write the selected subset of rows exactly in the same format (as was the input csv file) to an output csv file.

IIUC you can use drop_duplicates :

print df
   case id case_num no                        text
0       23  '29983'  1  'lorem ipsum ,lorem ipsum'
1       23  '29983'  1  'lorem ipsum ,lorem ipsum'
2       23  '29983'  1  'lorem ipsum ,lorem ipsum'
3       23  '29983'  1  'lorem ipsum ,lorem ipsum'
4       29  '20220'  0      'hickery dickery dock'

df = df.drop_duplicates(subset='case id')
print df
   case id case_num no                        text
0       23  '29983'  1  'lorem ipsum ,lorem ipsum'
4       29  '20220'  0      'hickery dickery dock'

Output to csv by to_csv :

df.to_csv(filename, sep=',', index=False)
case id,case_num,no,text
23,'29983',1,"'lorem ipsum ,lorem ipsum'"
29,'20220',0,'hickery dickery dock'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM