简体   繁体   中英

Taking last line from each row in a large csv file?

I have a 12000 rows with multiple lines in each row. I need to read and write into a new column only last lines in all 12000 rows

"► Контакт с пациентом | 07.02.2019 |  | 
► Принять в работу | 07.02.2019 |  | 
► Контакт с пациентом | 08.02.2019 |  | 
► Получить КП  | 14.02.2019 |  | 
► ждем КП | 18.02.2019 |  | 
► отправил ему ответ и стоимости лекарств! через дви недели с ним связываться  | 05.03.2019 |  | 
► арихив  | 23.03.2019 |  | ";
"► Контакт с пациентом | 19.06.2019 |  | 
► Принять в работу | 19.06.2019 |  | 
► Контакт с пациентом | 26.08.2019 |  | 
► Архив. | 10.09.2019 |  | ";

I can do that only for one row and thats it. How can I do that through all 12000 rows

import pandas as pd
df = pd.read_csv('/Users/gfidarov/Desktop/crosscheck/crosscheck/sheet1')
r = df.split('|')
r = r[-4:]
r = '|'.join(r)
print(r)

here I can read that with csv library but I can't take only the last one. And if I try to make it like I did with pandas row = row[-4:] I am getting error. How can I solve my problem?

import csv

with open('/Users/gfidarov/Desktop/sheet_one') as f:
    reader = csv.DictReader(f, delimiter='|')
    for row in reader:
        print(list(row))

For that file, the last line of each row is the line ending with a semicolon ( ; ) following a double quote ( " ).

So this could be enough:

with open('/Users/gfidarov/Desktop/sheet_one') as f:
    for line in f:
        if line.strip().endswith('";'):           # Ok this is the line we want...
            line = line.strip().strip('";')       # clean it a little
            print(line)

BTW, the csv try did not work because by default the double quote is used to quote fieds containing the delimiter or new lines, so here the csv module will only see one single field.

row in DictReader is a dict, where the keys are taken from the first row

When you use list(row), that only gives you those keys

You want to use csv.reader instead of csv.DictReader, which gives you a list for each row.

with open('/Users/gfidarov/Desktop/sheet_one.csv') as f:
    reader = csv.reader(f, delimiter='|')
    for row in reader:
        print(row)

Also, like @BergeBallesta said, the double quotes cause the error

but you need to use a text editor, to find and replace the " s and the ; s, so the csv module can read it properly

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM