繁体   English   中英

How to read every column of a csv file in python after every 10-15 rows which have the same header using pandas or csv?

[英]How to read every column of a csv file in python after every 10-15 rows which have the same header using pandas or csv?

我有一个 CSV 文件,在下面的文件中有类似这样的数据。

Column 1| Car Make  |Mazda      |Car Model  |123456|
Customer| fname     |lname      | phone     |      |        
1       |abcdef     |iiiiii     |123456454                  
2       |eddfga     |jjjjjj     |123434325                  
3       |phadfa     |kkkk       |124141414                  
4       |adfkld     |llllll     |567575575                  
5       |dafadf     |mmmm       |898979778                  
                                
Column 2| Car Make| Nissan| Car Model| 789471|
Customer| fname   |lname  |phone                    
1       |gjjgjd   |pwldd  |7123097109                   
2       |rbnhf    |ggggg  |9827394829                   
3       |plgrd    |hhhhh  |9210313813                   
4       |ghurf    |jjjjj  |9876548311                   
5       |mjydd    |kkkkk  |8887654625                   
6       |kiure    |llllll |7775559994                   
7       |wriok    |mmmmm  |9993338881                   
8       |stije    |nnnnnn |1110002223                   
9       |ahlkg    |bbbbb  |7778889995                   
10      |cdegf    |vvvvv    |9993331999     

        

我需要阅读 CSV 文件(我正在使用 Pandas)来构建和组合所有列以及与其品牌和 model 有关的客户姓名和电话号码。 Output 如下文件。

Column   |Customer| fname|  lname|  phone|  Car Make|   Car Model|
Column 1|   1|  abcdef| iiiiii| 123456454|  Mazda|  123456|
Column 1|   2|  eddfga| jjjjjj| 123434325|  Mazda|  123456|
Column 1|   3|  phadfa| kkkk|   124141414|  Mazda|  123456|
Column 1|   4|  adfkld| llllll| 567575575|  Mazda|  123456|
Column 1|   5|  dafadf| mmmm|   898979778|  Mazda|  123456|
Column 2|   1|  gjjgjd| pwldd|  7123097109| Nissan| 789471|
Column 2|   2|  rbnhf|  ggggg|  9827394829| Nissan| 789471|
Column 2|   3|  plgrd|  hhhhh|  9210313813| Nissan| 789471|
Column 2|   4|  ghurf|  jjjjj|  9876548311| Nissan| 789471|
Column 2|   5|  mjydd|  kkkkk|  8887654625| Nissan| 789471|
Column 2|   6|  kiure|  llllll| 7775559994| Nissan| 789471|
Column 2|   7|  wriok|  mmmmm|  9993338881| Nissan| 789471|
Column 2|   8|  stije|  nnnnnn| 1110002223| Nissan| 789471|
Column 2|   9|  ahlkg|  bbbbb|  7778889995| Nissan| 789471|
Column 2|   10| cdegf|  vvvvv|  9993331999| Nissan| 789471|

下面是我尝试过的代码,我无法进一步了解如何遍历第 2 列并在解析第 1 列后获取详细信息。

import pandas as pd

hdf = pd.read_csv("file.csv", nrows=0)
first = list(hdf)
df = pd.read_csv("file.csv", skiprows=1) 
df.columns = ['A','B','C','D']
out_df = pd.DataFrame(columns=['Column', 'Customer', 'fname', 'lname', 'phone', 'make', 'model'])
out_df.Customer = i.A
out_df.fname = i.B
out_df.lname = i.C
out_df.phone = i.D
out_df.Column = first[0]
out_df.make = first[2]
out_df.model = first[4]
out_df.to_csv('new_file.csv')

而且,这就是我不知道要进步的地方。 有没有其他类似的问题,我可以找到一些关于如何获得预期结果的想法?

任何帮助都非常感谢。

谢谢!

我会使用csv.DictReader()并从文件中读取行,检查条件,即第一列是“客户”或“列 x”并将值分配给列表。 最后将列表转换为所需的数据框。

输入 csv 文件test.csv

在此处输入图像描述

代码:

rows = []
with open('test.csv', 'rt') as f:
    for row in csv.DictReader(f, fieldnames=['A', 'B', 'C', 'D', 'E']):
        if row['A'].startswith('Column') and row['B'] == 'Car Make' and row['D'] == 'Car Model':
            column = row['A']
            make = row['C']
            model = row['E']
            customer = fname = lname = phone = ''
        elif row['A']!='Customer':
            customer = row['A']
            fname = row['B']
            lname = row['C']
            phone = row['D']

        if fname!='':
            rows.append([column, customer, fname, lname, phone, make, model])

df = pd.DataFrame(rows, columns=['Column', 'Customer', 'fname', 'lname', 'phone', 'Car Make', 'Car Model'])
print(df)

结果:

      Column Customer   fname   lname       phone Car Make Car Model
0   Column 1        1  abcdef  iiiiii   123456454    Mazda    123456
1   Column 1        2  eddfga  jjjjjj   123434325    Mazda    123456
2   Column 1        3  phadfa    kkkk   124141414    Mazda    123456
3   Column 1        4  adfkld  llllll   567575575    Mazda    123456
4   Column 1        5  dafadf    mmmm   898979778    Mazda    123456
5   Column 2        1  gjjgjd   pwldd  7123097109   Nissan    789471
6   Column 2        2   rbnhf   ggggg  9827394829   Nissan    789471
7   Column 2        3   plgrd   hhhhh  9210313813   Nissan    789471
8   Column 2        4   ghurf   jjjjj  9876548311   Nissan    789471
9   Column 2        5   mjydd   kkkkk  8887654625   Nissan    789471
10  Column 2        6   kiure  llllll  7775559994   Nissan    789471
11  Column 2        7   wriok   mmmmm  9993338881   Nissan    789471
12  Column 2        8   stije  nnnnnn  1110002223   Nissan    789471
13  Column 2        9   ahlkg   bbbbb  7778889995   Nissan    789471
14  Column 2       10   cdegf   vvvvv  9993331999   Nissan    789471

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM