简体   繁体   English

How to read every column of a csv file in python after every 10-15 rows which have the same header using pandas or csv?

[英]How to read every column of a csv file in python after every 10-15 rows which have the same header using pandas or csv?

I have a CSV file with data something like this in the below file.我有一个 CSV 文件,在下面的文件中有类似这样的数据。

Column 1| Car Make  |Mazda      |Car Model  |123456|
Customer| fname     |lname      | phone     |      |        
1       |abcdef     |iiiiii     |123456454                  
2       |eddfga     |jjjjjj     |123434325                  
3       |phadfa     |kkkk       |124141414                  
4       |adfkld     |llllll     |567575575                  
5       |dafadf     |mmmm       |898979778                  
                                
Column 2| Car Make| Nissan| Car Model| 789471|
Customer| fname   |lname  |phone                    
1       |gjjgjd   |pwldd  |7123097109                   
2       |rbnhf    |ggggg  |9827394829                   
3       |plgrd    |hhhhh  |9210313813                   
4       |ghurf    |jjjjj  |9876548311                   
5       |mjydd    |kkkkk  |8887654625                   
6       |kiure    |llllll |7775559994                   
7       |wriok    |mmmmm  |9993338881                   
8       |stije    |nnnnnn |1110002223                   
9       |ahlkg    |bbbbb  |7778889995                   
10      |cdegf    |vvvvv    |9993331999     

        

I need to read the CSV file(I'm using Pandas) to structure and combine all the columns along with the customer names and phone numbers with respect to their make and model.我需要阅读 CSV 文件(我正在使用 Pandas)来构建和组合所有列以及与其品牌和 model 有关的客户姓名和电话号码。 Output as the below file. Output 如下文件。

Column   |Customer| fname|  lname|  phone|  Car Make|   Car Model|
Column 1|   1|  abcdef| iiiiii| 123456454|  Mazda|  123456|
Column 1|   2|  eddfga| jjjjjj| 123434325|  Mazda|  123456|
Column 1|   3|  phadfa| kkkk|   124141414|  Mazda|  123456|
Column 1|   4|  adfkld| llllll| 567575575|  Mazda|  123456|
Column 1|   5|  dafadf| mmmm|   898979778|  Mazda|  123456|
Column 2|   1|  gjjgjd| pwldd|  7123097109| Nissan| 789471|
Column 2|   2|  rbnhf|  ggggg|  9827394829| Nissan| 789471|
Column 2|   3|  plgrd|  hhhhh|  9210313813| Nissan| 789471|
Column 2|   4|  ghurf|  jjjjj|  9876548311| Nissan| 789471|
Column 2|   5|  mjydd|  kkkkk|  8887654625| Nissan| 789471|
Column 2|   6|  kiure|  llllll| 7775559994| Nissan| 789471|
Column 2|   7|  wriok|  mmmmm|  9993338881| Nissan| 789471|
Column 2|   8|  stije|  nnnnnn| 1110002223| Nissan| 789471|
Column 2|   9|  ahlkg|  bbbbb|  7778889995| Nissan| 789471|
Column 2|   10| cdegf|  vvvvv|  9993331999| Nissan| 789471|

Below is the code that I have tried and I am not able to progress further on how to loop through column 2 and fetch the details after column 1 is parsed.下面是我尝试过的代码,我无法进一步了解如何遍历第 2 列并在解析第 1 列后获取详细信息。

import pandas as pd

hdf = pd.read_csv("file.csv", nrows=0)
first = list(hdf)
df = pd.read_csv("file.csv", skiprows=1) 
df.columns = ['A','B','C','D']
out_df = pd.DataFrame(columns=['Column', 'Customer', 'fname', 'lname', 'phone', 'make', 'model'])
out_df.Customer = i.A
out_df.fname = i.B
out_df.lname = i.C
out_df.phone = i.D
out_df.Column = first[0]
out_df.make = first[2]
out_df.model = first[4]
out_df.to_csv('new_file.csv')

And, this is where I am stuck with no idea to progress.而且,这就是我不知道要进步的地方。 Is there any other question similar to this where I can find a few ideas on how to get the desired result?有没有其他类似的问题,我可以找到一些关于如何获得预期结果的想法?

Any help is really very appreciated.任何帮助都非常感谢。

Thank you!谢谢!

I would use csv.DictReader() and read the rows from the file, check the condition ie, 1st column is 'Customer' or 'Column x' and assign the values to a list.我会使用csv.DictReader()并从文件中读取行,检查条件,即第一列是“客户”或“列 x”并将值分配给列表。 Finally convert the list to a desired data frame.最后将列表转换为所需的数据框。

Input csv file test.csv :输入 csv 文件test.csv

在此处输入图像描述

Code:代码:

rows = []
with open('test.csv', 'rt') as f:
    for row in csv.DictReader(f, fieldnames=['A', 'B', 'C', 'D', 'E']):
        if row['A'].startswith('Column') and row['B'] == 'Car Make' and row['D'] == 'Car Model':
            column = row['A']
            make = row['C']
            model = row['E']
            customer = fname = lname = phone = ''
        elif row['A']!='Customer':
            customer = row['A']
            fname = row['B']
            lname = row['C']
            phone = row['D']

        if fname!='':
            rows.append([column, customer, fname, lname, phone, make, model])

df = pd.DataFrame(rows, columns=['Column', 'Customer', 'fname', 'lname', 'phone', 'Car Make', 'Car Model'])
print(df)

Result:结果:

      Column Customer   fname   lname       phone Car Make Car Model
0   Column 1        1  abcdef  iiiiii   123456454    Mazda    123456
1   Column 1        2  eddfga  jjjjjj   123434325    Mazda    123456
2   Column 1        3  phadfa    kkkk   124141414    Mazda    123456
3   Column 1        4  adfkld  llllll   567575575    Mazda    123456
4   Column 1        5  dafadf    mmmm   898979778    Mazda    123456
5   Column 2        1  gjjgjd   pwldd  7123097109   Nissan    789471
6   Column 2        2   rbnhf   ggggg  9827394829   Nissan    789471
7   Column 2        3   plgrd   hhhhh  9210313813   Nissan    789471
8   Column 2        4   ghurf   jjjjj  9876548311   Nissan    789471
9   Column 2        5   mjydd   kkkkk  8887654625   Nissan    789471
10  Column 2        6   kiure  llllll  7775559994   Nissan    789471
11  Column 2        7   wriok   mmmmm  9993338881   Nissan    789471
12  Column 2        8   stije  nnnnnn  1110002223   Nissan    789471
13  Column 2        9   ahlkg   bbbbb  7778889995   Nissan    789471
14  Column 2       10   cdegf   vvvvv  9993331999   Nissan    789471

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM