简体   繁体   中英

Python - Iterating through rows of excel files with Pandas

I have problem with iterating through rows in excel file .

import os
import pandas as pd
import json

for file in os.listdir("./python_files"):
    if file.endswith(".xlsx"):
        df = pd.read_excel(os.path.join("./python_files", file)) 
        CRD_Array = df.iloc[:,1].values
        for single_CRD in CRD_Array:           
            with open("{}.json".format(single_CRD), 'w') as json_file:
                row_iterator = 0
                data = {}
                data['header']=[]
                data['header'].append({'Organization CRD#':  '{}'.format(df.iloc[row_iterator,1])})
                json.dump(data, json_file)
                row_iterator = row_iterator + 1
  

          

How u can see my script is

  1. Reading files from python_files folder
  2. Then it's reading second column with CRD number which returns an array of CRDs
  3. Then it's looping CRD array
  4. In that loop It's trying to save .json file with "hedar" field

What I get in output now

File name 172081.json

{"header": [{"Organization CRD#": "172081"}

File name 534123.json

{"header": [{"Organization CRD#": "172081"}

File name 184521.json

{"header": [{"Organization CRD#": "172081"}

I looks like df.iloc [row_iterator, 1] isn't changing row property despite adding +1 for each loop repeat

Can somebody help?

Edit: Excel file example-

在此处输入图像描述

What I want to achieve

File name 172081.json

{"header": [{"Organization CRD#": "172081"}

File name 534123.json

{"header": [{"Organization CRD#": "534123"}

File name 184521.json

{"header": [{"Organization CRD#": "184521"} 

In the for loop you are increasing the row_iterator, but in the first line after open you always set it back to 0. You need to take that line out from the loop. Like this:

row_iterator = 0

for single_CRD in CRD_Array: 
          
    with open("{}.json".format(single_CRD), 'w') as json_file:
       data = {}
       data['header']=[]
       data['header'].append({'Organization CRD#': '{}'.format(df.iloc[row_iterator,1])})
       json.dump(data, json_file)
       row_iterator = row_iterator + 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM