简体   繁体   中英

Reading multiple excel files from a certain path (with certain sheets names) into a single pandas data frame

How can I read only sheets '2g','3g','4g' from all the excel files exists in a certain path and write them to three different data frames. All 2g sheets should be stored in a pandas data frame called (2g) All 3g sheets should be stored in a pandas data frame called (3g) All 4g sheets should be stored in a pandas data frame called (4g) following the screenshot of the excel files: 在此处输入图像描述

and the sheets I want to read from each excel files as in the screen below:

在此处输入图像描述

You can use read_excel with multiple sheet names:

import pandas as pd

data2g = []
data3g = []
data4g = []

for xlsx in pathlib.Path('mobile').glob('*.xlsx'):
    df2g, df3g, df4g = pd.read_excel(xlsx, sheet_name=['2G', '3G', '4G']).values()
    data2g.append(df2g)
    data3g.append(df3g)
    data4g.append(df4g)

df2g = pd.concat(data2g, ignore_index=True)
df3g = pd.concat(data3g, ignore_index=True)
df4g = pd.concat(data4g, ignore_index=True)

As you can see, you can improve readability using loops:

import pandas as pd
from collections import defaultdict

data = defaultdict(list)
for xlsx in pathlib.Path('mobile').glob('*.xlsx'):
    out = pd.read_excel(xlsx, sheet_name=['2G', '3G', '4G'])
    for k, v in out.items():
        data[k].append(v)
df2g, df3g, df4g = [pd.concat(data[k], ignore_index=True) for k in data]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM