简体   繁体   中英

How to open multiple zipped excel files(.gz file) located in different sub folders at once in dataframe format in python?

I am new to python so having a problem. I have multiple zipped (.gz)excel files in different sub folders, and I am trying to read all the zipped excel files in dataframe format in python using the os.listdir function but it's only showing the name. I am facing the problem to read those zipped excel files in DataFrame format, so that I can perform operations on it. Any help would really be appreciated. I used:

path = "/Users/admin/Desktop/Data"

import os
df=os.listdir(path)
from pathlib import Path

for path in Path("path/to/dir").rglob("*.gz"):
    print(path)

The above formula is just giving me the name of the files, and not the actual CSV files, I want to perform operation on. Help would be really appreciated.

If you have at least Python 3.4, you can use the pathlib module to apply a recursive glob pattern - in this case, it will iterate over paths (to files, in any sub-directory belonging to the provided path) whose extension / suffix is .gz . Each path is an absolute path, and these path objects can easily be cast to strings as well if you need them as strings.

from pathlib import Path

for path in Path("path/to/dir").rglob("*.gz"):
    print(path)

EDIT - A more concrete example:

from pathlib import Path

root_path = "C:/users/admin/desktop/data"

for path in Path(root_path).rglob("*.gz"):
    print("Current path:", path)
    gzip_file = path.open("r")
    # ... do things with gzip_file

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM