简体   繁体   中英

How to read .xpt format file from the Azure ADLS blob container and convert to csv format

Downloaded the.xpt format file from the URL to the blob container in Databricks - Python notebook.

In the below code - 'example.xpt' is the local file. How to read the.xpt format file from the blob container?

import xport.v56

with open('example.xpt', 'rb') as f:

    library = xport.v56.load(f)

Appreciate any inputs. Thanks!

Considering you have already installed the library xport in your cluster and mounted your ADLS blob container , you follow the steps given below:

  • Use the same code, except the path will be the .xpt file present in your blob container.
import xport.v56

with open('/dbfs/mnt/repro/ALQY_F.XPT', 'rb') as f:
    # '/dbfs/mnt/repro/' refers to the mount point i.e., to ADLS blob container.
    library = xport.v56.load(f)
  • The library object is of type class 'xport.v56.Library' . library has an attribute values which returns an iterable object.

  • Use the following code to write the required data to csv format in specified destination

for data in library.values():

    print(type(data)) # <class 'xport.v56.Member'>

    print(dir(data)) # use to check all the possible attributes that can be used on this object

    data.to_csv("/dbfs/mnt/repro/op.csv") #writes as csv to your blob container.

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM