简体   繁体   中英

How to import .dta via pandas and describe data?

I am new to python and have a simple problem. In a first step, I want to load some sample data I created in Stata. In a second step, I would like to describe the data in python - that is, I'd like a list of the imported variable names. So far I've done this:

from pandas.io.stata import StataReader

reader = StataReader('sample_data.dta')
data = reader.data()

dir()

I get the following error:

anaconda/lib/python3.5/site-packages/pandas/io/stata.py:1375: UserWarning: 'data' is deprecated, use 'read' instead
  warnings.warn("'data' is deprecated, use 'read' instead")

What does it mean and how can I resolve the issue? And, is dir() the right way to get an understanding of what variables I have in the data?

Using pandas.io.stata.StataReader.data to read from a stata file has been deprecated in pandas 0.18.1 version and hence you are getting that warning.

Instead, you must use pandas.read_stata to read the file as shown:

df = pd.read_stata('sample_data.dta')
df.dtypes                                        ## Return the dtypes in this object

Sometimes this did not work for me especially when the dataset is large. So the thing I propose here is 2 steps (Stata and Python)

In Stata write the following commands:

export excel Cevdet.xlsx, firstrow(variables)

and to copy the variable labels write the following

describe, replace
    list
    export excel using myfile.xlsx, replace first(var)
restore

this will generate for you two files Cevdet.xlsx and myfile.xlsx

Now you go to your jupyter notebook

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_excel('Cevdet.xlsx')

This will allow you to read both files into jupyter (python 3)

My advice is to save this data file (especially if it is big)

df.to_pickle('Cevdet')

The next time you open jupyter you can simply run

df=pd.read_pickle("Cevdet")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM