简体   繁体   中英

Issue while reading SQL output into SAS dataset using Python

I am trying to read SQL query output and create a SAS dataset using Python pandas. My SQL query is executing properly but only importing column names into SAS dataset due to data shifting. It is an encoding issue and I wanted some guidance on how to fix this. Below is my code:

with open ('Query.sql', mode = 'r', encoding='utf-8-sig') as sql_input:
    sql_input = sql_input.read()

My_string = str('mssql+pyodbc://USER:Pass' + server + '/' + database + '?driver=SQL+Server+Native+Client+11.0')
engine = My_engine(My_string) 

df = pd.read_sql_query(sql_input,engine)

Viya = saspy.SASsession(omruser='User', omrpw='PWD', cfgfile='cfgfile.py')
Viya.submit('libname lib "path";') 
Viya.df2sd(df=df3, table='Test', libref='lib') 

and then I create SAS Viya Session and submit with the path and dataset name. When I check SAS dataset, I see only columns created.

Error: INPUT statement exceeded record length. INFILE CARDS OPTION STOPOVER specified.

Any suggestions?

First, I'm going to guess that you might have some embedded newline characters in the data.

See this github issue which has the same error message as you have. (Also note the detail in the initial post - this is about what your initial post should have looked like!)

If you have at least SASPy version 3.1.4, you should have an option embedded_newlines=true you can set, the example code:

sas.dataframe2sasdata(df=problematic_data,table='via_sas_test_prob',libref='OUT', embedded_newlines=True)

(that is an alias for df2sd , by the way). Hopefully that works in Viya; SASPy mostly runs base (MVA) SAS, so I don't know what things are different in Viya (and don't have Viya). If you don't have 3.1.4 or newer, then update your SASPy.

Another option would be to remove the newline ( \n ) characters in Python before transferring the data over. That's what the user in the github issue does.

Second, if you're running things in Viya, consider using python-SWAT , which is the SAS Viya scripting library in Python. It submits things as CAS actions, which might be better for you.

Finally, a third option would be to run the database connection piece in SAS. Unless you didn't license ACCESS to ODBC, of course, but if you do license that, then I'd consider moving that piece to SAS and just telling SAS to input the file itself.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM