简体   繁体   中英

Error: Unable to load time variables with missing values in python using pyreadr package from RData file

I want to execute some python functions using data from '.RData' file. I am using the 'pyreadr' python package for the same.

Here is example of R Code

library(data.table)

# Creating demo data frame
data <- data.table(x_time = c(Sys.time(),Sys.time()+1,Sys.time()+2))
data_missing <- data.table(x_time = c(Sys.time(),NA,NA))

# checking the classes
sapply(data,class)
sapply(data_missing,class)

# Storing the data in RData file 
save(data, file = "test_data.RData")
save(data_missing, file = "test_missing_data.RData")

The reason I am storing it in different files is because the 'test_data.RData' is successfully loaded in python, however the 'test_missing_data.RData' is giving the an error.

Here is the Python Code

##  Working demo
# import pyreadr
# result=pyreadr.read_r('test_data.RData')
# data=result['data']
# data.dtypes
# print(data)

### Error in below 

import pyreadr
result=pyreadr.read_r('test_missing_data.RData') # Error 
data=result['data']
data.dtypes
print(data)

The error message is as below:

C:\Users\Pawan\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\pandas\core\tools\datetimes.py:530: RuntimeWarning: invalid value encountered in multiply arr, tz_parsed = tslib.array_with_unit_to_datetime(arg, unit, errors=errors)

The error occurs when there are NA values in the data frame. Is there other way load RData files in python?

Thank you for your time and help.

It is not an error, it is a warning, meaning it is probably not affecting your results. After running your R code, I can read the RData files without issue, notice that the name of the dataframe, you got it wrong in your code

import pyreadr
result=pyreadr.read_r('test_missing_data.RData') # No error, just warning
# Your data frame is called data_missing, not data, since you called like that in your R code,
# I think this is what you are doing wrong
# Check data.keys() to see what you have if you are not sure
data=result['data_missing']
data.dtypes
#x_time    datetime64[ns]                                                                                                                                                                              
#dtype: object
print(data)
#                       x_time                                                                                                                                                                       
#0 2022-08-03 09:37:55.963370752                                                                                                                                                                       
#1                           NaT                                                                                                                                                                       
#2                           NaT 

# Looks correct to me

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM