在 python 中與 RDS 文件交互時遇到問題

Question

我想在 python 中打開一個關於魚長度的 RDS 文件，並將其轉換為基於大小和位置類的元組排序列表，執行基本統計，生成圖表，並將結果保存在 .CSV 中。 我正在使用pyreadr我已經能夠將文件作為 OrderedDict 讀取，（使用type(result) <class 'collection.OrderedDict'>驗證）但無法打印單行或單列。 我可以打印整個數據集，但我無法控制。

import pyreadr
myfile ='C:\\Users\\Tim\\Downloads\\fishData.RDS'
result = pyreadr.read_r(myfile)
print(result.keys())
df1=result[None]
print(df1)

我的輸出...

odict_keys([None])
OrderedDict([(None,       size  fishLength     location
0      fry   10.420310  mainChannel
1      fry    9.165523  mainChannel
2      fry    7.005817  mainChannel
3      fry    7.199168   floodPlain
4      fry    3.392063  mainChannel
..     ...         ...          ...
173  smolt   31.765081   floodPlain
174  smolt   32.573470   floodPlain
175  smolt   31.204408  mainChannel
176  smolt   30.948726   floodPlain
177  smolt   28.414746  mainChannel

[178 rows x 3 columns])])

我得到了我的數據，但是當我使用

len(results) 
1

這一切都在一個巨大的項目中，我無法找到如何獲取實際長度數據來處理它。 需要幫助訪問各個行以將它們提取為.CSV 。

Answer 1

當您使用read_r函數時，它將返回一個字典，其中鍵是對象的名稱。 RDS 文件是一個序列化的 R 對象（與可以存儲多個 R 對象的 RData 文件相反），因此字典中只有一個對象，其鍵為None 。 這是一個簡單的例子。

代碼

df <- data.frame(x=11:20,
                 y=sin(1:10),
                 z=rep(c('foo', 'bar'), each=5)
)

saveRDS(df, 'file.rds')

Python代碼

import pyreadr

result = pyreadr.read_r('file.rds')
result[None]

    x         y    z
0  11  0.841471  foo
1  12  0.909297  foo
2  13  0.141120  foo
3  14 -0.756802  foo
4  15 -0.958924  foo
5  16 -0.279415  bar
6  17  0.656987  bar
7  18  0.989358  bar
8  19  0.412118  bar
9  20 -0.544021  bar

您仍在嘗試將len函數應用於字典。 您應該將實際數據框保存到一個新對象並對其進行操作。 例如

In [2]: df = pyreadr.read_r("file.rds")[None]

In [3]: len(df)
Out[3]: 10

In [4]: df['x']*42
Out[4]: 
0    462
1    504
2    546
3    588
4    630
5    672
6    714
7    756
8    798
9    840
Name: x, dtype: int32

在 python 中與 RDS 文件交互時遇到問題

問題描述

1 個解決方案

解決方案1
2 2022-06-27 01:20:54

在 python 中與 RDS 文件交互時遇到問題

問題描述

1 個解決方案

解決方案1 2 2022-06-27 01:20:54

解決方案1
2 2022-06-27 01:20:54