简体   繁体   English

使用 .loc 从 Dask 访问一个值

[英]Accessing a value from Dask using .loc

For the life of me, I cant figure how to combine these two dataframes.对于我的生活,我不知道如何组合这两个数据框。 I am using the newest most updated versions of all softwares, including Python, Pandas and Dask.我使用的是所有软件的最新版本,包括 Python、Pandas 和 Dask。

#pandasframe has 10k rows and 3 columns - 
['monkey','banana','furry']

#daskframe has 1.5m rows, 1column, 135 partitions - 
row.index: 'monkey_banana_furry'
row.mycolumn = 'happy flappy tuna' 

my dask dataframe has a string as its index for accessing, but when i do daskframe.loc[index_str] it returns a dask dataframe, but i thought it was supposed to return one single specific row.我的 dask dataframe 有一个字符串作为其访问索引,但是当我执行daskframe.loc[index_str]它返回一个 dask dataframe,但我认为它应该返回一个特定的行。 and i dont know how to access the row/value that i need from that dataframe. what i want is to input the index, and output one specific value.而且我不知道如何从 dataframe 访问我需要的行/值。我想要的是输入索引和 output 一个特定值。

what am i doing wrong?我究竟做错了什么?

Even pandas.DataFrame.loc don't return a scalar if you don't specify a label for the columns.如果您没有为列指定 label,即使pandas.DataFrame.loc也不会返回标量。

Anyways, to get a scalar in your case, first, you need to add dask.dataframe.DataFrame.compute so you can get a pandas dataframe (since dask.dataframe.DataFrame.loc returns a dask dataframe).无论如何,要在您的情况下获得标量,首先,您需要添加dask.dataframe.DataFrame.compute以便您可以获得 pandas dataframe(因为dask.dataframe.DataFrame.loc返回 dask.dataframe.DataFrame data.loc And only then, you can use the pandas .loc .只有这样,您才能使用 pandas .loc

Assuming ( dfd ) is your dask dataframe , try this:假设 ( dfd ) 是你的dask dataframe ,试试这个:

dfd.loc[index_str].compute().loc[index_str, "happy flappy tuna"]

Or this:或这个:

dfd.loc[index_str, "happy flappy tuna"].compute().iloc[0]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM