简体   繁体   English

在python中从熊猫数据框检索信息

[英]Retrieving information from a pandas dataframe in python

I have a pandas dataframe: 我有一个熊猫数据框:

   Time(s)  RARb relative signal  Rescaled_CRABPII  atRA  RARa_tet  RARg_tet
0        0                     0          0.000000   100         0         0
1     7200                    20          0.000000   100         0         0
2    14400                    50         11.764706   100         0         0
3    21600                    90         58.823529   100         0         0
4    43200                   100        100.000000   100         0         0
5    50400                   100        105.882353   100         0         0
6    64800                   100        117.647059   100         0         0

How can I retrieve the value of RARb relative signal at df['Time(s)']==43200 ? 如何在df['Time(s)']==43200处检索RARb relative signal的值?

假设df为数据框,您可以:

a = df[df['Time(s)']==43200]['RARb relative signal']

First, if you keep the df as is, you should use .iat[] or .at[] instead for scalar getting and setting (see examples on .iat[] and .at[] [here] 1 ). 首先,如果保持df不变,则应使用.iat []或.at []进行标量获取和设置(请参见.iat []和.at []的示例[此处] 1 )。 So the .iat version (which is the faster of the two) would be: 因此.iat版本(这是两者中的更快者)将是:

val = df.iat[df['Time(s)']==43200, 1]

The column number is 1 because 'RARb relative signal' is the second column. 列号为1,因为“ RARb相对信号”是第二列。 If you're not sure what the column position will be you can either get that too with get_loc() or just use .at[] with label-based indexing: 如果您不确定列的位置是什么,您也可以使用get_loc()获得,也可以仅将.at []与基于标签的索引一起使用:

val = df.at[df['Time(s)']==43200, 'RARb relative signal']

If you actually have a multiindex pass a tuple with all the multiindex levels instead of the single label. 如果您实际上有一个多索引,请通过具有所有多索引级别而不是单个标签的元组。

Really though, why don't you make the time in seconds be the index? 真的,但是,为什么不将以秒为单位的时间作为索引呢? You could still use integer-based indexing to access items, and this approach seems to make much more sense for your data. 您仍然可以使用基于整数的索引来访问项目,并且这种方法似乎对您的数据更有意义。 First, reset the index: 首先,重置索引:

df.index = df['Time(s)']

You may wish to then delete the 'Time(s)' column, that's up to you. 然后,您可能要删除由您决定的“时间”列。 Anyway with the new index you can just do: 无论如何,使用新索引都可以:

val = df.at[43200, 'RARb relative signal'] 

If you're doing these sorts of value retrievals a lot, re-indexing and using .iat[] or .at[] can make a big performance difference. 如果您经常进行这些类型的值检索,那么重新索引并使用.iat []或.at []可能会产生很大的性能差异。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM