[英]Working with columns and rows python pandas
i am trying to collect data from pandas dataframes. 我正在尝试从熊猫数据帧中收集数据。 In the Screenshot you will see a part of how the database is built.
在屏幕截图中,您将看到数据库构建方式的一部分。
So now I want to analyze for the same hhid other columns. 因此,现在我想分析其他相同的列。 For the same hhid I want to compute the away time.
对于相同的hid,我想计算休假时间。 I want to select the first "from home" row and read the start value.
我想选择第一行“ home”并读取起始值。 Then this should not be overwritten again for the same hhids.
这样,对于相同的盖头,就不应再次覆盖它。 After that I want the end value of the last "to home" entry and then compute the difference between them.
之后,我想要最后一个“返回首页”条目的最终值,然后计算它们之间的差。 I tried to implement that, but the most time the read start value of from home gets overwritten and the differences are not the same.
我尝试实现这一点,但是大多数时候从首页读取的起始值都会被覆盖,并且差异并不相同。
Here is my routine: 这是我的例程:
wid=1
for i in range(0,len(dataframe)):
if (i+1 >= len(dataframe)):
break
if (
dataframe['hhid'].values[i] == dataframe['hhid'].values[i+1] or
dataframe['hhid'].values[i] == dataframe['hhid'].values[i-1]
):
if (
dataframe['w01'].values[i] == 'from Hause' and
wid >= dataframe['wid'].values[i]
):
bla = dataframe['wid'].values[i]
start = dataframe['st_std'].values[i]
print('start',start)
wid = dataframe['wid'].values[i]
if (
dataframe['w04'].values[i] == 'to Hause'
):
end = dataframe['en_std'].values[i]
print('end',end)
dataframe['awaytime'].values[i]= (end-start)
if end-start < 0:
dataframe['awaytime'].values[i]= (start-end)+1
else:
continue
if(dataframe['hhid'].values[i] != dataframe['hhid'].values[i+1]):
if (i+1 >= len(dataframe)):
break
wid=dataframe['wid'].values[i+1]
return dataframe
Any ideas how to do it correctly? 任何想法如何正确地做到这一点?
sample of data in excel format. excel格式的数据样本。 Unfortunately I am not allowed to upload the full dataset: https://www.dropbox.com/s/af3wb7fcsqhukvz/Export_db_awaytime.xlsx?dl=0
不幸的是,我不允许上传完整的数据集: https : //www.dropbox.com/s/af3wb7fcsqhukvz/Export_db_awaytime.xlsx?dl=0
I think I solved the problem. 我想我解决了问题。 I added an counter to hold the first value of from home.
我添加了一个计数器来保存home的第一个值。 The values I get are good.
我得到的价值很好。
FYI the code: 仅供参考:
counter=0
test_counter=0
from_home=0
for i in range(0,len(dataframe)):
if (i+1 >= len(dataframe)):
break
"""Check for same hhid"""
if (
dataframe['hhid'].values[i] == dataframe['hhid'].values[i+1] or
dataframe['hhid'].values[i] == dataframe['hhid'].values[i-1]
):
"""Check for first departure"""
if (
dataframe['w01'].values[i] == 'from home' and
counter<=test_counter
):
start = dataframe['st_std'].values[i]
#print('start',start)
from_home=1
counter+=1
"""Check way home"""
if (
dataframe['w04'].values[i] == 'to home' and
from_home==1
):
end = dataframe['en_std'].values[i]
dataframe['awaytime'].values[i]= (end-start)
if end-start < 0:
dataframe['awaytime].values[i]= (start-end)+1
"""Check when another hhid is next entry"""
if(dataframe['hhid'].values[i] != dataframe['hhid'].values[i+1]):
if (i+1 >= len(dataframe)):
break
counter=0
from_home=0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.