简体   繁体   English

如何确定每个人每次都住在不同城市的地方?

[英]How to identify where each person have lived in different cities in each time?

Here is a small set of the dataset that I am currently working on. 这是我当前正在处理的一小部分数据集。

FirstName  LastName   cities       occupation         time
---------------------------------------------------------------
---------------------------------------------------------------
Alice      Oumi       Queens       software engineer  1/1/2019
Alice      Oumi       New York     software engineer  12/3/2018
Sam        Charles    Santa Clara  Engineer           2/5/2017
Sam        Charles    Santa Monica Engineer           8/9/2018
Sam        Charles    Santa Clara  Engineer           12/12/2019
Alice      Oumi       New York     software engineer  1/2/2017

As you see above, the same person could be living in a same place but for a different duration of a time. 如您在上面看到的,同一个人可能生活在同一地方,但时间不同。 I want to make clean this dataset that should what places did Alice and Sam live. 我想整理一下该数据集,以了解爱丽丝和萨姆住过哪些地方。 For example, instead of having 2 rows of Alice living in New York, I only need to have one. 例如,我不需要在纽约有2行爱丽丝居住,而是只需要有一行。 Something similar to the following table 类似于下表

FirstName  LastName   cities         FirstTime    SecondTime
---------------------------------------------------------------
---------------------------------------------------------------
Alice      Oumi       Queens         1/1/2019     NA
Alice      Oumi       New York       1/2/2017     12/3/2018
Sam        Charles    Santa Clara    2/5/2017     12/12/2019
Sam        Charles    Santa Monica   8/9/2018     NA 

I am kinda new to python and trying to learn. 我是python的新手,正在尝试学习。 but i have tried to use for loops using iterrows() but didn't work. 但是我试图使用iterrows()进行循环,但是没有用。 What can use to achieve this table? 有什么可以用来实现此表的?

Thank you so much in advance 提前谢谢你

You can do that as follows: 您可以按照以下步骤进行操作:

# number the times a person lived in the same city (with the same occupation)
df['sequence']= df.groupby(['FirstName', 'LastName', 'cities', 'occupation']).cumcount()+1
# now create the "pivot" table
result= df.set_index(['FirstName', 'LastName', 'cities', 'occupation', 'sequence']).unstack()
# rename the columns
result.columns= ['FirstTime', 'SecondTime']

# reset the index (it was just needed for "pivoting"
result.reset_index(inplace=True)

The result looks like: 结果看起来像:

Out[483]: 
  FirstName LastName                 cities         occupation  FirstTime  SecondTime
0     Alice     Oumi               New York  software engineer  12/3/2018    1/2/2017
1     Alice     Oumi                 Queens  software engineer   1/1/2019         NaN
2       Sam  Charles            Santa Clara           Engineer   2/5/2017  12/12/2019
3       Sam  Charles  Santa Monica Engineer           8/9/2018       None         NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在Python中比较JSON对象,其中每个对象可以具有不同的属性 - How to compare JSON objects in Python where each object can have different attributes 每次在while循环中,如何使随机生成的数字不同? - How can I have the randomly generated number be different each time in the whileloop? 如何为每个人发出邀请? - How can I make an invitation for each person? 如何将相同的输入存储在不同的地方,并为每次增加 1 的每个答案使用不同的名称? 像,姓名 1,姓名 2 等 - How do I store the same input in different places and have a different name for each answer going up by 1 each time? like, name1, name2, etc 如何将字典保存到每个键都有不同行的 csv - How to save a dictionary to a csv where each key has a different row 如何计算文本文件中每个人的平均人数 - how to calculate average number of each person in a text file 有没有办法为每个循环使用不同的列表元素? - Is there a way to have a different list element for each loop? 生成随机整数,但每次都必须不同 - Generating random integers, but they need to be different each time Function 为每个人返回相同的值 - Function returns same value for each person 计算每个人在每辆车上行驶的公里数 - Calculate the amount of km traveled by each person in each vehicle
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM