[英]Separate data from one column into three columns
I have a column in an excel which contains a mix of First Names, Last Names and Job titles. 我在excel中有一列,其中包含名字,姓氏和职务的混合。 Only pattern that can be observed is - in each set of 3 rows, every 1st row is first name, 2nd row is last name and 3rd row is job title. 唯一可以观察到的模式是-在每三行中,第一行是名字,第二行是姓氏,第三行是职位。 I want to create 3 different columns and and segregate this data Sample data: 我想创建3个不同的列,并隔离此数据示例数据:
John
Bush
Manager
Katrina
Cohn
Secretary
I want: John , Bush , Manager as one row going in three different columns under First Name, Last name and Job title respectively. 我想要:John,Bush,Manager作为一行,分别在First Name,Last name和Job title下的三个不同的列中。 Like - 喜欢 -
First Name Last Name Job Title
John Bush Manager
Katrina Cohn Secretary
How can we achieve this task? 我们如何完成这项任务?
You can use this notation to get every third element with different starting points. 您可以使用此表示法来获取每三个具有不同起点的元素。
l = ['John', 'Bush', 'Manager', 'Katrina', 'Cohn', 'Secretary']
pd.DataFrame({'First Name': l[::3], 'Last Name': l[1::3], 'Job Title': l[2::3]})
outputs 输出
First Name Job Title Last Name
0 John Manager Bush
1 Katrina Secretary Cohn
s = pd.Series([
'John',
'Bush',
'Manager',
'Katrina',
'Cohn',
'Secretary'])
df = pd.DataFrame(s.values.reshape(-1, 3),
columns=['First Name', 'Last Name', 'Job Title'])
df
If your length of your data isn't a multiple of 3 then you can force it like this: 如果您的数据长度不是3的倍数,则可以这样强制:
s = pd.Series([
'John',
'Bush',
'Manager',
'Katrina',
'Cohn',
'Secretary',
'Bogus'])
s_ = s.iloc[:s.shape[0] // 3 * 3]
df = pd.DataFrame(s_.values.reshape(-1, 3), columns=['First Name', 'Last Name', 'Job Title'])
df
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.