简体   繁体   English

使用 Python Pandas 将一列序列号拆分为多个不同的列

[英]Split a column of serial number into multiple different columns using Python Pandas

For example a table contain product names, serial numbers, and product descriptions.例如,一个表包含产品名称、序列号和产品描述。

One of the serial numbers is the following: 12A-BCD3456-7899.序列号之一如下:12A-BCD3456-7899。

How do you separate this serial number into 6 different columns of 12, A, BCD, 34, 56, and 7899 by using Python Pandas given that the amount and order of letters and numbers are consistent?在字母和数字的数量和顺序一致的情况下,如何使用Python Pandas将这个序列号分成12、A、BCD、34、56和7899这6个不同的列? eg first 2 of the serial number are always numbers (12), then a single letter(A), followed by "-", then 3 letters (BCD), 4 numbers(3456), "-", and closed by 4 numbers (7899).例如,序列号的前 2 个始终是数字 (12),然后是单个字母 (A),然后是“-”,然后是 3 个字母 (BCD)、4 个数字 (3456)、“-”,并以 4 个数字结束(7899)。

The multiple columns should be added in between the product name and the product description or at the end of the table without messing up the table itself.应该在产品名称和产品描述之间或表格末尾添加多列,不要弄乱表格本身。

df['A'] = df['serial_nmber'].str[0:2]
df['B'] = df['serial_nmber'].str[2]
df['C'] = df['serial_nmber'].str[4:7]
df['D'] = df['serial_nmber'].str[7:9]
df['E'] = df['serial_nmber'].str[9:11]
df['F'] = df['serial_nmber'].str[12:16]

You can try this:你可以试试这个:

import pandas as pd
    
# creating dataframe with copy value for demo purpose
df = pd.DataFrame({'serial':['12A-BCD3456-7899']*10}) 

def processing(row):
  '''adding '-' to the serial value to make easy the split of string'''
  s='-'.join([row.serial[:2],row.serial[2:7], row.serial[7:9],row.serial[9:]])
  return s

# process the 'serial' value, split and create new column
ndf = df.apply(processing,axis=1).str.split('-', expand=True)

ndf['serial']=df['serial'] # add the 'serial' column to the new dataframe
ndf

Output : before processing输出:处理前

    serial
0   12A-BCD3456-7899
1   12A-BCD3456-7899
2   12A-BCD3456-7899
3   12A-BCD3456-7899
4   12A-BCD3456-7899
5   12A-BCD3456-7899
6   12A-BCD3456-7899
7   12A-BCD3456-7899
8   12A-BCD3456-7899
9   12A-BCD3456-7899

Output : after processing输出:处理后

    0   1   2   3   4   5       serial
0   12  A   BCD 34  56  7899    12A-BCD3456-7899
1   12  A   BCD 34  56  7899    12A-BCD3456-7899
2   12  A   BCD 34  56  7899    12A-BCD3456-7899
3   12  A   BCD 34  56  7899    12A-BCD3456-7899
4   12  A   BCD 34  56  7899    12A-BCD3456-7899
5   12  A   BCD 34  56  7899    12A-BCD3456-7899
6   12  A   BCD 34  56  7899    12A-BCD3456-7899
7   12  A   BCD 34  56  7899    12A-BCD3456-7899
8   12  A   BCD 34  56  7899    12A-BCD3456-7899
9   12  A   BCD 34  56  7899    12A-BCD3456-7899

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM