如何將列拆分為Pandas數據框中的列的字母值和數值？

Question

我有一個數據幀：

    Name    Section
1   James   P3
2   Sam     2.5C
3   Billy   T35
4   Sarah   A85
5   Felix   5I

如何將數值拆分為名為Section_Number的單獨列，並將字母值拆分為Section_Letter。 期望的結果

    Name    Section Section_Number  Section_Letter
1   James   P3               3          P
2   Sam     2.5C           2.5          C
3   Billy   T35             35          T
4   Sarah   A85             85          A
5   Felix   5L               5          L

Answer 1

使用str.replace與str.extract由[AZ]+為全大寫的字符串：

df['Section_Number'] = df['Section'].str.replace('([A-Z]+)', '')
df['Section_Letter'] = df['Section'].str.extract('([A-Z]+)')
print (df)
    Name Section Section_Number Section_Letter
1  James      P3              3              P
2    Sam    2.5C            2.5              C
3  Billy     T35             35              T
4  Sarah     A85             85              A
5  Felix      5I              5              I

對於seelct也是小寫值：

df['Section_Number'] = df['Section'].str.replace('([A-Za-z]+)', '')
df['Section_Letter'] = df['Section'].str.extract('([A-Za-z]+)')
print (df)
    Name Section Section_Number Section_Letter
1  James      P3              3              P
2    Sam    2.5C            2.5              C
3  Billy     T35             35              T
4  Sarah     A85             85              A
5  Felix      5I              5              I

Answer 2

毫無疑問它會變慢但是為了完整性而拋出一個替代方案，你可以使用str.extractall來獲得與模式匹配的命名組並合並匹配並加入到你的DF ......

new = df.join(
    df.Section.str.extractall(r'(?i)(?P<Section_Letter>[A-Z]+)|(?P<Section_Number>[\d.]+)')
    .groupby(level=0).first()
)

結果：

    Name Section Section_Letter Section_Number
1  James      P3              P              3
2    Sam    2.5C              C            2.5
3  Billy     T35              T             35
4  Sarah     A85              A             85
5  Felix      5I              I              5

Answer 3

如果在您的示例中，每個名稱中都有一個字母，則可以對其進行排序，然后進行切片：

def get_vals(x):
    return ''.join(sorted(x, key=str.isalpha))

# apply ordering
vals = df['Section'].apply(get_vals)

# split numbers from letter
df['num'] = vals.str[:-1].astype(float)
df['letter'] = vals.str[-1]

print(df)

    Name Section   num letter
1  James      P3   3.0      P
2    Sam    2.5C   2.5      C
3  Billy     T35  35.0      T
4  Sarah     A85  85.0      A
5  Felix      5I   5.0      I

Answer 4

我們可以使用itertools.groupby對連續的alpha和非alpha進行分組

from itertools import groupby

[sorted([''.join(x) for _, x in groupby(s, key=str.isalpha)]) for s in df.Section]

[['3', 'P'], ['2.5', 'C'], ['35', 'T'], ['85', 'A'], ['5', 'I']]

我們可以將其操作為新列

from itertools import groupby

N, L = zip(
    *[sorted([''.join(x) for _, x in groupby(s, key=str.isalpha)]) for s in df.Section]
)
df.assign(Selection_Number=N, Selection_Letter=L)

    Name Section Selection_Number Selection_Letter
1  James      P3                3                P
2    Sam    2.5C              2.5                C
3  Billy     T35               35                T
4  Sarah     A85               85                A
5  Felix      5I                5                I

如何將列拆分為Pandas數據框中的列的字母值和數值？

問題描述

4 個解決方案

解決方案1
4 已采納 2018-07-11 14:25:49

解決方案2
1 2018-07-11 14:38:20

解決方案3
1 2018-07-11 14:58:02

解決方案4
0 2018-07-11 15:18:31

如何將列拆分為Pandas數據框中的列的字母值和數值？

問題描述

4 個解決方案

解決方案1 4 已采納 2018-07-11 14:25:49

解決方案2 1 2018-07-11 14:38:20

解決方案3 1 2018-07-11 14:58:02

解決方案4 0 2018-07-11 15:18:31

解決方案1
4 已采納 2018-07-11 14:25:49

解決方案2
1 2018-07-11 14:38:20

解決方案3
1 2018-07-11 14:58:02

解決方案4
0 2018-07-11 15:18:31