如何删除熊猫数据框列中数值之前的字符串？

Question

I have a pandas dataframe column with strings that looks like this: 我有一个带有如下字符串的pandas dataframe列：

Column A

text moretext 251 St. Louis Apt.54
123 Orange Drive
sometext somemoretext 171 Poplar street
textnew 11th street 
77 yorkshire avenue

I want to remove the text before the numeric values ie I want the output to be something like this: 我想删除数值之前的文本，即我希望输出是这样的：

Column A

251 St. Louis Apt.54
123 Orange Drive
171 Poplar street
11th street 
77 yorkshire avenue

Answer 1

Let's use regex and extract : 让我们使用正则表达式和extract ：

df['Column A'] = df['Column A'].str.extract(r'(\d+.+$)')

Output: 输出：

0    251 St. Louis Apt.54
1        123 Orange Drive
2       171 Poplar street
3             11th street
4     77 yorkshire avenue
Name: Column A, dtype: object

The regex states get a group of characters start with a number of any length and continue until the end of the line. 正则表达式状态使一组字符以任意长度的数字开头，并一直持续到行尾。

Answer 2

This function is finding the index of the first numerical character in the string and selecting the remaining part of the string. 此功能是查找字符串中第一个数字字符的索引并选择字符串的其余部分。 This function is then applied to each value of the column using apply function 然后使用apply函数将此函数应用于列的每个值

def change(string):
    for i, c in enumerate(string):
         if c.isdigit():
            idx = i
            break
    return string[idx:]

data[A] = data[A].apply(change, axis = 0)

如何删除熊猫数据框列中数值之前的字符串？

问题描述

2 个解决方案

解决方案1
4 已采纳 2018-04-10 20:07:45

解决方案2
2 2018-04-10 20:12:27

如何删除熊猫数据框列中数值之前的字符串？

问题描述

2 个解决方案

解决方案1 4 已采纳 2018-04-10 20:07:45

解决方案2 2 2018-04-10 20:12:27

解决方案1
4 已采纳 2018-04-10 20:07:45

解决方案2
2 2018-04-10 20:12:27