简体   繁体   English

pandas dataframe 上的字符串拆分和连接

[英]String splitting and joining on a pandas dataframe

I have a dataframe containing devices and their corresponding firmware versions (eg 1.7.1.3).我有一个 dataframe 包含设备及其相应的固件版本(例如 1.7.1.3)。 I'm trying to shorten the firmware version to only show three numbers (eg 1.7.1).我正在尝试缩短固件版本以仅显示三个数字(例如 1.7.1)。

I know how to do this on a single string but how would I make it efficient for a large dataframe?我知道如何在单个字符串上执行此操作,但我如何使其对大型 dataframe 有效?

test = "1.2.3.4"
test = test.split(".")
'.'.join(test[0:-1])
#sample dataframe:
import pandas as pd
df=pd.DataFrame({'data': {0: '1.2.3.4', 1: '1.2.3.9', 2: '1.2.3.8'}})

For this you can use:为此,您可以使用:

df['data']=df['data'].str.split('.').str[0:3].apply('.'.join)

OR或者

df['data']=df['data'].str[0:5]

OR或者

df['data']=df['data'].str[::-1].str.split('.',1).str[1].str[::-1]

Performance:表现:

在此处输入图像描述

This could be done by extract function of pandas too, could you please try following.这也可以通过extract pandas 的 function 来完成,请您尝试以下操作。

df['data'] = df['data'].str.extract(r'^(\d+(?:\.\d+){2})', expand=True)

Simple explanation would be: using extract function of Pandas and mentioning regex in it to catch only first 3 digits as per OP's need.简单的解释是:使用 Pandas 的extract function 并在其中提及正则表达式以根据 OP 的需要仅捕获前 3 位数字。



Taking example of DataFrame used by Anurag Dabas here:以 Anurag Dabas 使用的 DataFrame 为例:

Let's say df is following:假设 df 如下:

    data
0   1.2.3.4
1   1.2.3.9
2   1.2.3.8

After running above code it will become like:运行上面的代码后会变成这样:

    data
0   1.2.3
1   1.2.3
2   1.2.3

Here is one more way of doing it using .replace :这是使用.replace的另一种方法:

import pandas as pd
df = pd.DataFrame({'data': {0: '1.2.3.4', 1: '1.2.3.9', 2: '1.2.3.8'}})
df['data'] = df['data'].str.replace(r'\.[^.]*$', '')
print (df['data'])

Output: Output:

0    1.2.3
1    1.2.3
2    1.2.3
Name: data, dtype: object

.replace(r'\.[^.]*$', '') matches last dot and text after that, which is replaced with an empty string. .replace(r'\.[^.]*$', '')匹配最后一个点和之后的文本,将其替换为空字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM