如何在有數字的地方拆分字符串？

Question

我在熊貓數據框中有一組要拆分的字符串，僅保留文本。

這是一個字符串示例：“消除渲染阻止資源0.46s以下一代格式提供圖像0.45s減少服務器響應時間（TTFB）0.22s刪除未使用的CSS 0.15s”

這是我想在不同列中看到的內容：[“消除渲染阻止資源”，“以下一代格式提供圖像”，“減少服務器響應時間（TTFB）”，“刪除未使用的CSS”]

我想到將.str.split命令用於'。' 在右邊加3個字符，在左邊加1個字符...但是老實說，我不知道從哪里開始。

謝謝您的幫助

Answer 1

將正則表達式與re.split()一起使用

import re

re.split(r'\d\.\d+s', your_string)

\\d\\.\\d+可以匹配任何字符串，例如0.15s ，0.22s等。例如：

s = 'Eliminate render-blocking resources 0.46s Serve images in next-gen formats 0.45s Reduce server response times (TTFB) 0.22s Remove unused CSS 0.15s'
re.split('\d\.\d+s', s)
['Eliminate render-blocking resources ', ' Serve images in next-gen formats ', ' Reduce server response times (TTFB) ', ' Remove unused CSS ', '']

之后，您可以使用刪除尾隨空格和空字符串。

Answer 2

我們可以在此處將Series.str.split與regex結合使用。 我們還傳遞了expand=True參數，因此它為每個拆分返回一個新列：

df['Col'].str.split(r'[0-9]{1}\.[0-9]{2}s', expand=True)

輸出量

                                      0                                   1                                      2                    3 4
0  Eliminate render-blocking resources    Serve images in next-gen formats    Reduce server response times (TTFB)    Remove unused CSS

如何在有數字的地方拆分字符串？

問題描述

2 個解決方案

解決方案1
3 2019-05-11 12:28:00

解決方案2
1 2019-05-11 12:34:07

如何在有數字的地方拆分字符串？

問題描述

2 個解決方案

解決方案1 3 2019-05-11 12:28:00

解決方案2 1 2019-05-11 12:34:07

解決方案1
3 2019-05-11 12:28:00

解決方案2
1 2019-05-11 12:34:07