如何在 python 中使用正则表达式模式提取字符串？

Question

我正在尝试通过使用正则表达式 - python 中的“str.extract”来提取文件名，这些文件名是时间之后和 before.filetype 之后的文本。

Answer 1

你可以试试：

                                         # 5 fields /fname/ext
df['filename'] = df['text'].str.extract(r'(?:\w+ ){5}(.*)\.[^.]*$')

output：

   index                                                           text                        filename
0      1        sample 1 root root 349802 Nov 1 2000 introduction.json*         Nov 1 2000 introduction
1      2           sample 1 root root 1234 Oct 1 10:26 test_housing.csv        Oct 1 10:26 test_housing
2      3  sample 1 root root 5983025 Nov 1 10:32 test_train_housing.csv  Nov 1 10:32 test_train_housing
3      4                  sample 1 root root 1252 Oct 1 10:32 _test.csv               Oct 1 10:32 _test
4      5            sample 1 root root 938 Oct 1 10:32 _train_small.csv        Oct 1 10:32 _train_small
5      6               sample 1 root root 9909303 Oct 5 2000 README.md*               Oct 5 2000 README

Answer 2

df['filename'] = df['text'].str.extract('(\w+)[.].*$')

结果：

['introduction', 'test_housing', 'test_train_housing', '_test', '_train_small', 'README']

如何在 python 中使用正则表达式模式提取字符串？

问题描述

2 个解决方案

解决方案1
0 2022-02-21 09:37:20

解决方案2
0 2022-02-21 09:44:13

如何在 python 中使用正则表达式模式提取字符串？

问题描述

2 个解决方案

解决方案1 0 2022-02-21 09:37:20

解决方案2 0 2022-02-21 09:44:13

解决方案1
0 2022-02-21 09:37:20

解决方案2
0 2022-02-21 09:44:13