[英]Spliting a path column in python
Hi I have a column with path like this:嗨,我有一列路径是这样的:
path_column = ['C:/Users/Desktop/sample\\1994-QTR1.tsv','C:/Users/Desktop/sample\\1995-QTR1.tsv']
I need to split and get just the file name.我需要拆分并只获取文件名。
[1994-QTR1,1995-QTR1] [1994-QTR1,1995-QTR1]
Thanks谢谢
Use str.extract
:使用
str.extract
:
df['new'] = df['path'].str.extract(r'\\([^\\]*)\.\w+$', expand=False)
The equivalent with rsplit
would be much less efficient:等效于
rsplit
的效率会低得多:
df['new'] = df['path'].str.rsplit('\\', n=1).str[-1].str.rsplit('.', n=1).str[0]
Output: Output:
path new
0 C:/Users/Desktop/sample\1994-QTR1.tsv 1994-QTR1
1 C:/Users/Desktop/sample\1995-QTR1.tsv 1995-QTR1
Use this or you can use regex to match and take what you want.使用这个或者你可以使用正则表达式来匹配和获取你想要的。
path.split("\\")[-1].split(".")[0]
Output: Output:
'1994-QTR1'
Edit编辑
new_col=[]
for i in path_column:
new_col.append(i.split("\\")[-1].split(".")[0])
print (new_col)
NOTE: If you need it in a list, you can append it to a new list from the loop.注意:如果你需要它在一个列表中,你可以 append 它到循环中的一个新列表。
Output: Output:
['1994-QTR1', '1995-QTR1']
You might harness pathlib
for this task following way您可以按照以下方式利用
pathlib
完成此任务
import pathlib
import pandas as pd
def get_stem(path):
return pathlib.PureWindowsPath(path).stem
df = pd.DataFrame({'paths':['C:/Users/Desktop/sample\\1994-QTR1.tsv','C:/Users/Desktop/sample\\1994-QTR2.tsv','C:/Users/Desktop/sample\\1994-QTR3.tsv']})
df['names'] = df.paths.apply(get_stem)
print(df)
gives output给出 output
paths names
0 C:/Users/Desktop/sample\1994-QTR1.tsv 1994-QTR1
1 C:/Users/Desktop/sample\1994-QTR2.tsv 1994-QTR2
2 C:/Users/Desktop/sample\1994-QTR3.tsv 1994-QTR3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.