[英]How to read a csv file with muplitiple delimiter in pandas
I have a csv file with delimiter (dot and underscore) and I am using sep='_.'我有一个带有分隔符(点和下划线)的 csv 文件,我正在使用 sep='_.' in read_csv but it is not taking dot as sep while reading.
在 read_csv 中,但它在阅读时没有将点作为 sep。
input jks_12034.45_89.12输入 jks_12034.45_89.12
output jks 12034 45 89 12 output jks 12034 45 89 12
As stated in the documentation如文档中所述
separators longer than 1 character and different from '\s+' will be interpreted as regular expressions
超过 1 个字符且不同于 '\s+' 的分隔符将被解释为正则表达式
If you use sep="_\."
如果你使用
sep="_\."
it will only match a point where youhave both an underscore AND a dot.它只会匹配同时具有下划线和点的点。
If you want to split on unserscore OR dot use sep="\.|_"
or sep="[_\.]"
如果你想在 unserscore或点上拆分,请使用
sep="\.|_"
或sep="[_\.]"
Use engine='python'
and sep=r'[_.]'
as parameters of pd.read_csv
:使用
engine='python'
和sep=r'[_.]'
作为pd.read_csv
的参数:
df = pd.read_csv('data.csv', sep=r'[_.]', engine='python', header=None)
print(df)
# Output
0 1 2 3 4
0 jks 12034 45 89 12
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.