简体   繁体   English

如何在pandas数据框中将一列分为两部分

[英]How to break a column into two in pandas dataframe

This is how cells in a column of the CSV looks like 这就是CSV列中的单元格的样子

[u"ABC||||'ABCDadfasf||||'random something', 'another random']
[u"ABCD||||'ABCDadfasf||||'random somethingadf', 'another random adsd']
[u"ABDC||||'ABCDasdadfasf||||'random something random', 'another something random']

` `

I want to break it like below. 我想像下面这样打破它。 Split based on ('||||') 根据('||||')进行分割

Col 1      Col 2          Col 3 
[u"ABC    ABCDadfasf    'random something', 'another random']
[u"ABCD   ABCDadfasf    'random somethingadf', 'another random adsd']
[u"ABDC   ABCDasdadfasf 'random something random', 'another something random']

This is what I tried 这是我尝试过的

Cov = pd.read_csv("path to CSV.csv", sep='||||', names = ["col 1", "col 2", "col 3"], engine = 'python')    

it does not show any error but the column is not split. 它没有显示任何错误,但该列未拆分。

Use \\ for escape | 使用\\进行转义| because special regex character: 因为特殊的正则表达式字符:

import pandas as pd

temp=u"""[u"ABC||||'ABCDadfasf||||'random something', 'another random']
[u"ABCD||||'ABCDadfasf||||'random somethingadf', 'another random adsd']
[u"ABDC||||'ABCDasdadfasf||||'random something random', 'another something random']"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), sep=r"\|\|\|\|", names = ["col 1", "col 2", "col 3"], engine = 'python') 

print (df)
     col 1           col 2                                              col 3
0   [u"ABC     'ABCDadfasf              'random something', 'another random']
1  [u"ABCD     'ABCDadfasf      'random somethingadf', 'another random adsd']
2  [u"ABDC  'ABCDasdadfasf  'random something random', 'another something ...

#temp=u"""

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM