用分隔符/sep 分隔 csv 列

Question

My goal is to separate data stored in cells to multiple columns in the same row.我的目标是将存储在单元格中的数据分成同一行中的多个列。

For example, I would like to take data that looks like this:例如，我想获取如下所示的数据：

Row 1: [<1><2>][<3><4>][][]

Row 2: [<1><2>][<3><4>][][]

Into data that looks like this:进入如下所示的数据：

Row 1: [1][2][3][4]

Row 2: [1][2][3][4]

I tried using the code below to pull the csv and separate each line at the ">"我尝试使用下面的代码来拉 csv 并在“>”处分隔每一行

df = pd.read_csv('file.csv', engine='python', sep="\*>", header=None)

However, the code did not function as anticipated.但是，代码没有像预期的那样 function。 Instead, the separation occurred at seemingly random and unpredictable points (I'm sure there's a pattern but I don't see it.) And each break created another row as opposed to another column.相反，分离发生在看似随机且不可预测的点（我确信有一个模式，但我没有看到它。）每个中断都创建了另一行而不是另一列。 For example:例如：

Row 1: [<1>][<2>]

Row 2: [<3>]

Row 3: [<4>]

I thought the issue might lie with reading the CSV file so I tried just re-scraping the site with the separator included but it produced the same results so I'm assuming its an issue with the separator call.我认为问题可能在于读取 CSV 文件，所以我尝试重新抓取包含分隔符的站点，但它产生了相同的结果，所以我假设它是分隔符调用的问题。 However, I found that call after trying many others that caused various errors.但是，在尝试了许多其他导致各种错误的调用后，我发现了该调用。 For example, when I tried using sep = '>' I got the following error: ParserError: '>' expected after '"' and when I tried sep = '\>' , I got the following error: ParserError: Expected 36 fields in line 1106, saw 120. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.例如，当我尝试使用sep = '>'时，我收到以下错误： ParserError: '>' expected after '"'并且当我尝试sep = '\>'时，我收到以下错误： ParserError: Expected 36 fields in line 1106, saw 120. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

These errors sent me looking though multiple resources including this and this among others.这些错误让我查看了多种资源，包括这个和这个等等。

However, I have find no resources that have successfully demonstrated how I can separate each column within a row following the use of a '>' delimiter.但是，我没有找到成功演示如何在使用“>”分隔符后分隔一行中的每一列的资源。 If anyone knows how to do this, please let me know.如果有人知道如何做到这一点，请告诉我。 Your help is much appreciated!非常感谢您的帮助！

Update:更新：

Here is an actual screenshot of the CSV file for a better understanding of what I was trying to demonstrate above.这是 CSV 文件的实际屏幕截图，以便更好地理解我在上面试图演示的内容。 My end goal is to have all the data is columns I+ have data on one descriptive factor as opposed to many as they do now.我的最终目标是让所有数据都是 I+ 列中的一个描述性因素的数据，而不是像现在这样的许多数据。

Answer 1

Would this work:这会起作用吗：

string="[<1><2>][<3><4>][][]"
string=string.replace("[","")
string=string.replace("]","")
string=string.replace("<","[")
string=string.replace(">","]")
print(string)

Result:结果：

[1][2][3][4]

Answer 2

I ended up using Google Sheets.我最终使用了谷歌表格。 Once you upload the csv there is a header titled "data" and then a sub-section titled "split text to columns."上传 csv 后，会出现一个名为“data”的 header，然后是一个名为“split text to columns”的小节。

If you want a faster way to do this with code, you can also do the following with pandas:如果您想以更快的方式使用代码执行此操作，您还可以使用 pandas 执行以下操作：

# new data frame with split value columns 
new = data["Name"].str.split(" ", n = 1, expand = True) 

# making separate first name column from new data frame 
data["First Name"]= new[0] 

# making separate last name column from new data frame 
data["Last Name"]= new[1] 

# Dropping old Name columns 
data.drop(columns =["Name"], inplace = True) 

# df display 
data

用分隔符/sep 分隔 csv 列

问题描述

2 个解决方案

解决方案1
0 2020-06-06 04:01:58

解决方案2
0 已采纳 2020-08-03 13:16:49

用分隔符/sep 分隔 csv 列

问题描述

2 个解决方案

解决方案1 0 2020-06-06 04:01:58

解决方案2 0 已采纳 2020-08-03 13:16:49

解决方案1
0 2020-06-06 04:01:58

解决方案2
0 已采纳 2020-08-03 13:16:49