简体   繁体   English

包含可变长度和逗号分隔的值字符串的熊猫行列如何堆叠成单独的值?

[英]How is a pandas column of rows containing variable length and comma separated strings of values, stacked into separate values?

I'm in the process of trying to separate specific values in a pandas column so that any "groups" of values become separate values.我正在尝试分离熊猫列中的特定值,以便任何“组”值成为单独的值。

The code I'm using to do this currently is as follows:我目前用来执行此操作的代码如下:

import csv
import pandas as pd 

data = pd.read_csv('ctabuses.csv')
route_column = data['routes']

with open('results.csv', 'wt+') as csv_file:
    writer = csv.writer(csv_file)
    for value in route_column:
        writer.writerow(value.split)

However, when I write the contents to a file it produces this:但是,当我将内容写入文件时,它会生成以下内容:

126

121,123

1,7,X28,126,129,130,132,151

1,7,X28,126,129,130,151

1,7,X28,126,129,130

1,7,X28,126,129

1,3,4,7,J14,26,X28,126,129,132,143,147,148

7,126,132,143,147

1,7,X28,126,129

3,4,6,J14,26,143

1,7,X28,126,129,151

1,7,X28,126,129,130,134,135,136,151,156

125,126

126

126

126

I've searched and tried everything I can think of and keep getting the same result.我已经搜索并尝试了所有我能想到的方法并不断得到相同的结果。

Edit: Expected Result My expected output if I encounter a group of values like this:编辑:预期结果如果我遇到一组这样的值,我的预期输出:

1,7,X28,126,129,130,134,135,136,151,156

Should be:应该:

1
7
X28
126
129
130
134
135
136
151
156

Which would then be used to import into a MySQL database.然后将用于导入 MySQL 数据库。

Imports:进口:

import pandas as pd

Create DataFrame:创建数据框:

df = pd.read_csv('data.csv', header=None)

df.head()

                              0
0                           126
1                       121,123
2   1,7,X28,126,129,130,132,151
3       1,7,X28,126,129,130,151
4           1,7,X28,126,129,130

String to list:要列出的字符串:

df_list = df.apply(lambda row: pd.Series(row).str.split(','))

df_list.head()

                                       0
0                                  [126]
1                             [121, 123]
2   [1, 7, X28, 126, 129, 130, 132, 151]
3        [1, 7, X28, 126, 129, 130, 151]
4             [1, 7, X28, 126, 129, 130]

List to long:长名单:

df_long = df_list.apply(lambda x: pd.Series(x[0]), axis=1).stack().reset_index(level=1, drop=True)

df_long

0     126
1     121
1     123
2       1
2       7
2     X28
2     126
2     129
2     130
2     132
2     151
3       1
3       7
3     X28
3     126
3     129
3     130
3     151
...

Save to csv:保存到 csv:

df_long.to_csv('results.csv', index=False)

Final Program (4 lines):最终程序(4 行):

df = pd.read_csv('ctabuses.csv')
df_routes = df.routes.apply(lambda row: pd.Series(row).str.split(','))
df_routes = df_routes.apply(lambda row: pd.Series(row[0]), axis=1).stack().reset_index(level=1, drop=True)
df_routes.to_csv('results.csv', index=False)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将具有可变长度逗号分隔值的熊猫系列转换为数据框 - Transform pandas Series with variable length comma separated values to Dataframe 将 pandas 数据框列值转换为逗号分隔的字符串 - convert pandas data frame column values into comma separated strings 通过在另一列 pandas 中拆分逗号分隔的多个值来复制行 - duplicating rows by splitting comma separated multiple values in another column pandas 如何使用 python 分隔 pandas 数据帧中的嵌套逗号分隔列值? - How to separate nested comma separated column values in pandas data frame using python? 熊猫-如何用逗号分隔的字符串分隔和分组 - Pandas - How to separate and group by comma separated strings 如何查找存储在 pandas 数据框列中的逗号分隔字符串中唯一值的数量? - How to find the number of unique values in comma separated strings stored in an pandas data frame column? 包含对象列表的pandas列,根据键名拆分此列,并将值存储为逗号分隔的值 - pandas column containing list of objects, split this column based upon keynames and store values as comma separated values Pandas 如何根据行添加逗号分隔值? - Pandas how to add comma separated values based on rows? 分隔字典中的逗号分隔值以分隔字符串列表 - Separate the comma separated values in a dictionary to separate strings list 如何在 pandas 的单个列中合并(逗号分隔的)行值? - How to combine (comma-separated) row values in a single column in pandas?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM