如何避免熊猫的 concate 和 to_csv 函数中的空集？

Question

I have a dictionary to be stored in csv through pandas:我有一个字典要通过 Pandas 存储在 csv 中：

df = pd.concat([pd.Series(node_dict[k], name=k) for k in HEADERS], 1)
df.to_csv(os.path.join(abspath, outputfile), sep='\t', index=False)

The keys correspond to the columns in the CSV or pandas frame, and the values are a list of sets.键对应于 CSV 或 Pandas 框架中的列，值是一组列表。 Each set is the current row's values.每组都是当前行的值。 Let's see if I have two columns:让我们看看我是否有两列：

   names                     companies                      
{'john', 'smith', 'mary'}   {'ms', 'fb'} 
 set()                      {'ms', 'fb', 'tw', 'g', 'lk'}
 ...                         ...

Some rows's values are empty, as indicated by the set() printout in the file.某些行的值为空，如文件中的 set() 打印输出所示。 I hope there is a way to modify this line:我希望有一种方法可以修改这一行：

[pd.Series(node_dict[k], name=k) for k in HEADERS]

to write the invisible '' into the file, instead of the string 'set()'.将不可见的 '' 写入文件，而不是字符串 'set()'。

Sample of the dict:字典示例：

node_dict['names'] = [{'john', 'smith', 'mary'}, {}]
node_dict['companies'] = [{'ms', 'fb'}, {'ms', 'fb', 'tw', 'g', 'lk'} ]

Of course the actual lists are much longer in the dictionary.当然，字典中的实际列表要长得多。

Answer 1

I think you can do something like:我认为您可以执行以下操作：

node_dict = {k: [x if x else "invisible" for x in v] for k,v in node_dict.items()}

prior to doing [pd.Series(node_dict[k], name=k) for k in HEADERS]在做[pd.Series(node_dict[k], name=k) for k in HEADERS]

Answer 2

You can just drop all the {} .您可以删除所有{} 。 Convert the dict to a string , drop and re-evaluate as dictionary.将dict转换为string ，删除并重新评估为字典。 Done.完毕。

df = pd.concat([pd.Series(eval(str(node_dict[k]).replace('{}',' ')), name=k) for k in HEADERS], 1)

df
                 names            companies
0  {john, mary, smith}             {fb, ms}
1                  NaN  {g, ms, lk, fb, tw}

Even works with trailing , in the dictionary.甚至可以在字典中使用尾随, 。 df.to_csv() evaluates the NaN automatically as empty string DataFrame.to_csv(path, sep: str = ',', na_rep: str = ''...) df.to_csv()自动将 NaN 评估为空字符串DataFrame.to_csv(path, sep: str = ',', na_rep: str = ''...)

如何避免熊猫的 concate 和 to_csv 函数中的空集？

问题描述

2 个解决方案

解决方案1
0 2020-03-26 01:15:33

解决方案2
0 2020-03-26 01:18:35

如何避免熊猫的 concate 和 to_csv 函数中的空集？

问题描述

2 个解决方案

解决方案1 0 2020-03-26 01:15:33

解决方案2 0 2020-03-26 01:18:35

解决方案1
0 2020-03-26 01:15:33

解决方案2
0 2020-03-26 01:18:35