简体   繁体   English

使用python 3 Base64解码CSV文件中的单个列

[英]Decoding a single column in a CSV file using python 3 Base64

I am very new and inexperienced to Python but I hope someone can help me with this. 我是Python的新手,没有经验,但是我希望有人可以帮助我。 I didn't find any (understandable?) answers on google. 我在Google上找不到任何(可理解的?)答案。

I have a large (10gb) CSV file that contains multiple columns. 我有一个很大的(10gb)CSV文件,其中包含多列。 All columns are "normal" human readable text except for one column. 除一列外,所有列都是“正常”的人类可读文本。 This column is binary. 此列为二进制。 I would like to decode this and write it the decoded data back into the CSV file. 我想对此进行解码,然后将解码后的数据写回到CSV文件中。

This is what I got so far, but I have a feeling I'm way off. 到目前为止,这是我得到的,但是我感觉自己还有一段路要走。 Any help would be appreciated! 任何帮助,将不胜感激!

import base64
import pandas as pd



df = pd.read_csv('sample.csv', delimiter=';',
                 usecols=[3], dtype=object, header=None,)
decoded_binary_data = base64.b64decode(df)

print(decoded_binary_data)

sample of CSV: CSV样本:

"5f8ebfd8-7d12-4659-a416-e5dcbe056d0a";"6";"1";**ez??R?+??a)???
Cs**;0;0;0;74;1720;
  • EDIT cleaned up the CSV file a bit. EDIT稍微清理了CSV文件。
  • EDIT added sample dataframe 编辑添加示例数据框

sample of dataframe: 数据框示例:

0                                       ez??R?+??a)???Cs
1                       B?t?a?h?kwd?W-]\???fc?m[m?A}??? 
2                       ?eE????3r??c??T????fc?m[m?A}??? 
3                       ?eE????3r??c??T????fc?m[m?A}??? 
4                       ?eE????3r??c??T????fc?m[m?A}??? 
5                       B?t?a?h?kwd?W-]\???fc?m[m?A}??? 

You can simply use: 您可以简单地使用:

bs64 = lambda x: base64.b64decode(x)

decoded_binary_data = df['col_name'].apply(bs64)

See this page: https://chrisalbon.com/python/pandas_apply_operations_to_dataframes.html 参见本页: https : //chrisalbon.com/python/pandas_apply_operations_to_dataframes.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM