简体   繁体   English

将 csv 文件中的 csv 行编码为 utf8

[英]encode csv lines in csv file to utf8

I'm looking for a way to encode URL saved in lines of my csv file to utf8, but couldnt till now find the right library to do it, the idea will be a library that read from csv lines and then encode to utf8 to put it may be in an other file or new column,我正在寻找一种方法来将 URL 保存在我的 csv 文件的行中保存为 utf8,但直到现在才找到合适的库来执行此操作,这个想法将是一个从 Z628CB5675FF524F3E71B 读取然后编码到 7AA24F3E71B 行的库它可能在其他文件或新列中,

Any body have an idea?任何机构有一个想法?

to add an example:添加一个例子:

I have a file that contain one column which is details:我有一个文件,其中包含一列详细信息:

containing some text, that I need after to pass in a url but encoded in utf.包含一些文本,我需要在传递 url 但以 utf 编码之后。

like one line is:就像一行是:

Créez, testez et déployez des applications sur Oracle Cloud — gratuitement. Créez, testez et déployez des applications sur Oracle Cloud — 免费。 Inscrivez-vous une fois et accédez à deux offres gratuites. Inscrivez-vous une fois et accédez à deux offres gratuites。

and the result expected is for this line is:这条线的预期结果是:

Cr%C3%A9ez%2C%20testez%20et%20d%C3%A9ployez%20des%20applications%20sur%20Oracle%20Cloud%20%E2%80%94%20gratuitement.%20Inscrivez-vous%20une%20fois%20et%20acc%C3%A9dez%20%C3%A0%20deux%20offres%20gratuites. Cr%C3%A9ez%2C%20testez%20et%20d%C3%A9ployez%20des%20applications%20sur%20Oracle%20Cloud%20%E2%80%94%20gratuitement.%20Inscrivez-vous%20une%20fois%20et%20acc %C3%A9dez%20%C3%A0%20deux%20offres%20gratuites。

this is just an example of one line in my csv file, I need to apply this for all the line,这只是我的 csv 文件中的一行示例,我需要将其应用于所有行,

well I found a solution and its working but not that correct:好吧,我找到了一个解决方案及其工作,但不是那么正确:

import pandas as pd
from urllib.parse import quote

data = pd.read_csv("file_decoded.csv",error_bad_lines=False)


def title_parse(details):
    details = quote(details)
    return details


data['details']= data.details.apply(title_parse)
data.to_csv('file_encoded.csv')

the issue with this function is the text is encoded but it does take it as ascci code base, I dont know how to explain it这个 function 的问题是文本被编码但它确实将它作为 ascci 代码库,我不知道如何解释它

import pandas as pd
data = pd.read_csv("filename.csv")
data.to_csv("filename_new.csv", encoding="utf-8")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM