简体   繁体   English

重新转换txt文件(从Windows到Unix)

[英]Reconversion txt file (from Windows to Unix)

My university project written in Java, take tweets from Twitter and analyzes them. 我的大学项目是用Java编写的,从Twitter获得了推文并对其进行了分析。

In the first phase, I take tweets; 在第一阶段,我会发推文。 I have to do that on a Windows machine, after I put online on my Linux server program and I use it to analyze tweets with a user feedback system. 我在Linux服务器程序上联机并使用它通过用户反馈系统分析推文后,必须在Windows机器上执行此操作。

When I open the txt file on Linux machine, it asks me if I want convert in UTF-8 , and I click yes. 当我在Linux机器上打开txt文件时,它询问我是否要在UTF-8转换,然后单击“是”。 But because of this operation some special characters are not formatted correctly. 但是由于此操作,某些特殊字符的格式不正确。 If I try to reconvert in original format (maybe CP1252 ) with iconv it returns an error caused by special characters. 如果我尝试使用iconv转换为原始格式(也许是CP1252 ),它将返回由特殊字符引起的错误。

I understand that it is impossible to reconvert that characters, because any special character is a sum of the possible character that they may be, but I can use a sort of text predict character to rewrite that character ? 我知道不可能重新转换这些字符,因为任何特殊字符都是它们可能是的可能字符的总和,但是我可以使用某种文本预测字符来重写该字符? .

For example if I have because , and e is a special character I see this word something like this becaus? 例如,如果我有because ,并且e是一个特殊字符,那么我会看到这个单词,因为这样becaus? , If I remove the ? ,如果我删除了? character, how can I reput the e ? 性格,我该如何称呼e I have tried to use Word but the txt is too big, so there a big mount of words with this problem, and with Word you have to check every word manually. 我尝试使用Word,但txt太大,因此出现大量单词出现此问题,而使用Word则必须手动检查每个单词。

您应该使用dos2unix将文件更改为linux格式

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM