简体   繁体   English

C#如何从二进制文件中删除换行符?

[英]C# How can i remove newline characters from binary?

Basically i have binary data, i dont mind if it's unreadable but im writing it to a file which is parsed and so it's importance newline characters are taken out. 基本上我有二进制数据,我不介意它是不可读的但是我把它写到一个被解析的文件,所以它的重要性换行字符被取出。

I thought i had done the right thing when i converted to string.... 当我转换成字符串时,我以为我做了正确的事....

byte[] b = (byte[])SubKey.GetValue(v[i]);
s = System.Text.ASCIIEncoding.ASCII.GetString(b);

and then removed the newlines 然后删除换行符

String t = s.replace("\n","")

but its not working ? 但它不起作用?

换行符可能是\\ r \\ n,您的二进制数据可能不是ASCII编码的。

Firstly newline ( Environment.Newline ) is usually two characters on Windows, do you mean removing single carriage-return or line-feed characters? 首先换行符( Environment.Newline )通常是Windows上的两个字符,你的意思是删除单个回车符或换行符吗?

Secondly, applying a text encoding to binary data is likely to lead to unexpected conversions. 其次,将文本编码应用于二进制数据可能会导致意外转换。 Eg what will happen to buyes of the binary data that do not map to ASCII characters? 例如,购买未映射到ASCII字符的二进制数据会发生什么?

New line character may be \\n or \\r or \\r\\n depends on operating system type , in order this is markers for Linux , Macintosh and Windows . 新行字符可能是\\n\\r \\n\\r \\r\\n 取决于操作系统类型 ,这是LinuxMacintoshWindows标记。

But if you say you file is binary from what you know they have newlines in ASCII in her content? 但是,如果你说你的文件是二进制的,你知道他们的内容中有ASCII换行符吗?

If this is binary file this may be a some struct , if this they struct you after remove newline characters shift left all data after the this newline and corrupt data in her . 如果这是二进制文件,这可能是一个结构 ,如果这样他们在删除换行符之后构造你,则在该换行符之后向左移动所有数据并损坏她的数据

I would imagine removing the bytes in a binary chunk which correspond the line feeds would actually corrupt the binary data, thereby making it useless. 我会想象删除二进制块中的字节,这些字节对应于换行实际上会破坏二进制数据,从而使其无用。

Perhaps you'd be better off using base64 encoding, which will produce ASCII-safe output. 也许你最好使用base64编码,这将产生ASCII安全输出。

If this is text data, then load it as text data (using the correct encoding), replace it as as a string, and re-encode it (using the correct encoding). 如果这是文本数据,则将其作为文本数据加载(使用正确的编码),将其替换为字符串,然后重新编码(使用正确的编码)。 For some encodings you might be able to do a swap at the file level (without decoding/encoding), but I wouldn't bet on it. 对于某些编码,您可以在文件级别进行交换(不进行解码/编码),但我不打赌。

If this is any other binary representation, you will have to know the exact details. 如果这是任何其他二进制表示,您将必须知道确切的详细信息。 For example, it is common (but not for certain) for strings embedded in part of a binary file to have a length prefix. 例如,嵌入在二进制文件的一部分中的字符串具有长度前缀是常见的(但不是确定的)。 If you change the data without changing the length prefix, you've just corrupted the file. 如果在不更改长度前缀的情况下更改数据,则只会损坏文件。 And to change the length prefix you need to know the format (it might be big-endian/little-endian, any fixed number of bytes, or the prefix itself could be variable length). 要更改长度前缀,您需要知道格式(它可能是big-endian / little-endian,任何固定数量的字节,或者前缀本身可能是可变长度)。 Or it might be delimited. 或者它可能是分隔的。 Or there might be relative offsets scattered through the file that all need fixing. 或者可能存在分散在文件中的所有需要​​修复的相对偏移。

Just as likely; 同样可能; you could by chance have the same byte sequence in the binary that doesn't represent a newline; 你可能偶然在二进制文件中有相同的字节序列, 不代表换行符; you could be completely trashing the data. 你可能会完全破坏数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM