简体   繁体   English

将文本写入二进制文件-有什么区别?

[英]Writing text to binary file - what's the difference?

I'm learning to write binary files in C++. 我正在学习用C ++编写二进制文件。 I'm a bit confused with the result. 我对结果有些困惑。 Let's say I have this code: 假设我有以下代码:

#include<fstream>
#include<string>
using namespace std;

int main(){
    ofstream file;
    string text = "Some text over here";

    file.open("test.bin",ios::out|ios::binary);
    file.write(text.c_str(), text.length());
    file.close();

    return 0;
}

I'm expecting the output file test.bin to be "in binary", but when I look at it in notepad, I see normal text: 我期望输出文件test.bin是“二进制的”,但是当我在记事本中查看它时,会看到普通的文本:

Some text over here

Is my expectation wrong? 我的期望错了吗? What makes things binary and what should I use to achieve it? 是什么使事物变成二进制,我应该用什么来实现它?

The most "important" definition of what the word "binary" means comes from just a situation where a number can take on one of two values. “二进制”一词含义的最“重要”定义仅来自数字可以取两个值之一的情况。 Whatever you call those doesn't strictly matter ("on"/"off", "1"/"0", "yes"/"no"). 不管您怎么说,这些都没有关系(“ on” /“ off”,“ 1” /“ 0”,“ yes” /“ no”)。 All that matters is that there are just two states. 重要的是只有两个状态。

Keep that core definition in mind. 牢记该核心定义。 But you will find a large number of other idiomatic usages of the word "binary" in the computer world, depending on context. 但是您会发现在计算机世界中,根据上下文,还会大量使用“ binary”一词。

As an example: Some people will refer to a file representing an executable image (such as an .EXE file on Windows) as simply "a binary" or ( "the binary" , when compiling a certain codebase and you know what executable you'd be talking about.) 例如:有人编译一个特定的代码库时,将表示可执行文件映像的文件(例如Windows上的.EXE文件)简称为“二进制文件”或( “二进制文件”) ,并且您知道自己使用的是哪种可执行文件。 d正在谈论。)

You've tripped across another confusing distinction of how sometimes people will talk about a file format as being either "textual" or "binary". 你跨人会怎样有时谈论的文件格式为 “文本” “二进制”又一个令人困惑的区别跳闸。 Yet today's computers are based on systems that are always binary ( technically they don't have to be ) . 但是今天的计算机是基于始终是二进制的系统(从技术上来说不必如此 So if "textual" files aren't stored ultimately as binary bits somewhere, how else would they be stored? 所以,如果“文本”文件不存储最终二进制位的地方,是怎么回事他们储存在哪里? :-/ : - /

So really what it means for a file format to be labeled as "textual" is to say that it is "stricter about what binary patterns it uses, such that it will only use those patterns which make sense in certain textual encodings". 因此,将文件格式标记为“文本”的真正含义是说它“严格限制使用哪种二进制模式,这样它将仅使用在某些文本编码中有意义的那些模式”。 That's why those files look readable when you load them up in text editors. 这就是为什么在文本编辑器中加载这些文件时它们看起来可读的原因。

So a "textual file format" is a subset of all "file formats". 因此,“文本文件格式”是所有“文件格式”的子集。 And sometimes when people want to refer to something that is not in that subset of textual files, they will call it a "binary file format". 有时,当人们想要引用不在文本文件子集中的内容时,他们会称其为“二进制文件格式”。

Plenty of room for confusion! 混乱的空间很大! But the upshot is that all you do when you open a file in "textual" vs. "binary" mode in C++ is to tell the stream that you are not using only the bit patterns likely to look good in a text editor when loaded. 但是结果是,在C ++中以“文本”与“二进制”模式打开文件时,您要做的就是告诉流,您只是使用加载时在文本编辑器中看起来不错的位模式。 Opening in binary asks for all bytes to be sent to the file verbatim , instead of having it try and take care of cross-platform text-file differences in newline handling "under the hood" as a convenience. 以二进制格式打开时,要求将所有字节逐字发送到文件,而不是为了方便起见而尝试在换行符处理中处理跨平台文本文件差异。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM