简体   繁体   English

java分裂混淆空间字符

[英]java split confusing space character

I am splitting a string which contains a filename from a windows system. 我正在拆分一个包含Windows系统文件名的字符串。 The string uses the ascii FS to separate the filename from other information 该字符串使用ascii FS将文件名与其他信息分开

eg filename.jpgFSotherInformationFSanotherPartOfInformation 例如filename.jpgFSotherInformationFSanotherPartOfInformation

Here some example code: 这里有一些示例代码:

String fs = new String(new byte[]{(byte)32}); 
String information ="filename (copy).jpg"+fs+"otherInformation"; 
String[] parts = information.split(fs);

Why does split confuse the space-separator with the ascii-FS? 为什么拆分会将空间分隔符与ascii-FS混淆?

Should I use a different function that split? 我应该使用不同的功能吗? Pattern.quote(fs) does help either... :-( Pattern.quote(fs)确实帮助...... :-(

Because FS is not ascii value 32. 因为FS不是ascii值32。

http://bestofthisweb.com/blogs/tag/ascii-table/ http://bestofthisweb.com/blogs/tag/ascii-table/

The FS is character 28, but this control character should not be used in file names, only for some rare binary file formats (I don't know of one which uses it anymore) FS是字符28,但是这个控制字符不应该用在文件名中,只能用于一些罕见的二进制文件格式(我不知道再使用它的那个)

The space character is 32 which is why it looks the same the split, because it is. 空格字符是32,这就是它看起来与拆分相同的原因,因为它是。

For a simple field seperator, I suggest you use ',' or '\\t' which can be easily read as text or using a spreadsheet package. 对于简单的字段分隔符,我建议您使用','或'\\ t',它们可以像文本或电子表格包一样轻松阅读。

I would suggest stepping through the code in a debugger so you can see what you program is doing. 我建议逐步调试调试器中的代码,以便您可以看到您的程序正在执行的操作。

You've initialized fs with a space (in a rather complicated way). 你用空格初始化fs (以相当复杂的方式)。 The following is equal and shows your problem: 以下是相同的,并显示您的问题:

String fs = " "; 
String information ="filename (copy).jpg"+fs+"otherInformation"; 
String[] parts = information.split(fs);

The ascii char FS has the number 0x1C , so this should work properly: ascii char FS的编号为0x1C ,所以这应该可以正常工作:

String fs = "\u001C"; 
String information ="filename (copy).jpg"+fs+"otherInformation"; 
String[] parts = information.split(fs);

Background information 背景资料

The file separator FS is an interesting control code, as it gives us insight in the way that computer technology was organized in the sixties. 文件分隔符FS是一个有趣的控制代码,因为它让我们深入了解计算机技术在六十年代的组织方式。 We are now used to random access media like RAM and magnetic disks, but when the ASCII standard was defined, most data was serial. 我们现在习惯于随机访问媒体,如RAM和磁盘,但是当定义ASCII标准时,大多数数据都是串行的。 I am not only talking about serial communications, but also about serial storage like punch cards, paper tape and magnetic tapes. 我不仅谈论串行通信,还谈论打孔卡,纸带和磁带等串行存储。 In such a situation it is clearly efficient to have a single control code to signal the separation of two files. 在这种情况下,使用单个控制代码来发信号通知两个文件的分离显然是有效的。 The FS was defined for this purpose. 为此目的定义了FS。 (source) (资源)

The FS was invented to separate real files and not filenames in a hierarchical file directory. 发明FS是为了在分层文件目录中分离真实文件而不是文件名 Technically, yes, you can use it, but it has a different meaning. 从技术上讲,是的,你可以使用它,但它有不同的含义。

Beacuse FS is Ascii values 28 Beacuse FS是Ascii值28

Ascii value 32 is space Ascii值32是space

Split's parameter is actually a regular expression, have you tried Split的参数实际上是一个正则表达式,你试过吗?

String[] parts = information.split("\\x20");

Or even 甚至

String[] parts = information.split("\\s");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM