简体   繁体   English

使用java String.split()方法分离数据

[英]Separate data using java String.split() method

I've been trying to use the String.split() method in the java programming language to retrieve data from a string. 我一直在尝试使用Java编程语言中的String.split()方法从字符串中检索数据。 The data in my string is in a weird format. 我的字符串中的数据格式很奇怪。 Let me give you an example: let's say I have the following string: 让我举一个例子:假设我有以下字符串:

String s="My name is      :  John Smith         13   75,5";

I need to split the string in such a way that everything (except whitespaces) before : goes to a string, then, in what's left, everything (except whitespaces) before the first number to another string, then the first number to another string and the last number to another string, such that, considering the String s in my example I would have the following output: 我需要以这样的方式拆分字符串::之前的所有内容(空格除外)都移至字符串,然后,剩下的是,将第一个数字之前的所有内容(空格除外)移至另一个字符串,然后将第一个数字移至另一个字符串,然后最后一个数字到另一个字符串,这样,考虑到我的示例中的String s ,我将得到以下输出:

My name is
John Smith
13
75,5

I tried the following code: 我尝试了以下代码:

data= s.split("\\s*\\:\\s*|\\s+|\\s+");

But the output was: 但是输出是:

My
name
is
John
Smith
13
75,5

I've tried many other regular expressions but with no success (I believe that proves how much of a beginner I am with regex...) Can somebody help me? 我尝试了许多其他正则表达式,但都没有成功(我相信这证明了我对regex有多大的了解。)有人可以帮助我吗?

Note: I think it wouldn't be too difficult to write my own split method for my data and maybe it's much better in terms of performance but I would really like to understand how to do this using regex. 注意:我认为为数据编写自己的split方法并不难,也许在性能方面会更好,但是我真的很想了解如何使用正则表达式来实现。

If you want to keep single spaces, but split on 2+ spaces, the expression looks like this: 如果要保留单个空格,但要拆分两个以上的空格,则表达式如下所示:

data= s.split("\\s*\\:\\s*|\\s{2,}");

This produces the output that you need ( demo ): 这将产生您需要的输出( 演示 ):

My name is
John Smith
13
75,5
"\\s*\\:\\s*|\\s+|\\s+"

The | | characters don't do what you think they might. 角色没有按照您认为的方式做。 They act as "or", so in your case your regex is matching \\\\s+ to the spaces between words, because while they don't match the first section of your regex, they certainly match the second (and third). 它们充当“或”,因此在您的情况下,您的正则表达式将\\\\s+与单词之间的空格匹配,因为尽管它们与正则表达式的第一部分不匹配,但它们肯定与第二(和第三部分)匹配。

Better solution would be something like this: 更好的解决方案是这样的:

"\\s*\\:\\s*|\\s{2,}"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM