简体   繁体   中英

Parse {“item”=“value”} with java

It probably existe somewhere else, but I can't manage to find it.

I have a file with data saved like this {“item1”=“value1”, “item2”=“value2”} and etc. All I have manage to do to get the value was .split(“,”) and then .split(“=“) . After that I simply remove the with .replace(“\\”, “”) .

So it “works” but isn't very good and effective, especially if I have multiple {} .

Is there a better way to do it? Or should I save my data an other way? I'm really not good with data storing.

Thank you very much!

I can think of 2 ways to do that:

  1. Match and replace:

    Replace the "=" symbol between key and value with ":" to make it a json file. Then parse it with third party tools such as gson . I would suggest using regex to look for "=" between keys and values, as the symbol might also appears in the String literal of value and/or key.

  2. Check completeness of parentheses:

    You can use Stack to do that. Refer to here for implementation details.

You do understand that all you could do is 1 line of code:

String array[] = line.replaceAll("[\\{\\}\"]", "").split(",");

If I'm not mistaken you said that you .split(“,“) before you .replace("\\"", "")
If this is the case then there is a problem because you must iterate through all the splitted elements to make the replacement.
Instead of this first make all the replacements and finally split to get an array of pairs (item=value) .

You can use ANTLR parser-generator for parsing such input from a file. The grammar you can use is as follows:

S → {Q}
Q → T | ɛ
T → A | A,T
A → I=V | {T}
I → item       // item is the regex for whatever you expect in the item field
V → value      // value is the regex for whatever you expect in the value field

The above grammar matches strings of following type (it also supports nesting of braces):

  1. {item=value}
  2. {item=value,item=value}
  3. {item=value,item=value,item=value,item=value}
  4. {item=value,item=value,item=value,item=value,{item=value},{item=value}}
  5. {item=value,item=value,item=value,item=value,{item=value,{item=value}}}

Once you parse the file input with this grammar using ANTLR, you get a parse tree which will preserve the hierarchy of the item-value pairs in the original input. Using this parse-tree data structure you can then easily glean out information about the various item-value pairs.

With ANTLR, your parse-tree data structure will look something like this:

                                         ,
                                       /   \
                                     /       \
                                   /           \
                                 /               \
                               /                   \
                             =                      ,                 ===> represents {item=value,{item=value,item=value}}                                                  
                            / \                   /   \
                           /   \                /       \
                       item   value           /           \
                                             =             =
                                            /  \          /  \
                                           /    \        /    \
                                         item  value   item   value

Even if you don't want to use a tool like ANTLR for this task, you can easily write a recursive-descent parser based on this grammar, though you will have to tokenize the file input using these 6 types of tokens (and then feed it to your parser)=>

Token class  matches
LPAREN         "{"
RPAREN         "}"
COMMA          ","
ITEM           item  //regex for identifying items
VALUE          value //regex for identifying values
EQUAL          "="

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM