简体   繁体   中英

I need a regex to match puppet facter facts

Puppet facts look like this:

processors => {"models"=>["AMD Opteron(tm) Processor 6172", "AMD Opteron(tm) Processor 6172", "AMD Opteron(tm) Processor 6172", "AMD Opteron(tm) Processor 6172"], "count"=>4, "physicalcount"=>2}
productname => VMware Virtual Platform
ps => ps -ef
puppetversion => 3.6.2
rubysitedir => /usr/local/brs/harmony-puppet/lib/ruby/site_ruby/2.1.0
rubyversion => 2.1.2
sshecdsakey => AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBNDUmg8FQGCO/r/VGABUPwBqT8zTwzXwZCjTdBC6cXj1Mo5ypxuqO1Qtwg9uQagcS5eLNbv+SxHotpzYSXZ1R8g=
sshfp_dsa => SSHFP 2 1 42ffbd293f1501c0718b2b7b3852542329da1758
SSHFP 2 2 eb52d78a34bdadecc41b38366a5580c923bbb6cd0b81cec76de6379ce4251439
sshfp_ecdsa => SSHFP 3 1 d41abd2e3aff846b4efb59dbc8e4803875d33130
SSHFP 3 2 ae77a20a66859976e06efb7d6dd0819db4f9e9d93bc55da52a4bffff6acb1baa
sshfp_rsa => SSHFP 1 1 d3f14587683138e6d10cacba92fa34364ed5d326
SSHFP 1 2 132856925e056d02767e6c6ca4015ed21ac4c6eddb727f7c69e5edecb8806884
sshrsakey => AAAAB3NzaC1yc2EAAAADAQABAAABAQDzcJ6158aIkY161vcDH6WKNgKAeUsxrHh+HJH9IEistcV2TUJSdHtG/p5peI+cTa0EhabbNw8ToUU3ZWYmiTmxxuZzxggAxCx6xhWNDgC/492QnouxHnqjxwpFyIYnLpdbaMRV/6t9iE7v09Gfb31TS3/DbAUh5yla1OOeHbxJQ/eUOUYgy7/6eFL43+R9SfiuP11VRK8r325mCOFaPqw8VuNeGul/rMnccBCbuFvgmQnfOo/ldwrfOL2W4qAvfE0bKyG13WrDSlauo+CFtYqDK08hCItjrbVKgVrOzLCuKGzKFuqOgF3u8Q1je23qu7eUmF7lZPYVWSEpkh0xlR0p
swapfree => 1.45 GB
swapfree_mb => 1482.82
swapsize => 1.46 GB
swapsize_mb => 1497.00
system_uptime => {"seconds"=>6034301, "hours"=>1676, "days"=>69, "uptime"=>"69 days"}
timezone => PDT

I am trying to easily split each fact up into a key/value pair. Using this site:

http://rubular.com/

And this regex

(?m)^(\S+) => (((?!^\S+ => ).)*)$

I am able to get what I want (all the keys and values match perfectly). The problem is I'm writing my code in java, and using this site:

http://java-regex-tester.appspot.com/

With the same inputs I am not getting the matches I want. Specifically the facts where the value of the key/value pair contains a newline character, such as this one:

sshfp_rsa => SSHFP 1 1 d3f14587683138e6d10cacba92fa34364ed5d326
SSHFP 1 2 132856925e056d02767e6c6ca4015ed21ac4c6eddb727f7c69e5edecb8806884

End up omitting the second line of the value:

key = sshfp_rsa
value = SSHFP 1 1 d3f14587683138e6d10cacba92fa34364ed5d326

Can anyone help me building the correct regex?

This regex should work for you:

(?ms)^(\w+) => (.*?)(?=(?:\s^\w+ =>|\z))

In Java Code:

Pattern p = Pattern.compile("^(\\w+) => (.*?)(?=(?:\\s^\\w+ =>|\\z))", 
          Pattern.MULTILINE | Pattern.DOTALL);

RegEx Demo

Use this expression:

(?ms)^(\S+) => (.*?(?=^\S+ => |\Z))

Demo


I kept most of the same logic, except for looking for the value...let's break that down:

.*?        (?# lazily match 0+ characters)
(?=        (?# begin lookahead to end value)
  ^\S+ =>  (?# find the start of a new key)
 |         (?# OR)
  \Z       (?# end of the string)
)          (?# end lookahead)

We use the dot-match-newline modifier ( s ) and do a lazy match that is ended by a lookahead. The lookahead is either a new key or the end of the string.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM