hi i am working on a shellscript.. suppose this is the data my shell script runs on
Ownership
o Australian Owned
?
Ads for Mining Engineers
232 results for
mining engineers in All States
filtered by Mining Engineers [x] category
* [ ]
[34]get directions
Category:
[35]Mining Engineers
[36]Arrow Electrical Services in Wollongong, NSW under Mining
Engineers logo
[37]email
[38]send to mobile
[39]info
Compare (0)
* [ ]
. [40]Firefly International
Designers & Manufacturers. Service, Repair & Hire.
We are the provider of mining engineers in Mt Thorley, NSW.
25 Thrift Cl, Mt Thorley NSW 2330
ph: (02) 6574 6660
[41]http://www.fireflyint.com.au
[42]get directions
Category:
[43]Mining Engineers
[44]Firefly International in Mt Thorley, NSW under Mining Engineers
logo
[45]email
[46]send to mobile
[47]info
Compare (0)
* [ ]
[48]Materials Solutions
Materials Research & Development, Slurry Rheology & Piping Design.
We are a well established company servicing the mining industry &
associated manufacturing industries in all areas.
Thornlie WA 6108
ph: (08) 6468 4118
[49]www.materialssolutions.com.au
Category:
[50]Mining Engineers
[51]Materials Solutions in Thornlie, WA under Mining Engineers logo
[52]email
[53]send to mobile
[54]info
Compare (0)
* [ ]
. [55]ATC Williams Pty Ltd
Our services are available from concept to completion of the works.
Today, as the rebranded ATC Williams, we continue to expand our
operations across Australia and in locations around the world.
Unit 1, 21 Teddington Rd, Burswood WA 6100
ph: (08) 9355 1383
[56]www.atcwilliams.com.au
[57]get directions
Category:
[58]Mining Engineers
[59]ATC Williams Pty Ltd in Burswood, WA under Mining Engineers
logo
[60]email
[61]send to mobile
[62]info
Compare (0)
and i need to grab addresses that look like this
* [ ]
. [55]ATC Williams Pty Ltd
Our services are available from concept to completion of the works.
Today, as the rebranded ATC Williams, we continue to expand our
operations across Australia and in locations around the world.
Unit 1, 21 Teddington Rd, Burswood WA 6100
ph: (08) 9355 1383
[56]www.atcwilliams.com.au
so what do i do.. i've been working on regular expressions like
^*(.?[\\w\\W?\\s?]*)+(.com.au)$
but thats not helping.. it matches the address when i give the input file with the address match i want.. but when given in bulk, it doesnt help. so can somebody help me out..
I see some issues with your regex
^*(.?[\w\W?\s?]*)+(.com.au)$
^ ^ ^ ^ ^ ^
1 1 2 2 1 1
special char's that need escaping
greedy quantifier that match everything till the last ".com.au" , add a ?
after the quantifier to make it ungreedy ==> match as less as possible (means till the first ".com.au" that is found at the row end).
==> This is your main problem
You nest quantifiers *)+
, you don't need that
In your example there is whitespace between the "*" and the ".", so either match for whitespace or remove the dot at all, it will be matched by your character class.
There is also whitespace between the start of the row and the "*"
So, try this
^\s*\*([\w\W?\s?]*?)(\.com\.au)$
See it here on Regexr
Try this
^\s*\*\s*\[ \][^\*]+?[.]com[.]au$
explanation
^ # Assert position at the beginning of a line (at beginning of the string or after a line break character)
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\* # Match the character “*” literally
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\[ # Match the character “[” literally
\ # Match the character “ ” literally
\] # Match the character “]” literally
[^\*] # Match any character that is NOT a * character
+? # Between one and unlimited times, as few times as possible, expanding as needed (lazy)
[.] # Match the character “.”
com # Match the characters “com” literally
[.] # Match the character “.”
au # Match the characters “au” literally
$ # Assert position at the end of a line (at the end of the string or before a line break character)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.