I have a file with thousands of records, I need to filter them based on the 8th character of each line. In my case, if the 8th character is [a or A] I want to extract that line and save to a new file.
I have just put together a simple java application with 3 item 2 of which have the data I want "1st and 3rd", and I am print to console but my matcher isn't working.
my Code Example:
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class ValidateDemo {
public static void main(String[] args) {
String pattern = "^.{7}([aA]{1})";
// Create a Pattern object
Pattern p = Pattern.compile(pattern);
List<String> input = new ArrayList<String>();
input.add("CARHALAALondon GB W");
input.add("T(U LRFonhai CN E");
input.add("A$F LAMuguni VE E");
for (String ssn : input) {
System.out.println(p + " -> " +ssn);
if (p.matcher(ssn).matches()){
System.out.println("Match: " + ssn);
}
}
}
}
Output:
^.{7}([aA]{1}) -> CARHALAALondon GB United Kingdom W
^.{7}([aA]{1}) -> T(U LRFonhai CN China E
^.{7}([aA]{1}) -> A$F LAMuguni VE Venezuela E
As you can see it only prints out the first SYSO, anyone any idea how I can achieve what I'm trying to do.
Thanks
G
You are almost there - Matcher::matches
attempts to match the whole string .
This pattern should do what you want:
String pattern = "^.{7}[aA].*";
Alternatively (simpler and more efficient):
for (String ssn : input) {
char eighth = ssn.charAt(7);
if (eighth == 'a' || eighth == 'A') {
System.out.println("Match: " + ssn);
}
}
I would ditch the regular expression stuff and just do a check using String's charAt(int) method as I've done in the eighthCharIsACharAt method below:
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Pattern;
public class ValidateDemo {
private static boolean eighthCharIsACharAt(String s) {
char eighthChar = s.charAt(7);
return (eighthChar == 'a' || eighthChar == 'A');
}
private static boolean eighthCharIsAMatcher(String s, Pattern p) {
return p.matcher(s).matches();
}
public static void main(String[] args) {
String pattern = "^.{7}[aA].*";
Pattern p = Pattern.compile(pattern);
List<String> input = new ArrayList<String>();
input.add("CARHALAALondon GB W");
input.add("T(U LRFonhai CN E");
input.add("A$F LAMuguni VE E");
int numIterations = 10000;
long startTime = System.currentTimeMillis();
for (int i = 0; i < numIterations; i++) {
for (String s: input) {
if (eighthCharIsAMatcher(s, p)) {
//System.out.println(s);
}
}
}
System.out.println("Matcher elapsed time: " + (System.currentTimeMillis() - startTime) + " ms");
startTime = System.currentTimeMillis();
for (int i = 0; i < numIterations; i++) {
for (String s: input) {
if (eighthCharIsACharAt(s)) {
//System.out.println(s);
}
}
}
System.out.println("charAt elapsed time: " + (System.currentTimeMillis() - startTime) + " ms");
}
}
Regular expressions are great, but not very efficient when used in a loop. In your specific case, it seems like overkill.
In my test comparison using charAt versus Pattern matches, charAt wins by over a factor of 10.
Run output:
Matcher elapsed time: 64 ms
charAt elapsed time: 4 ms
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.