简体   繁体   中英

How to find a string which is case-sensitive and ignore it in JAVA

I have a text file (T1.txt) in which it has few strings.out of them 2 are similar but case-sensitive. I have to ignore the other one and get the rest of them..

eg. ABCD, XYZ, pqrs, aBCd.

i am using Set to return the strings.. but how I can ignore the duplicate and return only one string( either of ABCD , aBCd).

public static Set findDuplicates(File inputFile)
{
 FileInputStream fis = null;
    BufferedInputStream bis = null;
    DataInputStream dis = null;
    Set<String> set = new HashSet<String>();
    ArrayList<String> inpArrayList = new ArrayList<String>();

    try{

        fis = new FileInputStream(inputFile);

        bis = new BufferedInputStream(fis);
        dis = new DataInputStream(bis);

        while (dis.available() != 0) 
        {
           inpArrayList.add(dis.readLine());
        }

         for(int i=0; i < inpArrayList.size(); i++)
         {
             if(!set.contains(inpArrayList.get(i)))
                set.add(inpArrayList.get(i));
        }

    }
    catch (FileNotFoundException e) {
  e.printStackTrace();
} catch (IOException e) {
  e.printStackTrace();
}
System.out.println(" set" +  set);
return set;        
}

The returning set shall contain only XYZ, pqrs, aBCd or ABCD. but not both.

Thanks Ramm

Create a hash-map, use currentString.toLowerCase() as key, and original string as value. So that two string with different case will have the same key. When storing it, you use the original string as value, so when printing you won't get all lower-case but one of the original.

You could use a TreeSet and the String.CASE_INSENSITIVE_ORDER comparator, which I find more elegant than the suggested HashMap solutions:

Set<String> set = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
set.add("abc");
set.add("AbC");
set.add("aBc");
set.add("DEF");
System.out.println(set); // => "[abc, DEF]"

Note that iteration through this set would give you the keys in lexicographical order. If you want to preserve the insertion order as well, I'd maintain a List on the side like this:

Set<String> set = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
List<String> inOrder = new ArrayList<String>();
// when adding stuff inside your loop:
if (set.add(someString)) { // returns true if it was added to the set
    inOrder.add(someString);
}
inpArrayList.add(dis.readLine().toLowerCase());

添加此行应该工作...

You can use the old trick of calling .toLower() before putting it in the set.

And if you want to keep the original case change to a hashmap from the lower case to the natural case then iterate the values.

Convert every string to lowercase before inserting it into the set, and then the set will take care of the uniqueness for you.

(If you also need to preserve the case of the input (returning abcd for AbCd is not acceptable), then you need a second set that stores lower-case variants and use checks on the second set to decide whether or not to add strings to the result set. Same principle, but one more step to program.)

Just store your strings in upcase in your set, before storing them in your ArrayList result.

If you can't add a string to the set (because it already exists), don't store it in the ArrayList.

Just as said above, I did something similar earlier this week. You can do something like (just adjust it to your code):

HashMap<String, String> set = new HashMap<String, String>();

while(tokenzier.hasMoreTokens())
{
    String element = tokenzier.nextToken();
    String lowerCaseElement = element.toLowerCase();
    if (!set.containsKey(element)
    {
       set.put(lowerCaseElement, element);
    }
}

At the end the map 'set' will contain what you need.

How about using HashMap (HashMap), with key being generated by a your hash function. The hash function would return the string in lowercase.

Shash

If the case of the output is not important you could use a custom FilterInputStream to do the conversion.

    bis = new BufferedInputStream(fis);
    fltis = new LowerCaseInputStream(bis);
    dis = new DataInputStream(fltis);

An example of LowerCaseInputStream comes from here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM