简体   繁体   English

使用Java中的合并算法将两个值合并为单个值

[英]Merge two values into single value using a merge algorithm in Java

Need help to perform merge two similar contact values. 需要帮助来合并两个相似的联系值。 I would like to know how can i merge two similar contact values into a single contact before I convert it as a xml file using Java. 我想知道在使用Java将其转换为xml文件之前,如何将两个相似的联系人值合并为一个联系人。 In my case I have contacts like this: 就我而言,我有这样的联系方式:

Contact No1:
Contact
Arun
Arun_niit
nuraaa_iceee@yahoo.co.in
Contact
Contact No2:
Contact
Arun
Arun_niit
nuraaa_iceee@gmail.com
Contact

Contact No1&2 has same name and also the contact No 3&4 联系人1&2具有相同的名称,并且联系人3&4也具有相同的名称

Contact No3:
Contact
Rangarajkarthik
karthik Rangaraj
kart26@gmail.com
karthikranga@yahoo.com
Contact


Contact No4:
Contact
Rangaraj
karthik 
kart26@gmail.com
karthikranga@yahoo.com
Contact

The above contact repeating twice with the same name and email address. 上述联系人重复了两次,并使用了相同的姓名和电子邮件地址。 How can I merge this as a single contact? 如何将其合并为单个联系人?

This is my .partf file and I want to convert this to an XML file which I did. 这是我的.partf文件,我想将其转换为我所做的XML文件。 But I want to merge these contacts and then create a XML using my JavaCode. 但是我想合并这些联系人,然后使用我的JavaCode创建XML。

Contact
Arun
Arun_niit
nuraaa_iceee@yahoo.co.in
Contact
Contact
ColomboGiorgia
Giorgia Colombo
giorgiacolombo81239@libero.it
Contact
Contact
Arun
Arun_niit
nuraaa_iceee@gmail.com
Contact
Contact
KumarVeera
Veera Kumar
KUMARg_8111@yahoo.com
Contact
Contact
MarbellaFunkybuddha
Funkybuddha Marbella
http://www.facebook.com/profile.php?id=1123301493096451
Contact
Contact
Rangarajkarthik
karthik Rangaraj
kart2006@gmail.com
karthikrangaraj@yahoo.com
Contact
Contact
Rangaraj
karthik 
kart26@gmail.com
karthikranga@yahoo.com
Contact

My JavaCode: 我的JavaCode:

package textparser;  
import java.io.BufferedReader;
  import java.io.FileOutputStream;
  import java.io.FileReader;
  import java.util.regex.Pattern;
  import org.xml.sax.ContentHandler;
  import java.io.Serializable;
  import com.sun.org.apache.xml.internal.serialize.OutputFormat;
  import com.sun.org.apache.xml.internal.serialize.XMLSerializer;
  import com.sun.xml.internal.bind.util.AttributesImpl;
  public class Item  {
  public static void main (String args[]) {
  Item.readWrite("D:/Demo/test.part","D:/Demo/juin202.xml");//Read XML and Save as a XML file.
  }

  public static void readWrite(String fromFile, String toFile)  
  {
try{
    Pattern p = Pattern.compile(".+@.+\\.[a-z]+");//A compiled representation of a regular expression. 
    BufferedReader in = new BufferedReader(new FileReader(fromFile));
    FileOutputStream fos = new FileOutputStream(toFile);
    OutputFormat of = new OutputFormat("XML","windows-1250",true);//Codepage Windows-1250 - Character Code Listing for Central Europe languages.
    of.setIndent(1);
    of.setIndenting(true);//JDOM API:This will set the indent String to use; this is usually a String of empty spaces.
    XMLSerializer serializer = new XMLSerializer(fos,of);
    ContentHandler hd = serializer.asContentHandler();//The Serializer interface is implemented by a serializer to enable users to: get an org.xml.sax.ContentHandler or a DOMSerializer to provide input to. 
    hd.startDocument();
    AttributesImpl atts = new AttributesImpl();//Construct a new, empty AttributesImpl object.
    hd.startElement("","","CONTACTS",atts);//To create root tag <Contacts>
    String line = null,tag;
    while ((line=in.readLine())!=null) {
        if(line.equals("Contact")){
            line=in.readLine();
            hd.startElement("","","CONTACT",atts);
            int i=0;
            while(!line.equals("Contact")){
                if(i==0)
                    tag="FirstName";
                else if(i==1)
                    tag="LastName";
                else{
                    if(p.matcher(line).matches())
                        tag="EMail";
                    else
                        tag="URL";
                }
                hd.startElement("","",tag,atts);
                hd.characters(line.toCharArray(),0,line.length());
                hd.endElement("","",tag);
                i++;
                line=in.readLine();
            }
            hd.endElement("","","CONTACT");
        }
    }
    hd.endElement("","","CONTACTS");
    hd.endDocument();
    fos.close();
    System.out.println("Work is done and check your file specified in the directory");
    in.close();
    }catch(Exception E){
        System.out.println("Cannot Generate XML!!!");
            }

        }
  }

Am I doing something wrong or is there any better way of doing this? 我是在做错什么,还是有更好的方法呢?

This seems simple, as you load your contacts compare the contact you are loading to what you have already loaded. 这看起来很简单,在加载联系人时,将加载的联系人与已加载的联系人进行比较。 If it is a duplicate, you don't really need to merge you just need to delete (or just stop loading) the duplicate contact. 如果重复,则不需要合并,只需删除(或停止加载)重复的联系人即可。

I did not look into your code in detail, but I think you have a more general design problem. 我没有详细研究您的代码,但我认为您有一个更一般的设计问题。 You should 1) read the file and store each of the different contacts in an Object, for now let's call it InputContact . 您应该1)读取文件并将每个不同的联系人存储在一个对象中,现在让我们将其称为InputContact 2) you must specify your rules for merging the contacts. 2)您必须指定合并联系人的规则。 3) iterate over the list of InputContact s and perform the comparisons and build a new list of MergedContacts . 3)遍历InputContact的列表并执行比较,并建立一个新的MergedContacts列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM