从IBM Domino检索大数据

Question

I am trying to collect account data and group data from IBM Domino v9.0. 我正在尝试从IBM Domino v9.0收集帐户数据和组数据。

The code that I wrote for collecting data used the lotus.domino.View API. 我编写的用于收集数据的代码使用了lotus.domino.View API。 The code works fine for a small data set, I ran the code for 40 users and it was fine. 该代码适用于少量数据集，我为40个用户运行了该代码，这很好。 When I ran the code to extract the data for 10k users , it takes roughly around 3-4 hrs to extract the data, which is not acceptable. 当我运行代码以提取10k用户的数据时，大约需要3-4个小时来提取数据，这是不可接受的。

Can you please suggest me some other API with which we can retrieve data at much faster rate? 您能否建议我一些其他的API，以便我们以更快的速度检索数据？

The code that i am currently using is as follows: 我当前使用的代码如下：

private static void testUserDataRetrieval(String host, String userName, String password) throws Exception{
    Session s = NotesFactory.createSession(host, userName, password);
    Database db = s.getDatabase(s.getServerName(), "names");
    File outFile = new File("C:\\Users\\vishva\\Desktop\\dominoExtract.txt");
    FileWriter writer = new FileWriter(outFile);
    View view = db.getView("($Users)");
    ViewEntryCollection entryCollection = view.getAllEntries();
    writer.write("\n--------------------------------- Printing data for view:"+view.getName()+"("+view.getEntryCount()+")--------------------------------------------");
    Vector<Object> columnNames = view.getColumnNames();
    for(int i = 0; i< entryCollection.getCount(); i++){
        ViewEntry entry = entryCollection.getNextEntry();
        if(entry == null) continue;
        Vector<Object> colValues = entry.getColumnValues();
        for(int j = 0; j < columnNames.size(); j++)
            writer.write("\n"+columnNames.get(j)+":"+colValues.get(j));
        writer.write("\n*****************************************");
    }
    writer.flush();
    writer.close();
}

Please let me know which other API should I use to increase the speed for data retrieval? 请让我知道我应该使用其他哪个API来提高数据检索的速度？

Answer 1

First of all: You read the count of the collection in every run of your for- loop. 首先：在for循环的每次运行中读取集合的计数。 That makes it very slow. 这使其非常缓慢。 The for- loop is not necessary at all. 完全不需要for循环。 Second: You never recycle the objects. 第二：您永远不会回收对象。 That is MANDATORY when working with domino objects, as otherwise your code will eat up the memory faster than you can think: 在使用多米诺骨牌对象时，这是必须的，否则您的代码将以比您想象的更快的速度消耗内存：

private static void testUserDataRetrieval(String host, String userName, String password) throws Exception{
  Session s = NotesFactory.createSession(host, userName, password);
  Database db = s.getDatabase(s.getServerName(), "names");
  File outFile = new File("C:\\Users\\vishva\\Desktop\\dominoExtract.txt");
  FileWriter writer = new FileWriter(outFile);
  View view = db.getView("($Users)");
  ViewEntryCollection entryCollection = view.getAllEntries();
  writer.write("\n--------------------------------- Printing data for view:"+view.getName()+"("+view.getEntryCount()+")--------------------------------------------");
  Vector<Object> columnNames = view.getColumnNames();
  ViewEntry entry = entryCollection.getFirstEntry();
  ViewEntry entryNext;
  while(entry != null) {
    Vector<Object> colValues = entry.getColumnValues();
    for(int j = 0; j < columnNames.size(); j++)
        writer.write("\n"+columnNames.get(j)+":"+colValues.get(j));
    writer.write("\n*****************************************");
    entryNext = entryCollection.getNextEntry();
    entry.recycle();
    entry = entryNext ;
  }

writer.flush();
writer.close();

} }

Answer 2

By default, your View object has the isAutoUpdate property set to true. 默认情况下，您的View对象的isAutoUpdate属性设置为true。 That means it is synchronizing with any changes that are occurring on the back end instead of just giving you a one-time snapshot. 这意味着它将与后端发生的任何更改同步，而不仅仅是给您一次性快照。 Add this: 添加：

view.setAutoUpdate(false);

Also, you should probably investigate using the ViewNavigator class. 另外，您可能应该使用ViewNavigator类进行调查。 See here for an article detailing performance improvements in this class as of Domino 8.5.2. 请参阅此处以获得详细描述从Domino 8.5.2开始该类中的性能改进的文章。 It's written by the lead developer for the classes. 它是由班级的主要开发人员编写的。 Also see this blog post for a lot of interesting detail. 另请参阅此博客文章，以获得很多有趣的细节。

Answer 3

Thanks Everyone for their input. 感谢大家的投入。 For retrieving the users i tried out the LDAP API later on, and it worked like charm. 为了检索用户，我稍后尝试了LDAP API，它的工作原理很吸引人。 I was able to retrieve 10k accounts in 23 seconds , much acceptable. 我能够在23秒内检索到1 万个帐户 ，这是可以接受的。 The code that i used was as follows 我使用的代码如下

    private void extractDominoDataViaLdap(){
    String [] args = new String[]{"192.168.21.156","Administrator","password","","C:\\Users\\vishva\\Desktop"};
    String server = args[0];
    String userName = args[1];
    String password = args[2];
    String baseDN = args[3];
    String fileLocation = args[4];
    // Set up environment for creating initial context
    Hashtable<String, Object> env = new Hashtable<String, Object>(11);
    env.put(Context.INITIAL_CONTEXT_FACTORY,"com.sun.jndi.ldap.LdapCtxFactory");
    env.put(Context.PROVIDER_URL, "ldap://"+server+":389");

    long time = System.currentTimeMillis();
    // Authenticate as S. User and password "mysecret"
    env.put(Context.SECURITY_AUTHENTICATION, "simple");
    env.put(Context.SECURITY_PRINCIPAL, userName);
    env.put(Context.SECURITY_CREDENTIALS, password);
    FileWriter writer = null;
    BufferedWriter out = null;
    try {
        DirContext ctx = new InitialDirContext(env);
        //fetching user data
        // Create the search controls
        SearchControls searchCtls = new SearchControls();
        // Specify the attributes to return
        String returnedAtts[] = {"FullName","dn","displayname","givenName","sn","location","mail","mailfile","mailserver"};
        searchCtls.setReturningAttributes(returnedAtts);
        // Specify the search scope
        searchCtls.setSearchScope(SearchControls.SUBTREE_SCOPE);
        searchCtls.setCountLimit(0);
        // specify the LDAP search filter
        String searchFilter = "objectClass=inetOrgPerson";

        // Specify the Base for the search
        String searchBase = baseDN;

        // Search for objects using the filter
        NamingEnumeration<SearchResult> answer = ctx.search(searchBase,searchFilter, searchCtls);

        writer = new FileWriter(fileLocation+"\\users.csv");
        out = new BufferedWriter(writer);
        for(String attr : returnedAtts)
            out.write(attr+",");
        out.write("\n");
        int count = 0;
        // Loop through the search results
        while (answer.hasMoreElements()) {
            ++count;
            SearchResult sr = (SearchResult) answer.next();
            Attributes attrs = sr.getAttributes();
            StringBuilder sb = new StringBuilder();
            for(String attr : returnedAtts)
                sb.append("\""+((attrs.get(attr) != null)?attrs.get(attr).get():"")+"\"").append(",");
            out.write(sb.toString()+"\n");
            out.flush();
        }
        System.out.println("# of users returned: "+count);
        out.close();
        writer.close();
    }
}

从IBM Domino检索大数据

问题描述

3 个解决方案

解决方案1
2 2014-03-27 16:10:33

解决方案2
1 2014-03-27 18:13:28

解决方案3
0 2014-04-03 12:18:03

从IBM Domino检索大数据

问题描述

3 个解决方案

解决方案1 2 2014-03-27 16:10:33

解决方案2 1 2014-03-27 18:13:28

解决方案3 0 2014-04-03 12:18:03

解决方案1
2 2014-03-27 16:10:33

解决方案2
1 2014-03-27 18:13:28

解决方案3
0 2014-04-03 12:18:03