从IBM Domino检索大数据

Question

我正在尝试从IBM Domino v9.0收集帐户数据和组数据。

我编写的用于收集数据的代码使用了lotus.domino.View API。 该代码适用于少量数据集，我为40个用户运行了该代码，这很好。 当我运行代码以提取10k用户的数据时，大约需要3-4个小时来提取数据，这是不可接受的。

您能否建议我一些其他的API，以便我们以更快的速度检索数据？

我当前使用的代码如下：

private static void testUserDataRetrieval(String host, String userName, String password) throws Exception{
    Session s = NotesFactory.createSession(host, userName, password);
    Database db = s.getDatabase(s.getServerName(), "names");
    File outFile = new File("C:\\Users\\vishva\\Desktop\\dominoExtract.txt");
    FileWriter writer = new FileWriter(outFile);
    View view = db.getView("($Users)");
    ViewEntryCollection entryCollection = view.getAllEntries();
    writer.write("\n--------------------------------- Printing data for view:"+view.getName()+"("+view.getEntryCount()+")--------------------------------------------");
    Vector<Object> columnNames = view.getColumnNames();
    for(int i = 0; i< entryCollection.getCount(); i++){
        ViewEntry entry = entryCollection.getNextEntry();
        if(entry == null) continue;
        Vector<Object> colValues = entry.getColumnValues();
        for(int j = 0; j < columnNames.size(); j++)
            writer.write("\n"+columnNames.get(j)+":"+colValues.get(j));
        writer.write("\n*****************************************");
    }
    writer.flush();
    writer.close();
}

请让我知道我应该使用其他哪个API来提高数据检索的速度？

Answer 1

首先：在for循环的每次运行中读取集合的计数。 这使其非常缓慢。 完全不需要for循环。 第二：您永远不会回收对象。 在使用多米诺骨牌对象时，这是必须的，否则您的代码将以比您想象的更快的速度消耗内存：

private static void testUserDataRetrieval(String host, String userName, String password) throws Exception{
  Session s = NotesFactory.createSession(host, userName, password);
  Database db = s.getDatabase(s.getServerName(), "names");
  File outFile = new File("C:\\Users\\vishva\\Desktop\\dominoExtract.txt");
  FileWriter writer = new FileWriter(outFile);
  View view = db.getView("($Users)");
  ViewEntryCollection entryCollection = view.getAllEntries();
  writer.write("\n--------------------------------- Printing data for view:"+view.getName()+"("+view.getEntryCount()+")--------------------------------------------");
  Vector<Object> columnNames = view.getColumnNames();
  ViewEntry entry = entryCollection.getFirstEntry();
  ViewEntry entryNext;
  while(entry != null) {
    Vector<Object> colValues = entry.getColumnValues();
    for(int j = 0; j < columnNames.size(); j++)
        writer.write("\n"+columnNames.get(j)+":"+colValues.get(j));
    writer.write("\n*****************************************");
    entryNext = entryCollection.getNextEntry();
    entry.recycle();
    entry = entryNext ;
  }

writer.flush();
writer.close();

}

Answer 2

默认情况下，您的View对象的isAutoUpdate属性设置为true。 这意味着它将与后端发生的任何更改同步，而不仅仅是给您一次性快照。 添加：

view.setAutoUpdate(false);

另外，您可能应该使用ViewNavigator类进行调查。 请参阅此处以获得详细描述从Domino 8.5.2开始该类中的性能改进的文章。 它是由班级的主要开发人员编写的。 另请参阅此博客文章，以获得很多有趣的细节。

Answer 3

感谢大家的投入。 为了检索用户，我稍后尝试了LDAP API，它的工作原理很吸引人。 我能够在23秒内检索到1 万个帐户 ，这是可以接受的。 我使用的代码如下

    private void extractDominoDataViaLdap(){
    String [] args = new String[]{"192.168.21.156","Administrator","password","","C:\\Users\\vishva\\Desktop"};
    String server = args[0];
    String userName = args[1];
    String password = args[2];
    String baseDN = args[3];
    String fileLocation = args[4];
    // Set up environment for creating initial context
    Hashtable<String, Object> env = new Hashtable<String, Object>(11);
    env.put(Context.INITIAL_CONTEXT_FACTORY,"com.sun.jndi.ldap.LdapCtxFactory");
    env.put(Context.PROVIDER_URL, "ldap://"+server+":389");

    long time = System.currentTimeMillis();
    // Authenticate as S. User and password "mysecret"
    env.put(Context.SECURITY_AUTHENTICATION, "simple");
    env.put(Context.SECURITY_PRINCIPAL, userName);
    env.put(Context.SECURITY_CREDENTIALS, password);
    FileWriter writer = null;
    BufferedWriter out = null;
    try {
        DirContext ctx = new InitialDirContext(env);
        //fetching user data
        // Create the search controls
        SearchControls searchCtls = new SearchControls();
        // Specify the attributes to return
        String returnedAtts[] = {"FullName","dn","displayname","givenName","sn","location","mail","mailfile","mailserver"};
        searchCtls.setReturningAttributes(returnedAtts);
        // Specify the search scope
        searchCtls.setSearchScope(SearchControls.SUBTREE_SCOPE);
        searchCtls.setCountLimit(0);
        // specify the LDAP search filter
        String searchFilter = "objectClass=inetOrgPerson";

        // Specify the Base for the search
        String searchBase = baseDN;

        // Search for objects using the filter
        NamingEnumeration<SearchResult> answer = ctx.search(searchBase,searchFilter, searchCtls);

        writer = new FileWriter(fileLocation+"\\users.csv");
        out = new BufferedWriter(writer);
        for(String attr : returnedAtts)
            out.write(attr+",");
        out.write("\n");
        int count = 0;
        // Loop through the search results
        while (answer.hasMoreElements()) {
            ++count;
            SearchResult sr = (SearchResult) answer.next();
            Attributes attrs = sr.getAttributes();
            StringBuilder sb = new StringBuilder();
            for(String attr : returnedAtts)
                sb.append("\""+((attrs.get(attr) != null)?attrs.get(attr).get():"")+"\"").append(",");
            out.write(sb.toString()+"\n");
            out.flush();
        }
        System.out.println("# of users returned: "+count);
        out.close();
        writer.close();
    }
}

从IBM Domino检索大数据

问题描述

3 个解决方案

解决方案1
2 2014-03-27 16:10:33

解决方案2
1 2014-03-27 18:13:28

解决方案3
0 2014-04-03 12:18:03

从IBM Domino检索大数据

问题描述

3 个解决方案

解决方案1 2 2014-03-27 16:10:33

解决方案2 1 2014-03-27 18:13:28

解决方案3 0 2014-04-03 12:18:03

解决方案1
2 2014-03-27 16:10:33

解决方案2
1 2014-03-27 18:13:28

解决方案3
0 2014-04-03 12:18:03