在Hadoop / Mapreduce作業的mapper類中，無法打開Openrdf（芝麻）而無法連接到SPARQLRepository

Question

我確實使用Sesame（RDF4j）API編寫了一個Java應用程序來測試> 700個SPARQL端點的可用性，但是需要花費數小時才能完成，因此我嘗試使用Hadoop / MapReduce框架分發此應用程序。

現在的問題是，在mapper類中，應該測試可用性的方法不起作用，我認為這無法連接到端點。

這是我使用的代碼：

public class DMap extends Mapper<LongWritable, Text, Text, Text> {

protected boolean isActive(String sourceURL)
        throws RepositoryException, MalformedQueryException, QueryEvaluationException {
    boolean t = true;
    SPARQLRepository repo = new SPARQLRepository(sourceURL);
    repo.initialize();
    RepositoryConnection con = repo.getConnection();
    TupleQuery tupleQuery = con.prepareTupleQuery(QueryLanguage.SPARQL, "SELECT * WHERE{ ?s ?p ?o . } LIMIT 1");
    tupleQuery.setMaxExecutionTime(120);
    TupleQueryResult result = tupleQuery.evaluate();
    if (!result.hasNext()) {
        t = false;
    }
    con.close();
    result.close();
    repo.shutDown();
    return t;
}

public void map(LongWritable key, Text value, Context context) throws InterruptedException, IOException {
    String src = value.toString();
    String val = "null";
    try {
        boolean b = isActive(src); 
        if (b) {
            val = "active";
        } else {
            val = "inactive";
        }
    } catch (MalformedQueryException e) {
        e.printStackTrace();
    } catch (RepositoryException e) {
        e.printStackTrace();
    } catch (QueryEvaluationException e) {
        e.printStackTrace();
    }
    context.write(new Text(src), new Text(val));
}
}

輸入是TextInputFormat，它看起來像這樣：
http://visualdataweb.infor.uva.es/sparql
...

輸出是一個TextOutputFormat，我得到這個：
http://visualdataweb.infor.uva.es/sparql null
...

Edit1 ：如@ james-leigh和@ChristophE所建議，我使用了try-with-resource語句，但尚無結果：

public class DMap extends Mapper<LongWritable, Text, Text, Text> {

    public void map(LongWritable key, Text value, Context context) throws InterruptedException, IOException {
        String src = value.toString(), val = "";
        SPARQLRepository repo = new SPARQLRepository(src);
        repo.initialize();
        try (RepositoryConnection con = repo.getConnection()) {
            TupleQuery tupleQuery = con.prepareTupleQuery(QueryLanguage.SPARQL, "SELECT * WHERE { ?s ?p ?o . } LIMIT 1");
            tupleQuery.setMaxExecutionTime(120);
            try (TupleQueryResult result = tupleQuery.evaluate()) {
                if (!result.hasNext()) {
                    val = "inactive";
                } else {
                    val = "active";
                }
            }

        }
        repo.shutDown();
        context.write(new Text(src), new Text(val));

    }

}

謝謝

Answer 1

使用try-with-resource語句。 SPRAQLRepository使用必須正確清理的后台線程。

在Hadoop / Mapreduce作業的mapper類中，無法打開Openrdf（芝麻）而無法連接到SPARQLRepository

問題描述

1 個解決方案

解決方案1
1 2017-08-25 13:27:02

在Hadoop / Mapreduce作業的mapper類中，無法打開Openrdf（芝麻）而無法連接到SPARQLRepository

問題描述

1 個解決方案

解決方案1 1 2017-08-25 13:27:02

解決方案1
1 2017-08-25 13:27:02