简体   繁体   English

使用RJB(Ruby Java Bridge)的OpenNLP中的java.lang.NullPointerException

[英]java.lang.NullPointerException in OpenNLP using RJB (Ruby Java Bridge)

I am trying to use the open-nlp Ruby gem to access the Java OpenNLP processor through RJB (Ruby Java Bridge). 我试图使用open-nlp Ruby gem通过RJB(Ruby Java Bridge)访问Java OpenNLP处理器。 I am not a Java programmer, so I don't know how to solve this. 我不是Java程序员,所以我不知道如何解决这个问题。 Any recommendations regarding resolving it, debugging it, collecting more information, etc. would be appreciated. 任何有关解决它,调试它,收集更多信息等的建议都将受到赞赏。

The environment is Windows 8, Ruby 1.9.3p448, Rails 4.0.0, JDK 1.7.0-40 x586. 环境是Windows 8,Ruby 1.9.3p448,Rails 4.0.0,JDK 1.7.0-40 x586。 Gems are rjb 1.4.8 and louismullie/open-nlp 0.1.4. 宝石是rjb 1.4.8和louismullie / open-nlp 0.1.4。 For the record, this file runs in JRuby but I experience other problems in that environment and would prefer to stay native Ruby for now. 为了记录,这个文件在JRuby中运行,但是我在那个环境中遇到了其他问题,并且现在更愿意保留原生Ruby。

In brief, the open-nlp gem is failing with java.lang.NullPointerException and Ruby error method missing. 简而言之,open-nlp gem失败,缺少java.lang.NullPointerException和Ruby错误方法。 I hesitate to say why this is happening because I don't know, but it appears to me that the dynamic loading of the Jars file opennlp.tools.postag.POSTaggerME@1b5080a cannot be accessed, perhaps because OpenNLP::Bindings::Utils.tagWithArrayList isn't being set up correctly. 我不知道为什么会这样,因为我不知道,但在我看来,无法访问Jars文件opennlp.tools.postag.POSTaggerME@1b5080a的动态加载,可能是因为OpenNLP :: Bindings :: Utils .tagWithArrayList未正确设置。 OpenNLP::Bindings is Ruby. OpenNLP :: Bindings是Ruby。 Utils, and its methods, are Java. Utils及其方法是Java。 And Utils is supposedly the "default" Jars and Class files, which may be important. 而Utils应该是“默认的”Jars和Class文件,这可能很重要。

What am I doing wrong, here? 我做错了什么,在这里? Thanks! 谢谢!

The code I am running is copied straight out of github/open-nlp . 我正在运行的代码直接从github / open-nlp复制。 My copy of the code is: 我的代码副本是:

class OpennlpTryer

  $DEBUG=false

  # From https://github.com/louismullie/open-nlp
  # Hints: Dir.pwd; File.expand_path('../../Gemfile', __FILE__);
  # Load the module
  require 'open-nlp'
  #require 'jruby-jars'

=begin
  # Alias "write" to "print" to monkeypatch the NoMethod write error
  java_import java.io.PrintStream
  class PrintStream
    java_alias(:write, :print, [java.lang.String])
  end
=end

=begin
  # Display path of jruby-jars jars...
  puts JRubyJars.core_jar_path # => path to jruby-core-VERSION.jar
  puts JRubyJars.stdlib_jar_path # => path to jruby-stdlib-VERSION.jar
=end
  puts ENV['CLASSPATH']

  # Set an alternative path to look for the JAR files.
  # Default is gem's bin folder.
  # OpenNLP.jar_path = '/path_to_jars/'

  OpenNLP.jar_path = File.join(ENV["GEM_HOME"],"gems/open-nlp-0.1.4/bin/")
  puts OpenNLP.jar_path
  # Set an alternative path to look for the model files.
  # Default is gem's bin folder.
  # OpenNLP.model_path = '/path_to_models/'

  OpenNLP.model_path = File.join(ENV["GEM_HOME"],"gems/open-nlp-0.1.4/bin/")
  puts OpenNLP.model_path
  # Pass some alternative arguments to the Java VM.
  # Default is ['-Xms512M', '-Xmx1024M'].
  # OpenNLP.jvm_args = ['-option1', '-option2']
  OpenNLP.jvm_args = ['-Xms512M', '-Xmx1024M']
  # Redirect VM output to log.txt
  OpenNLP.log_file = 'log.txt'
  # Set default models for a language.
  # OpenNLP.use :language
  OpenNLP.use :english          # Make sure this is lower case!!!!

# Simple tokenizer

  OpenNLP.load

  sent = "The death of the poet was kept from his poems."
  tokenizer = OpenNLP::SimpleTokenizer.new

  tokens = tokenizer.tokenize(sent).to_a
# => %w[The death of the poet was kept from his poems .]
  puts "Tokenize #{tokens}"

# Maximum entropy tokenizer, chunker and POS tagger

  OpenNLP.load

  chunker = OpenNLP::ChunkerME.new
  tokenizer = OpenNLP::TokenizerME.new
  tagger = OpenNLP::POSTaggerME.new

  sent = "The death of the poet was kept from his poems."

  tokens = tokenizer.tokenize(sent).to_a
# => %w[The death of the poet was kept from his poems .]
  puts "Tokenize #{tokens}"

  tags = tagger.tag(tokens).to_a
# => %w[DT NN IN DT NN VBD VBN IN PRP$ NNS .]
  puts "Tags #{tags}"

  chunks = chunker.chunk(tokens, tags).to_a
# => %w[B-NP I-NP B-PP B-NP I-NP B-VP I-VP B-PP B-NP I-NP O]
  puts "Chunks #{chunks}"


# Abstract Bottom-Up Parser

  OpenNLP.load

  sent = "The death of the poet was kept from his poems."
  parser = OpenNLP::Parser.new
  parse = parser.parse(sent)

=begin
  parse.get_text.should eql sent

  parse.get_span.get_start.should eql 0
  parse.get_span.get_end.should eql 46
  parse.get_child_count.should eql 1
=end

  child = parse.get_children[0]

  child.text # => "The death of the poet was kept from his poems."
  child.get_child_count # => 3
  child.get_head_index #=> 5
  child.get_type # => "S"

  puts "Child: #{child}"

# Maximum Entropy Name Finder*

  OpenNLP.load

  # puts File.expand_path('.', __FILE__)
  text = File.read('./spec/sample.txt').gsub!("\n", "")

  tokenizer = OpenNLP::TokenizerME.new
  segmenter = OpenNLP::SentenceDetectorME.new
  puts "Tokenizer: #{tokenizer}"
  puts "Segmenter: #{segmenter}"

  ner_models = ['person', 'time', 'money']
  ner_finders = ner_models.map do |model|
    OpenNLP::NameFinderME.new("en-ner-#{model}.bin")
  end
  puts "NER Finders: #{ner_finders}"

  sentences = segmenter.sent_detect(text)
  puts "Sentences: #{sentences}"

  named_entities = []

  sentences.each do |sentence|
    tokens = tokenizer.tokenize(sentence)
    ner_models.each_with_index do |model, i|
      finder = ner_finders[i]
      name_spans = finder.find(tokens)
      name_spans.each do |name_span|
        start = name_span.get_start
        stop = name_span.get_end-1
        slice = tokens[start..stop].to_a
        named_entities << [slice, model]
      end
    end
  end
  puts "Named Entities: #{named_entities}"

# Loading specific models
# Just pass the name of the model file to the constructor. The gem will search for the file in the OpenNLP.model_path folder.

  OpenNLP.load

  tokenizer = OpenNLP::TokenizerME.new('en-token.bin')
  tagger = OpenNLP::POSTaggerME.new('en-pos-perceptron.bin')
  name_finder = OpenNLP::NameFinderME.new('en-ner-person.bin')
# etc.
  puts "Tokenizer: #{tokenizer}"
  puts "Tagger: #{tagger}"
  puts "Name Finder: #{name_finder}"

# Loading specific classes
# You may want to load specific classes from the OpenNLP library that are not loaded by default. The gem provides an API to do this:

# Default base class is opennlp.tools.
  OpenNLP.load_class('SomeClassName')
# => OpenNLP::SomeClassName

# Here, we specify another base class.
  OpenNLP.load_class('SomeOtherClass', 'opennlp.tools.namefind')
  # => OpenNLP::SomeOtherClass

end

The line which is failing is line 73: (tokens == the sentence being processed.) 失败的行是第73行:(令牌==正在处理的句子。)

  tags = tagger.tag(tokens).to_a  # 
# => %w[DT NN IN DT NN VBD VBN IN PRP$ NNS .]

tagger.tag calls open-nlp/classes.rb line 13, which is where the error is thrown. tagger.tag调用open-nlp / classes.rb第13行,这是抛出错误的地方。 The code there is: 那里的代码是:

class OpenNLP::POSTaggerME < OpenNLP::Base

  unless RUBY_PLATFORM =~ /java/
    def tag(*args)
      OpenNLP::Bindings::Utils.tagWithArrayList(@proxy_inst, args[0])  # <== Line 13
    end
  end

end

The Ruby error thrown at this point is: `method_missing': unknown exception (NullPointerException). 此时抛出的Ruby错误是:`method_missing':未知异常(NullPointerException)。 Debugging this, I found the error java.lang.NullPointerException. 调试这个,我发现错误java.lang.NullPointerException。 args[0] is the sentence being processed. args [0]是正在处理的句子。 @proxy_inst is opennlp.tools.postag.POSTaggerME@1b5080a. @proxy_inst是opennlp.tools.postag.POSTaggerME@1b5080a。

OpenNLP::Bindings sets up the Java environment. OpenNLP :: Bindings设置Java环境。 For example, it sets up the Jars to be loaded and the classes within those Jars. 例如,它设置要加载的Jars和这些Jars中的类。 In line 54, it sets up defaults for RJB, which should set up OpenNLP::Bindings::Utils and its methods as follows: 在第54行中,它设置了RJB的默认值,它应该设置OpenNLP :: Bindings :: Utils及其方法,如下所示:

  # Add in Rjb workarounds.
  unless RUBY_PLATFORM =~ /java/
    self.default_jars << 'utils.jar'
    self.default_classes << ['Utils', '']
  end

utils.jar and Utils.java are in the CLASSPATH with the other Jars being loaded. utils.jar和Utils.java位于CLASSPATH中,其他Jars正在加载。 They are being accessed, which is verified because the other Jars throw error messages if they are not present. 正在访问它们,这是经过验证的,因为如果它们不存在,其他Jars会抛出错误消息。 The CLASSPATH is: CLASSPATH是:

.;C:\Program Files (x86)Java\jdk1.7.0_40\lib;C:\Program Files (x86)Java\jre7\lib;D:\BitNami\rubystack-1.9.3-12\ruby\lib\ruby\gems\1.9.1\gems\open-nlp-0.1.4\bin

The applications Jars are in D:\\BitNami\\rubystack-1.9.3-12\\ruby\\lib\\ruby\\gems\\1.9.1\\gems\\open-nlp-0.1.4\\bin and, again, if they are not there I get error messages on other Jars. 应用程序Jars位于D:\\ BitNami \\ ruby​​stack-1.9.3-12 \\ ruby​​ \\ lib \\ ruby​​ \\ gems \\ 1.9.1 \\ gems \\ open-nlp-0.1.4 \\ bin中,如果它们不在那里我在其他Jars上收到错误消息。 The Jars and Java files in ...\\bin include: ... \\ bin中的Jars和Java文件包括:

jwnl-1.3.3.jar
opennlp-maxent-3.0.2-incubating.jar
opennlp-tools-1.5.2-incubating.jar
opennlp-uima-1.5.2-incubating.jar
utils.jar
Utils.java

Utils.java is as follows: Utils.java如下:

import java.util.Arrays;
import java.util.ArrayList;
import java.lang.String;
import opennlp.tools.postag.POSTagger;
import opennlp.tools.chunker.ChunkerME;
import opennlp.tools.namefind.NameFinderME; // interface instead?
import opennlp.tools.util.Span;

// javac -cp '.:opennlp.tools.jar' Utils.java
// jar cf utils.jar Utils.class
public class Utils {

    public static String[] tagWithArrayList(POSTagger posTagger, ArrayList[] objectArray) {
      return posTagger.tag(getStringArray(objectArray));
    }
    public static Object[] findWithArrayList(NameFinderME nameFinder, ArrayList[] tokens) {
      return nameFinder.find(getStringArray(tokens));
    }
    public static Object[] chunkWithArrays(ChunkerME chunker, ArrayList[] tokens, ArrayList[] tags) {
      return chunker.chunk(getStringArray(tokens), getStringArray(tags));
    }
    public static String[] getStringArray(ArrayList[] objectArray) {
      String[] stringArray = Arrays.copyOf(objectArray, objectArray.length, String[].class);
          return stringArray;
    }
}

So, it should define tagWithArrayList and import opennlp.tools.postag.POSTagger. 因此,它应该定义tagWithArrayList并导入opennlp.tools.postag.POSTagger。 (OBTW, just to try, I changed the incidences of POSTagger to POSTaggerME in this file. It changed nothing...) (OBTW,只是为了尝试,我在这个文件中将POSTagger的发生率改为POSTaggerME。它没有改变......)

The tools Jar file, opennlp-tools-1.5.2-incubating.jar, includes postag/POSTagger and POSTaggerME class files, as expected. 工具Jar文件opennlp-tools-1.5.2-incubating.jar包含postag / POSTagger和POSTaggerME类文件,如预期的那样。

Error messages are: 错误消息是:

D:\BitNami\rubystack-1.9.3-12\ruby\bin\ruby.exe -e $stdout.sync=true;$stderr.sync=true;load($0=ARGV.shift) D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb
.;C:\Program Files (x86)\Java\jdk1.7.0_40\lib;C:\Program Files (x86)\Java\jre7\lib;D:\BitNami\rubystack-1.9.3-12\ruby\lib\ruby\gems\1.9.1\gems\open-nlp-0.1.4\bin
D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/bin/
D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/bin/
Tokenize ["The", "death", "of", "the", "poet", "was", "kept", "from", "his", "poems", "."]
Tokenize ["The", "death", "of", "the", "poet", "was", "kept", "from", "his", "poems", "."]
D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp/classes.rb:13:in `method_missing': unknown exception (NullPointerException)
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp/classes.rb:13:in `tag'
    from D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:73:in `<class:OpennlpTryer>'
    from D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:1:in `<top (required)>'
    from -e:1:in `load'
    from -e:1:in `<main>'

Modified Utils.java: 修改过的Utils.java:

import java.util.Arrays;
import java.util.Object;
import java.lang.String;
import opennlp.tools.postag.POSTagger;
import opennlp.tools.chunker.ChunkerME;
import opennlp.tools.namefind.NameFinderME; // interface instead?
import opennlp.tools.util.Span;

// javac -cp '.:opennlp.tools.jar' Utils.java
// jar cf utils.jar Utils.class
public class Utils {

    public static String[] tagWithArrayList(POSTagger posTagger, Object[] objectArray) {
      return posTagger.tag(getStringArray(objectArray));
    }f
    public static Object[] findWithArrayList(NameFinderME nameFinder, Object[] tokens) {
      return nameFinder.find(getStringArray(tokens));
    }
    public static Object[] chunkWithArrays(ChunkerME chunker, Object[] tokens, Object[] tags) {
      return chunker.chunk(getStringArray(tokens), getStringArray(tags));
    }
    public static String[] getStringArray(Object[] objectArray) {
      String[] stringArray = Arrays.copyOf(objectArray, objectArray.length, String[].class);
          return stringArray;
    }
}

Modified error messages: 修改的错误消息:

Uncaught exception: uninitialized constant OpennlpTryer::ArrayStoreException
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:81:in `rescue in <class:OpennlpTryer>'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:77:in `<class:OpennlpTryer>'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:1:in `<top (required)>'

Revised error with Utils.java revised to "import java.lang.Object;": 修改了Utils.java错误修改为“import java.lang.Object;”:

Uncaught exception: uninitialized constant OpennlpTryer::ArrayStoreException
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:81:in `rescue in <class:OpennlpTryer>'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:77:in `<class:OpennlpTryer>'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:1:in `<top (required)>'

Rescue removed from OpennlpTryer shows error trapped in classes.rb: 从OpennlpTryer中删除的Rescue显示在classes.rb中捕获的错误:

Uncaught exception: uninitialized constant OpenNLP::POSTaggerME::ArrayStoreException
    D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp/classes.rb:16:in `rescue in tag'
    D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp/classes.rb:13:in `tag'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:78:in `<class:OpennlpTryer>'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:1:in `<top (required)>'

Same error but with all rescues removed so it's "native Ruby" 相同的错误,但删除了所有救援,所以它是“本机Ruby”

Uncaught exception: unknown exception
    D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp/classes.rb:15:in `method_missing'
    D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp/classes.rb:15:in `tag'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:78:in `<class:OpennlpTryer>'
    D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:1:in `<top (required)>'

Revised Utils.java: 修改了Utils.java:

import java.util.Arrays;
import java.util.ArrayList;
import java.lang.String;
import opennlp.tools.postag.POSTagger;
import opennlp.tools.chunker.ChunkerME;
import opennlp.tools.namefind.NameFinderME; // interface instead?
import opennlp.tools.util.Span;

// javac -cp '.:opennlp.tools.jar' Utils.java
// jar cf utils.jar Utils.class
public class Utils {

    public static String[] tagWithArrayList(
      System.out.println("Tokens: ("+objectArray.getClass().getSimpleName()+"): \n"+objectArray);
      POSTagger posTagger, ArrayList[] objectArray) {
      return posTagger.tag(getStringArray(objectArray));
    }
    public static Object[] findWithArrayList(NameFinderME nameFinder, ArrayList[] tokens) {
      return nameFinder.find(getStringArray(tokens));
    }
    public static Object[] chunkWithArrays(ChunkerME chunker, ArrayList[] tokens, ArrayList[] tags) {
      return chunker.chunk(getStringArray(tokens), getStringArray(tags));
    }
    public static String[] getStringArray(ArrayList[] objectArray) {
      String[] stringArray = Arrays.copyOf(objectArray, objectArray.length, String[].class);
          return stringArray;
    }
}

I ran cavaj on Utils.class that I unzipped from util.jar and this is what I found. 我在Utils.class上运行cavaj,我从util.jar解压缩,这就是我找到的。 It differs from Utils.java by quite a bit. 它与Utils.java有很大的不同。 Both come installed with the open-nlp 1.4.8 gem. 两者都安装了open-nlp 1.4.8 gem。 I don't know if this is the root cause of the problem, but this file is the core of where it breaks and we have a major discrepancy. 我不知道这是否是问题的根本原因,但是这个文件是它破坏的核心,我们有一个主要的差异。 Which should we use? 我们应该使用哪个?

import java.util.ArrayList;
import java.util.Arrays;
import opennlp.tools.chunker.ChunkerME;
import opennlp.tools.namefind.NameFinderME;
import opennlp.tools.postag.POSTagger;

public class Utils
{

    public Utils()
    {
    }

    public static String[] tagWithArrayList(POSTagger postagger, ArrayList aarraylist[])
    {
        return postagger.tag(getStringArray(aarraylist));
    }

    public static Object[] findWithArrayList(NameFinderME namefinderme, ArrayList aarraylist[])
    {
        return namefinderme.find(getStringArray(aarraylist));
    }

    public static Object[] chunkWithArrays(ChunkerME chunkerme, ArrayList aarraylist[], ArrayList aarraylist1[])
    {
        return chunkerme.chunk(getStringArray(aarraylist), getStringArray(aarraylist1));
    }

    public static String[] getStringArray(ArrayList aarraylist[])
    {
        String as[] = (String[])Arrays.copyOf(aarraylist, aarraylist.length, [Ljava/lang/String;);
        return as;
    }
}

Utils.java in use as of 10/07, compiled and compressed into utils.jar: Utils.java在10月7日使用,编译并压缩为utils.jar:

import java.util.Arrays;
import java.util.ArrayList;
import java.lang.String;
import opennlp.tools.postag.POSTagger;
import opennlp.tools.chunker.ChunkerME;
import opennlp.tools.namefind.NameFinderME; // interface instead?
import opennlp.tools.util.Span;

// javac -cp '.:opennlp.tools.jar' Utils.java
// jar cf utils.jar Utils.class
public class Utils {

    public static String[] tagWithArrayList(POSTagger posTagger, ArrayList[] objectArray) {
      return posTagger.tag(getStringArray(objectArray));
    }
    public static Object[] findWithArrayList(NameFinderME nameFinder, ArrayList[] tokens) {
      return nameFinder.find(getStringArray(tokens));
    }
    public static Object[] chunkWithArrays(ChunkerME chunker, ArrayList[] tokens, ArrayList[] tags) {
      return chunker.chunk(getStringArray(tokens), getStringArray(tags));
    }
    public static String[] getStringArray(ArrayList[] objectArray) {
      String[] stringArray = Arrays.copyOf(objectArray, objectArray.length, String[].class);
          return stringArray;
    }
}

Failures are occurring in BindIt::Binding::load_klass in line 110 here: 故障发生在第110行的BindIt :: Binding :: load_klass中:

# Private function to load classes.
# Doesn't check if initialized.
def load_klass(klass, base, name=nil)
  base += '.' unless base == ''
  fqcn = "#{base}#{klass}"
  name ||= klass
  if RUBY_PLATFORM =~ /java/
    rb_class = java_import(fqcn)
    if name != klass
      if rb_class.is_a?(Array)
        rb_class = rb_class.first
      end
      const_set(name.intern, rb_class)
    end
  else
    rb_class = Rjb::import(fqcn)             # <== This is line 110
    const_set(name.intern, rb_class)
  end
end

The messages are as follows, however they are inconsistent in terms of the particular method that is identified. 消息如下,但是就所识别的特定方法而言它们是不一致的。 Each run may display a different method, any of POSTagger, ChunkerME, or NameFinderME. 每次运行可以显示不同的方法,POSTagger,ChunkerME或NameFinderME中的任何一种。

D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/bind-it-0.2.7/lib/bind-it/binding.rb:110:in `import': opennlp/tools/namefind/NameFinderME (NoClassDefFoundError)
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/bind-it-0.2.7/lib/bind-it/binding.rb:110:in `load_klass'
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/bind-it-0.2.7/lib/bind-it/binding.rb:89:in `block in load_default_classes'
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/bind-it-0.2.7/lib/bind-it/binding.rb:87:in `each'
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/bind-it-0.2.7/lib/bind-it/binding.rb:87:in `load_default_classes'
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/bind-it-0.2.7/lib/bind-it/binding.rb:56:in `bind'
    from D:/BitNami/rubystack-1.9.3-12/ruby/lib/ruby/gems/1.9.1/gems/open-nlp-0.1.4/lib/open-nlp.rb:14:in `load'
    from D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:54:in `<class:OpennlpTryer>'
    from D:/BitNami/rubystack-1.9.3-12/projects/RjbTest/app/helpers/opennlp_tryer.rb:1:in `<top (required)>'
    from -e:1:in `load'
    from -e:1:in `<main>'

The interesting point about these errors are that they are originating in OpennlpTryer line 54 which is: 关于这些错误的有趣之处在于它们起源于OpennlpTryer第54行,它是:

  OpenNLP.load

At this point, OpenNLP fires up RJB which uses BindIt to load the jars and classes. 此时,OpenNLP启动了使用BindIt加载jar和类的RJB。 This is well before the errors that I was seeing at the beginning of this question. 这是在我在这个问题的开头看到的错误之前。 However, I can't help but think it is all related. 但是,我不禁认为这一切都是相关的。 I really don't understand the inconsistency of these errors at all. 我真的不明白这些错误的不一致性。

I was able to add the logging function in to Utils.java, compile it after adding in an "import java.io.*" and compress it. 我能够将日志功能添加到Utils.java中,在添加“import java.io. *”后对其进行编译并压缩它。 However, I pulled it out because of these errors as I didn't know if or not it was involved. 但是,由于这些错误,我把它拉出来,因为我不知道它是否涉及。 I don't think it was. 我认为不是。 However, because these errors are occurring during load, the method is never called anyway so logging there won't help... 但是,因为这些错误是在加载过程中发生的,所以从不调用该方法,因此在那里记录将无济于事......

For each of the other jars, the jar is loaded then each class is imported using RJB. 对于每个其他jar,加载jar然后使用RJB导入每个类。 Utils is handled differently and is specified as the "default". Utils的处理方式不同,并被指定为“默认”。 From what I can tell, Utils.class is executed to load its own classes? 据我所知,Utils.class被执行加载自己的类?

Later update on 10/07: 稍后更新于10/07:

Here is where I am, I think. 我想,这就是我的所在。 First, I have some problem replacing Utils.java, as I described earlier today. 首先,我在替换Utils.java时遇到了一些问题,正如我今天早些时候所描述的那样。 That problem probably needs solved before I can install a fix. 在我可以安装修复程序之前,可能需要解决该问题。

Second, I now understand the difference between POSTagger and POSTaggerME because the ME means Maximum Entropy. 其次,我现在理解POSTagger和POSTaggerME之间的区别,因为ME意味着最大熵。 The test code is trying to call POSTaggerME but it looks to me like Utils.java, as implemented, supports POSTagger. 测试代码试图调用POSTaggerME,但它看起来像Utils.java,实现后支持POSTagger。 I tried changing the test code to call POSTagger, but it said it couldn't find an initializer. 我尝试更改测试代码以调用POSTagger,但它说它无法找到初始化程序。 Looking at the source for each of these, and I am guessing here, I think that POSTagger exists for the sole purpose to support POSTaggerME which implements it. 查看每个这些的来源,我猜这里,我认为POSTagger的唯一目的是支持实现它的POSTaggerME。

The source is opennlp-tools file opennlp-tools-1.5.2-incubating-sources.jar. 源代码是opennlp-tools文件opennlp-tools-1.5.2-incubating-sources.jar。

What I don't get is the whole reason for Utils in the first place? 我没有得到的是Utils的首要原因? Why aren't the jars/classes provided in bindings.rb enough? 为什么bindings.rb中提供的jar /类不够? This feels like a bad monkeypatch. 这感觉就像一个糟糕的monkeypatch。 I mean, look what bindings.rb does in the first place: 我的意思是,看看bindings.rb首先做了什么:

  # Default JARs to load.
  self.default_jars = [
    'jwnl-1.3.3.jar',
    'opennlp-tools-1.5.2-incubating.jar',
    'opennlp-maxent-3.0.2-incubating.jar',
    'opennlp-uima-1.5.2-incubating.jar'
  ]

  # Default namespace.
  self.default_namespace = 'opennlp.tools'

  # Default classes.
  self.default_classes = [
    # OpenNLP classes.
    ['AbstractBottomUpParser', 'opennlp.tools.parser'],
    ['DocumentCategorizerME', 'opennlp.tools.doccat'],
    ['ChunkerME', 'opennlp.tools.chunker'],
    ['DictionaryDetokenizer', 'opennlp.tools.tokenize'],
    ['NameFinderME', 'opennlp.tools.namefind'],
    ['Parser', 'opennlp.tools.parser.chunking'],
    ['Parse', 'opennlp.tools.parser'],
    ['ParserFactory', 'opennlp.tools.parser'],
    ['POSTaggerME', 'opennlp.tools.postag'],
    ['SentenceDetectorME', 'opennlp.tools.sentdetect'],
    ['SimpleTokenizer', 'opennlp.tools.tokenize'],
    ['Span', 'opennlp.tools.util'],
    ['TokenizerME', 'opennlp.tools.tokenize'],

    # Generic Java classes.
    ['FileInputStream', 'java.io'],
    ['String', 'java.lang'],
    ['ArrayList', 'java.util']
  ]

  # Add in Rjb workarounds.
  unless RUBY_PLATFORM =~ /java/
    self.default_jars << 'utils.jar'
    self.default_classes << ['Utils', '']
  end

I don't think you're doing anything wrong at all. 我认为你根本没有做错任何事。 You're also not the only one with this problem . 也不是唯一有这个问题的人 It looks like a bug in Utils . 它看起来像是Utils一个错误。 Creating an ArrayList[] in Java doesn't make much sense - it's technically legal, but it would be an array of ArrayList s, which a) is just plain odd and b) terrible practice with regard to Java generics, and c) won't cast properly to String[] like the author intends in getStringArray() . 在Java中创建一个ArrayList[]没有多大意义 - 它在技术上是合法的,但它将是一个ArrayList的数组,其中a)只是简单的奇怪和b)关于Java泛型的可怕做法,以及c)赢了像作者打算在getStringArray()那样正确地转换为String[]

Given the way the utility's written and the fact that OpenNLP does, in fact, expect to receive a String[] as input for its tag() method, my best guess is that the original author meant to have Object[] where they have ArrayList[] in the Utils class. 考虑到实用程序的编写方式和OpenNLP的事实,实际上,期望接收String[]作为其tag()方法的输入,我最好的猜测是原始作者意味着拥有Object[] ,他们有ArrayList[]Utils类中的ArrayList[]

Update 更新

To output to a file in the root of your project directory, try adjusting the logging like this (I added another line for printing the contents of the input array): 要输出到项目目录根目录中的文件,请尝试像这样调整日志记录(我添加了另一行来打印输入数组的内容):

try {
    File log = new File("log.txt");
    FileWriter fileWriter = new FileWriter(log);
    BufferedWriter bufferedWriter = new BufferedWriter(fileWriter);
    bufferedWriter.write("Tokens ("+objectArray.getClass().getSimpleName()+"): \r\n"+objectArray.toString()+"\r\n");
    bufferedWriter.write(Arrays.toString(objectArray));
    bufferedWriter.close(); 
}
catch (Exception e) {
    e.printStackTrace();
}

SEE FULL CODE AT END FOR THE COMPLETE CORRECTED CLASSES.RB MODULE 查看完整正确的CLASSES.RB模块的完整代码

I ran into the same problem today. 我今天遇到了同样的问题。 I didn't quite understand why the Utils class were being used, so I modified the classes.rb file in the following way: 我不太明白为什么要使用Utils类,所以我用以下方式修改了classes.rb文件:

unless RUBY_PLATFORM =~ /java/
  def tag(*args)
    @proxy_inst.tag(args[0])
    #OpenNLP::Bindings::Utils.tagWithArrayList(@proxy_inst, args[0])
  end
end

In that way I can make the following test to pass: 通过这种方式,我可以通过以下测试:

sent   = "The death of the poet was kept from his poems."
tokens = tokenizer.tokenize(sent).to_a
# => %w[The death of the poet was kept from his poems .]
tags   = tagger.tag(tokens).to_a
# => ["prop", "prp", "n", "v-fin", "n", "adj", "prop", "v-fin", "n", "adj", "punc"]

R_G Edit: I tested that change and it eliminated the error. R_G编辑:我测试了这个改变,它消除了错误。 I am going to have to do more testing to ensure the outcome is what should be expected. 我将不得不做更多的测试,以确保结果是应该预期的。 However, following that same pattern, I made the following changes in classes.rb as well: 但是,遵循相同的模式,我在classes.rb中进行了以下更改:

def chunk(tokens, tags)
  chunks = @proxy_inst.chunk(tokens, tags)
  # chunks = OpenNLP::Bindings::Utils.chunkWithArrays(@proxy_inst, tokens,tags)
  chunks.map { |c| c.to_s }
end

... ...

class OpenNLP::NameFinderME < OpenNLP::Base
  unless RUBY_PLATFORM =~ /java/
    def find(*args)
      @proxy_inst.find(args[0])
      # OpenNLP::Bindings::Utils.findWithArrayList(@proxy_inst, args[0])
    end
  end
end

This allowed the entire sample test to execute without failure. 这样就可以在不失败的情况下执行整个样本测试。 I will provide a later update regarding verification of the results. 我将提供有关结果验证的更新信息。

FINAL EDIT AND UPDATED CLASSES.RB per Space Pope and R_G: 每个太空教皇和R_G的最终编辑和更新的CLASSES.RB:

As it turns out, this answer was key to the desired solution. 事实证明,这个答案是理想解决方案的关键。 However, the results were inconsistent as it was corrected. 但是,结果不一致,因为它已得到纠正。 We continued to drill down into it and implemented strong typing during the calls, as specified by RJB. 我们继续深入研究并在电话会议期间实施强类型,如RJB所规定。 This converts the call to use of the _invoke method where the parameters include the desired method, the strong type, and the additional parameters. 这会将调用转换为使用_invoke方法,其中参数包括所需方法,强类型和其他参数。 Andre's recommendation was key to the solution, so kudos to him. 安德烈的建议是解决方案的关键,所以对他赞不绝口。 Here is the complete module. 这是完整的模块。 It eliminates the need for the Utils.class that was attempting to make these calls but failing. 它消除了试图进行这些调用但失败的Utils.class的需要。 We plan to issue a github pull request for the open-nlp gem to update this module: 我们计划为open-nlp gem发出一个github pull请求来更新这个模块:

require 'open-nlp/base'

class OpenNLP::SentenceDetectorME < OpenNLP::Base; end

class OpenNLP::SimpleTokenizer < OpenNLP::Base; end

class OpenNLP::TokenizerME < OpenNLP::Base; end

class OpenNLP::POSTaggerME < OpenNLP::Base

  unless RUBY_PLATFORM =~ /java/
    def tag(*args)
        @proxy_inst._invoke("tag", "[Ljava.lang.String;", args[0])
    end

  end
end


class OpenNLP::ChunkerME < OpenNLP::Base

  if RUBY_PLATFORM =~ /java/

    def chunk(tokens, tags)
      if !tokens.is_a?(Array)
        tokens = tokens.to_a
        tags = tags.to_a
      end
      tokens = tokens.to_java(:String)
      tags = tags.to_java(:String)
      @proxy_inst.chunk(tokens,tags).to_a
    end

  else

    def chunk(tokens, tags)
      chunks = @proxy_inst._invoke("chunk", "[Ljava.lang.String;[Ljava.lang.String;", tokens, tags)
      chunks.map { |c| c.to_s }
    end

  end

end

class OpenNLP::Parser < OpenNLP::Base

  def parse(text)

    tokenizer = OpenNLP::TokenizerME.new
    full_span = OpenNLP::Bindings::Span.new(0, text.size)

    parse_obj = OpenNLP::Bindings::Parse.new(
    text, full_span, "INC", 1, 0)

    tokens = tokenizer.tokenize_pos(text)

    tokens.each_with_index do |tok,i|
      start, stop = tok.get_start, tok.get_end
      token = text[start..stop-1]
      span = OpenNLP::Bindings::Span.new(start, stop)
      parse = OpenNLP::Bindings::Parse.new(text, span, "TK", 0, i)
      parse_obj.insert(parse)
    end

    @proxy_inst.parse(parse_obj)

  end

end

class OpenNLP::NameFinderME < OpenNLP::Base
  unless RUBY_PLATFORM =~ /java/
    def find(*args)
      @proxy_inst._invoke("find", "[Ljava.lang.String;", args[0])
    end
  end
end

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM