简体   繁体   English

使用Eclipse创建Nutch 2.x插件

[英]Create a Nutch 2.x plugin using eclipse

I have to write a plugin to parse crawled content vy Nutch 2.3.1. 我必须编写一个插件来解析Nutch 2.3.1的爬网内容。 I have decided to use eclipse as Its better than simple editor. 我决定使用eclipse作为其优于简单编辑器的工具。 Now How can I create a plugin in eclipse and test it via some simple use case ? 现在如何在eclipse中创建一个插件并通过一些简单的用例进行测试?

You can use following steps to get the plugin working from Eclipse. 您可以按照以下步骤从Eclipse中使用插件。

  1. Get Nutch source code. 获取Nutch源代码。

    git clone https://github.com/apache/nutch.git

  2. Switch to 2.3.1 branch. 切换到2.3.1分支。 If you want latest 2.x in development, you can use 2.x branch https://github.com/apache/nutch/tree/branch-2.3.1 如果要开发最新的2.x版本,可以使用2.x分支https://github.com/apache/nutch/tree/branch-2.3.1

  3. Import project in eclipse. 在Eclipse中导入项目。

  4. Build for eclipse. 为日食而建。 It uses ant for build and has eclipse target. 它使用ant进行构建,并具有eclipse目标。

    ant eclipse

  5. All available plugins in nutch are under src/plugins directory. nutch中所有可用的插件都在src/plugins目录下。

  6. You need a smilar structure for your new plugin so copy one of the existing plugin to new one. 您的新插件需要使用类似的结构,因此将现有插件之一复制到新插件中。

    cp -r lib-http my-http

7 Now check the structure of plugin directory. 7现在检查插件目录的结构。 It should be as below. 它应该如下所示。

my-http/
├── build.xml
├── ivy.xml
├── plugin.xml
└── src
    ├── java
    └── test
  1. plugin.xml is the one which has definitions for extentions, extention-points, runtime libraries etc. You can view it in eclipse plugin editor and can do changes from there. plugin.xml是具有扩展,扩展点,运行时库等的定义的文件。您可以在eclipse插件编辑器中查看它,并可以在此进行更改。

  2. Add proper implementation class and tests. 添加适当的实现类和测试。 Map that for extension in plugin.xml 将其映射为plugin.xml中的扩展名

  3. You have to change your build.xml and ivy.xml to add proper dependencies. 您必须更改build.xmlivy.xml才能添加适当的依赖项。

  4. You can override targets defined in src/plugin/build-plugin.xml in your build.xml . 您可以覆盖build.xmlsrc/plugin/build-plugin.xml中定义的目标。 build-plugin.xml is being called by main build file src/build.xml for each plugin. 每个插件的主构建文件src/build.xml都将调用build-plugin.xml

  5. You can test your plugin by using ant from the plugin directory. 您可以通过使用plugin目录中的ant测试您的插件。 ant test . ant test

You can also use eclipse to check JUint test results. 您还可以使用eclipse检查JUint测试结果。 Click on Test class and Run as JUnit Test 单击Test类并以JUnit Test身份运行

  1. Add plugin to deploy and test targets in src/plugin/build.xml . src/plugin/build.xml添加插件以部署和测试目标。 This file is used by the main build file 此文件由主构建文件使用

    <ant dir="my-http" target="deploy"/>

  2. add any required dependencies in build/ivy/ivy.xml build/ivy/ivy.xml添加任何必需的依赖项

  3. Add plugin plugin.includes property in conf/nutch-site.xml conf/nutch-site.xml添加插件plugin.includes属性

  4. Build nutch 建立坚果

    ant runtime

Now your plugin is set to run in local/distributed mode from runtime directory. 现在,您的插件已设置为从runtime目录以本地/分布式模式runtime

You can use any editor you want to write your code, as long as you generate a jar that you load in the Nutch plugin system with the right dependencies and configurations in the xml file everything should work. 您可以使用任何想要编写代码的编辑器,只要生成一个jar ,然后在xml文件中使用正确的依赖关系和配置将其加载到Nutch插件系统中,一切就可以正常工作。 You can check https://wiki.apache.org/nutch/RunNutchInEclipse that contains detailed instructions to open and run within eclipse so debugging is easier, but its not required. 您可以检查https://wiki.apache.org/nutch/RunNutchInEclipse ,其中包含在Eclipse中打开和运行的详细说明,因此调试起来比较容易,但不是必需的。 Specially important is to run ant eclipse in your local copy of the project, so that you can open the Nutch entire source code in Eclipse, once this is done you can create your plugin file structure and start coding. 特别重要的是在项目的本地副本中运行ant eclipse ,以便您可以在Eclipse中打开Nutch整个源代码,一旦完成,就可以创建插件文件结构并开始编码。

Hope it helps. 希望能帮助到你。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM