简体   繁体   English

RapidMiner在Java应用程序中的集成

[英]Integration of RapidMiner in Java application

I have a text classification process in RapidMiner. 我在RapidMiner中有一个文本分类过程。 It reads the test data from specified excel ssheet and does the classification. 它从指定的excel ssheet读取测试数据并进行分类。 I have also a small Java application which is just running this process. 我还有一个小的Java应用程序,它正在运行此过程。 Now I want to make the file input part in my aplication, so that everytime I would be able to specify the excel file from my application (not from RapidMiner). 现在,我想在应用程序中添加文件输入部分,以便每次我都可以从我的应用程序(而不是从RapidMiner)中指定excel文件。 Any hints? 有什么提示吗?

This is the code: 这是代码:

import com.rapidminer.RapidMiner;
import com.rapidminer.Process;
import com.rapidminer.example.Attribute;
import com.rapidminer.example.Example;
import com.rapidminer.example.ExampleSet;
import com.rapidminer.operator.IOContainer;
import com.rapidminer.operator.Operator;
import com.rapidminer.operator.OperatorException;



import java.io.File;
import java.io.IOException;
import java.util.Iterator;
import com.rapidminer.operator.io.ExcelExampleSource; 
import com.rapidminer.tools.XMLException;


public class Classification {

    public static void main(String [] args) throws Exception{
         ExampleSet resultSet1 = null;
         IOContainer ioInput = null;
        IOContainer ioResult;
        try {
            RapidMiner.setExecutionMode(RapidMiner.ExecutionMode.COMMAND_LINE);
            RapidMiner.init();
            Process pr = new Process(new File("C:\\Users\\MP-TEST\\Desktop\\Rapid_Test\\Wieder_Model.rmp"));
            Operator op = pr.getOperator("Read Excel");
            op.setParameter(ExcelExampleSource.PARAMETER_EXCEL_FILE, "C:\\Users\\MP-TEST\\Desktop\\Rapid_Test\\HaendlerRatings_neu.xls");
            ioResult = pr.run(ioInput);
            if (ioResult.getElementAt(0) instanceof ExampleSet) {
                resultSet1 = (ExampleSet)ioResult.getElementAt(0);

                for (Example example : resultSet1) {
                    Iterator<Attribute> allAtts = example.getAttributes().allAttributes();
                    while(allAtts.hasNext()) {
                        Attribute a = allAtts.next();
                                if (a.isNumerical()) {
                                        double value = example.getValue(a);
                                        System.out.println(value);

                                } else {
                                        String value = example.getValueAsString(a);
                                        System.out.println(value);
                                }
                         }
                }
                    }
        } catch (IOException | XMLException | OperatorException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }




          }
}

This is the error: 这是错误:

Apr 09, 2013 9:06:05 AM com.rapidminer.Process run
INFO: Process C:\Users\MP-TEST\Desktop\Rapid_Test\Wieder_Model.rmp starts
com.rapidminer.operator.UserError: A value for the parameter 'excel_file' must be specified! 
    at com.rapidminer.operator.nio.model.ExcelResultSetConfiguration.makeDataResultSet(ExcelResultSetConfiguration.java:316)
    at com.rapidminer.operator.nio.model.AbstractDataResultSetReader.createExampleSet(AbstractDataResultSetReader.java:127)
    at com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:52)
    at com.rapidminer.operator.io.AbstractExampleSource.read(AbstractExampleSource.java:1)
    at com.rapidminer.operator.io.AbstractReader.doWork(AbstractReader.java:126)
    at com.rapidminer.operator.Operator.execute(Operator.java:855)
    at com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
    at com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:711)
    at com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
    at com.rapidminer.operator.Operator.execute(Operator.java:855)
    at com.rapidminer.Process.run(Process.java:949)
    at com.rapidminer.Process.run(Process.java:873)
    at com.rapidminer.Process.run(Process.java:832)
    at com.rapidminer.Process.run(Process.java:827)
    at Classification.main(Classification.java:29)

Best regards 最好的祝福

Armen Armen

Works fine for me: 对我来说效果很好:

  • Download Rapidminer (and unzip the file) 下载Rapidminer (并解压缩文件)
  • Into "lib" directory, you need: 进入“ lib”目录,您需要:
    1. rapidminer.jar Rapidminer.jar
    2. launcher.jar launcher.jar
    3. All jar in "/lib/freehep" directory. 所有jar在“ / lib / freehep”目录中。
  • Put libs 1, 2 and 3 in your classpath java project (libraries) 将库1、2和3放入类路径Java项目(库)中
  • Copy this code and run: 复制此代码并运行:


    import com.rapidminer.Process;
    import com.rapidminer.RapidMiner;
    import com.rapidminer.operator.Operator;
    import com.rapidminer.operator.OperatorException;
    import com.rapidminer.operator.io.ExcelExampleSource;
    import com.rapidminer.tools.XMLException;
    import java.io.File;
    import java.io.IOException;
    import java.lang.Object;

    public class ReadRapidminerProcess {
      public static void main(String[] args) {
        try {
          RapidMiner.setExecutionMode(RapidMiner.ExecutionMode.COMMAND_LINE);
          RapidMiner.init();

          Process process = new Process(new File("/your_path/your_file.rmp"));
          process.run();

        } catch (IOException | XMLException | OperatorException ex) {
          ex.printStackTrace();
        }
      }
    }

I hope to help you, I searched a lot before finding the answer. 希望对您有所帮助,在找到答案之前,我进行了很多搜索。

I see two ways to do that. 我看到两种方法可以做到这一点。

The first one would be to change programatically the XML definition of your process. 第一个是以编程方式更改流程的XML定义。 Rapidminer processes are specified by an XML file with .rmp extension. Rapidminer进程由扩展名为.rmp的XML文件指定。 In the file you will find the definition of the operator you wish to change. 在文件中,您将找到要更改的运算符的定义。 This is an excerpt from a simple process specifiing the Read Excel operator: 这是指定Read Excel运算符的简单过程的摘录:

<operator activated="true" class="read_excel" compatibility="5.3.005" expanded="true" height="60" name="Read Excel" width="90" x="313" y="75">
    <parameter key="excel_file" value="D:\file.xls"/>    <!-- HERE IS THE FILE PATH -->
    <parameter key="sheet_number" value="1"/>
    <parameter key="imported_cell_range" value="A1"/>
    <parameter key="encoding" value="SYSTEM"/>
    <parameter key="first_row_as_names" value="true"/>
    <list key="annotations"/>
    <parameter key="date_format" value=""/>
    <parameter key="time_zone" value="SYSTEM"/>
    <parameter key="locale" value="English (United States)"/>
    <list key="data_set_meta_data_information"/>
    <parameter key="read_not_matching_values_as_missings" value="true"/>
    <parameter key="datamanagement" value="double_array"/>
</operator>

I highlighted the part where the path to the excel file is. 我突出显示了excel文件路径所在的部分。 You can overwrite that in your application. 您可以在应用程序中覆盖它。 Just be careful not to break the XML file. 请注意不要破坏XML文件。


The other way is to modify the operator after you load the process in your java application. 另一种方法是在Java应用程序中加载进程后修改运算符。 You can get a reference to your operator by Process#getOperator(String name) or Process#getAllOperators() . 您可以通过Process#getOperator(String name)Process#getAllOperators()获得对您的运算符的引用。 I guess it should be of one of these classes: 我猜应该属于以下类别之一:

com.rapidminer.operator.io.ExcelExampleSource
com.rapidminer.operator.nio.ExcelExampleSource

When you find the correct operator you modify the path by Operator#setParameter(String key, String Value) . 当找到正确的运算符时,可以通过Operator#setParameter(String key, String Value)修改路径。

This code works for me with RapidMiner 5.3: (the process is just a Read Excel operator and a Write CSV operator) 这段代码适用于RapidMiner 5.3 :(该过程只是一个Read Excel运算符和一个Write CSV运算符)

package sorapid;

import com.rapidminer.Process;
import com.rapidminer.RapidMiner;
import com.rapidminer.operator.Operator;
import com.rapidminer.operator.OperatorException;
import com.rapidminer.operator.io.ExcelExampleSource;
import com.rapidminer.tools.XMLException;
import java.io.File;
import java.io.IOException;

public class SOrapid {

  public static void main(String[] args) {
    try {
      RapidMiner.setExecutionMode(RapidMiner.ExecutionMode.COMMAND_LINE);
      RapidMiner.init();

      Process process = new Process(new File("c:\\Users\\Matlab\\.RapidMiner5\\repositories\\Local Repository\\processes\\test.rmp"));
      Operator op = process.getOperator("Read Excel");
      op.setParameter(ExcelExampleSource.PARAMETER_EXCEL_FILE, "d:\\excel.xls");
      process.run();

    } catch (IOException | XMLException | OperatorException ex) {
      ex.printStackTrace();
    }
  }
}

Try this: 尝试这个:

private SimpleExampleSet ReadExcel( File processXMLFile_, File excelFile_ ) throws IOException, XMLException, OperatorException
{
    IOContainer outParameters   = null;
    Process     readExcel       = new Process( processXMLFile_ );
    IOObject    inObject        = new SimpleFileObject( excelFile_ );
    IOContainer inParameters    = new IOContainer( inObject );

    outParameters   = readExcel.run( inParameters );

    SimpleExampleSet    result  = (SimpleExampleSet) outParameters.getElementAt( 0 );

    return result;

}

Sorry, I cannot post image with RapidMiner script if you need, I can send it to email. 抱歉,如果需要,我无法使用RapidMiner脚本发布图像,我可以将其发送到电子邮件中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM