简体   繁体   中英

DIH Scheduling in Solr

I have just started playing around with Solr and I have it deployed and running on Tomcat. I have the schema and data import handler set up and it indexes the files just fine. Now I want to schedule this dataImportHandler to run every hour or so.

There is a wiki page detailing the files here .

But there are not instructions on where to create the files and how to deploy them

A similar question has been asked on Stack Overflow before here .

The answer was to "Create classes ApplicationListener, HTTPPostScheduler and SolrDataImportProperties". I don't know where I should be creating the classes. But I took a guess and I downloaded the latest nightly build and created the classes in the org.apache.solr.handler.dataimport.scheduler package (copy pasting the classes from the wiki page). I compiled and ran the ant dist command to create the deployable jar files.

I configured the dataimport.properties as per the instructions in the wiki and then added the listener in the web.xml file as instructed in the answer above. But when I started Tomcat solr was not deployed.

I see this error message in the log file:

INFO: Starting Servlet Engine: Apache Tomcat/7.0.14
Jun 21, 2011 5:20:47 PM org.apache.catalina.startup.HostConfig deployDescriptor
INFO: Deploying configuration descriptor solr.xml from /home/sabman/programs/apache-tomcat-7.0.14/conf/Catalina/localhost
Jun 21, 2011 5:20:47 PM org.apache.catalina.startup.HostConfig deployDescriptor
WARNING: A docBase /home/sabman/programs/apache-tomcat-7.0.14/webapps/solr.war inside the host appBase has been specified, and will be ignored
Jun 21, 2011 5:20:47 PM org.apache.catalina.startup.SetContextPropertiesRule begin
WARNING: [SetContextPropertiesRule]{Context} Setting property 'debug' to '0' did not find a matching property.
Jun 21, 2011 5:20:48 PM org.apache.catalina.core.StandardContext startInternal
SEVERE: Error listenerStart

I had to remove listener code from the web.xml for it work as it was before.

Any idea about what I could be doing wrong?

See my TimerHttpTask for a simple WAR to periodically call any HTTP link. For example, the link can be a DIH link to start a delta import. The project is LGPL. JNDI is used to schedule job(s) without re-building the WAR. The examples below direct TimerHttpTask to call a URL using Fixed Delay with an initial delay of 15 sec and every 60 thereafter.

Jetty JNDI Configuration

<Call name="setProperty">
    <Arg>TIMEAPI-UTC-NOW</Arg> 
    <Arg>FD|15000|60000|http://www.timeapi.org/utc/now.json</Arg>
</Call>

Tomcat JNDI Configuration

TIMEAPI-UTC-NOW="FD|15000|60000|http://www.timeapi.org/utc/now.json"

I got this reply from the Solr mailing list:

The Wiki page describes a design for a scheduler, which has not been committed to Solr yet (I checked). I did see a patch the other day (see https://issues.apache.org/jira/browse/SOLR-2305 ) but it didn't look well tested.

I think that you're basically stuck with something like cron at this time. If your application is written in java, take a look at the Quartz scheduler - http://www.quartz-scheduler.org/

If you copied the source for ApplicationListener, etc and ran a build, you may want to check that the files are actually being compiled into your distribution. You can do that by opening up the war file and looking to see if there is a jar containing .class files for those classes you mentioned or looking in the classes directory in the .war to see if they are there. If they're not then they won't get loaded in the web app (hence the failed deployment).

You may have to compile them on your own (create your own jar file that has compiled classes) and include the jar file in the war file manually (this would be a good test, at least).

You could also just use the second answer from that Stackoverflow post, which was to call the command line from cron or the task scheduler.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM