简体   繁体   中英

Testing for crawler is necessary or not?

Is it necessary testing for crawler tools? And in which way?

My company is using a crawler tool (through API and GUI) to collect data for customers. The problem is sometime the GUI of target website has some changes, cause some error or data missing to the crawler.

Now the boss want to make sure that every time that changes happen, they would know it instantly (by DevOps/CICD of course).

However, I'm not sure which method I should use for testing. The leader want me to do automation test, but it mean that I have to do the crawler... again, by myself. It's just do a job twice. Unit testing?

Maybe, but is it necessary because if you want something alert you about error, you can do it right in your code.

Beside, in my knowledge, testing is about taking a standard sample data set, which has smaller size than feasible data set. But when you test for crawler, you test all of its data, because you are capable to do it, which mean you just do what source code did...again.

So what do you think?

Depending on the method you uses to get data from GUI, it may fail if the references you expect does not with your code. For example, let's say you use Selenium to get all links with a class called "any-class". If this class is removed from UI, for whatever reason, your crawler is going to fail. Maybe the problem is not the tool itself, but the way the crawler geta the data. If you know the pages to get data, you can perform periodic sanity checks in random pages to find erros before the official execution. You can implement the crawler to continue running on errors and report the failed steps for further analysis.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM