[英]How to manipulate default value retrieved from x-ray scraper (node.js)
This is my code: 这是我的代码:
var Xray = require('x-ray');
var x = Xray();
x('http://someurl.com', 'tr td:nth-child(2)', [{
text: 'a',
url: 'a@href'
}]).write('results.json')
I need to populate the field named "text" only with the first word from each a tag. 我需要使用每个标记中的第一个单词填充名为“text”的字段。 An example of a tag value:
标记值的示例:
"FirstWord SecondWord ThirdWord" “FirstWord SecondWord ThirdWord”
The actual result is text: FirstWord SecondWord ThirdWord 实际结果是文本:FirstWord SecondWord ThirdWord
Desired result text: FirstWord 期望的结果文本:FirstWord
I can postprocess the result.json file but i don´t like that way. 我可以对result.json文件进行后期处理,但我不喜欢这样。
you can define your function in the filters, which showed in the official Github page 您可以在过滤器中定义您的功能,这在官方Github页面中显示
var Xray = require('x-ray');
var x = Xray({
filters: {
trim: function (value) {
return typeof value === 'string' ? value.trim() : value
},
reverse: function (value) {
return typeof value === 'string' ? value.split('').reverse().join('') : value
},
slice: function (value, start , end) {
return typeof value === 'string' ? value.slice(start, end) : value
}
}
});
x('http://mat.io', {
title: 'title | trim | reverse | slice:2,3'
})(function(err, obj) {
/*
{
title: 'oi'
}
*/
})
There is a fork of x-ray library made by cbou 有一个由cbou制作的x射线库
It's custom x-ray API has a function prepare that can change the output 它的自定义x-ray API具有可以改变输出的功能准备
https://github.com/cbou/x-ray#xrayprepare-str--fn https://github.com/cbou/x-ray#xrayprepare-str--fn
Example: 例:
function uppercase(str) {
return str.toUpperCase();
}
xray('mat.io')
.prepare('uppercase', uppercase)
.select('title | uppercase')
.run(function(err, title) {
// title == MAT.IO
});
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.