[英]Strip dashes from a string?
For web scraping, I need to match the last part of a URL and replace "-" dashes with " " spaces. 对于网络抓取,我需要匹配URL的最后一部分,并用“”空格替换“-”破折号。
Code looks like this... 代码看起来像这样...
<div class="tags">
<span class="tag" style="background-color: #5A214A;">
<a href="/Services/Research/Telecoms-software/Service-Assurance/">SA</a>
</span>
</div>
I want to be left with "Service Assurance" (this part may contain multiple "-" dashes and require multiple replacements). 我想留下“服务保证”(此部分可能包含多个“-”破折号,并且需要多次替换)。
Currently being used: 当前正在使用:
Xpath: Xpath的:
//span[@class="tag"]/a/@href
Regex: 正则表达式:
/.*/(.*)/
This produces "Service-Assurance", but does not strip out the "-". 这将产生“服务保证”,但不会去除“-”。
I am told elsewhere that this replacement is not possible since I am already using Regex to find the string between the final "/" slashes. 在其他地方,我被告知不可能进行此替换,因为我已经在使用Regex查找最后的“ /”斜杠之间的字符串。
Can I do both? 我可以两者都做吗? Can I replace the "-" dashes at the end, too?
我也可以在末尾替换破折号吗?
Regex is plain, inside an app called import.io, no particular language flavour. 正则表达式很简单,在一个名为import.io的应用程序中,没有特殊的语言味道。
Thank-you very much. 非常感谢你。
Try this xpath without the regex: 尝试不带正则表达式的xpath:
//*[@class='tag-wrapper']/input[1]/@value
althernatively you can also try these methods: 另外,您也可以尝试以下方法:
I scrape urls in google-sheets all the time with xpaths and regexes - so if you want to try: 我一直用xpaths和正则表达式在Google表格中抓取网址-因此,如果您想尝试:
=importXML("url goes here","//span[@class="tag"]/a/@href")
now then if you do at least get the url string back, then you know its working ad we can then modify it to this to get what you want: 现在,如果您至少返回了url字符串,那么您就知道其有效的广告,然后我们可以对其进行修改以获取所需的内容:
=SUBSTITUTE(REGEXEXTRACT(importXML("url goes here","//span[@class="tag"]/a/@href"),".*\/(.*)\/$"),"-"," ")
Let me know if you have issues - there are a couple of weird quirks with google - but if you share the url your pulling that xpath in with I can at least test it myself - i use this method now more than any others, I used to use import.io and outwit hub etc a ton 让我知道您是否有问题-与Google有一些怪异的怪癖-但如果您共享该URL,则至少可以自己测试一下该xpath-我现在比其他任何人都使用这种方法大量使用import.io和outwit hub等
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.