简体   繁体   English

从链接中提取URL页面的文件名

[英]extract file name of url page from links

I'm trying to extract the page names from a httpclient response. 我正在尝试从httpclient响应中提取页面名称。 I want to use regex to extract all the links that are in the /bts format. 我想使用正则表达式提取/ bts格式的所有链接。 (this part is working fine and im not gettin any undesired links) ex:when the pattern is " bts/pagename.htm">Name of link " i want that pagename to be extracted. I have it working to extract the full example above but I can't seem to extract just the page without the the rest of the pattern. The patten im matching is bts/ to but I dont wanna keep them in my output. I guess really i want pagenames that start with bts/ and end in .htm Maybe its impossible Im not sure (这部分工作正常,没有出现任何不想要的链接)例如:当模式为“ bts / pagename.htm“>链接名称时,我希望提取该页面名称。我可以提取上面的完整示例但是我似乎无法仅提取没有其余模式的页面。patten im匹配是bts /到,但是我不想将它们保留在我的输出中。我想真的我想要以bts /开头并以end结尾的页面名称在.htm中也许不可能我不确定

Do you want to extract the character sequence of filename? 是否要提取文件名的字符序列?

I'm not very good at regular expression. 我不太擅长正则表达式。 But maybe you can try this 但是也许你可以尝试一下

(?<=/)\w+(?=.)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM