[英]Extracting a string starting with x and ending with y
First of all, I did a search on this and was able to find how to use something like String.Split() to extract the string based on a condition. 首先,我对此进行了搜索,并且能够找到如何使用String.Split()之类的东西来根据条件提取字符串。 I wasn't able to find however, how to extract it based on an ending condition as well.
然而,我无法找到如何根据结束条件提取它。 For example, I have a file with links to images:
http://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpg
You will notice that all the images start with http://
and end with .jpg
. 例如,我有一个文件链接到图像:
http://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpg
: http://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpg
: http://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpg
你会注意到所有图像都以http://
开头,以.jpg
结尾。 However, .jpg is succeeded by http:// without a space, making this a little more difficult. 但是,.jpg由http://继承而没有空格,这使得这更难一点。
So basically I'm trying to find a way (Regex?) to extract a string from a string that starts with http:// and ends with .jpg 所以基本上我试图找到一种方法(正则表达式?)从以http://开头并以.jpg结尾的字符串中提取字符串
Regex is the easiest way to do this. 正则表达式是最简单的方法。 If you're not familiar with regular expressions, you might check out Regex Buddy .
如果您不熟悉正则表达式,可以查看Regex Buddy 。 It's a relatively cheap little tool that I found extremely useful when I was learning.
这是一个相对便宜的小工具,我发现在学习时非常有用。 For your particular case, a possible expression is:
对于您的特定情况,可能的表达式是:
(http://.+?\.jpg)
It probably requires some more refinement, as there are boundary cases that could trip this up, but it would work if the file is a simple list. 它可能需要更多的细化,因为有边界情况可以解决这个问题,但如果文件是一个简单的列表,它将起作用。
You can also do free quick testing of expressions here . 你也可以在这里免费快速测试表达式。
Per your latest comment, if you have links to other non-images as well, then you need to make sure it doesn't start at the http:// for one link and read all the way to the .jpg for the next image. 根据您的最新评论,如果您还有其他非图像的链接,那么您需要确保它不是从http://开始一个链接并且一直读到.jpg以获取下一个图像。 Since URLs are not allowed to have whitespace, you can do it like this:
由于不允许URL有空格,您可以这样做:
(http://[^\s]+\.jpg)
This basically says, "match a string starting with http:// and ending with .jpg where there is at least one character between the two and none of those characters are whitespace". 这基本上说,“匹配以http://开头并以.jpg结尾的字符串,其中两者之间至少有一个字符,这些字符都不是空格”。
Regex RegexObj = new Regex("http://.+?\\.jpg");
Match MatchResults = RegexObj.Match(subject);
while (MatchResults.Success) {
//Do something with it
MatchResults = MatchResults.NextMatch();
}
In your specific case, you could always split if by ".jpg". 在您的特定情况下,您可以始终按“.jpg”拆分。 You will probably end up with one empty element at the end of the array, and have to append the .jpg at the end of each file if you need that.
您可能最终会在数组末尾添加一个空元素,并且如果需要,必须在每个文件的末尾附加.jpg。 Apart from that I think it would work.
除此之外,我认为它会起作用。
Tested the following code and it worked fine: 测试了以下代码,它工作正常:
public void SplitTest()
{
string test = "http://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpg";
string[] items = test.Split(new string[] { ".jpg" }, StringSplitOptions.RemoveEmptyEntries);
}
It even get rid of the empty entry... 它甚至摆脱了空洞的进入......
The following LINQ will separate by http: and make sure to only get values that end with jpg. 以下LINQ将通过http:分隔,并确保仅获取以jpg结尾的值。
var images = from i in imageList.Split(new[] {"http:"},
StringSplitOptions.RemoveEmptyEntries)
where i.EndsWith(".jpg")
select "http:" + i;
Regex would work really well for this. 正则表达式对此非常有效。 Here's an example in C# (and Java) for Regex
这是 Regex的C#(和Java)中的一个例子
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.