简体   繁体   English

提取以x开头并以y结尾的字符串

[英]Extracting a string starting with x and ending with y

First of all, I did a search on this and was able to find how to use something like String.Split() to extract the string based on a condition. 首先,我对此进行了搜索,并且能够找到如何使用String.Split()之类的东西来根据条件提取字符串。 I wasn't able to find however, how to extract it based on an ending condition as well. 然而,我无法找到如何根据结束条件提取它。 For example, I have a file with links to images: http://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpg You will notice that all the images start with http:// and end with .jpg . 例如,我有一个文件链接到图像: http://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpghttp://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpg : http://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpg你会注意到所有图像都以http://开头,以.jpg结尾。 However, .jpg is succeeded by http:// without a space, making this a little more difficult. 但是,.jpg由http://继承而没有空格,这使得这更难一点。

So basically I'm trying to find a way (Regex?) to extract a string from a string that starts with http:// and ends with .jpg 所以基本上我试图找到一种方法(正则表达式?)从以http://开头并以.jpg结尾的字符串中提取字符串

Regex is the easiest way to do this. 正则表达式是最简单的方法。 If you're not familiar with regular expressions, you might check out Regex Buddy . 如果您不熟悉正则表达式,可以查看Regex Buddy It's a relatively cheap little tool that I found extremely useful when I was learning. 这是一个相对便宜的小工具,我发现在学习时非常有用。 For your particular case, a possible expression is: 对于您的特定情况,可能的表达式是:

(http://.+?\.jpg)

It probably requires some more refinement, as there are boundary cases that could trip this up, but it would work if the file is a simple list. 它可能需要更多的细化,因为有边界情况可以解决这个问题,但如果文件是一个简单的列表,它将起作用。


You can also do free quick testing of expressions here . 你也可以在这里免费快速测试表达式。


Per your latest comment, if you have links to other non-images as well, then you need to make sure it doesn't start at the http:// for one link and read all the way to the .jpg for the next image. 根据您的最新评论,如果您还有其他非图像的链接,那么您需要确保它不是从http://开始一个链接并且一直读到.jpg以获取下一个图像。 Since URLs are not allowed to have whitespace, you can do it like this: 由于不允许URL有空格,您可以这样做:

(http://[^\s]+\.jpg)

This basically says, "match a string starting with http:// and ending with .jpg where there is at least one character between the two and none of those characters are whitespace". 这基本上说,“匹配以http://开头并以.jpg结尾的字符串,其中两者之间至少有一个字符,这些字符都不是空格”。

    Regex RegexObj = new Regex("http://.+?\\.jpg");
Match MatchResults = RegexObj.Match(subject);
while (MatchResults.Success) {
    //Do something with it 
    MatchResults = MatchResults.NextMatch();
     }

In your specific case, you could always split if by ".jpg". 在您的特定情况下,您可以始终按“.jpg”拆分。 You will probably end up with one empty element at the end of the array, and have to append the .jpg at the end of each file if you need that. 您可能最终会在数组末尾添加一个空元素,并且如果需要,必须在每个文件的末尾附加.jpg。 Apart from that I think it would work. 除此之外,我认为它会起作用。

Tested the following code and it worked fine: 测试了以下代码,它工作正常:

public void SplitTest()
{
    string test = "http://i594.photobucket.com/albums/tt27/34/444.jpghttp://i594.photobucket.com/albums/as/asfd/ghjk6.jpg";
    string[] items = test.Split(new string[] { ".jpg" }, StringSplitOptions.RemoveEmptyEntries);
}

It even get rid of the empty entry... 它甚至摆脱了空洞的进入......

The following LINQ will separate by http: and make sure to only get values that end with jpg. 以下LINQ将通过http:分隔,并确保仅获取以jpg结尾的值。

 var images = from i in imageList.Split(new[] {"http:"}, 
                                     StringSplitOptions.RemoveEmptyEntries)
              where i.EndsWith(".jpg")
              select "http:" + i;

Regex would work really well for this. 正则表达式对此非常有效。 Here's an example in C# (and Java) for Regex 这是 Regex的C#(和Java)中的一个例子

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 C#正则表达式:匹配以x开头,以y结尾的字符串,不包括结尾部分,并匹配模式的最后一次出现 - C# regex: match a string starting with x and ending with y, not including the ending part & match the last occurence of a pattern 如何在Unity中获得起始x位置和终止x位置? - How to get the starting x position and ending x position in Unity? 如何从字符串中获取字符串,以特定字符串开头和结尾 - How to get a string from a string, starting and ending with a specific string 正则表达式匹配字符串不开始和/或以空格结尾但允许在空格之间 - regex to match string not starting and/or ending with spaces but allowing inbetween spaces 如何在长字符串中找到所有以&#39;$&#39;开头并以空格结尾的单词? - How to find all the words starting with '$' sign and ending with space, in a long string? 提取字符串X和字符串Y之间的字符串 - Extract string that is between string X and string Y 从曲线C#中提取点坐标(x,y) - Extracting points coordinates(x,y) from a curve c# 验证字符串长度:X或Y,不在X和Y之间? - Validation a string length : either X or Y, not between X and Y? 如何使用RegEx(或者我应该)在起始字符串&#39;__&#39;到&#39;__&#39;或&#39;nothing&#39;之间提取字符串 - How can I use RegEx (Or Should I) to extract a string between the starting string '__' and ending with '__' or 'nothing' 列表中的X和Y轴索引 <string> 对于Roguelike - X and Y Axis Indices in List<string> for Roguelike
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM