如何从字符串中获取子字符串而不拆分？

Question

String str = "internet address : http://test.com Click this!";

I want to get " http://test.com ", so I wrote like this.我想得到“ http://test.com ”，所以我是这样写的。

String[] split = str.split(" ");
for ( int i = 0 ; i < split.length ; i++ ) {
    if ( split[i].contains("http://") ) {
        return split[i];
    }
}

but I think this is ineffective.但我认为这是无效的。 how to get that more easily?如何更轻松地获得它？

Answer 1

Assuming you always have the same format (some text : URL more text) this can work:假设您始终具有相同的格式（一些文本：URL 更多文本），这可以工作：

public static void main(String[] args) throws IOException {
    String str = "internet address : http://test.com Click this!";
    String first = str.substring(str.indexOf("http://"));
    String second = first.substring(0, first.indexOf(" "));
    System.out.println(second);
}

But better is regex as suggested in different answer但更好的是正则表达式，如不同答案中所建议

Answer 2

Usually, this is either done with a regular expression or with indexOf and substring .通常，这是使用正则表达式或使用indexOf和substring 。

With a regular expression, this can be done like that:使用正则表达式，可以这样做：

    // This is using a VERY simplified regular expression
    String str = "internet address : http://test.com Click this!";
    Pattern pattern = Pattern.compile("[http:|https:]+\\/\\/[\\w.]*");
    Matcher matcher = pattern.matcher(str);
    if (matcher.find()) {
        System.out.println(matcher.group(0));
    }

You can read here why it's simplified: https://mathiasbynens.be/demo/url-regex - tl;dr: the problem with URLs is they can have so many different patterns which are valid.您可以在此处阅读简化的原因： https : //mathiasbynens.be/demo/url-regex - tl;dr：URL 的问题在于它们可以有许多不同的有效模式。

With split, there would be a way utilizing the URL class of Java:有了 split，就有一种利用 Java 的 URL 类的方法：

   String[] split = str.split(" ");

    for (String value : split) {
        try {
            URL uri = new URL(value);
            System.out.println(value);
        } catch (MalformedURLException e) {
            // no valid url
        }
    }

You can check their validation in the OpenJDK source here .您可以在此处的 OpenJDK 源代码中检查它们的验证。

Answer 3

My try with regex我对正则表达式的尝试

String regex = "http?:\\/\\/(www\\.)?[-a-zA-Z0-9@:%._\\+~#=]{2,256}\\.[a-z]{2,6}\\b([-a-zA-Z0-9@:%_\\+.~#?&//=]*)";
String str = "internet address : http://test.com Click this!";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
if (matcher.find()) {
    System.out.println(matcher.group(0));
}

result:结果：

http://test.com

source: here来源：这里

Answer 4

Find the http:// in the string, then look forwards and backwards for the space:在字符串中找到http:// ，然后向前和向后查找空格：

int pos = str.indexOf("http://");
if (pos >= 0) {
  // Look backwards for space.
  int start = Math.max(0, str.lastIndexOf(' ', pos));

  // Look forwards for space.
  int end = str.indexOf(' ', pos + "http://".length());
  if (end < 0) end = str.length();

  return str.substring(start, end);
}

Answer 5

It is not clear if the structure of the input string is constant, however, I would do something like this:不清楚输入字符串的结构是否是常量，但是，我会这样做：

    String str = "internet address : http://test.com Click this!";
    // get the index of the first letter of an url
    int urlStart = str.indexOf("http://");
    System.out.println(urlStart);
    // get the first space after the url
    int urlEnd = str.substring(urlStart).indexOf(" ");
    System.out.println(urlEnd);
    // get the substring of the url
    String urlString = str.substring(urlStart, urlStart + urlEnd);
    System.out.println(urlString);

Answer 6

I just made a quick solution for the same.我只是为此做了一个快速解决方案。 It should work for you perfectly.它应该非常适合你。

package Main.Kunal;

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class URLOutOfString {

    public static void main(String[] args) {
        String str = "internet address : http://test.com Click this!, internet address : http://tes1t.com Click this!";
        List<String> result= new ArrayList<>();
        int counter = 0;
        final Pattern urlPattern = Pattern.compile(
                "(?:^|[\\W])((ht|f)tp(s?):\\/\\/|www\\.)"
                        + "(([\\w\\-]+\\.){1,}?([\\w\\-.~]+\\/?)*"
                        + "[\\p{Alnum}.,%_=?&#\\-+()\\[\\]\\*$~@!:/{};']*)",
                Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);

        Matcher matcher = urlPattern.matcher(str);

        while (matcher.find()) {
            result.add(str.substring(matcher.start(1), matcher.end()));
            counter++;
        }

        System.out.println(result);

    }

}

This will find all URLs in your string and add it to arraylist.这将找到字符串中的所有 URL 并将其添加到 arraylist。 You can use it as per your business need.您可以根据业务需要使用它。

Answer 7

You could use regex for it你可以使用正则表达式

String str = "internet address : http://test.com Click this!";
Pattern pattern = Pattern.compile("((http|https)\\S*)");
Matcher matcher = pattern.matcher(str);
if (matcher.find())
{
    System.out.println(matcher.group(1));
}

如何从字符串中获取子字符串而不拆分？

问题描述

7 个解决方案

解决方案1
1 已采纳 2019-01-17 09:18:49

解决方案2
1 2019-01-17 09:24:32

解决方案3
0 2019-01-17 09:15:09

解决方案4
0 2019-01-17 09:18:31

解决方案5
0 2019-01-17 09:25:24

解决方案6
0 2019-01-17 09:28:10

解决方案7
0 2019-01-17 09:38:41

如何从字符串中获取子字符串而不拆分？

问题描述

7 个解决方案

解决方案1 1 已采纳 2019-01-17 09:18:49

解决方案2 1 2019-01-17 09:24:32

解决方案3 0 2019-01-17 09:15:09

解决方案4 0 2019-01-17 09:18:31

解决方案5 0 2019-01-17 09:25:24

解决方案6 0 2019-01-17 09:28:10

解决方案7 0 2019-01-17 09:38:41

解决方案1
1 已采纳 2019-01-17 09:18:49

解决方案2
1 2019-01-17 09:24:32

解决方案3
0 2019-01-17 09:15:09

解决方案4
0 2019-01-17 09:18:31

解决方案5
0 2019-01-17 09:25:24

解决方案6
0 2019-01-17 09:28:10

解决方案7
0 2019-01-17 09:38:41