简体   繁体   English

如何从uri确定文件的文件扩展名

[英]How to determine the file extension of a file from a uri

Assuming I am given a URI, and I want to find the file extension of the file that is returned, what do I have to do in Java. 假设我有一个URI,并且我想找到返回的文件的文件扩展名,我在Java中需要做什么。

For example the file at http://www.daml.org/2001/08/baseball/baseball-ont is http://www.daml.org/2001/08/baseball/baseball-ont.owl 例如, http : //www.daml.org/2001/08/baseball/baseball-ont上的文件是http://www.daml.org/2001/08/baseball/baseball-ont.owl

When I do 当我做

    URI uri = new URI(address); 
    URL url = uri.toURL();
    String file = url.getFile();
    System.out.println(file);

I am not able to see the full file name with .owl extension, just /2001/08/baseball/baseball-ont how do I get the file extension as well. 我无法看到扩展名为.owl的完整文件名,只是/2001/08/baseball/baseball-ont .owl /2001/08/baseball/baseball-ont我如何获得文件扩展名。 `` ``

At first, I want to make sure you know it's impossible to find out what kind of file a URI links too, since a link ending with .jpg might let you access a .exe file (this is especially true for URL's, due to symbolic links and .htaccess files), thus it isn't a rock solid solution to fetch the real extension from the URI if you want to limit allowed file types, if this is what you're going for of course. 首先,我想确保你知道找不到URI链接的文件类型也是不可能的,因为以.jpg结尾的链接可能会让你访问.exe文件(由于符号的原因,对于URL来说尤其如此)链接和.htaccess文件),因此,如果你想限制允许的文件类型,如果这是你当然想要的,那么从URI中获取真正的扩展名并不是一个坚如磐石的解决方案。 So, I assume you just want to know what extension a file has based on it's URI even though this isn't completely trustworthy; 所以,我假设你只是想知道一个文件基于它的URI的扩展名,即使这不是完全值得信赖的;

You can get the extension from any URI, URL or file path using the method bellow. 您可以使用下面的方法从任何URI,URL或文件路径获取扩展名。 You don't have to use any libraries or extensions, since this is basic Java functionality. 您不必使用任何库或扩展,因为这是基本的Java功能。 This solution get's the position of the last . 这个解决方案得到了最后的位置. (period) sign in the URI string, and creates a sub-string starting at the position of the period sign, ending at the end of the URI string. (句点)在URI字符串中签名,并创建一个从句点符号位置开始的子字符串,结束于URI字符串的末尾。

String uri = "http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/integrating_apps/images/google_logo.png";
String extension = uri.substring(uri.lastIndexOf("."));

This code sample will above will output the .png extension from the URI in the extension variable, note that a . 上面的代码示例将从extension变量中的URI输出.png扩展extension ,注意a . (period) is included in the extension, if you want to gather the file extension without a prefixed period, increase the substring index by one, like this: (句点)包含在扩展中,如果要收集没有前缀句点的文件扩展名,请将子字符串索引加1,如下所示:

String extension = uri.substring(url.lastIndexOf(".") + 1);

One pro for using this method over regular expressions (a method other people use a lot) is that this is a lot less resource expensive and a lot less heavy to execute while giving the same result. 在正则表达式(一种其他人经常使用的方法)上使用这种方法的一个专业人员是,这种资源的资源要少得多,并且在给出相同结果的同时执行起来要轻得多。

Additionally, you might want to make sure the URL contains a period character, use the following code to achieve this: 此外,您可能希望确保URL包含句点字符,请使用以下代码来实现此目的:

String uri = "http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/integrating_apps/images/google_logo.png";
if(uri.contains(".")) {
    String extension = uri.substring(url.lastIndexOf("."));
}

You might want to improve the functionally even further to create a more robust system. 您可能希望进一步改进功能以创建更强大的系统。 Two examples might be: 两个例子可能是:

  • Validate the URI by checking it exists, or by making sure the syntax of the URI is valid, possibly using a regular expression. 通过检查URI是否存在来验证URI,或者通过确保URI的语法有效,可能使用正则表达式来验证URI。
  • Trim the extension to remove unwanted white spaces. 修剪扩展名以删除不需要的空格。

I won't cover the solutions for these two features in here, because that isn't what was being asked in the first place. 我不会在这里介绍这两个功能的解决方案,因为这不是首先要求的。

Hope this helps! 希望这可以帮助!

There are two answers to this. 这有两个答案。

If a URI does not have a "file extension", then there is no way that you can infer one by looking at it textually, or by converting it to a File . 如果URI没有“文件扩展名”,那么您无法通过文本查看或将其转换为File来推断它。 In general, neither the URI or the File needs to have an extension at all. 通常,URI或文件都不需要具有扩展名。 Extensions are just a file naming convention . 扩展只是一个文件命名约定

What you are really after is the media type / MIMEtype / content type of the file. 您真正关注的是文件的媒体类型/ MIMEtype /内容类型。 You may be able to determine the media type by doing something like this: 您可以通过执行以下操作来确定媒体类型:

URLConnection conn = url.connect();
String type = conn.getContentType();

However the getContentType() method will return null if the server did not set a content type in the response. 但是,如果服务器未在响应中设置内容类型,则getContentType()方法将返回null (Or it could give you the wrong content type, or a non-specific content type.) At that point, you would need to resort to content type "guessing", and I don't know if that would give you a specific enough type in this case. (或者它可能会给你错误的内容类型或非特定的内容类型。)那时,你需要诉诸内容类型“猜测”,我不知道这是否会给你一个足够的具体输入这种情况。

But if you "know" that the file should be OWL, why don't you just give it a ".owl" extension anyway? 但是,如果你“知道”该文件应该是OWL,那么你为什么不给它一个“.owl”扩展呢?

This link might help for those who are still having problems: How I can get the mime type of a file having its Uri? 此链接可能对那些仍然有问题的人有所帮助: 如何获取具有其Uri的文件的mime类型?

 public static String getMimeType(Context context, Uri uri) {
    String extension;

    //Check uri format to avoid null
    if (uri.getScheme().equals(ContentResolver.SCHEME_CONTENT)) {
        //If scheme is a content
        final MimeTypeMap mime = MimeTypeMap.getSingleton();
        extension = mime.getExtensionFromMimeType(context.getContentResolver().getType(uri));
    } else {
        //If scheme is a File
        //This will replace white spaces with %20 and also other special characters. This will avoid returning null values on file name with spaces and special characters.
        extension = MimeTypeMap.getFileExtensionFromUrl(Uri.fromFile(new File(uri.getPath())).toString());

    }

    return extension;
}

URLConnection.guessContentTypeFromName(url) would deliver the mime type as in the first answer. URLConnection.guessContentTypeFromName(url)将像第一个答案中一样传递mime类型。 Maybe you simply wanted: 也许你只是想要:

String extension = url.getPath().replaceFirst("^.*/[^/]*(\\.[^\\./]*|)$", "$1");

The regular expression consuming all upto the last slash, then upto a period and either returns an extension like ".owl" or "". 正则表达式消耗所有直到最后一个斜杠,然后到达一个句点并返回一个类似“.owl”或“”的扩展名。 (If not mistaken) (如果没有记错的话)

As other answers have explained, you don't really know the content type without inspecting the file. 正如其他答案所解释的那样,如果不检查文件,您就不会真正了解内容类型。 However, you can predict the file type from a URL. 但是,您可以从URL预测文件类型。

Java almost provides this functionality as part of the URL class. Java 几乎将此功能作为URL类的一部分提供。 The method URL::getFile will intelligently grab the file portion of a URL : 方法URL::getFile将智能地获取URL的文件部分:

final URL url = new URL("http://www.example.com/a/b/c/stuff.zip?u=1");
final String file = url.getFile(); // file = "/a/b/c/stuff.zip?u=1"

We can use this to write our implementation: 我们可以用它来编写我们的实现:

public static Optional<String> getFileExtension(final URL url) {

    Objects.requireNonNull(url, "url is null");

    final String file = url.getFile();

    if (file.contains(".")) {

        final String sub = file.substring(file.lastIndexOf('.') + 1);

        if (sub.length() == 0) {
            return Optional.empty();
        }

        if (sub.contains("?")) {
            return Optional.of(sub.substring(0, sub.indexOf('?')));
        }

        return Optional.of(sub);
    }

    return Optional.empty();
}

This implementation should handle edge-cases properly: 此实现应正确处理边缘情况:

assertEquals(
    Optional.of("zip"), 
    getFileExtension(new URL("http://www.example.com/stuff.zip")));

assertEquals(
    Optional.of("zip"), 
    getFileExtension(new URL("http://www.example.com/stuff.zip")));

assertEquals(
    Optional.of("zip"), 
    getFileExtension(new URL("http://www.example.com/a/b/c/stuff.zip")));

assertEquals(
    Optional.empty(), 
    getFileExtension(new URL("http://www.example.com")));

assertEquals(
    Optional.empty(), 
    getFileExtension(new URL("http://www.example.com/")));

assertEquals(
    Optional.empty(), 
    getFileExtension(new URL("http://www.example.com/.")));

Accepted answer is not useful for url contains '?' 接受的答案对于url包含'?'没有用 or '/' after extension. 或者扩展后的'/'。 So, to remove that extra string, You can use getLastPathSegment() method. 因此,要删除该额外字符串,可以使用getLastPathSegment()方法。 It gives you only name from uri and then you can get extension as follows: 它只为您提供来自uri的名称,然后您可以获得如下扩展名:

String name = uri.getLastPathSegment();
//Here uri is your uri from which you want to get extension
String extension = name.substring(name.lastIndexOf("."));

above code gets extension with .(dot) if you want to remove the dot then you can code as follows: 上面的代码得到扩展名。(点)如果你想删除点,那么你可以编码如下:

String extension = name.substring(name.lastIndexOf(".") + 1);

Another useful way which is not mentioned in accepted answer is, If you have a remote url, then you can get mimeType from URLConnection, Like 在接受的答案中没有提到的另一种有用的方法是,如果你有一个远程网址,那么你可以从URLConnection获取mimeType,Like

  URLConnection urlConnection = new URL("http://www.google.com").openConnection();
  String mimeType = urlConnection.getContentType(); 

Now to get file extension from MimeType, I'll refer to this post 现在要从MimeType获取文件扩展名,我将参考这篇文章

I am doing it in this way. 我是这样做的。

You can check any file extension with more validation: 您可以通过更多验证检查任何文件扩展名:

String stringUri = uri.toString();
String fileFormat = "png";

                    if (stringUri.contains(".") && fileFormat.equalsIgnoreCase(stringUri.substring(stringUri.lastIndexOf(".") + 1))) {

                        // do anything

                    } else {

                        // invalid file

                    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM