简体   繁体   English

如何从Java中的MIME类型确定适当的文件扩展名

[英]How to determine appropriate file extension from MIME Type in Java

I am uploading files to an Amazon s3 bucket and have access to the InputStream and a String containing the MIME Type of the file but not the original file name. 我正在将文件上传到Amazon s3存储桶,并可以访问InputStream和一个包含文件的MIME类型而不是原始文件名的字符串。 It's up to me to actually create the file name and extension before pushing the file up to S3. 在将文件推送到S3之前,实际上要由我来创建文件名和扩展名。 Is there a library or convenient way to determine the appropriate extension to use from the MIME Type? 有没有一种库或便捷的方法来确定要从MIME类型使用的适当扩展名?

I've seen some references to the Apache Tika library but that seems like overkill and I haven't been able to get it to successfully detect file extensions yet. 我已经看到了对Apache Tika库的一些引用,但是这似乎有点过头了,而且我还无法获得它来成功检测文件扩展名。 From what I've been able to gather it seems like this code should work, but I'm just getting an empty string when my type variable is "image/jpeg" 从我已经收集到的内容来看,这段代码似乎应该可以工作,但是当我的类型变量为“ image / jpeg”时,我只是得到一个空字符串

    MimeType mimeType = null;
    try {
        mimeType = new MimeTypes().forName(type);
    } catch (MimeTypeException e) {
        Logger.error("Couldn't Detect Mime Type for type: " + type, e);
    }

    if (mimeType != null) {
        String extension = mimeType.getExtension();
        //do something with the extension
    }

As some of the commentors have pointed out, there is no universal 1:1 mapping between mimetypes and file extensions... Some mimetypes have more than one possible extension, many extensions are shared by multiple mimetypes, and some mimetypes have no extension. 正如一些评论者所指出的那样,mimetypes和文件扩展名之间没有通用的1:1映射...某些mimetypes具有多个可能的扩展名,许多扩展名由多个mimetypes共享,而某些mimetypes没有扩展名。

Wherever possible, you're much better off storing the mimetype and using that going forward, and forgetting about the extension. 在任何可能的情况下,最好存储mimetype并继续使用它,而不必考虑扩展名。

That said, if you do want to get the most common file extension for a given mimetype, then Tika is a good way to go. 就是说,如果您确实想获得给定mimetype的最常见文件扩展名,那么Tika是一个不错的选择。 Apache Tika has a very large set of mimetypes it knows about, and for many of these it also knows mime magic for detection, common extensions, descriptions etc. Apache Tika拥有大量已知的mimetypes,对于其中许多mimetypes,它也知道mime魔术检测,通用扩展名,描述等。

If you want to get the most common extension for a JPEG file, then as shown in this Apache Tika unit test you just need to do something like: 如果要获取JPEG文件的最常用扩展名,则如本Apache Tika单元测试中所示,您只需执行以下操作:

  MimeTypes allTypes = MimeTypes.getDefaultMimeTypes();
  MimeType jpeg = allTypes.forName("image/jpeg");
  String jpegExt = jpeg.getExtension(); // .jpg
  assertEquals(".jpg", jpeg.getExtension());

The key thing is that you need to load up the xml file that's bundled in the Tika jar to get the definitions of all the mimetypes. 关键是您需要加载Tika jar中捆绑的xml文件,以获取所有mimetypes的定义。 If you might be dealing with custom mimetypes too, then Tika supports those, and change line one to be: 如果您也可能要处理自定义的模仿类型,那么Tika支持这些,并将第一行更改为:

  TikaConfig config = TikaConfig.getDefaultConfig();
  MimeTypes allTypes = config.getMimeRepository();

By using the TikaConfig method to get the MimeTypes, Tika will also check your classpath for custom mimetype defintions, and include those too. 通过使用TikaConfig方法获取MimeType,Tika还将检查您的类路径中的自定义mimetype定义,并包括这些定义。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM