简体   繁体   English

如何从内容类型获取文件扩展名?

[英]How to get file extension from content type?

I'm using Apache Tika, and I have files (without extension) of particular content type that need to be renamed to have extension that reflect the content type.我正在使用 Apache Tika,并且我有特定内容类型的文件(没有扩展名),需要重命名以具有反映内容类型的扩展名。

Any idea if there is something I could use instead of programming that from scratch based on content type names ?知道有什么我可以使用而不是根据内容类型名称从头开始编程的吗?

The two key classes for you are MediaTypeRegistry and MimeTypes .您的两个关键类是MediaTypeRegistryMimeTypes Using these, you can do mime type magic based detection, and get information on the mime types and their relationships.使用这些,您可以进行基于 mime 类型魔术的检测,并获取有关 mime 类型及其关系的信息。

(That said, if you want to do a full detection, potentially involving some parsing of the File using extra logic in the Tika Parsers jar for container-based formats, you should be using TikaConfig.getDetector() and/or DefaultDetector .) (也就是说,如果您想进行完整检测,可能涉及使用 Tika Parsers jar 中基于容器格式的额外逻辑对文件进行一些解析,您应该使用TikaConfig.getDetector()和/或DefaultDetector 。)

// Load your Tika config, find all the Tika classes etc
TikaConfig config = TikaConfig.getDefaultConfig();

// Do the detection. Use DefaultDetector / getDetector() for more advanced detection
Metadata metadata = new Metadata();    
InputStream stream = TikaInputStream.get(new File(file), metadata);
MediaType mediaType = config.getMimeRepository().detect(stream);

// Fest the most common extension for the detected type
MimeType mimeType = config.getMimeRepository().forName(mediaType.toString());
String extension = mimeType.getExtension();

The problem with this question is that several extensions can produce the same MimeType, vice versa is simple and you can find it in many utils classes for example: MimetypesFileTypeMap , but I found this useful class MimeTypes and I did a little trick adding an extensions Mapping, remember this return the default extension by MimeType but can be inaccurate.这个问题的问题是几个扩展可以产生相同的 MimeType,反之亦然很简单,你可以在许多 utils 类中找到它,例如: MimetypesFileTypeMap ,但我发现这个有用的类MimeTypes并且我做了一个添加扩展映射的小技巧,请记住这会通过 MimeType 返回默认扩展名,但可能不准确。

// Copyright (c) 2003-2009, Jodd Team (jodd.org). All Rights Reserved.

 package co.tmunited.fs.service.util;

 import java.util.HashMap;

 /**
  * Map file extensions to MIME types. Based on the Apache mime.types file.
  * http://www.iana.org/assignments/media-types/
 */
public class MimeTypes {

public static final String MIME_APPLICATION_ANDREW_INSET = "application/andrew-inset";
public static final String MIME_APPLICATION_JSON = "application/json";
public static final String MIME_APPLICATION_ZIP = "application/zip";
public static final String MIME_APPLICATION_X_GZIP = "application/x-gzip";
public static final String MIME_APPLICATION_TGZ = "application/tgz";
public static final String MIME_APPLICATION_MSWORD = "application/msword";
public static final String MIME_APPLICATION_MSWORD_2007 = "application/vnd.openxmlformats-officedocument.wordprocessingml.document";
public static final String MIME_APPLICATION_VND_TEXT = "application/vnd.oasis.opendocument.text";
public static final String MIME_APPLICATION_POSTSCRIPT = "application/postscript";
public static final String MIME_APPLICATION_PDF = "application/pdf";
public static final String MIME_APPLICATION_JNLP = "application/jnlp";
public static final String MIME_APPLICATION_MAC_BINHEX40 = "application/mac-binhex40";
public static final String MIME_APPLICATION_MAC_COMPACTPRO = "application/mac-compactpro";
public static final String MIME_APPLICATION_MATHML_XML = "application/mathml+xml";
public static final String MIME_APPLICATION_OCTET_STREAM = "application/octet-stream";
public static final String MIME_APPLICATION_ODA = "application/oda";
public static final String MIME_APPLICATION_RDF_XML = "application/rdf+xml";
public static final String MIME_APPLICATION_JAVA_ARCHIVE = "application/java-archive";
public static final String MIME_APPLICATION_RDF_SMIL = "application/smil";
public static final String MIME_APPLICATION_SRGS = "application/srgs";
public static final String MIME_APPLICATION_SRGS_XML = "application/srgs+xml";
public static final String MIME_APPLICATION_VND_MIF = "application/vnd.mif";
public static final String MIME_APPLICATION_VND_MSEXCEL = "application/vnd.ms-excel";
public static final String MIME_APPLICATION_VND_MSEXCEL_2007 = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
public static final String MIME_APPLICATION_VND_SPREADSHEET = "application/vnd.oasis.opendocument.spreadsheet";
public static final String MIME_APPLICATION_VND_MSPOWERPOINT = "application/vnd.ms-powerpoint";
public static final String MIME_APPLICATION_VND_RNREALMEDIA = "application/vnd.rn-realmedia";
public static final String MIME_APPLICATION_X_BCPIO = "application/x-bcpio";
public static final String MIME_APPLICATION_X_CDLINK = "application/x-cdlink";
public static final String MIME_APPLICATION_X_CHESS_PGN = "application/x-chess-pgn";
public static final String MIME_APPLICATION_X_CPIO = "application/x-cpio";
public static final String MIME_APPLICATION_X_CSH = "application/x-csh";
public static final String MIME_APPLICATION_X_DIRECTOR = "application/x-director";
public static final String MIME_APPLICATION_X_DVI = "application/x-dvi";
public static final String MIME_APPLICATION_X_FUTURESPLASH = "application/x-futuresplash";
public static final String MIME_APPLICATION_X_GTAR = "application/x-gtar";
public static final String MIME_APPLICATION_X_HDF = "application/x-hdf";
public static final String MIME_APPLICATION_X_JAVASCRIPT = "application/x-javascript";
public static final String MIME_APPLICATION_X_KOAN = "application/x-koan";
public static final String MIME_APPLICATION_X_LATEX = "application/x-latex";
public static final String MIME_APPLICATION_X_NETCDF = "application/x-netcdf";
public static final String MIME_APPLICATION_X_OGG = "application/x-ogg";
public static final String MIME_APPLICATION_X_SH = "application/x-sh";
public static final String MIME_APPLICATION_X_SHAR = "application/x-shar";
public static final String MIME_APPLICATION_X_SHOCKWAVE_FLASH = "application/x-shockwave-flash";
public static final String MIME_APPLICATION_X_STUFFIT = "application/x-stuffit";
public static final String MIME_APPLICATION_X_SV4CPIO = "application/x-sv4cpio";
public static final String MIME_APPLICATION_X_SV4CRC = "application/x-sv4crc";
public static final String MIME_APPLICATION_X_TAR = "application/x-tar";
public static final String MIME_APPLICATION_X_RAR_COMPRESSED = "application/x-rar-compressed";
public static final String MIME_APPLICATION_X_TCL = "application/x-tcl";
public static final String MIME_APPLICATION_X_TEX = "application/x-tex";
public static final String MIME_APPLICATION_X_TEXINFO = "application/x-texinfo";
public static final String MIME_APPLICATION_X_TROFF = "application/x-troff";
public static final String MIME_APPLICATION_X_TROFF_MAN = "application/x-troff-man";
public static final String MIME_APPLICATION_X_TROFF_ME = "application/x-troff-me";
public static final String MIME_APPLICATION_X_TROFF_MS = "application/x-troff-ms";
public static final String MIME_APPLICATION_X_USTAR = "application/x-ustar";
public static final String MIME_APPLICATION_X_WAIS_SOURCE = "application/x-wais-source";
public static final String MIME_APPLICATION_VND_MOZZILLA_XUL_XML = "application/vnd.mozilla.xul+xml";
public static final String MIME_APPLICATION_XHTML_XML = "application/xhtml+xml";
public static final String MIME_APPLICATION_XSLT_XML = "application/xslt+xml";
public static final String MIME_APPLICATION_XML = "application/xml";
public static final String MIME_APPLICATION_XML_DTD = "application/xml-dtd";
public static final String MIME_IMAGE_BMP = "image/bmp";
public static final String MIME_IMAGE_CGM = "image/cgm";
public static final String MIME_IMAGE_GIF = "image/gif";
public static final String MIME_IMAGE_IEF = "image/ief";
public static final String MIME_IMAGE_JPEG = "image/jpeg";
public static final String MIME_IMAGE_TIFF = "image/tiff";
public static final String MIME_IMAGE_PNG = "image/png";
public static final String MIME_IMAGE_SVG_XML = "image/svg+xml";
public static final String MIME_IMAGE_VND_DJVU = "image/vnd.djvu";
public static final String MIME_IMAGE_WAP_WBMP = "image/vnd.wap.wbmp";
public static final String MIME_IMAGE_X_CMU_RASTER = "image/x-cmu-raster";
public static final String MIME_IMAGE_X_ICON = "image/x-icon";
public static final String MIME_IMAGE_X_PORTABLE_ANYMAP = "image/x-portable-anymap";
public static final String MIME_IMAGE_X_PORTABLE_BITMAP = "image/x-portable-bitmap";
public static final String MIME_IMAGE_X_PORTABLE_GRAYMAP = "image/x-portable-graymap";
public static final String MIME_IMAGE_X_PORTABLE_PIXMAP = "image/x-portable-pixmap";
public static final String MIME_IMAGE_X_RGB = "image/x-rgb";
public static final String MIME_AUDIO_BASIC = "audio/basic";
public static final String MIME_AUDIO_MIDI = "audio/midi";
public static final String MIME_AUDIO_MPEG = "audio/mpeg";
public static final String MIME_AUDIO_X_AIFF = "audio/x-aiff";
public static final String MIME_AUDIO_X_MPEGURL = "audio/x-mpegurl";
public static final String MIME_AUDIO_X_PN_REALAUDIO = "audio/x-pn-realaudio";
public static final String MIME_AUDIO_X_WAV = "audio/x-wav";
public static final String MIME_CHEMICAL_X_PDB = "chemical/x-pdb";
public static final String MIME_CHEMICAL_X_XYZ = "chemical/x-xyz";
public static final String MIME_MODEL_IGES = "model/iges";
public static final String MIME_MODEL_MESH = "model/mesh";
public static final String MIME_MODEL_VRLM = "model/vrml";
public static final String MIME_TEXT_PLAIN = "text/plain";
public static final String MIME_TEXT_RICHTEXT = "text/richtext";
public static final String MIME_TEXT_RTF = "text/rtf";
public static final String MIME_TEXT_HTML = "text/html";
public static final String MIME_TEXT_CALENDAR = "text/calendar";
public static final String MIME_TEXT_CSS = "text/css";
public static final String MIME_TEXT_SGML = "text/sgml";
public static final String MIME_TEXT_TAB_SEPARATED_VALUES = "text/tab-separated-values";
public static final String MIME_TEXT_VND_WAP_XML = "text/vnd.wap.wml";
public static final String MIME_TEXT_VND_WAP_WMLSCRIPT = "text/vnd.wap.wmlscript";
public static final String MIME_TEXT_X_SETEXT = "text/x-setext";
public static final String MIME_TEXT_X_COMPONENT = "text/x-component";
public static final String MIME_VIDEO_QUICKTIME = "video/quicktime";
public static final String MIME_VIDEO_MPEG = "video/mpeg";
public static final String MIME_VIDEO_VND_MPEGURL = "video/vnd.mpegurl";
public static final String MIME_VIDEO_X_MSVIDEO = "video/x-msvideo";
public static final String MIME_VIDEO_X_MS_WMV = "video/x-ms-wmv";
public static final String MIME_VIDEO_X_SGI_MOVIE = "video/x-sgi-movie";
public static final String MIME_X_CONFERENCE_X_COOLTALK = "x-conference/x-cooltalk";

private static HashMap<String, String> mimeTypeMapping;
private static HashMap<String, String> extMapping;

static {
    mimeTypeMapping = new HashMap<String, String>(200) {
        private void put1(String key, String value) {
            if (put(key, value) != null) {
                throw new IllegalArgumentException("Duplicated extension: " + key);
            }
        }

        {
            put1("xul", MIME_APPLICATION_VND_MOZZILLA_XUL_XML);
            put1("json", MIME_APPLICATION_JSON);
            put1("ice", MIME_X_CONFERENCE_X_COOLTALK);
            put1("movie", MIME_VIDEO_X_SGI_MOVIE);
            put1("avi", MIME_VIDEO_X_MSVIDEO);
            put1("wmv", MIME_VIDEO_X_MS_WMV);
            put1("m4u", MIME_VIDEO_VND_MPEGURL);
            put1("mxu", MIME_VIDEO_VND_MPEGURL);
            put1("htc", MIME_TEXT_X_COMPONENT);
            put1("etx", MIME_TEXT_X_SETEXT);
            put1("wmls", MIME_TEXT_VND_WAP_WMLSCRIPT);
            put1("wml", MIME_TEXT_VND_WAP_XML);
            put1("tsv", MIME_TEXT_TAB_SEPARATED_VALUES);
            put1("sgm", MIME_TEXT_SGML);
            put1("sgml", MIME_TEXT_SGML);
            put1("css", MIME_TEXT_CSS);
            put1("ifb", MIME_TEXT_CALENDAR);
            put1("ics", MIME_TEXT_CALENDAR);
            put1("wrl", MIME_MODEL_VRLM);
            put1("vrlm", MIME_MODEL_VRLM);
            put1("silo", MIME_MODEL_MESH);
            put1("mesh", MIME_MODEL_MESH);
            put1("msh", MIME_MODEL_MESH);
            put1("iges", MIME_MODEL_IGES);
            put1("igs", MIME_MODEL_IGES);
            put1("rgb", MIME_IMAGE_X_RGB);
            put1("ppm", MIME_IMAGE_X_PORTABLE_PIXMAP);
            put1("pgm", MIME_IMAGE_X_PORTABLE_GRAYMAP);
            put1("pbm", MIME_IMAGE_X_PORTABLE_BITMAP);
            put1("pnm", MIME_IMAGE_X_PORTABLE_ANYMAP);
            put1("ico", MIME_IMAGE_X_ICON);
            put1("ras", MIME_IMAGE_X_CMU_RASTER);
            put1("wbmp", MIME_IMAGE_WAP_WBMP);
            put1("djv", MIME_IMAGE_VND_DJVU);
            put1("djvu", MIME_IMAGE_VND_DJVU);
            put1("svg", MIME_IMAGE_SVG_XML);
            put1("ief", MIME_IMAGE_IEF);
            put1("cgm", MIME_IMAGE_CGM);
            put1("bmp", MIME_IMAGE_BMP);
            put1("xyz", MIME_CHEMICAL_X_XYZ);
            put1("pdb", MIME_CHEMICAL_X_PDB);
            put1("ra", MIME_AUDIO_X_PN_REALAUDIO);
            put1("ram", MIME_AUDIO_X_PN_REALAUDIO);
            put1("m3u", MIME_AUDIO_X_MPEGURL);
            put1("aifc", MIME_AUDIO_X_AIFF);
            put1("aif", MIME_AUDIO_X_AIFF);
            put1("aiff", MIME_AUDIO_X_AIFF);
            put1("mp3", MIME_AUDIO_MPEG);
            put1("mp2", MIME_AUDIO_MPEG);
            put1("mp1", MIME_AUDIO_MPEG);
            put1("mpga", MIME_AUDIO_MPEG);
            put1("kar", MIME_AUDIO_MIDI);
            put1("mid", MIME_AUDIO_MIDI);
            put1("midi", MIME_AUDIO_MIDI);
            put1("dtd", MIME_APPLICATION_XML_DTD);
            put1("xsl", MIME_APPLICATION_XML);
            put1("xml", MIME_APPLICATION_XML);
            put1("xslt", MIME_APPLICATION_XSLT_XML);
            put1("xht", MIME_APPLICATION_XHTML_XML);
            put1("xhtml", MIME_APPLICATION_XHTML_XML);
            put1("src", MIME_APPLICATION_X_WAIS_SOURCE);
            put1("ustar", MIME_APPLICATION_X_USTAR);
            put1("ms", MIME_APPLICATION_X_TROFF_MS);
            put1("me", MIME_APPLICATION_X_TROFF_ME);
            put1("man", MIME_APPLICATION_X_TROFF_MAN);
            put1("roff", MIME_APPLICATION_X_TROFF);
            put1("tr", MIME_APPLICATION_X_TROFF);
            put1("t", MIME_APPLICATION_X_TROFF);
            put1("texi", MIME_APPLICATION_X_TEXINFO);
            put1("texinfo", MIME_APPLICATION_X_TEXINFO);
            put1("tex", MIME_APPLICATION_X_TEX);
            put1("tcl", MIME_APPLICATION_X_TCL);
            put1("sv4crc", MIME_APPLICATION_X_SV4CRC);
            put1("sv4cpio", MIME_APPLICATION_X_SV4CPIO);
            put1("sit", MIME_APPLICATION_X_STUFFIT);
            put1("swf", MIME_APPLICATION_X_SHOCKWAVE_FLASH);
            put1("shar", MIME_APPLICATION_X_SHAR);
            put1("sh", MIME_APPLICATION_X_SH);
            put1("cdf", MIME_APPLICATION_X_NETCDF);
            put1("nc", MIME_APPLICATION_X_NETCDF);
            put1("latex", MIME_APPLICATION_X_LATEX);
            put1("skm", MIME_APPLICATION_X_KOAN);
            put1("skt", MIME_APPLICATION_X_KOAN);
            put1("skd", MIME_APPLICATION_X_KOAN);
            put1("skp", MIME_APPLICATION_X_KOAN);
            put1("js", MIME_APPLICATION_X_JAVASCRIPT);
            put1("hdf", MIME_APPLICATION_X_HDF);
            put1("gtar", MIME_APPLICATION_X_GTAR);
            put1("spl", MIME_APPLICATION_X_FUTURESPLASH);
            put1("dvi", MIME_APPLICATION_X_DVI);
            put1("dxr", MIME_APPLICATION_X_DIRECTOR);
            put1("dir", MIME_APPLICATION_X_DIRECTOR);
            put1("dcr", MIME_APPLICATION_X_DIRECTOR);
            put1("csh", MIME_APPLICATION_X_CSH);
            put1("cpio", MIME_APPLICATION_X_CPIO);
            put1("pgn", MIME_APPLICATION_X_CHESS_PGN);
            put1("vcd", MIME_APPLICATION_X_CDLINK);
            put1("bcpio", MIME_APPLICATION_X_BCPIO);
            put1("rm", MIME_APPLICATION_VND_RNREALMEDIA);
            put1("ppt", MIME_APPLICATION_VND_MSPOWERPOINT);
            put1("mif", MIME_APPLICATION_VND_MIF);
            put1("grxml", MIME_APPLICATION_SRGS_XML);
            put1("gram", MIME_APPLICATION_SRGS);
            put1("smil", MIME_APPLICATION_RDF_SMIL);
            put1("smi", MIME_APPLICATION_RDF_SMIL);
            put1("rdf", MIME_APPLICATION_RDF_XML);
            put1("ogg", MIME_APPLICATION_X_OGG);
            put1("oda", MIME_APPLICATION_ODA);
            put1("dmg", MIME_APPLICATION_OCTET_STREAM);
            put1("lzh", MIME_APPLICATION_OCTET_STREAM);
            put1("so", MIME_APPLICATION_OCTET_STREAM);
            put1("lha", MIME_APPLICATION_OCTET_STREAM);
            put1("dms", MIME_APPLICATION_OCTET_STREAM);
            put1("bin", MIME_APPLICATION_OCTET_STREAM);
            put1("mathml", MIME_APPLICATION_MATHML_XML);
            put1("cpt", MIME_APPLICATION_MAC_COMPACTPRO);
            put1("hqx", MIME_APPLICATION_MAC_BINHEX40);
            put1("jnlp", MIME_APPLICATION_JNLP);
            put1("ez", MIME_APPLICATION_ANDREW_INSET);
            put1("txt", MIME_TEXT_PLAIN);
            put1("ini", MIME_TEXT_PLAIN);
            put1("c", MIME_TEXT_PLAIN);
            put1("h", MIME_TEXT_PLAIN);
            put1("cpp", MIME_TEXT_PLAIN);
            put1("cxx", MIME_TEXT_PLAIN);
            put1("cc", MIME_TEXT_PLAIN);
            put1("chh", MIME_TEXT_PLAIN);
            put1("java", MIME_TEXT_PLAIN);
            put1("csv", MIME_TEXT_PLAIN);
            put1("bat", MIME_TEXT_PLAIN);
            put1("cmd", MIME_TEXT_PLAIN);
            put1("asc", MIME_TEXT_PLAIN);
            put1("rtf", MIME_TEXT_RTF);
            put1("rtx", MIME_TEXT_RICHTEXT);
            put1("html", MIME_TEXT_HTML);
            put1("htm", MIME_TEXT_HTML);
            put1("zip", MIME_APPLICATION_ZIP);
            put1("rar", MIME_APPLICATION_X_RAR_COMPRESSED);
            put1("gzip", MIME_APPLICATION_X_GZIP);
            put1("gz", MIME_APPLICATION_X_GZIP);
            put1("tgz", MIME_APPLICATION_TGZ);
            put1("tar", MIME_APPLICATION_X_TAR);
            put1("gif", MIME_IMAGE_GIF);
            put1("jpeg", MIME_IMAGE_JPEG);
            put1("jpg", MIME_IMAGE_JPEG);
            put1("jpe", MIME_IMAGE_JPEG);
            put1("tiff", MIME_IMAGE_TIFF);
            put1("tif", MIME_IMAGE_TIFF);
            put1("png", MIME_IMAGE_PNG);
            put1("au", MIME_AUDIO_BASIC);
            put1("snd", MIME_AUDIO_BASIC);
            put1("wav", MIME_AUDIO_X_WAV);
            put1("mov", MIME_VIDEO_QUICKTIME);
            put1("qt", MIME_VIDEO_QUICKTIME);
            put1("mpeg", MIME_VIDEO_MPEG);
            put1("mpg", MIME_VIDEO_MPEG);
            put1("mpe", MIME_VIDEO_MPEG);
            put1("abs", MIME_VIDEO_MPEG);
            put1("doc", MIME_APPLICATION_MSWORD);
            put1("docx", MIME_APPLICATION_MSWORD_2007);
            put1("odt", MIME_APPLICATION_VND_TEXT);
            put1("xls", MIME_APPLICATION_VND_MSEXCEL);
            put1("xlsx", MIME_APPLICATION_VND_MSEXCEL_2007);
            put1("ods", MIME_APPLICATION_VND_SPREADSHEET);
            put1("eps", MIME_APPLICATION_POSTSCRIPT);
            put1("ai", MIME_APPLICATION_POSTSCRIPT);
            put1("ps", MIME_APPLICATION_POSTSCRIPT);
            put1("pdf", MIME_APPLICATION_PDF);
            put1("exe", MIME_APPLICATION_OCTET_STREAM);
            put1("dll", MIME_APPLICATION_OCTET_STREAM);
            put1("class", MIME_APPLICATION_OCTET_STREAM);
            put1("jar", MIME_APPLICATION_JAVA_ARCHIVE);
        }
    };
}

static {
    extMapping = new HashMap<String, String>(200) {
        private void put1(String key, String value) {
            if (put(key, value) != null) {
                throw new IllegalArgumentException("Duplicated Mimetype: " + key);
            }
        }

        {
            put1(MIME_APPLICATION_VND_MOZZILLA_XUL_XML, "xul");
            put1(MIME_APPLICATION_JSON, "json");
            put1(MIME_X_CONFERENCE_X_COOLTALK, "ice");
            put1(MIME_VIDEO_X_SGI_MOVIE, "movie");
            put1(MIME_VIDEO_X_MSVIDEO, "avi");
            put1(MIME_VIDEO_X_MS_WMV, "wmv");
            put1(MIME_VIDEO_VND_MPEGURL, "m4u");
            put1(MIME_TEXT_X_COMPONENT, "htc");
            put1(MIME_TEXT_X_SETEXT, "etx");
            put1(MIME_TEXT_VND_WAP_WMLSCRIPT, "wmls");
            put1(MIME_TEXT_VND_WAP_XML, "wml");
            put1(MIME_TEXT_TAB_SEPARATED_VALUES, "tsv");
            put1(MIME_TEXT_SGML, "sgml");
            put1(MIME_TEXT_CSS, "css");
            put1(MIME_TEXT_CALENDAR, "ics");
            put1(MIME_MODEL_VRLM, "vrlm");
            put1(MIME_MODEL_MESH, "mesh");
            put1(MIME_MODEL_IGES, "iges");
            put1(MIME_IMAGE_X_RGB, "rgb");
            put1(MIME_IMAGE_X_PORTABLE_PIXMAP, "ppm");
            put1(MIME_IMAGE_X_PORTABLE_GRAYMAP, "pgm");
            put1(MIME_IMAGE_X_PORTABLE_BITMAP, "pbm");
            put1(MIME_IMAGE_X_PORTABLE_ANYMAP, "pnm");
            put1(MIME_IMAGE_X_ICON, "ico");
            put1(MIME_IMAGE_X_CMU_RASTER, "ras");
            put1(MIME_IMAGE_WAP_WBMP, "wbmp");
            put1(MIME_IMAGE_VND_DJVU, "djvu");
            put1(MIME_IMAGE_SVG_XML, "svg");
            put1(MIME_IMAGE_IEF, "ief");
            put1(MIME_IMAGE_CGM, "cgm");
            put1(MIME_IMAGE_BMP, "bmp");
            put1(MIME_CHEMICAL_X_XYZ, "xyz");
            put1(MIME_CHEMICAL_X_PDB, "pdb");
            put1(MIME_AUDIO_X_PN_REALAUDIO, "ra");
            put1(MIME_AUDIO_X_MPEGURL, "m3u");
            put1(MIME_AUDIO_X_AIFF, "aiff");
            put1(MIME_AUDIO_MPEG, "mp3");
            put1(MIME_AUDIO_MIDI, "midi");
            put1(MIME_APPLICATION_XML_DTD, "dtd");
            put1(MIME_APPLICATION_XML, "xml");
            put1(MIME_APPLICATION_XSLT_XML, "xslt");
            put1(MIME_APPLICATION_XHTML_XML, "xhtml");
            put1(MIME_APPLICATION_X_WAIS_SOURCE, "src");
            put1(MIME_APPLICATION_X_USTAR, "ustar");
            put1(MIME_APPLICATION_X_TROFF_MS, "ms");
            put1(MIME_APPLICATION_X_TROFF_ME, "me");
            put1(MIME_APPLICATION_X_TROFF_MAN, "man");
            put1(MIME_APPLICATION_X_TROFF, "roff");
            put1(MIME_APPLICATION_X_TEXINFO, "texi");
            put1(MIME_APPLICATION_X_TEX, "tex");
            put1(MIME_APPLICATION_X_TCL, "tcl");
            put1(MIME_APPLICATION_X_SV4CRC, "sv4crc");
            put1(MIME_APPLICATION_X_SV4CPIO, "sv4cpio");
            put1(MIME_APPLICATION_X_STUFFIT, "sit");
            put1(MIME_APPLICATION_X_SHOCKWAVE_FLASH, "swf");
            put1(MIME_APPLICATION_X_SHAR, "shar");
            put1(MIME_APPLICATION_X_SH, "sh");
            put1(MIME_APPLICATION_X_NETCDF, "cdf");
            put1(MIME_APPLICATION_X_LATEX, "latex");
            put1(MIME_APPLICATION_X_KOAN, "skm");
            put1(MIME_APPLICATION_X_JAVASCRIPT, "js");
            put1(MIME_APPLICATION_X_HDF, "hdf");
            put1(MIME_APPLICATION_X_GTAR, "gtar");
            put1(MIME_APPLICATION_X_FUTURESPLASH, "spl");
            put1(MIME_APPLICATION_X_DVI, "dvi");
            put1(MIME_APPLICATION_X_DIRECTOR, "dir");
            put1(MIME_APPLICATION_X_CSH, "csh");
            put1(MIME_APPLICATION_X_CPIO, "cpio");
            put1(MIME_APPLICATION_X_CHESS_PGN, "pgn");
            put1(MIME_APPLICATION_X_CDLINK, "vcd");
            put1(MIME_APPLICATION_X_BCPIO, "bcpio");
            put1(MIME_APPLICATION_VND_RNREALMEDIA, "rm");
            put1(MIME_APPLICATION_VND_MSPOWERPOINT, "ppt");
            put1(MIME_APPLICATION_VND_MIF, "mif");
            put1(MIME_APPLICATION_SRGS_XML, "grxml");
            put1(MIME_APPLICATION_SRGS, "gram");
            put1(MIME_APPLICATION_RDF_SMIL, "smil");
            put1(MIME_APPLICATION_RDF_XML, "rdf");
            put1(MIME_APPLICATION_X_OGG, "ogg");
            put1(MIME_APPLICATION_ODA, "oda");
            put1(MIME_APPLICATION_MATHML_XML, "mathml");
            put1(MIME_APPLICATION_MAC_COMPACTPRO, "cpt");
            put1(MIME_APPLICATION_MAC_BINHEX40, "hqx");
            put1(MIME_APPLICATION_JNLP, "jnlp");
            put1(MIME_APPLICATION_ANDREW_INSET, "ez");
            put1(MIME_TEXT_PLAIN, "txt");
            put1(MIME_TEXT_RTF, "rtf");
            put1(MIME_TEXT_RICHTEXT, "rtx");
            put1(MIME_TEXT_HTML, "html");
            put1(MIME_APPLICATION_ZIP, "zip");
            put1(MIME_APPLICATION_X_RAR_COMPRESSED, "rar");
            put1(MIME_APPLICATION_X_GZIP, "gzip");
            put1(MIME_APPLICATION_TGZ, "tgz");
            put1(MIME_APPLICATION_X_TAR, "tar");
            put1(MIME_IMAGE_GIF, "gif");
            put1(MIME_IMAGE_JPEG, "jpg");
            put1(MIME_IMAGE_TIFF, "tiff");
            put1(MIME_IMAGE_PNG, "png");
            put1(MIME_AUDIO_BASIC, "au");
            put1(MIME_AUDIO_X_WAV, "wav");
            put1(MIME_VIDEO_QUICKTIME, "mov");
            put1(MIME_VIDEO_MPEG, "mpg");
            put1(MIME_APPLICATION_MSWORD, "doc");
            put1(MIME_APPLICATION_MSWORD_2007, "docx");
            put1(MIME_APPLICATION_VND_TEXT, "odt");
            put1(MIME_APPLICATION_VND_MSEXCEL, "xls");
            put1(MIME_APPLICATION_VND_SPREADSHEET, "ods");
            put1(MIME_APPLICATION_POSTSCRIPT, "ps");
            put1(MIME_APPLICATION_PDF, "pdf");
            put1(MIME_APPLICATION_OCTET_STREAM, "exe");
            put1(MIME_APPLICATION_JAVA_ARCHIVE, "jar");
        }
    };
}

/**
 * Registers MIME type for provided extension. Existing extension type will be overriden.
 */
public static void registerMimeType(String ext, String mimeType) {
    mimeTypeMapping.put(ext, mimeType);
}

/**
 * Returns the corresponding MIME type to the given extension.
 * If no MIME type was found it returns 'application/octet-stream' type.
 */
public static String getMimeType(String ext) {
    String mimeType = lookupMimeType(ext);
    if (mimeType == null) {
        mimeType = MIME_APPLICATION_OCTET_STREAM;
    }
    return mimeType;
}

/**
 * Simply returns MIME type or <code>null</code> if no type is found.
 */
public static String lookupMimeType(String ext) {
    return mimeTypeMapping.get(ext.toLowerCase());
}

/**
 * Simply returns Ext or <code>null</code> if no Mimetype is found.
 */
public static String lookupExt(String mimeType) {
    return extMapping.get(mimeType.toLowerCase());
}

/**
 * Returns the default Ext to the given MimeType.
 * If no MIME type was found it returns 'unknown' ext.
 */
public static String getDefaultExt(String mimeType) {
    String ext = lookupExt(mimeType);
    if (ext == null) {
        ext = "unknown";
    }
    return ext;
}
}

And you can use it like this:你可以像这样使用它:

@Test
public void mimetypePdf() throws Exception {

    String ext = MimeTypes.getDefaultExt(MimeTypes.MIME_APPLICATION_PDF);
    Assert.assertEquals("Not equals", "pdf", ext);
}

You want to look at file tika-mimetypes.xml -> check out tika's source code and :您想查看文件 tika-mimetypes.xml -> 查看 tika 的源代码和:

org.apache.tika.mime.MimeTypesReader

     } else if (nodeElement.getTagName().equals(GLOB_TAG)) {
         boolean useRegex = Boolean.valueOf(nodeElement.getAttribute(ISREGEX_ATTR));
         types.addPattern(type, nodeElement.getAttribute(PATTERN_ATTR), useRegex);

You can then work wit然后你可以机智地工作

org.apache.tika.mime.MimeTypes

      private Patterns patterns = new Patterns(registry);

If you are using Apache tike version 1.24.1 then try following code如果您使用的是 Apache 1.24.1版, 1.24.1尝试以下代码

// File Input Stream  is "inputStream"
try 
{
    TikaConfig tikaConfig= new TikaConfig();                    
    Detector detector = tikaConfig.getDetector();

    TikaInputStream stream = TikaInputStream.get(item.getInputStream());

    Metadata metadata = new Metadata();
    metadata.add(Metadata.RESOURCE_NAME_KEY, item.getName());
    MediaType mediaType = detector.detect(stream, metadata);
     
    MimeType mimeType = tikaConfig.getMimeRepository().forName(mediaType.toString());
    String extension =  mimeType.getExtension().split("\\.")[1];
    System.out.println("File Extentions is : "+ extension);
                
} 
catch (TikaException | IOException e1) 
{
    e1.printStackTrace();
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM