简体   繁体   English

如何通过文件的URL找出文件的扩展名?

[英]How to find out extension of a file through its url?

I am trying to find out what extensions a particular url has, Here is what I am trying to do: 我正在尝试找出特定网址具有的扩展名,这是我正在尝试做的事情:

$pathinfo = pathinfo('http://imgur.com/9P54j');
$extension = $pathinfo['extension'];
echo $extension;

The url 'http://imgur.com/9P54j' is actually a url having 9P54j.gif image, and its not evident in the url, how do I extract the extension .gif of the the file '9P54j' ? 网址'http://imgur.com/9P54j'实际上是具有9P54j.gif图像的网址,并且在网址中不明显,如何提取文件'9P54j'的扩展名.gif?

That URL is not a URL to the .gf image, but a page that contains the image in its HTML. 该URL并非.gf图像的URL,而是一个包含HTML图像的页面。 You will want to parse the HTML for the URL to the image. 您将需要解析HTML图像的URL。 Try: rightclick on the image in the link you provided above, and click "open image" or "view image" to see the full URL. 尝试:右键单击上面提供的链接中的图像,然后单击“打开图像”或“查看图像”以查看完整的URL。

Even that URL may not have an extension because the data may be streamed to the user bia PHP. 甚至那个URL都可能没有扩展名,因为数据可能通过PHP流传输到用户。 If that's the case, just check the Content-Type header to find out what the extension is. 如果是这种情况,只需检查Content-Type标头以找出扩展名是什么。

You can use a regex to extract it, something like this: 您可以使用正则表达式来提取它,如下所示:

$url = 'http://imgur.com/9P54j';
$content = file_get_contents($url);
$explode = explode('/', $url);
preg_match('/http:\/\/i\.imgur\.com\/' . array_pop($explode) . '(\.[A-z]{1,4})/', $content, $matches);
$ext = $matches[1]; // '.gif'

My answer assumes that would like to grab the file's extension from urls that have no extensions in the url itself. 我的答案假设这是想从URL本身没有扩展名的URL中获取文件的扩展名。

Using pathinfo() will not work as it retrieves the extension using text procession and in the url there is just no extension. 使用pathinfo()无效,因为它使用文本处理来检索扩展名,并且在url中仅没有扩展名。

An approach would be to use lower level http functionality that allows to send a http request to the url and fetch the response headers. 一种方法是使用较低级别的http功能,该功能允许将http请求发送到url并获取响应标头。 The response headers should regulary contain the 'Content-Type:' header that shows us the mimetype of the content. 响应标头应定期包含“ Content-Type:”标头,该标头向我们显示内容的模仿类型。

Once having the 'Content-Type' header you could use a translation table and translation mimetype to file extension. 获得“ Content-Type”标头后,您可以使用翻译表和翻译模仿类型来扩展文件。 This list of supported extensions would of course be limited und there are mimetypes that can translate to more than one extension. 当然,此受支持的扩展名列表将受到限制,并且存在可以转换为多个扩展名的mimetypes。 In such cases you would have to do further investigations on the files content itself. 在这种情况下,您将不得不对文件内容本身做进一步调查。

As a real php programm would be too large for this answer I'll give just a pseudo code: 作为一个真正的php程序,对于这个答案来说太大了,我只给出一个伪代码:

<?php

function get_extension_from_mimetype($url) {
    // static translation table. to be extended
    static $translationTable = array (
        'image/jpeg' => 'jpg',
        'text/xml' => 'xml',
        'text/html' => 'html'
    );

    $defaultExtension = 'dat';

    // you'll have to write this method 
    $mimetype = get_mimetype_by_url($url);

    if(isset($translationTable[$mimetype])) {
        $extension = $translationTable[$mimetype];
    } else {
        $extension = $defaultExtension;
    }

    return $extension;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM