简体   繁体   English

从Azure Data Lake下载文件

[英]Download files from the azure data lake

I upload my files in azure data lake. 我将文件上传到蔚蓝的数据湖中。 I try to download that file through asp.net mvc application.I have adl path for that file. 我尝试通过asp.net mvc应用程序下载该文件。我具有该文件的adl路径。 I can download below 150 MB files. 我可以下载150 MB以下的文件。 But i can't download the more then 150 MB files. 但是我无法下载超过150 MB的文件。 Time out error came. 超时错误来了。

My Code in the bellow... 我的代码在下面...

public ActionResult Download(string adlpath)
{
    String header = adlpath;
    Console.WriteLine(header);
    string[] splitedStr = header.Split('/');
    var path = GenerateDownloadPaths(adlpath);
    string filename = path["fileName"];
    HttpResponseMessage val = DataDownloadFile(path["fileSrcPath"]);
    byte[] filedata = val.Content.ReadAsByteArrayAsync().Result;
    string contentType = MimeMapping.GetMimeMapping(filename);
    var cd = new System.Net.Mime.ContentDisposition
    {
        FileName = filename,
        Inline = true,
    };
    Response.AppendHeader("Content-Disposition", cd.ToString());

    return File(filedata, contentType);
}

public HttpResponseMessage DataDownloadFile(string srcFilePath)
{
    string DownloadUrl = "https://{0}.azuredatalakestore.net/webhdfs/v1/{1}?op=OPEN&read=true";
    var fullurl = string.Format(DownloadUrl, _datalakeAccountName, srcFilePath);

    using (var client = new HttpClient())
    {
        client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", _accesstoken.access_token);
        using (var formData = new MultipartFormDataContent())
        {
            resp = client.GetAsync(fullurl).Result;
        }
    }
    return resp;
}

Image : 图片 :
在此处输入图片说明

You should modify your code to use async and await . 您应该修改代码以使用asyncawait Your implementation blocks while retrieving the file and that is probably what times out: 您的实现在检索文件时会阻塞,这可能是超时的原因:

public async Task<HttpResponseMessage> DataDownloadFile(string srcFilePath)
{
    string DownloadUrl = "https://{0}.azuredatalakestore.net/webhdfs/v1/{1}?op=OPEN&read=true";
    var fullurl = string.Format(DownloadUrl, _datalakeAccountName, srcFilePath);

    using (var client = new HttpClient())
    {
        client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", _accesstoken.access_token);
        using (var formData = new MultipartFormDataContent())
        {
            resp = await client.GetAsync(fullurl);
        }
    }
    return resp;
}

The return value of the method is changed to Task<HttpResponseMessage> and the async modifier is added. 该方法的返回值更改为Task<HttpResponseMessage>并添加了async修饰符。

Calling client.GetAsync is changed to use await instead of blocking by retrieving the Result property. 调用client.GetAsync更改为使用await而不是通过检索Result属性进行阻止。

Your code may still timeout. 您的代码可能仍然超时。 I believe that there is a configurable limit on how long a request can take before it is aborted and if you still get a timeout you should investigate this. 我认为,在中止请求之前可以等待多长时间有可配置的限制,如果仍然超时,则应对此进行调查。

Per my understanding, you could try to increase the HttpClient.Timeout (100 seconds by default) for your HttpClient instance. 根据我的理解,您可以尝试为HttpClient实例增加HttpClient.Timeout (默认为100秒)。

HttpClient.Timeout HttpClient.Timeout

Gets or sets the timespan to wait before the request times out. 获取或设置请求超时之前要等待的时间跨度。

The default value is 100,000 milliseconds (100 seconds). 默认值为100,000毫秒(100秒)。

Moreover, if you host your application via Azure Web App, you may encounter an idle timeout setting of 4 minutes from Azure Load Balancer. 此外,如果通过Azure Web App托管应用程序,则Azure负载平衡器可能会遇到4分钟的空闲超时设置。 You could change the idle timeout setting in Azure VM and Azure Cloud Service. 您可以在Azure VM和Azure云服务中更改空闲超时设置。 Details you could follow here . 您可以在此处关注详细信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM