如何处理传输编码：使用 .NET Core HttpClient.PostAsync 下载文件时分块

Question

Situation情况

I am using HttpClient (System.Net.Http, Version=4.2.1.0) to POST an HTTP request with multipart form data to a web API.我正在使用 HttpClient (System.Net.Http, Version=4.2.1.0) 将带有多部分表单数据的 HTTP 请求发布到 Web API。 The form data includes a string parameter ( benchmark ) and a file ( addressFile ) which is contained in stream .表单数据包括一个字符串参数（ benchmark ）和一个包含在stream的文件（ addressFile ）。 The API call returns a CSV file which I want to save to disk. API 调用返回一个我想保存到磁盘的 CSV 文件。

The response contains the header Transfer-Encoding: chunked and the data contained in responseBytes includes the chunk headers.响应包含头Transfer-Encoding: chunked并且responseBytes包含的数据包括块头。 I would expect the HttpClient library to strip out these headers, which are metadata for the actual content.我希望HttpClient 库去除这些标头，它们是实际内容的元数据。 Instead, it simply includes the header rows in the Content .相反，它只是在Content包含标题行。

Question题

What is the correct way to handle these chunk headers?处理这些块头的正确方法是什么？

I could of course write a method to handle the headers myself, but I find it hard to believe that the HttpClient library doesn't already have this functionality baked in somewhere.我当然可以编写一个方法来自己处理标头，但我发现很难相信 HttpClient 库还没有在某处烘焙此功能。

Code代码

using (var client = new HttpClient())
        {
            var content = new MultipartFormDataContent();
            content.Add(new StringContent("Public_AR_Current"), "benchmark");
            content.Add(new ByteArrayContent(stream.ToArray()), "addressFile", "addressFile.csv");

            var response = await client.PostAsync("https://geocoding.geo.census.gov/geocoder/locations/addressbatch", content);

            var responseBytes = await response.Content.ReadAsByteArrayAsync();
            saveResponse(responseBytes);

            var geocodedItems = ParseGeocodeResponse(responseBytes);
            var parsedItems = geocodedItems.Select(gi => gi.ToEpaHandlerUsCensusGeocode());
            return parsedItems;
        }

Result结果

Note the chunk header on the first and subsequent lines ( 0fe8 , 0060 , 0fe8 ).请注意第一行和后续行（ 0fe8 、 0060 、 0fe8 ）上的0fe8 。

0fe8
0fe8
"AK0000036228","500 HOLLYWOOD DR, ANCHORAGE, AK, 99501","Match","Exact","500 HOLLYWOOD DR, ANCHORAGE, AK, 99501","-149.87424,61.23034","190797469","R"
"AK0000363994","3155 E 18TH CIR, ANCHORAGE, AK, 99508","Match","Non_Exact","3155 E 18TH CIR, ANCHORAGE, AK, 99508","-149.82193,61.20462","190799569","L"
...
0060
28712","N 65 DEG 35 15 W 167 DEG 55 18, WALES, AK, 99734","No_Match"
"AK0000112227","KODIAK ARPR
...
0fe8
T AREA, KODIAK, AK, 99615","No_Match"
"AK0000033902","2130 E DIMOND BLVD, ANCHORAGE, AK, 99515","Match","Non_Exact","2130 W DIMOND BLVD, ANCHORAGE, AK, 99515","-149.91881,61.1375","190795925","L"
"AK0000562769","3100 TONGASS AVE, KETCHIKAN, AK, 99901-5746","No_Match"

Expected Result预期结果

I would expect headers to be stripped out by HttpClient library.我希望 HttpClient 库去除标头。

"AK0000036228","500 HOLLYWOOD DR, ANCHORAGE, AK, 99501","Match","Exact","500 HOLLYWOOD DR, ANCHORAGE, AK, 99501","-149.87424,61.23034","190797469","R"
"AK0000363994","3155 E 18TH CIR, ANCHORAGE, AK, 99508","Match","Non_Exact","3155 E 18TH CIR, ANCHORAGE, AK, 99508","-149.82193,61.20462","190799569","L"
"AK0000228718","1050 ASPEN ST, FAIRBANKS, AK, 99709-5501","Match","Exact","1050 ASPEN ST, FAIRBANKS, AK, 99709","-147.7731,64.8535","605310042","L"
"AK0000536714","SMITH COVE IN SMITH LAGOON T74S R86E CRM S17 & 20, KASAAN, AK, 99901","No_Match"
"AK0001413822","USS-12403, N BANK WOOD RIVER, ALEKNAGIK, AK, 99555","No_Match"
"AK0000489567","BREAKWATER BTWN WESTERN AVE & TAIT ST, METLAKATLA, AK, 99926","No_Match"

Answer 1

I ended up writing this extension method which performs sufficiently well for my use case.我最终编写了这个扩展方法，它在我的用例中表现得足够好。

    public static Task<Stream> ReadAsStreamAsync(this HttpContent content, bool isChunked)
    {
        if (!isChunked)
        {
            return content.ReadAsStreamAsync();
        }
        else
        {
            var task = content.ReadAsStreamAsync()
            .ContinueWith<Stream>((streamTask) =>
            {
                var outputStream = new MemoryStream();
                var buffer = new char[1024 * 1024];
                var stream = streamTask.Result;

                // No using() so that we don't dispose stream.
                var tr = new StreamReader(stream);
                var tw = new StreamWriter(outputStream);

                while (!tr.EndOfStream)
                {
                    var chunkSizeStr = tr.ReadLine().Trim();
                    var chunkSize = int.Parse(chunkSizeStr, System.Globalization.NumberStyles.HexNumber);

                    tr.ReadBlock(buffer, 0, chunkSize);
                    tw.Write(buffer, 0, chunkSize);
                    tr.ReadLine();
                }

                return outputStream;
            });

            return task;
        }


    }

Answer 2

If response is text only.如果响应仅为文本。 Httpclient auto handle it (only for string) Httpclient 自动处理它（仅适用于字符串）

string result = await response.Content.ReadAsStringAsync();

如何处理传输编码：使用 .NET Core HttpClient.PostAsync 下载文件时分块

问题描述

Situation情况

Question题

Code代码

Result结果

Expected Result预期结果

2 个解决方案

解决方案1
3 已采纳 2018-11-23 23:04:48

解决方案2
0 2020-10-28 10:50:56

如何处理传输编码：使用 .NET Core HttpClient.PostAsync 下载文件时分块

问题描述

Situation情况

Question题

Code代码

Result结果

Expected Result预期结果

2 个解决方案

解决方案1 3 已采纳 2018-11-23 23:04:48

解决方案2 0 2020-10-28 10:50:56

解决方案1
3 已采纳 2018-11-23 23:04:48

解决方案2
0 2020-10-28 10:50:56