[英]Creating a REST API to allow upload of large data sets
I am currently creating a suite of REST APIs that will be used to upload a non-determined number of rows of information to our database.我目前正在创建一套 REST API,用于将不确定数量的信息行上传到我们的数据库。 These APIs would be used by developers from a third-party company team.
这些 API 将由第三方公司团队的开发人员使用。
The amount of information would start in the daily bulk upload of about 4k rows of information with an estimated increase of up to 5k more rows of information in about 4 months.信息量将从每天批量上传约 4k 行信息开始,预计在约 4 个月内增加多达 5k 行信息。 My question is, what would be the best way to design said upload API?
我的问题是,设计上传 API 的最佳方式是什么?
Before I write down some of the ideas, I've been reading about here are some considerations to take into account.在我写下一些想法之前,我一直在阅读这里有一些需要考虑的注意事项。
The overall structure of a row of information looks like this, times 4k.一行信息的整体结构是这样的,乘以 4k。
"data": [ {"InfoID": 1, "InfoName": "HELLO", "InfoValue": 1.00, "InfoDate": "2019-01-01"}, {"InfoID": 2, "InfoName": "WORLD", "InfoValue": 2.00, "InfoDate": "2019-01-02"} ]
Some of the ideas I've read about in designing this type of APIs are:我在设计此类 API 时了解到的一些想法是:
Any opinions, recommendations, and ideas would be helpful in taking a design decision.任何意见、建议和想法都有助于做出设计决策。
I would suggest a single endpoint that accepts POST
requests.我建议使用一个接受
POST
请求的端点。 Let the body of the request be the entire batch of data in whatever formats you choose to accept it in - JSON, XML, CSV, etc. Have clients specify the Content-Type
header to indicate what format they're sending the information in. Parse out that format to apply the batch of changes.让请求的正文是您选择接受它的任何格式的整批数据 - JSON、XML、CSV 等。让客户端指定
Content-Type
标头以指示他们发送信息的格式。解析该格式以应用批量更改。 If it's going to take more than a second or so to reply, send a 202 Accepted
right away and a Location
header with an endpoint where they can get a progress report on how the batch processing is going.如果回复时间超过一秒左右,请立即发送
202 Accepted
和带有端点的Location
标头,他们可以在其中获得有关批处理进展情况的进度报告。
Note that you'll have to decide how to handle uploads that have some bad entries in them - either fail the whole batch or accept what you can.请注意,您必须决定如何处理包含一些错误条目的上传 - 要么使整个批次失败,要么接受您所能接受的。
Pagination is probably overkill.分页可能是矫枉过正。 Based on the example you gave, 5k entries is probably less than a single megabyte?
根据您给出的示例,5k 个条目可能小于 1 兆字节? Weigh that against the annoyance of the client having to futz with pagination.
权衡这一点与客户不得不使用分页的烦恼。 As a client, I wouldn't want to have to do that.
作为客户,我不想这样做。
Requiring clients to POST 4k times to get all their data up is probably not the right idea because of the performance cost.由于性能成本,要求客户端 POST 4k 次以获取所有数据可能不是正确的想法。 It's also unlikely that clients will want to parse the data themselves to write the loop.
客户端也不太可能希望自己解析数据来编写循环。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.