[英]How to check record count in a csv file uploaded in azure blob storage?
So i am uploading a 2gb csv file to my BLOB storage, and i want the record count (no of rows) of this file, so that i can validate after it gets loaded to ADW.所以我将一个 2gb csv 文件上传到我的 BLOB 存储,我想要这个文件的记录计数(行数),以便我可以在它加载到 ADW 后进行验证。 Is there any way to get record count(like column count) in azure itself.
有什么方法可以在 azure 中获得记录数(如列数)。
Thanks in advance提前致谢
Azure Blobs are not like local files: You'd have to download (or stream) your blob to something that works through the file to perform any calculation you're trying to do. Azure Blob 与本地文件不同:您必须将 Blob 下载(或流式传输)到可通过文件运行的内容,以执行您尝试执行的任何计算。
Alternatively, you could mount your blob storage to something like Databricks (Spark cluster) and write your code there (same basic concept).或者,您可以将 blob 存储挂载到 Databricks(Spark 集群)之类的东西,并在那里编写代码(相同的基本概念)。
Or... you could do your record counts prior to (or during) your upload to blob storage.或者...您可以在上传到 Blob 存储之前(或期间)进行记录计数。
Ultimately, how you perform this counting is really up to you.最终,您如何执行此计数完全取决于您。 Blob storage is just bulk storage and knows nothing about file formats.
Blob 存储只是大容量存储,对文件格式一无所知。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.