简体   繁体   English

从 Blob-Storage-Container 读取文件的时差

[英]Time difference reading files from Blob-Storage-Container

We are using Blobfuse for "mounting" our blob-storage-container to an Azure virtual machine as well as to Azure ML Studio.我们正在使用Blobfuse将我们的 blob 存储容器“安装”到 Azure 虚拟机以及 Azure ML Studio。
In our blob-storage-container there are around 400 files each about 1.5MB在我们的 blob-storage-container 中有大约 400 个文件,每个大约 1.5MB

With the Azure VM, the algorithm needs 45 seconds to read all files.对于 Azure VM,该算法需要 45 秒来读取所有文件。
With Azure ML Studio, the same algorithm needs 5 minutes to read all files.使用 Azure ML Studio,相同的算法需要 5 分钟才能读取所有文件。

The Azure VM resource as well as the Azure ML Studio resource are in the same tenant. Azure VM 资源以及 Azure ML Studio 资源位于同一租户中。
These resources use two different computes but have the same specifications.这些资源使用两种不同的计算,但具有相同的规格。

Why does it take so much longer to read all the files when using Azure ML Studio compared to Azure VM?与 Azure VM 相比,为什么使用 Azure ML Studio 读取所有文件花费的时间要长得多?
Is it possible to reduce the time needed for reading all files when using Azure ML Studio without changing the storage file hierarchy in any way?是否可以在不以任何方式更改存储文件层次结构的情况下减少使用 Azure ML Studio 时读取所有文件所需的时间?

It shouldn't take more time to read file with ML Studio comparatively with VM.与 VM 相比,使用 ML Studio 读取文件应该不会花费更多时间。

  1. Once verify the configuration of your authentication with storage account, which will lead for performance drop.一旦使用存储帐户验证您的身份验证配置,这将导致性能下降。

  2. Check E2E latency read and write latency as well还要检查端到端延迟读取和写入延迟

  3. Verify connection to your data in storage services on Azure with Azure Machine Learning datastores使用 Azure 机器学习数据存储验证与 Azure 上存储服务中数据的连接

  4. Azure Machine Learning requires additional configuration steps to communicate with a storage account that is behind a firewall or within a virtual.network. Azure 机器学习需要额外的配置步骤才能与防火墙后面或虚拟网络中的存储帐户进行通信。

  5. If storage account is behind a firewall, you can add your client's IP address to an allow list via the Azure portal.如果存储帐户在防火墙后面,您可以通过 Azure 门户将客户的 IP 地址添加到允许列表。

Reference links: https://learn.microsoft.com/en-us/azure/machine-learning/how-to-connect-data-ui?tabs=credential https://learn.microsoft.com/en-us/azure/machine-learning/concept-optimize-data-processing参考链接: https://learn.microsoft.com/en-us/azure/machine-learning/how-to-connect-data-ui?tabs=credential https://learn.microsoft.com/en-us/天蓝色/机器学习/概念优化数据处理

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Matillion:从 Azure Blob 存储容器和 Windows 文件共享中删除文件 - Matillion: Delete files from Azure Blob Storage Container and Windows Fileshare 从笔记本电脑 spark 读取 Azure blob 存储中的文件时出错 - Error in reading files in Azure blob storage from laptop spark 将文件从 Azure blob 存储移动到 Google 云存储桶 - Moving Files from Azure blob storage to Google cloud storage bucket 从事件网格触发的 function 访问存储容器中的 blob - Access blob in storage container from function triggered by Event Grid 从Azure vm(linux)发送数据到Azure blob存储容器 - Send data from Azure vm (linux) to Azure blob storage container 从谷歌云存储桶中读取文件 - Reading files from google cloud storage bucket Blob 存储 - 直接在其上处理文件 - Blob Storage - handle files directly on it 用于将 csv 个文件从 sftp 服务器复制到 blob 存储的数据集格式 - Dataset format for copying csv files from a sftp server to blob storage 无法将 .bacpac 文件从 Azure VM 上传到 Azure blob 存储容器 - Not able to upload the .bacpac file from Azure VM to Azure blob storage container 使用 Python 客户端异步从 Google Storage 读取多个文件 - Reading multiple files from Google Storage using Python client asynchronously
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM