简体   繁体   English

Azure 数据工厂 V2 多个环境,如 SSIS

[英]Azure Data Factory V2 multiple environments like in SSIS

I'm coming from a long SSIS background, we're looking to use Azure data factory v2 but I'm struggling to find any (clear) way of working with multiple environments.我来自长期的 SSIS 背景,我们正在寻找使用 Azure 数据工厂 v2,但我正在努力寻找任何(明确的)处理多种环境的方式。 In SSIS we would have project parameters tied to the Visual Studio project configuration (eg development/test/production etc...) and say there were 2 parameters for SourceServerName and DestinationServerName, these would point to different servers if we were in development or test.在 SSIS 中,我们将项目参数与 Visual Studio 项目配置(例如开发/测试/生产等)相关联,并说 SourceServerName 和 DestinationServerName 有 2 个参数,如果我们处于开发或测试阶段,它们将指向不同的服务器.

From my initial playing around I can't see any way to do this in data factory.从我最初的游戏开始,我看不到在数据工厂中执行此操作的任何方法。 I've searched google of course, but any information I've found seems to be around CI/CD then talks about Git 'branches' and is difficult to follow.我当然搜索过谷歌,但我发现的任何信息似乎都在 CI/CD 周围,然后谈到 Git '分支'并且很难理解。

I'm basically looking for a very simple explanation and example of how this would be achieved in Azure data factory v2 (if it is even possible).我基本上是在寻找一个非常简单的解释和示例,说明如何在 Azure 数据工厂 v2 中实现这一点(如果可能的话)。

It works differently.它的工作方式不同。 You create an instance of data factory per environment and your environments are effectively embedded in each instance.您为每个环境创建一个数据工厂实例,并且您的环境有效地嵌入到每个实例中。

So here's one simple approach:所以这是一种简单的方法:

  1. Create three data factories: dev, test, prod创建三个数据工厂:dev、test、prod
  2. Create your linked services in the dev environment pointing at dev sources and targets在指向开发源和目标的开发环境中创建链接服务
  3. Create the same named linked services in test, but of course these point at your tst systems在测试中创建相同命名的链接服务,但当然这些指向您的 tst 系统
  4. Now when you "migrate" your pipelines from dev to test, they use the same logical name (just like a connection manager)现在,当您将管道从开发“迁移”到测试时,它们使用相同的逻辑名称(就像连接管理器一样)

So you don't designate an environment at execution time or map variables or anything... everything in test just runs against test because that's the way the linked servers have been defined.因此,您无需在执行时指定环境或 map 变量或任何东西......测试中的所有内容都只是针对测试运行,因为这是定义链接服务器的方式。

That's the first step.这是第一步。

The next step is to connect only the dev ADF instance to Git.下一步是将 dev ADF 实例连接到 Git。 If you're a newcomer to Git it can be daunting but it's just a version control system.如果您是 Git 的新手,这可能会让人望而生畏,但它只是一个版本控制系统。 You save your code to it and it remembers every change you made.您将代码保存到其中,它会记住您所做的每一个更改。

Once your pipeline code is in git, the theory is that you migrate code out of git into higher environments in an automated fashion.一旦您的管道代码位于 git 中,理论上您将代码从 git 以自动化方式迁移到更高的环境中。

If you go through the links provided in the other answer, you'll see how you set it up.如果您通过其他答案中提供的链接 go ,您将看到如何设置它。

I do have an issue with this approach though - you have to look up all of your environment values in keystore, which to me is silly because why do we need to designate the test servers hostname everytime we deploy to test?不过,我确实对这种方法有疑问 - 您必须在密钥库中查找所有环境值,这对我来说很愚蠢,因为为什么我们每次部署测试时都需要指定测试服务器主机名?

One last thing is that if you a pipeline that doesn't use a linked service (say a REST pipeline), I haven't found a way to make that environment aware.最后一件事是,如果您的管道不使用链接服务(例如 REST 管道),我还没有找到一种方法来让该环境感知。 I ended up building logic around the current data factories name to dynamically change endpoints.我最终围绕当前数据工厂名称构建逻辑以动态更改端点。

This is a bit of a bran dump but feel free to ask questions.这有点像垃圾场,但请随时提出问题。

Although it's not recommended - yes, you can do it.虽然不推荐 - 是的,你可以做到。
Take a look at Linked Service - in this case, I have a connection to Azure SQL Database:看看链接服务 - 在这种情况下,我连接到 Azure SQL 数据库:
在此处输入图像描述 You have possibilities to use dynamic content for either the server name and database name.您可以对服务器名称和数据库名称使用动态内容。 Just add a parameter to your pipeline, pass it to the Linked Service and use in the required field.只需向您的管道添加一个参数,将其传递给链接服务并在必填字段中使用。
Let me know whether I explained it clearly enough?让我知道我是否解释得足够清楚?

Yes, it's possible although not so simple as it was in VS for SSIS.是的,虽然不像 VS 中的 SSIS 那样简单,但这是可能的。
1) First of all: there is no desktop application for developing ADF, only the browser. 1)首先:开发ADF没有桌面应用,只有浏览器。
Therefore developers should make the changes in their DEV environment and from many reasons, the best way to do it is a way of working with GIT repository connected.因此,开发人员应该在他们的 DEV 环境中进行更改,出于多种原因,最好的方法是使用连接的 GIT 存储库。
2) Then, you need "only": 2)然后,您需要“仅”:
a) publish the changes (it creates/updates adf_publish branch in git) a)发布更改(它在 git 中创建/更新 adf_publish 分支)
b) With Azure DevOps deploy the code from adf_publish replacing required parameters for target environment. b) 使用 Azure DevOps 部署来自 adf_publish 的代码,替换目标环境所需的参数。 I know that at the beginning it sounds horrible, but the sooner you set up an environment like this the more time you save while developing pipelines.我知道一开始这听起来很可怕,但是越早设置这样的环境,您在开发管道时节省的时间就越多。

How to do these things step by step?如何一步一步做这些事情?
I describe all the steps in the following posts:我在以下帖子中描述了所有步骤:
- Setting up Code Repository for Azure Data Factory v2 - 为 Azure 数据工厂 v2 设置代码存储库
- Deployment of Azure Data Factory with Azure DevOps - 使用 Azure DevOps 部署 Azure 数据工厂

I hope this helps.我希望这有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM