简体   繁体   English

Azure SQL 故障转移组,宽限期是什么意思?

[英]Azure SQL failover group, what does the grace period mean?

I am currently reading this: https://docs.microsoft.com/en-us/azure/sql-database/sql-database-auto-failover-group , and I have a hard time understanding the automatic failover policy:我目前正在阅读: https://docs.microsoft.com/en-us/azure/sql-database/sql-database-auto-failover-group ,我很难理解自动故障转移策略:

By default, a failover group is configured with an automatic failover policy.默认情况下,故障转移组配置有自动故障转移策略。 The SQL Database service triggers failover after the failure is detected and the grace period has expired. SQL 数据库服务在检测到故障并且宽限期已过后触发故障转移。 The system must verify that the outage cannot be mitigated by the built-in high availability infrastructure of the SQL Database service due to the scale of the impact.由于影响的规模,系统必须验证 SQL 数据库服务的内置高可用性基础结构无法缓解中断。 If you want to control the failover workflow from the application, you can turn off automatic failover.如果要从应用程序控制故障转移工作流,可以关闭自动故障转移。

When defining the failover group in an ARM template:在 ARM 模板中定义故障转移组时:

{
  "condition": "[equals(parameters('redundancyId'), 'pri')]",
  "type": "Microsoft.Sql/servers",
  "kind": "v12.0",
  "name": "[variables('sqlServerPrimaryName')]",
  "apiVersion": "2014-04-01-preview",
  "location": "[parameters('location')]",
  "properties": {
    "administratorLogin": "[parameters('sqlServerPrimaryAdminUsername')]",
    "administratorLoginPassword": "[parameters('sqlServerPrimaryAdminPassword')]",
    "version": "12.0"
  },
  "resources": [
    {
      "condition": "[equals(parameters('redundancyId'), 'pri')]",
      "apiVersion": "2015-05-01-preview",
      "type": "failoverGroups",
      "name": "[variables('sqlFailoverGroupName')]",
      "properties": {
        "serverName": "[variables('sqlServerPrimaryName')]",
        "partnerServers": [
          {
            "id": "[resourceId('Microsoft.Sql/servers/', variables('sqlServerSecondaryName'))]"
          }
        ],
        "readWriteEndpoint": {
          "failoverPolicy": "Automatic",
          "failoverWithDataLossGracePeriodMinutes": 60
        },
        "readOnlyEndpoint": {
          "failoverPolicy": "Disabled"
        },
        "databases": [
          "[resourceId('Microsoft.Sql/servers/databases', variables('sqlServerPrimaryName'), variables('sqlDatabaseName'))]"
        ]
      },
      "dependsOn": [
        "[variables('sqlServerPrimaryName')]",
        "[resourceId('Microsoft.Sql/servers/databases', variables('sqlServerPrimaryName'), variables('sqlDatabaseName'))]",
        "[resourceId('Microsoft.Sql/servers', variables('sqlServerSecondaryName'))]"
      ]
    },
    {
      "condition": "[equals(parameters('redundancyId'), 'pri')]",
      "name": "[variables('sqlDatabaseName')]",
      "type": "databases",
      "apiVersion": "2014-04-01-preview",
      "location": "[parameters('location')]",
      "dependsOn": [
        "[variables('sqlServerPrimaryName')]"
      ],
      "properties": {
        "edition": "[variables('sqlDatabaseEdition')]",
        "requestedServiceObjectiveName": "[variables('sqlDatabaseServiceObjective')]"
      }
    }
  ]
},
{
  "condition": "[equals(parameters('redundancyId'), 'pri')]",
  "type": "Microsoft.Sql/servers",
  "kind": "v12.0",
  "name": "[variables('sqlServerSecondaryName')]",
  "apiVersion": "2014-04-01-preview",
  "location": "[variables('sqlServerSecondaryRegion')]",
  "properties": {
    "administratorLogin": "[parameters('sqlServerSecondaryAdminUsername')]",
    "administratorLoginPassword": "[parameters('sqlServerSecondaryAdminPassword')]",
    "version": "12.0"
  }
}

I specify the readWriteEndpoint like this:我像这样指定 readWriteEndpoint:

    "readWriteEndpoint": {
      "failoverPolicy": "Automatic",
      "failoverWithDataLossGracePeriodMinutes": 60
    }

With a failoverWithDataLossGracePeriodMinutes set to 60 minutes.将 failoverWithDataLossGracePeriodMinutes 设置为 60 分钟。

What does this mean?这是什么意思? I cannot find a clear answer anywhere.我在任何地方都找不到明确的答案。 Does it mean that:是否意味着:

  1. When an outage is happening in my primary region where my primary database resides, the read/write endpoint points to the primary and only after 60 minutes it fails over to my secondary, which becomes the new primary.当我的主要数据库所在的主要区域发生中断时,读/写端点指向主要数据库,并且仅在 60 分钟后才故障转移到我的辅助数据库,后者成为新的主要数据库。 In the 60 minutes, the only way to read my data is to use the readOnlyEndpoint directly? 60分钟,读取我的数据唯一的办法就是直接使用readOnlyEndpoint? OR或者
  2. My read/write endpoint is turned instantly, if they somehow can detect that there was no data to be synced我的读/写端点立即打开,如果他们能以某种方式检测到没有要同步的数据

I think it boils down to: do I have to manually make the failover, if I detect an outage, if I don't care about data loss, but I want to be able to write to my database?我认为这归结为:如果我检测到中断,如果我不关心数据丢失,但我希望能够写入我的数据库,我是否必须手动进行故障转移?

Bonus question: is the reason why the grace period is present because there can be unsynced data on the primary, that will be overwritten, or tossed away, if the secondary becomes the new primary (if i switch manually)?额外问题:存在宽限期的原因是因为主节点上可能存在未同步的数据,如果辅助节点成为新的主节点(如果我手动切换),这些数据将被覆盖或丢弃?

Sorry, I can't keep it to only one question.抱歉,我不能只回答一个问题。 I have read a lot and I really need to know this.我已经阅读了很多,我真的需要知道这一点。

What does this mean?这是什么意思?

It means that:这意味着:

"when a outage is happening in my primary region where my primary database resides, the read/write endpoint points to the primary and only after 60 minutes it fails over to my secondary, which becomes the new primary. " “当我的主要数据库所在的主要区域发生中断时,读/写端点指向主要数据库,并且仅在 60 分钟后它才会故障转移到我的辅助数据库,后者成为新的主要数据库。”

It can't failover automatically even when the data is synced because the high-availability solution in the primary region is trying to do the same thing, and almost all of the time your primary database will come back quickly in the primary region.即使数据同步,它也无法自动进行故障转移,因为主区域中的高可用性解决方案正在尝试做同样的事情,并且几乎所有时间您的主数据库都会在主区域中快速恢复。 Performing an automatic cross-region fail-over would interfere with this.执行自动跨区域故障转移会干扰这一点。

And

"the reason why the grace period is present, is that because the there can be unsynced data on the primary, that will be overwritten, or tossed away, if the secondary becomes the new primary" “存在宽限期的原因是因为主节点上可能存在未同步的数据,如果辅助节点成为新的主节点,这些数据将被覆盖或丢弃”

And to allow time for the database to failover within the primary region.并留出时间让数据库在主要区域内进行故障转移。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM