简体   繁体   English

遵循传奇模式的微服务中的非自愿中断/ SIGKILL 处理

[英]involuntary disruptions / SIGKILL handling in microservice following saga pattern

Should i engineer my microservice to handle involuntary disruptions like hardware failure?我应该设计我的微服务来处理硬件故障等非自愿中断吗? Are these disruptions frequent enough to be handled in a service running on AWS managed EKS cluster.这些中断是否频繁到足以在 AWS 托管 EKS 集群上运行的服务中处理。
Should i consider some design change in the service to handle the unexpected SIGKILL with methods like persisting the data at each step or will that be considered as over-engineering?我是否应该考虑在服务中进行一些设计更改以使用诸如在每个步骤中持久化数据之类的方法来处理意外的 SIGKILL,还是将其视为过度工程?

What standard way would you suggest for handling these involuntary disruptions if it is如果是,您会建议什么标准方法来处理这些非自愿中断
a) a restful service that responds typically in 1s(follows saga pattern). a) 一个安静的服务,通常在 1 秒内响应(遵循 saga 模式)。 b) a service that process a big 1GB file in 1 hour. b) 在 1 小时内处理 1GB 大文件的服务。

There are couple of ways to handle those disruptions.有几种方法可以处理这些中断。 As mentioned here here :如此所述:

Here are some ways to mitigate involuntary disruptions:以下是一些减轻非自愿干扰的方法:

  • Ensure your pod requests the resources it needs.确保您的 pod 请求它需要的资源。
  • Replicate your application if you need higher availability.如果您需要更高的可用性,请复制您的应用程序。 (Learn about running replicated stateless and stateful applications.) (了解如何运行复制的无状态和有状态应用程序。)
  • For even higher availability when running replicated applications, spread applications across racks (using anti-affinity) or across zones (if using a multi-zone cluster.)为了在运行复制的应用程序时获得更高的可用性,请将应用程序分布在机架之间(使用反关联)或跨区域(如果使用多区域集群)。

The frequency of voluntary disruptions varies.自愿中断的频率各不相同。

So:所以:

  • if your budget allows it, spread your app accross zones or racks, you can use Node affinity to schedule Pods on cetrain nodes,如果您的预算允许,将您的应用程序分布在各个区域或机架上,您可以使用节点亲和性在 cetrain 节点上安排 Pod,
  • make sure to configure Replicas, it will ensure that when one Pod receives SIGKILL the load is automatically directed to another Pod.确保配置副本,它将确保当一个 Pod 收到SIGKILL时,负载会自动定向到另一个 Pod。 You can read more about this here .您可以在此处阅读有关此内容的更多信息。
  • consider using DaemonSets , which ensure each Node runs a copy of a Pod.考虑使用DaemonSets ,它确保每个节点运行一个 Pod 的副本。
  • use Deployments for stateless apps and StatefulSets for stateful.对无状态应用使用部署,对有状态应用使用StatefulSet
  • last thing you can do is to write your app to be distruption tolerant.您可以做的最后一件事是编写您的应用程序以容忍干扰。

I hope I cleared the water a little bit for you, feel free to ask more questions.我希望我为您清理了一点水,请随时提出更多问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Kubernetes 活性探针故障是自愿还是非自愿中断? - Are Kubernetes liveness probe failures voluntary or involuntary disruptions? 微服务活跃度和就绪超时处理 - Microservice liveness and readiness timeout handling 微服务API网关模式如何与自动水平缩放一起使用? - How does the microservice API gateway pattern work with auto Horizontal scaling? 启动容器化微服务的多个实例时处理数据库架构的创建和迁移 - Handling database schema creation and migrations when launching multiple instances of a containerized microservice golang 微服务与数据库微服务的连接问题 - Connection issue with golang microservice to database microservice 微服务集群中的授权架构 - Authorization architecture in microservice cluster 微服务调用同一个 kubernetes 集群中的另一个微服务 - Microservice calling another microservice in same kubernetes cluster gRPC 节点微服务与 istio 网格中的另一个微服务对话 - gRPC Node microservice talking to another microservice in istio mesh 将事件发布到单个微服务实例 - Publishing an Event to a Single Microservice Instance Kubernetes - 按服务名称调用微服务 - Kubernetes - Calling Microservice by Service Name
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM