简体   繁体   English

Kubernetes 中的 Accumulo 集群部署

[英]Accumulo cluster deployment in Kubernetes

I am trying to use the container from https://github.com/cybermaggedon/accumulo-docker to create a 3 node deployment in the Google Kubernetes Engine.我正在尝试使用https://github.com/cybermaggedon/accumulo-docker 中的容器在 Google Kubernetes Engine 中创建 3 节点部署。 My main problem is how to make the nodes aware of each other.我的主要问题是如何让节点相互了解。 For example, the accumulo/conf/slaves config file contains a list of all the nodes (either names or IPs, one per line), and needs to be replicated across all the nodes.例如, accumulo/conf/slaves配置文件包含所有节点的列表(名称或 IP,每行一个),并且需要跨所有节点复制。 Also, a single Accumulo node is designated as a master, and all slave nodes point to it by making it the only name/IP in the conf/masters file.此外,单个 Accumulo 节点被指定为主节点,所有从节点都通过使其成为 conf/masters 文件中的唯一名称/IP 来指向它。

The documentation for the Accumulo docker container configures each container in this manner by providing environment variables, which are in turn used by the container startup script to rewrite the configuration files for that container, eg Accumulo docker 容器的文档通过提供环境变量以这种方式配置每个容器,容器启动脚本反过来使用这些变量来重写该容器的配置文件,例如

  docker run -d --ip=10.10.10.11 --net my_network \
      -e ZOOKEEPERS=10.10.5.10,10.10.5.11,10.10.5.12 \
      -e HDFS_VOLUMES=hdfs://hadoop01:9000/accumulo \
      -e NAMENODE_URI=hdfs://hadoop01:9000/ \
      -e MY_HOSTNAME=10.10.10.11 \
      -e GC_HOSTS=10.10.10.10 \
      -e MASTER_HOSTS=10.10.10.10 \
      -e SLAVE_HOSTS=10.10.10.10,10.10.10.11,10.10.10.12 \
      -e MONITOR_HOSTS=10.10.10.10 \
      -e TRACER_HOSTS=10.10.10.10 \
      --link hadoop01:hadoop01 \
      --name acc02 cybermaggedon/accumulo:1.8.1h

This is a startup of one of the slave nodes, it includes itself in SLAVE_HOSTS and points to the master in MASTER_HOSTS .这是一个从节点的启动,它包含在SLAVE_HOSTS并指向MASTER_HOSTS的主节点。

If I implement my scaling as a stateful set under Kubernetes, how I can achieve a similar result?如果我在 Kubernetes 下将我的缩放实现为有状态集,我如何才能获得类似的结果? I can modify the container as needed, I have no problem creating my own version.我可以根据需要修改容器,创建自己的版本没有问题。

Disclaimer: Just because it runs on docker it doesn't necessarily mean that it can run on Kubernetes.免责声明:仅仅因为它在 docker 上运行并不一定意味着它可以在 Kubernetes 上运行。 Accumulo is part of the Hadoop/HDFS ecosystem and lots of the components are not necessarily production ready. Accumulo是 Hadoop/HDFS 生态系统的一部分,许多组件不一定是生产就绪的。 Check my other answers: 1 , 2 .检查我的其他答案: 1 , 2

Kubernetes runs its pods using a PodCidr and it's only seen within the cluster. Kubernetes 使用 PodCidr 运行它的 Pod,它只能在集群中看到。 Also, the IP addresses in those for each pod is not fixed, meaning it can change as it moves from one cluster to another or as pods are stopped/started.此外,每个 Pod 的 IP 地址不是固定的,这意味着它可以在从一个集群移动到另一个集群时或在 Pod 停止/启动时发生变化。 The way services/pods are generally discovered in a cluster is using DNS .通常在集群中发现服务/pod 的方式是使用DNS So, for example for the master and slave options, you will probably have to specify a Kubernetes DNS (and considering you are using a StatefulSet that uses ordinal numbers for pods)因此,例如,对于 master 和 slave 选项,您可能必须指定一个 Kubernetes DNS(并考虑到您正在使用一个对 pod 使用序数的StatefulSet

MASTER_HOSTS=acummulo-0.accumulo.default.svc.cluster.local
SLAVE_HOSTS=acummulo-0.accumulo.default.svc.cluster.local,acummulo-1.accumulo.default.svc.cluster.local,acummulo-2.accumulo.default.svc.cluster.local

Since Accumulo is a distributed K/V store, you can take cues from how Cassandra could be deployed on a Kubernetes cluster.由于 Accumulo 是一个分布式 K/V 存储,因此您可以从Cassandra如何部署在 Kubernetes 集群中获取线索。 Hope it helps!希望能帮助到你!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM