swarms
Understanding Swarm clusters
A swarm
is a group of machines
that are running Docker and joined into a cluster
. After that has happened, you continue to run the Docker commands you’re used to, but now they are executed on a cluster by a swarm manager
. The machines in a swarm can be physical or virtual
. After joining a swarm, they are referred to as nodes
.
swarm
就是 docker 集群模式。启用 swarm 之后,命令只能在 swarm manager
上执行。 加入集群的机器可以是物理机或虚拟机,加入之后,被称为节点。
Swarm managers can use several strategies
to run containers, such as
“emptiest node”
– which fills the least utilized machines with containers. 为使用率最低的机器分配容器- Or
“global”
, which ensures that each machine gets exactly one instance of the specified container. 为每个机器都分配一个容器
You instruct the swarm manager to use these strategies in the Compose file
, just like the one you have already been using.
Swarm managers are the only machines
in a swarm that can execute your commands
, or authorize
other machines to join
the swarm as workers
. Workers are just
there to provide capacity
and do not have the authority to tell
any other machine what it can and cannot do.
Up until now, you have been using Docker in a single-host mode
on your local machine. But Docker also can be switched into swarm mode
, and that’s what enables the use of swarms. Enabling swarm mode instantly 立即
makes the current machine
a swarm manager
. From then on, Docker will run the commands you execute on the swarm you’re managing, rather than just on the current machine.
Set up your swarm
A swarm is made up of multiple nodes
, which can be either physical or virtual machines
. The basic concept is simple enough: run docker swarm init
to enable swarm mode and make your current
machine a swarm manager
, then run docker swarm join
on other machines to have them join the swarm as workers
.
docker swarm init
: 创建 swarm 且当前机器做为swarm manager (地主)
docker swarm join
: 加入已经创建好的 swarm 做worker (苦工)
Create a cluster
原文中是使用 docker-mechine
创建了两个虚拟机。
由于我运行 docker-ce
的 Ubuntu1604Server
已经是虚拟机了,无法再在里面安装 virtualbox
。 因此,这里开了两个 Ubuntu1604Server
,并且都安装了 docker-ce
节点名 | 节点角色 | IP 地址 | 系统版本 | Docker 版本 |
---|---|---|---|---|
S12 | Manager | 192.168.56.212 | Ubuntu 1604.03 | Docker-ce 17.06 |
S13 | Worker | 192.168.56.213 | Ubuntu 1604.03 | Docker-ce 17.06 |
初始化 swarm
在 S12
上执行命令 docker swarm init
。命令执行后, S12
便成为这个集群的 swarm manager
,控制集群的所有命令都通过 S12
发出。
由于 S12
上有多个 IP,因此在初始化 swarm 的时候,必须使用 --advertise-addr 192.168.56.212
指定 swarm 绑定的 IP。
[user@S012 04.swarm_sample]$ docker swarm init --advertise-addr 192.168.56.212
Swarm initialized: current node (z2yzvrbh0mv2w2yzhoc9ryzmb) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-4750ptov1hfil71va7ofrwvkihvwupn4gs8akpz4hqis13y7u5-59w992ywlsrxme1glcr6axc9a 192.168.56.212:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
初始化完成后,可以看到系统提示,当前主机已经是 manager 地主
了。
并且显示 docker swarm join
命令提示其他主机如何加入该集群。
To add a worker to this swarm, run the following command:
docker swarm join \
--token <token> \
<ip>:<port>
在 S13
上执行 docker swarm join
命令加入刚才创建的 swarm。
user@S013:~$ docker swarm join --token SWMTKN-1-4750ptov1hfil71va7ofrwvkihvwupn4gs8akpz4hqis13y7u5-59w992ywlsrxme1glcr6axc9a 192.168.56.212:2377
This node joined a swarm as a worker.
# 该节点已经成功加入 swarm , 卖身为奴。
注意:在使用
docker swarm join
时,如果node
也是多 IP 环境,也必须使用--advertise-addr ipaddr
选项。否则 docker 会根据路由或者其他条件随机选择一个网卡,然而这个可能不是你所期待的。
现在回到 S12
, 使用 docker node ls
查看当前节点状态
$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
re0g5dw09cwd7h4ciab68uaw1 S013 Ready Active
z2yzvrbh0mv2w2yzhoc9ryzmb * S012 Ready Active Leader
可以看到, S12 和 S13 都已经成为集群的节点了。并且,在 MANAGER STATUS
一栏, S12 被标为 Leader
。
Deploy your app on a cluster
继续留在 S12 上, 通过 swarm manager 的身份向集群发布应用。
找到 03.service
节中的 docker-compose.yml
。
使用 docker stack deploy
命令发布应用。
[user@S012 03.service_sample]$ docker stack deploy -c docker-compose.yml getstartedlab
Creating network getstartedlab_webnet
Creating service getstartedlab_web
需要注意的是: 由于 S13上面没有
octowhale/friendlyhello:tag
镜像,因此所有在 S13 节点上启动容器都失败了,提示 No such image:... 。 最后发现 5 个容器都启动在 S12 上了。注意: 这里其实是我将镜像名字写错了。 本来应该是
latest
而写成了tag
。
[user@S012 03.service_sample]$ docker stack ps getstartedlab
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
tqxgpkg81j3d getstartedlab_web.1 octowhale/friendlyhello:tag S013 Ready Preparing 1 second ago
lylttxrahjsm \_ getstartedlab_web.1 octowhale/friendlyhello:tag S012 Shutdown Rejected 1 second ago "No such image: octowhale/friend…"
... 略 ...
首先,在 S12 上执行命令 docker stack rm
关闭服务。
[user@S012 03.service_sample]$ docker stack rm getstartedlab
Removing service getstartedlab_web
Removing network getstartedlab_webnet
然后,切回到 S13
,使用 docker pull
命令拉取镜像 octowhale/friendlyhello:lastest
。
$ docker pull octowhale/friendlyhello:latest
latest: Pulling from octowhale/friendlyhello
ad74af05f5a2: Pull complete
a36a1c51ab4d: Pull complete
be169522399f: Pull complete
286703095347: Pull complete
1c1fda0fa4c6: Pull complete
0951c86a6675: Pull complete
0302b40f6cd6: Pull complete
最后,再次回到 S12
上发布应用
使用 docker stack ps getstartedlab
查看结果
[user@S012 03.service_sample]$ docker stack ps getstartedlab
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
tcw0rpmg3jmh getstartedlab_web.1 octowhale/friendlyhello:latest S013 Running Running 8 seconds ago
v2ww0v172uch getstartedlab_web.2 octowhale/friendlyhello:latest S013 Running Running 8 seconds ago
yhlyjw4quyol getstartedlab_web.3 octowhale/friendlyhello:latest S012 Running Running 9 seconds ago
5yyyq98y7ad4 getstartedlab_web.4 octowhale/friendlyhello:latest S013 Running Running 8 seconds ago
qpbonq3g97sv getstartedlab_web.5 octowhale/friendlyhello:latest S012 Running Running 9 seconds ago
可以发现,容器已经在两台节点上分别启动起来了。
注意: 第二次发布应用的时候,
S13
依旧失败,提示"starting container failed: Ad…"
, 经分析,应该是觉得docker-compose.yml
发布是端口映射的原因- "80:80"
。 由于用户普通用户,不能打开80
端口。修改端口为8080
后,重新发布正常了。
# 错误提示
j9hnjmfghnmb \_ getstartedlab_web.2 octowhale/friendlyhello:latest S013 Shutdown Failed 31 seconds ago "starting container failed: Ad…"
Accessing your cluster
You can access your app from the IP address of either 同时 S12
or S13
. The network you created is shared
between them and load-balancing. 共享网络和 LB
The reason both IP addresses work is that nodes in a swarm participate in an ingress routing mesh 共享路由人口
. This ensures that a service deployed at a certain port within your swarm always has that port reserved to itself, no matter what node is actually running the container. Here’s a diagram of how a routing mesh for a service called my-web published at port 8080 on a three-node swarm would look:
Having connectivity trouble?
Keep in mind that in order to use the ingress network in the swarm, you need to have the following ports open between the swarm nodes before you enable swarm mode: 如果要实现 共享路由入口 ingress-routing-mesh,需要在 节点 直接相互允许以下端口访问:
Port 7946 TCP/UDP for container network discovery.
Port 4789 UDP for the container ingress network.
Iterating and scaling your app
From here you can do everything you learned about in part 3.
Scale the app by changing the docker-compose.yml
file.
Change the app behavior by editing code.
In either case, simply run docker stack deploy
again to deploy these changes.
You can join any machine, physical or virtual, to this swarm, using the same docker swarm join
command you used on S13
, and capacity will be added to your cluster. Just run docker stack deploy
afterwards, and your app will take advantage of the new resources.
Cleanup
You can tear down the stack with docker stack rm. For example:
# 在 S12 上执行 rm 删除 service
docker stack rm getstartedlab