Docker是当前最常用的容器运行时引擎,在使用Kubernetes的过程中,我们使用Docker来负责底层的容器的启动、停止。在用户新安装Docker后的使用过程中,发现通过docker run命令启动的容器,使用默认的bridge网络的情况下,容器无法连接到外部网络,针对这个现象进行排查。

缩小问题范围

使用 docker run -it --rm alpine:3.6 /bin/sh 启动一个容器,采用bridge网络,在容器内ping外部网络的IP,我们发现是无法ping通,该命令会hang住。退出该容器,再尝试使用host网络启动容器,docker run -it --rm --network=host alpine:3.6 /bin/sh,这次我们发现是可以ping通外部网络的,说明是docker的默认bridge网络有问题,缩小范围。

容器使用bridge网络的情况下,在ping外部网络的情况下,如果发送的不是Docker启动创建的docker0的网桥,会进行SNAT,然后使用宿主机的网卡出去,那么怀疑是SNAT可能有问题,因此查看iptables中和Docker相关的规则。命令结果如下:

# iptables -t nat -nvL POSTROUTING
Chain POSTROUTING (policy ACCEPT 845 packets, 57086 bytes)
 pkts bytes target     prot opt in     out     source               destination
 1414 96107 POSTROUTING_direct  all  --  *      *       0.0.0.0/0            0.0.0.0/0
 1414 96107 POSTROUTING_ZONES_SOURCE  all  --  *      *       0.0.0.0/0            0.0.0.0/0
 1414 96107 POSTROUTING_ZONES  all  --  *      *       0.0.0.0/0            0.0.0.0/0

根据上面结果,我们发现缺少了一条-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE的iptables规则。我们怀疑是用户在配置Docker的时候,配置出错导致的该问题。

排查过程

根据上面的章节,我们发现是iptable的SNAT规则有问题,那么我们就要看Docker是如何来生成该规则的。

查看配置

首先查看官方的文档,我们看到官方文档有一段关于SNAT的相关配置:

--iptables=false prevents the Docker daemon from adding iptables rules. If multiple daemons manage iptables rules, they may overwrite rules set by another daemon. Be aware that disabling this option requires you to manually add iptables rules to expose container ports. If you prevent Docker from adding iptables rules, Docker will also not add IP masquerading rules, even if you set --ip-masq to true. Without IP masquerading rules, Docker containers will not be able to connect to external hosts or the internet when using network other than default bridge.

可以看到对于缺少的这条规则来说,需要dockerd在启动的时候有两项配置--iptables=true--ip-masq=true,否则Docker不会在创建默认网桥的时候生成该规则,从而容器无法访问外部网络,但是Docker默认以上两个配置项默认均为true。我们让用户登录出现该问题的机器上,查看/etc/docker/daemon.json里的上述两项配置,发现确实是因为设置了--ip-masq=false,这个配置是我们在Kubernetes的环境上开启的,用户在新安装的时候,由于不清楚Docker的相关配置,直接拷贝了Kubernetes环境上的配置,导致覆盖了Docker默认的配置,从而出现了容器内部无法访问外部网络的情况。

到这里问题已经排查清楚,下面我们看下dockerd是如何根据--ip-masq这个参数生成-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE这条iptables规则的。

查看日志

我们把dockerd的debug模式开启,在--ip-masqfalse的情况下,我们重启dockerd的服务,看到日志如下:

未启用ip-masq的dockerd启动日志

Nov 18 10:36:25 localhost systemd[1]: Starting Docker Application Container Engine…
– Subject: Unit docker.service has begun start-up
– Defined-By: systemd

– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit docker.service has begun starting up. Nov 18 10:36:25 localhost dockerd[12450]: time=“2019-11-18T10:36:25.645346653+08:00” level=debug msg=“Listener created for HTTP on unix (/var/run/docker.sock)” Nov 18 10:36:25 localhost dockerd[12450]: time=“2019-11-18T10:36:25.703480025+08:00” level=warning msg=“libcontainerd: could not upgrade state files during live restore for container b4d6a03f5d78537aed03e54307a3074f8c7c9fe4b8b60618c4c61abfe897ea06: json: cannot unmarshal number into Go struct field State.init_process_start of type string” Nov 18 10:36:25 localhost dockerd[12450]: time=“2019-11-18T10:36:25.824283436+08:00” level=warning msg=“libcontainerd: could not upgrade state files during live restore for container 2af4603ecccde03ace1e0c652f6cda75adba1502b729db50bb15de92f75c7021: json: cannot unmarshal number into Go struct field State.init_process_start of type string” Nov 18 10:36:25 localhost dockerd[12450]: time=“2019-11-18T10:36:25.852366488+08:00” level=info msg=“libcontainerd: new containerd process, pid: 12468” Nov 18 10:36:25 localhost dockerd[12450]: time=“2019-11-18T10:36:25.884451175+08:00” level=debug msg=“containerd: grpc api on /var/run/docker/libcontainerd/docker-containerd.sock” Nov 18 10:36:25 localhost dockerd[12450]: time=“2019-11-18T10:36:25.891178698+08:00” level=debug msg=“containerd: read past events” count=624 Nov 18 10:36:25 localhost dockerd[12450]: time=“2019-11-18T10:36:25.895318127+08:00” level=debug msg=“containerd: container restored” id=00ea5e25d0e3f815e37d54dac5286b1c4c85c97457a9a159a2819e603d38ba38 Nov 18 10:36:25 localhost dockerd[12450]: time=“2019-11-18T10:36:25.896635095+08:00” level=debug msg=“containerd: container restored” id=01f7de8633244f2ed7094a706141e714a1e9a745d317a3389e7d73a6498bfd3e Nov 18 10:36:26 localhost dockerd[12450]: time=“2019-11-18T10:36:26.311542843+08:00” level=debug msg=“containerd: supervisor running” cpus=10 memory=15884 runtime=docker-runc runtimeArgs=[] stateDir=“/var/run/docker/libcontainerd/containerd” id=b4d6a03f5d78537aed03e54307a3074f8c7c9fe4b8b60618c4c61abfe897ea06 pid=init status=1 systemPid=9915 Nov 18 10:36:26 localhost dockerd[12450]: time=“2019-11-18T10:36:26.75219215+08:00” level=debug msg=“truncating event log” Nov 18 10:36:26 localhost dockerd[12450]: time=“2019-11-18T10:36:26.885192866+08:00” level=debug msg=“Using default logging driver json-file” Nov 18 10:36:26 localhost dockerd[12450]: time=“2019-11-18T10:36:26.885394029+08:00” level=debug msg=“Golang’s threads limit set to 114120” Nov 18 10:36:27 localhost dockerd[12450]: time=“2019-11-18T10:36:27.037816649+08:00” level=info msg=”[graphdriver] using prior storage driver: overlay” Nov 18 10:36:27 localhost dockerd[12450]: time=“2019-11-18T10:36:27.037886696+08:00” level=debug msg=“Using graph driver overlay” Nov 18 10:36:47 localhost dockerd[12450]: time=“2019-11-18T10:36:47.231452175+08:00” level=debug msg=“Max Concurrent Downloads: 3” Nov 18 10:36:47 localhost dockerd[12450]: time=“2019-11-18T10:36:47.231546307+08:00” level=debug msg=“Max Concurrent Uploads: 5” Nov 18 10:36:47 localhost dockerd[12450]: time=“2019-11-18T10:36:47.765935586+08:00” level=info msg=“Graph migration to content-addressability took 0.00 seconds” Nov 18 10:36:47 localhost dockerd[12450]: time=“2019-11-18T10:36:47.767705404+08:00” level=info msg=“Loading containers: start.” Nov 18 10:36:48 localhost dockerd[12450]: time=“2019-11-18T10:36:48.127053639+08:00” level=debug msg=“Loaded container 00ea5e25d0e3f815e37d54dac5286b1c4c85c97457a9a159a2819e603d38ba38” … Nov 18 10:37:02 localhost dockerd[12450]: time=“2019-11-18T10:37:02.537544686+08:00” level=debug msg=“libcontainerd: restore container d3675c3093e6d9798a1f0d4ac1adf701e14ae5c576838b1922cc287b23643369 state running” Nov 18 10:37:24 localhost dockerd[12450]: time=“2019-11-18T10:37:24.872529481+08:00” level=debug msg=“Option Experimental: false” Nov 18 10:37:24 localhost dockerd[12450]: time=“2019-11-18T10:37:24.872623250+08:00” level=debug msg=“Option DefaultDriver: bridge” Nov 18 10:37:24 localhost dockerd[12450]: time=“2019-11-18T10:37:24.872645419+08:00” level=debug msg=“Option DefaultNetwork: bridge” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.137877612+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -D PREROUTING -m addrtype –dst-type LOCAL -j DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.154943092+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -D OUTPUT -m addrtype –dst-type LOCAL ! –dst 127.0.0.0/8 -j DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.170953079+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -D OUTPUT -m addrtype –dst-type LOCAL -j DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.185271347+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -D PREROUTING]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.198304019+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -D OUTPUT]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.212138653+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -F DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.226024377+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -X DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.240471905+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -F DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.260903009+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -X DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.277556321+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -F DOCKER-ISOLATION]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.296239533+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -X DOCKER-ISOLATION]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.309454170+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -n -L DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.319968029+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -N DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.330245091+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -n -L DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.342196683+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -n -L DOCKER-ISOLATION]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.356755687+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C DOCKER-ISOLATION -j RETURN]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.371465691+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -A DOCKER-ISOLATION -j RETURN]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.475654176+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -D FORWARD -i docker0 -o docker0 -j DROP]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.492682638+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -i docker0 -o docker0 -j ACCEPT]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.506983374+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -i docker0 ! -o docker0 -j ACCEPT]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.522621189+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C PREROUTING -m addrtype –dst-type LOCAL -j DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.535000682+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -A PREROUTING -m addrtype –dst-type LOCAL -j DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.548194608+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C OUTPUT -m addrtype –dst-type LOCAL -j DOCKER ! –dst 127.0.0.0/8]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.559555080+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -A OUTPUT -m addrtype –dst-type LOCAL -j DOCKER ! –dst 127.0.0.0/8]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.573701209+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -j DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.587198568+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -j DOCKER]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.599839612+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -m conntrack –ctstate RELATED,ESTABLISHED -j ACCEPT]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.612769657+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -m conntrack –ctstate RELATED,ESTABLISHED -j ACCEPT]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.628283870+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -j DOCKER-ISOLATION]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.643898371+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -D FORWARD -j DOCKER-ISOLATION]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.661793237+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -I FORWARD -j DOCKER-ISOLATION]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.682111845+08:00” level=debug msg=“Network (aa8157c) restored” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.685520213+08:00” level=debug msg=“Endpoint (2a75234) restored to network (aa8157c)” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.686140799+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C DOCKER -p tcp -d 0/0 –dport 8031 -j DNAT –to-destination 172.17.0.5:8031 ! -i docker0]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.728470794+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -A DOCKER -p tcp -d 0/0 –dport 8031 -j DNAT –to-destination 172.17.0.5:8031 ! -i docker0]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.743250375+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C DOCKER ! -i docker0 -o docker0 -p tcp -d 172.17.0.5 –dport 8031 -j ACCEPT]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.761040207+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -A DOCKER ! -i docker0 -o docker0 -p tcp -d 172.17.0.5 –dport 8031 -j ACCEPT]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.781458609+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C POSTROUTING -p tcp -s 172.17.0.5 -d 172.17.0.5 –dport 8031 -j MASQUERADE]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.817543904+08:00” level=debug msg=“Endpoint (9c4b929) restored to network (aa8157c)” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.817645126+08:00” level=debug msg=“Endpoint (ca32793) restored to network (aa8157c)” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.817965521+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C DOCKER -p tcp -d 0/0 –dport 8200 -j DNAT –to-destination 172.17.0.6:8500 ! -i docker0]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.835336546+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -A DOCKER -p tcp -d 0/0 –dport 8200 -j DNAT –to-destination 172.17.0.6:8500 ! -i docker0]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.852558621+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C DOCKER ! -i docker0 -o docker0 -p tcp -d 172.17.0.6 –dport 8500 -j ACCEPT]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.870653385+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -A DOCKER ! -i docker0 -o docker0 -p tcp -d 172.17.0.6 –dport 8500 -j ACCEPT]” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.919703349+08:00” level=debug msg=“Endpoint (e532036) restored to network (aa8157c)” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.919785071+08:00” level=debug msg=“Endpoint (08d870c) restored to network (aa8157c)” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.970019076+08:00” level=debug msg=“Allocating IPv4 pools for network bridge (aa8157c4bbf581e542fcfb6d0c1291b17b0d8b70d0ce3fd92830a9dfdf69877b)” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.970079737+08:00” level=debug msg=“RequestPool(LocalDefault, 172.17.0.0/16, , map[], false)” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.970245816+08:00” level=debug msg=“RequestAddress(LocalDefault/172.17.0.0/16, 172.17.0.1, map[RequestAddressType:com.docker.network.gateway])” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974256136+08:00” level=debug msg=“Assigning addresses for endpoint elastic_hypatia’s interface on network bridge” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974354929+08:00” level=debug msg=“RequestAddress(LocalDefault/172.17.0.0/16, 172.17.0.2, map[])” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974428765+08:00” level=debug msg=“Assigning addresses for endpoint heuristic_lovelace’s interface on network bridge” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974458297+08:00” level=debug msg=“RequestAddress(LocalDefault/172.17.0.0/16, 172.17.0.3, map[])” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974498116+08:00” level=debug msg=“Assigning addresses for endpoint zen_montalcini’s interface on network bridge” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974529968+08:00” level=debug msg=“RequestAddress(LocalDefault/172.17.0.0/16, 172.17.0.9, map[])” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974567495+08:00” level=debug msg=“Assigning addresses for endpoint sharp_yalow’s interface on network bridge” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974593215+08:00” level=debug msg=“RequestAddress(LocalDefault/172.17.0.0/16, 172.17.0.5, map[])” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974634826+08:00” level=debug msg=“Assigning addresses for endpoint agitated_ardinghelli’s interface on network bridge” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974660075+08:00” level=debug msg=“RequestAddress(LocalDefault/172.17.0.0/16, 172.17.0.8, map[])” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974690759+08:00” level=debug msg=“Assigning addresses for endpoint consul_tttt’s interface on network bridge” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974773988+08:00” level=debug msg=“RequestAddress(LocalDefault/172.17.0.0/16, 172.17.0.7, map[])” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974803107+08:00” level=debug msg=“Assigning addresses for endpoint happy_gates’s interface on network bridge” Nov 18 10:37:25 localhost dockerd[12450]: time=“2019-11-18T10:37:25.974829205+08:00” level=debug msg=“RequestAddress(LocalDefault/172.17.0.0/16, 172.17.0.4, map[])” Nov 18 10:37:33 localhost dockerd[12450]: time=“2019-11-18T10:37:33.275111037+08:00” level=info msg=“There are old running containers, the network config will not take affect” Nov 18 10:37:43 localhost dockerd[12450]: time=“2019-11-18T10:37:43.537678207+08:00” level=info msg=“Loading containers: done.” Nov 18 10:37:43 localhost dockerd[12450]: time=“2019-11-18T10:37:43.877029545+08:00” level=info msg=“Daemon has completed initialization” Nov 18 10:37:43 localhost dockerd[12450]: time=“2019-11-18T10:37:43.877112287+08:00” level=info msg=“Docker daemon” commit=cec0b72 graphdriver=overlay version=17.06.2-ce Nov 18 10:37:43 localhost dockerd[12450]: time=“2019-11-18T10:37:43.877314877+08:00” level=debug msg=“Registering routers”

通过dockerd的启动日志,看到没有关于设置-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE的。接着看下在--ip-masqtrue的情况下,dockerd服务的日志:

启用ip-masq的dockerd日志

Nov 20 19:56:25 localhost systemd[1]: Starting Docker Application Container Engine…
– Subject: Unit docker.service has begun start-up
– Defined-By: systemd

– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit docker.service has begun starting up. Nov 20 19:56:25 localhost dockerd[29194]: time=“2019-11-20T19:56:25.360829851+08:00” level=debug msg=“Listener created for HTTP on unix (/var/run/docker.sock)” Nov 20 19:56:25 localhost dockerd[29194]: time=“2019-11-20T19:56:25.363662244+08:00” level=info msg=“libcontainerd: new containerd process, pid: 29205” Nov 20 19:56:25 localhost dockerd[29194]: time=“2019-11-20T19:56:25.413798954+08:00” level=debug msg=“containerd: grpc api on /var/run/docker/libcontainerd/docker-containerd.sock” Nov 20 19:56:25 localhost dockerd[29194]: time=“2019-11-20T19:56:25.43069298+08:00” level=debug msg=“containerd: read past events” count=889 Nov 20 19:56:25 localhost dockerd[29194]: time=“2019-11-20T19:56:25.431920253+08:00” level=debug msg=“containerd: supervisor running” cpus=10 memory=15884 runtime=docker-runc runtimeArgs=[] stateDir=“/var/run/docker/libcontainerd/containerd” Nov 20 19:56:26 localhost dockerd[29194]: time=“2019-11-20T19:56:26.367814654+08:00” level=debug msg=“Using default logging driver json-file” Nov 20 19:56:26 localhost dockerd[29194]: time=“2019-11-20T19:56:26.367968513+08:00” level=debug msg=“Golang’s threads limit set to 114120” Nov 20 19:56:26 localhost dockerd[29194]: time=“2019-11-20T19:56:26.482988509+08:00” level=info msg=”[graphdriver] using prior storage driver: overlay” Nov 20 19:56:26 localhost dockerd[29194]: time=“2019-11-20T19:56:26.483074356+08:00” level=debug msg=“Using graph driver overlay” Nov 20 19:56:27 localhost dockerd[29194]: time=“2019-11-20T19:56:27.012441762+08:00” level=debug msg=“Max Concurrent Downloads: 3” Nov 20 19:56:27 localhost dockerd[29194]: time=“2019-11-20T19:56:27.012511378+08:00” level=debug msg=“Max Concurrent Uploads: 5” Nov 20 19:56:27 localhost dockerd[29194]: time=“2019-11-20T19:56:27.978489276+08:00” level=info msg=“Graph migration to content-addressability took 0.00 seconds” Nov 20 19:56:27 localhost dockerd[29194]: time=“2019-11-20T19:56:27.981563636+08:00” level=info msg=“Loading containers: start.” Nov 20 19:56:27 localhost dockerd[29194]: time=“2019-11-20T19:56:27.998301236+08:00” level=debug msg=“Loaded container 0019277d551acd628a2a261700084b140f4eb5200d325224f9939c7d8b369210” … Nov 20 19:56:55 localhost dockerd[29194]: time=“2019-11-20T19:56:55.733974785+08:00” level=debug msg=“Option Experimental: false” Nov 20 19:56:55 localhost dockerd[29194]: time=“2019-11-20T19:56:55.734076895+08:00” level=debug msg=“Option DefaultDriver: bridge” Nov 20 19:56:55 localhost dockerd[29194]: time=“2019-11-20T19:56:55.734109116+08:00” level=debug msg=“Option DefaultNetwork: bridge” Nov 20 19:56:55 localhost dockerd[29194]: time=“2019-11-20T19:56:55.845959144+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -D PREROUTING -m addrtype –dst-type LOCAL -j DOCKER]” Nov 20 19:56:55 localhost dockerd[29194]: time=“2019-11-20T19:56:55.874553702+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -D OUTPUT -m addrtype –dst-type LOCAL ! –dst 127.0.0.0/8 -j DOCKER]” Nov 20 19:56:55 localhost dockerd[29194]: time=“2019-11-20T19:56:55.901055240+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -D OUTPUT -m addrtype –dst-type LOCAL -j DOCKER]” Nov 20 19:56:55 localhost dockerd[29194]: time=“2019-11-20T19:56:55.922319105+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -D PREROUTING]” Nov 20 19:56:55 localhost dockerd[29194]: time=“2019-11-20T19:56:55.944060777+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -D OUTPUT]” Nov 20 19:56:55 localhost dockerd[29194]: time=“2019-11-20T19:56:55.967324383+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -F DOCKER]” Nov 20 19:56:55 localhost dockerd[29194]: time=“2019-11-20T19:56:55.987413083+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -X DOCKER]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.014868917+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -F DOCKER]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.036961301+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -X DOCKER]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.055275108+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -F DOCKER-ISOLATION]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.074053719+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -X DOCKER-ISOLATION]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.090371208+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -n -L DOCKER]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.108678619+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -N DOCKER]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.131947559+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -n -L DOCKER]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.148991887+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -n -L DOCKER-ISOLATION]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.164710366+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C DOCKER-ISOLATION -j RETURN]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.181511594+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -A DOCKER-ISOLATION -j RETURN]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.284105227+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.308756270+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C DOCKER -i docker0 -j RETURN]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.335641210+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -I DOCKER -i docker0 -j RETURN]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.363230612+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -D FORWARD -i docker0 -o docker0 -j DROP]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.383616696+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -i docker0 -o docker0 -j ACCEPT]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.402559510+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -i docker0 ! -o docker0 -j ACCEPT]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.421681542+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C PREROUTING -m addrtype –dst-type LOCAL -j DOCKER]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.443809710+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -A PREROUTING -m addrtype –dst-type LOCAL -j DOCKER]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.468883034+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C OUTPUT -m addrtype –dst-type LOCAL -j DOCKER ! –dst 127.0.0.0/8]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.490892976+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -A OUTPUT -m addrtype –dst-type LOCAL -j DOCKER ! –dst 127.0.0.0/8]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.516061320+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -j DOCKER]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.535614080+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -j DOCKER]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.554691848+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -m conntrack –ctstate RELATED,ESTABLISHED -j ACCEPT]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.575195396+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -m conntrack –ctstate RELATED,ESTABLISHED -j ACCEPT]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.594073056+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -j DOCKER-ISOLATION]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.612002882+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -D FORWARD -j DOCKER-ISOLATION]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.633447313+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -I FORWARD -j DOCKER-ISOLATION]” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.656813751+08:00” level=debug msg=“Network (a1a0658) restored” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.696105374+08:00” level=debug msg=“Allocating IPv4 pools for network bridge (a1a065871eb32a252c404df0612daa76062e75d30d19f168a3964da23c637b39)” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.696189927+08:00” level=debug msg=“RequestPool(LocalDefault, 172.17.0.0/16, , map[], false)” Nov 20 19:56:56 localhost dockerd[29194]: time=“2019-11-20T19:56:56.696395058+08:00” level=debug msg=“RequestAddress(LocalDefault/172.17.0.0/16, 172.17.0.1, map[RequestAddressType:com.docker.network.gateway])” Nov 20 19:56:57 localhost dockerd[29194]: time=“2019-11-20T19:56:57.128664833+08:00” level=info msg=“Removing stale sandbox c2652096fb8d27c58a0f43b2a81290cc04606ff6a8ae9fb12020991bcc159e45 (a206fe6accc38d656cf33e31bde6390dfcb94a7d57d7be0f4f3580377cfec219)” Nov 20 19:56:58 localhost dockerd[29194]: time=“2019-11-20T19:56:58.015474182+08:00” level=error msg=“getEndpointFromStore for eid efdaba4bad3c4609bf9e121658cb8e41fc2f1e3830449e6fe43bc54cc8d336d8 failed while trying to build sandbox for cleanup: could not find endpoint efdaba4bad3c4609bf9e121658cb8e41fc2f1e3830449e6fe43bc54cc8d336d8: []” … Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.206143067+08:00” level=info msg=“Fixing inconsistent endpoint_cnt for network none. Expected=0, Actual=4” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.284957847+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.305198036+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -D POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.327371741+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C DOCKER -i docker0 -j RETURN]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.346494574+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -D DOCKER -i docker0 -j RETURN]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.369452177+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -i docker0 -o docker0 -j ACCEPT]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.385374448+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -D FORWARD -i docker0 -o docker0 -j ACCEPT]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.403740178+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -i docker0 ! -o docker0 -j ACCEPT]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.420012781+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -D FORWARD -i docker0 ! -o docker0 -j ACCEPT]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.437025163+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -j DOCKER]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.451082839+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -j DOCKER]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.465296414+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -D FORWARD -o docker0 -j DOCKER]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.481200659+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -m conntrack –ctstate RELATED,ESTABLISHED -j ACCEPT]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.497479356+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -m conntrack –ctstate RELATED,ESTABLISHED -j ACCEPT]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.513781726+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -D FORWARD -o docker0 -m conntrack –ctstate RELATED,ESTABLISHED -j ACCEPT]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.585051073+08:00” level=debug msg=“releasing IPv4 pools from network bridge (a1a065871eb32a252c404df0612daa76062e75d30d19f168a3964da23c637b39)” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.585130601+08:00” level=debug msg=“ReleaseAddress(LocalDefault/172.17.0.0/16, 172.17.0.1)” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.585188268+08:00” level=debug msg=“ReleasePool(LocalDefault/172.17.0.0/16)” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.618285456+08:00” level=debug msg=“cleanupServiceBindings for a1a065871eb32a252c404df0612daa76062e75d30d19f168a3964da23c637b39” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.689639084+08:00” level=info msg=“Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option –bip can be used to set a preferred IP address” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.689744084+08:00” level=debug msg=“Allocating IPv4 pools for network bridge (8d788c8c16ae53e5afc4b42e85f37bf6d90a534d4a912a0e526c1fee9eb372e7)” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.689771299+08:00” level=debug msg=“RequestPool(LocalDefault, 172.17.0.0/16, , map[], false)” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.689858293+08:00” level=debug msg=“RequestAddress(LocalDefault/172.17.0.0/16, 172.17.0.1, map[RequestAddressType:com.docker.network.gateway])” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.690245805+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.706716889+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -I POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.730492852+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C DOCKER -i docker0 -j RETURN]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.751247682+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -I DOCKER -i docker0 -j RETURN]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.774815794+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -D FORWARD -i docker0 -o docker0 -j DROP]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.790259210+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -i docker0 -o docker0 -j ACCEPT]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.807369393+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -I FORWARD -i docker0 -o docker0 -j ACCEPT]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.826121363+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -i docker0 ! -o docker0 -j ACCEPT]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.846859743+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -I FORWARD -i docker0 ! -o docker0 -j ACCEPT]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.867987977+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C PREROUTING -m addrtype –dst-type LOCAL -j DOCKER]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.887877305+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C PREROUTING -m addrtype –dst-type LOCAL -j DOCKER]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.908672750+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C OUTPUT -m addrtype –dst-type LOCAL -j DOCKER ! –dst 127.0.0.0/8]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.927645010+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t nat -C OUTPUT -m addrtype –dst-type LOCAL -j DOCKER ! –dst 127.0.0.0/8]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.950293283+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -j DOCKER]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.968980254+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -I FORWARD -o docker0 -j DOCKER]” Nov 20 19:57:58 localhost dockerd[29194]: time=“2019-11-20T19:57:58.987675944+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -o docker0 -m conntrack –ctstate RELATED,ESTABLISHED -j ACCEPT]” Nov 20 19:57:59 localhost dockerd[29194]: time=“2019-11-20T19:57:59.003987393+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -I FORWARD -o docker0 -m conntrack –ctstate RELATED,ESTABLISHED -j ACCEPT]” Nov 20 19:57:59 localhost dockerd[29194]: time=“2019-11-20T19:57:59.022481990+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -j DOCKER-ISOLATION]” Nov 20 19:57:59 localhost dockerd[29194]: time=“2019-11-20T19:57:59.039402584+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -D FORWARD -j DOCKER-ISOLATION]” Nov 20 19:57:59 localhost dockerd[29194]: time=“2019-11-20T19:57:59.058339359+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -I FORWARD -j DOCKER-ISOLATION]” Nov 20 19:57:59 localhost dockerd[29194]: time=“2019-11-20T19:57:59.139123493+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -n -L DOCKER-USER]” Nov 20 19:57:59 localhost dockerd[29194]: time=“2019-11-20T19:57:59.157298644+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C DOCKER-USER -j RETURN]” Nov 20 19:57:59 localhost dockerd[29194]: time=“2019-11-20T19:57:59.174001382+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -t filter -C FORWARD -j DOCKER-USER]” Nov 20 19:57:59 localhost dockerd[29194]: time=“2019-11-20T19:57:59.188689266+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -D FORWARD -j DOCKER-USER]” Nov 20 19:57:59 localhost dockerd[29194]: time=“2019-11-20T19:57:59.208030003+08:00” level=debug msg=“/usr/sbin/iptables, [–wait -I FORWARD -j DOCKER-USER]” Nov 20 19:58:17 localhost dockerd[29194]: time=“2019-11-20T19:58:17.326531605+08:00” level=info msg=“Loading containers: done.” Nov 20 19:58:17 localhost dockerd[29194]: time=“2019-11-20T19:58:17.649466743+08:00” level=info msg=“Daemon has completed initialization” Nov 20 19:58:17 localhost dockerd[29194]: time=“2019-11-20T19:58:17.649603664+08:00” level=info msg=“Docker daemon” commit=cec0b72 graphdriver=overlay version=17.06.2-ce Nov 20 19:58:17 localhost dockerd[29194]: time=“2019-11-20T19:58:17.649854403+08:00” level=debug msg=“Registering routers”

通过日志,我们可以看到在启用--ip-masq这个配置的情况下,dockerd服务的日志会先删除,再重新生成上述两行防火墙规则。下面看下源码是如何处理这个规则的。

查看源码

下载对应环境中17.06.2-ce的Docker代码,我们可以搜索MASQUERADE或者POSTROUTING关键字,可以发现在libnetwork/drivers/bridge/setup_ip_tables.go里是有相关设置的。

func setupIPTablesInternal(bridgeIface string, addr net.Addr, icc, ipmasq, hairpin, enable bool) error {

	var (
		address   = addr.String()
		natRule   = iptRule{table: iptables.Nat, chain: "POSTROUTING", preArgs: []string{"-t", "nat"}, args: []string{"-s", address, "!", "-o", bridgeIface, "-j", "MASQUERADE"}}
		hpNatRule = iptRule{table: iptables.Nat, chain: "POSTROUTING", preArgs: []string{"-t", "nat"}, args: []string{"-m", "addrtype", "--src-type", "LOCAL", "-o", bridgeIface, "-j", "MASQUERADE"}}
		skipDNAT  = iptRule{table: iptables.Nat, chain: DockerChain, preArgs: []string{"-t", "nat"}, args: []string{"-i", bridgeIface, "-j", "RETURN"}}
		outRule   = iptRule{table: iptables.Filter, chain: "FORWARD", args: []string{"-i", bridgeIface, "!", "-o", bridgeIface, "-j", "ACCEPT"}}
	)

	// Set NAT.
	if ipmasq {
		if err := programChainRule(natRule, "NAT", enable); err != nil {
			return err
		}
    }
    ...
}

可以看到只有上述ipmasqtrue的时候,该POSTROUTING的规则才会被创建。而我们继续向上追溯,setupIPTablesInternal这个函数是在setup_ip_tables.gosetupIPTables进行调用,该函数如下:

// setup_ip_tables.go
func (n *bridgeNetwork) setupIPTables(config *networkConfiguration, i *bridgeInterface) error {
	var err error

	d := n.driver
	d.Lock()
	driverConfig := d.config
	d.Unlock()

	// Sanity check.
	if driverConfig.EnableIPTables == false {
		return errors.New("Cannot program chains, EnableIPTable is disabled")
    }
    ...

	if config.Internal {
		...
	} else {
		if err = setupIPTablesInternal(config.BridgeName, maskedAddrv4, config.EnableICC, config.EnableIPMasquerade, hairpinMode, true); err != nil {
			return fmt.Errorf("Failed to Setup IP tables: %s", err.Error())
		}
		n.registerIptCleanFunc(func() error {
			return setupIPTablesInternal(config.BridgeName, maskedAddrv4, config.EnableICC, config.EnableIPMasquerade, hairpinMode, false)
		})
	}

	return nil
}

我们看到ipmasq对应的是config.EnableIPMasquerade,我们继续查找config.EnableIPMasquerade引用的地方,发现针对该配置的设置是在初始化bridge的时候进行的。

接下来我们看下从dockerd启动到初始化bridge及网络的这个过程中,相关ipmasq的处理。

具体的启动实现流程

首先dockerd读取命令行参数并调用daemonCli.start(opts),该函数先读取/etc/docker/daemon.json的文件,并与命令行参数进行比对及配置合并,然后会执行NewDaemon函数进行daemon的初始化,主要的daemon的逻辑在这个函数完成。

NewDaemon首先执行verfiyDaemonSettings函数进行参数的校验处理
func verifyDaemonSettings(conf *config.Config) error {
	...
	// 如果iptables为false,即使ip-masq为true,该ip-masq仍然失效,这个和上面官方文档的说明是一致的。
	if !conf.BridgeConfig.EnableIPTables && conf.BridgeConfig.EnableIPMasq {
		conf.BridgeConfig.EnableIPMasq = false
	}
	...
	return nil
}
执行数据恢复

配置处理好后,会执行日志驱动、线程数等设置,并且根据数据目录(/var/lib/docker)初始化相应的组件的存储信息,以上执行完后,会执行数据恢复的步骤,该步骤是为了把之前本机的容器网络、容器配置好并启动,函数为d.restore(),下面进入该函数。

// daemon/daemon.go
func (daemon *Daemon) restore() error {
	var (
		currentDriver = daemon.GraphDriverName()
		containers    = make(map[string]*container.Container)
	)

	logrus.Info("Loading containers: start.")

	dir, err := ioutil.ReadDir(daemon.repository)
	if err != nil {
		return err
	}

	for _, v := range dir {
		id := v.Name()
        // 读取`/var/lib/docker/containers`中所有文件目录,获取到所有的容器信息,加载到内存中,
        // 每次加载完一个容器,会打印`Loaded container `字样的日志
	}

	removeContainers := make(map[string]*container.Container)
	restartContainers := make(map[*container.Container]chan struct{})
	activeSandboxes := make(map[string]interface{})
	for id, c := range containers {
        // 查看上述加载的容器,并把老版本的容器进行配置或volumes的迁移,用于兼容老版本启动的容器
        ...
	}

	var wg sync.WaitGroup
	var mapLock sync.Mutex
	for _, c := range containers {
		wg.Add(1)
		go func(c *container.Container) {
			defer wg.Done()
            //启动多协程,对状态为`Running`和`Paused`的容器进行不同的处理,处理包括:mount、remove等操作
            ...
		}(c)
	}
    wg.Wait()
    // 初始化网络
	daemon.netController, err = daemon.initNetworkController(daemon.configStore, activeSandboxes)
	if err != nil {
		return fmt.Errorf("Error initializing network controller: %v", err)
	}
    ...
	logrus.Info("Loading containers: done.")

	return nil
}
进入initNetworkController方法,查看具体的实现。
// daemon/daemon_unix.go
func (daemon *Daemon) initNetworkController(config *config.Config, activeSandboxes map[string]interface{}) (libnetwork.NetworkController, error) {
    // 处理network的配置,把默认的bridge的网络插件添加到配置中
	netOptions, err := daemon.networkOptions(config, daemon.PluginStore, activeSandboxes)
	if err != nil {
		return nil, err
	}

    // 创建网络的控制器
    // 1. 默认把`/var/lib/docker/network/files/local-kv.db`作为网络的数据库存储容器网络相关信息。
    // 2. 初始化docker支持的所有网络类型,如:bridge、host、none等。
    // 3. 清理之前docker创建的iptables规则,并处理FORWARD规则。
    // 该部分的实现内容可以参考`libnetwork.New(netOptions...)`的实现
	controller, err := libnetwork.New(netOptions...)
	if err != nil {
		return nil, fmt.Errorf("error obtaining controller instance: %v", err)
	}

    // Initialize default network on "null"
    ...
    // Initialize default network on "host"
    ...
	// 清理无用的bridge网络
	if n, err := controller.NetworkByName("bridge"); err == nil {
		if err = n.Delete(); err != nil {
			return nil, fmt.Errorf("could not delete the default bridge network: %v", err)
		}
	}

	if !config.DisableBridge {
		// Initialize default driver "bridge"
		if err := initBridgeDriver(controller, config); err != nil {
			return nil, err
		}
	} else {
		removeDefaultBridgeInterface()
	}

	return controller, nil
}

func initBridgeDriver(controller libnetwork.NetworkController, config *config.Config) error {
	bridgeName := bridge.DefaultBridgeName
	if config.BridgeConfig.Iface != "" {
		bridgeName = config.BridgeConfig.Iface
	}
	netOption := map[string]string{
		bridge.BridgeName:         bridgeName,
		bridge.DefaultBridge:      strconv.FormatBool(true),
        netlabel.DriverMTU:        strconv.Itoa(config.Mtu),
        // config.BridgeConfig.EnableIPMasq 这里对应了1中的verifyDaemonSettings的校验函数。
		bridge.EnableIPMasquerade: strconv.FormatBool(config.BridgeConfig.EnableIPMasq),
		bridge.EnableICC:          strconv.FormatBool(config.BridgeConfig.InterContainerCommunication),
	}

	// --ip processing
	if config.BridgeConfig.DefaultIP != nil {
		netOption[bridge.DefaultBindingIP] = config.BridgeConfig.DefaultIP.String()
    }
    ...
    // NewNetwork这个方法是初始化bridge,进入该函数
	// Initialize default network on "bridge" with the same name
	_, err = controller.NewNetwork("bridge", "bridge", "",
		libnetwork.NetworkOptionEnableIPv6(config.BridgeConfig.EnableIPv6),
		libnetwork.NetworkOptionDriverOpts(netOption),
		libnetwork.NetworkOptionIpam("default", "", v4Conf, v6Conf, nil),
		libnetwork.NetworkOptionDeferIPv6Alloc(deferIPv6Alloc))
	if err != nil {
		return fmt.Errorf("Error creating default \"bridge\" network: %v", err)
	}
	return nil
}

NewNetwork这个方法是初始化bridge,进入该函数,发现调用github.com/docker/libnetwork/drivers/bridge/bridge.go中的createNetwork方法,进入到方法中,我们看到这个createNetwork方法最终调用了我们在一开始看到的setupIPTables这个方法进行iptables规则的处理。

// github.com/docker/libnetwork/drivers/bridge/bridge.go
func (d *driver) createNetwork(config *networkConfiguration) error {

    ...
    network := &bridgeNetwork{
		id:         config.ID,
		endpoints:  make(map[string]*bridgeEndpoint),
		config:     config,
		portMapper: portmapper.New(d.config.UserlandProxyPath),
		driver:     d,
	}
    ...
	// Conditionally queue setup steps depending on configuration values.
	for _, step := range []struct {
		Condition bool
		Fn        setupStep
	}{
		...
		// Setup IPTables.
		{d.config.EnableIPTables, network.setupIPTables},
        ...
	} {
		if step.Condition {
			bridgeSetup.queueStep(step.Fn)
		}
	}

	return nil
}

到这里,我们就了解了docker是如何处理--ip-masq这个参数并如何根据这个参数进行iptables规则的设置。

libnetwork.New(netOptions…)的实现
// github.com/docker/libnetwork/controller.go
// New creates a new instance of network controller.
func New(cfgOptions ...config.Option) (NetworkController, error) {
	c := &controller{
		id:              stringid.GenerateRandomID(),
		cfg:             config.ParseConfigOptions(cfgOptions...),
		sandboxes:       sandboxTable{},
		svcRecords:      make(map[string]svcInfo),
		serviceBindings: make(map[serviceKey]*service),
		agentInitDone:   make(chan struct{}),
		networkLocker:   locker.New(),
	}

    // 创建读取网络数据库的client
	if err := c.initStores(); err != nil {
		return nil, err
	}

	drvRegistry, err := drvregistry.New(c.getStore(datastore.LocalScope), c.getStore(datastore.GlobalScope), c.RegisterDriver, nil, c.cfg.PluginGetter)
	if err != nil {
		return nil, err
	}

    // 添加并初始化所有网络类型,getInitializers的函数如下,可以看到这些是docker支持的网络类型。
    // 我们主要关注在bridge的类型,即bridge.Init这个方法。
    /*
        func getInitializers(experimental bool) []initializer {
            in := []initializer{
                {bridge.Init, "bridge"},
                {host.Init, "host"},
                {macvlan.Init, "macvlan"},
                {null.Init, "null"},
                {remote.Init, "remote"},
                {overlay.Init, "overlay"},
            }

            if experimental {
                in = append(in, additionalDrivers()...)
            }
            return in
        }
    */
	for _, i := range getInitializers(c.cfg.Daemon.Experimental) {
		var dcfg map[string]interface{}

		// External plugins don't need config passed through daemon. They can
		// bootstrap themselves
		if i.ntype != "remote" {
			dcfg = c.makeDriverConfig(i.ntype)
		}

		if err := drvRegistry.AddDriver(i.ntype, i.fn, dcfg); err != nil {
			return nil, err
		}
    }
    ...

	return c, nil
}
// bridge.Init方法的实现。
// Init registers a new instance of bridge driver
func Init(dc driverapi.DriverCallback, config map[string]interface{}) error {
	d := newDriver()
	if err := d.configure(config); err != nil {
		return err
	}

	c := driverapi.Capability{
		DataScope:         datastore.LocalScope,
		ConnectivityScope: datastore.LocalScope,
	}
	return dc.RegisterDriver(networkType, d, c)
}

func (d *driver) configure(option map[string]interface{}) error {
	var (
		config         *configuration
		err            error
		natChain       *iptables.ChainInfo
		filterChain    *iptables.ChainInfo
		isolationChain *iptables.ChainInfo
    )
    ...

	if config.EnableIPTables {
		if _, err := os.Stat("/proc/sys/net/bridge"); err != nil {
			if out, err := exec.Command("modprobe", "-va", "bridge", "br_netfilter").CombinedOutput(); err != nil {
				logrus.Warnf("Running modprobe bridge br_netfilter failed with message: %s, error: %v", out, err)
			}
        }
        // 清理之前docker相关的iptables规则,在前面章节的日志可以看到有删除iptables规则
		removeIPChains()
		natChain, filterChain, isolationChain, err = setupIPChains(config)
		if err != nil {
			return err
		}
		// Make sure on firewall reload, first thing being re-played is chains creation
		iptables.OnReloaded(func() { logrus.Debugf("Recreating iptables chains on firewall reload"); setupIPChains(config) })
	}

    // 这里是处理FORWARD规则的地方,会去读取/proc/sys/net/ipv4/ip_forward中的值进行处理
	if config.EnableIPForwarding {
		err = setupIPForwarding(config.EnableIPTables)
		if err != nil {
			logrus.Warn(err)
			return err
		}
    }
    ...

	return nil
}

数据存储在本地db文件中的内容

docker默认使用boltdb来存储网络相关的配置信息,那么我们看下在这个db文件中到底保存的数据格式是什么,对于我们以后看docker的一些源码实现或者自身组件的实现可能会有帮助。

我们通过代码,可以读取存储在本机的docker网络配置,我们读取了Key为docker/network/v1.0/bridge的值,该key为docker的默认bridge的信息,信息格式如下:

{
    "BridgeIfaceCreator": 2,
    "Internal": false,
    "DefaultBridge": true,
    "EnableICC": true,
    "EnableIPv6": false,
    "Mtu": 1500,
    "DefaultGatewayIPv6": "<nil>",
    "EnableIPMasquerade": true,
    "DefaultGatewayIPv4": "<nil>",
    "AddressIPv4": "172.17.0.1/16",
    "ContainerIfacePrefix": "",
    "ID": "6ce3c0c4d5c392d732a06a7ea9d6293bf04201056d563f7a6ad5ef9d8b3822db",
    "BridgeName": "docker0",
    "DefaultBindingIP": "0.0.0.0"
}

可以看到EnableIPMasquerade我们设置的true,从而使得前文所说的-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE规则被创建。

// 读取boltdb的代码示例
package main

import (
	"fmt"
	"log"
	"time"

	"github.com/docker/libkv"
	"github.com/docker/libkv/store"
	"github.com/docker/libkv/store/boltdb"
)

func init() {
	// Register boltdb store to libkv
	boltdb.Register()
}

func main() {
	client := "./local-kv.db"

	// Initialize a new store
	kv, err := libkv.NewStore(
		store.BOLTDB,
		[]string{client},
		&store.Config{
			Bucket:            "libnetwork",
			ConnectionTimeout: 10 * time.Second,
		},
	)
	if err != nil {
		log.Fatalf("Cannot create store: %v", err)
	}

	pair, err := kv.List("docker/network/v1.0/bridge")
	for _, p := range pair {
		fmt.Println(p.Key)
		fmt.Println(string(p.Value))
	}
}

总结

实现流程如下:

  1. dockerd首先根据配置文件和命令行参数获取--iptables--ip-masq的值
  2. 进行参数校验,校验后决定ip-masq是否启用
  3. dockerd初始化bridge网络的时候,先清理旧的iptables规则,然后依次添加新的iptables规则
  4. 如果启用ip-masq,那么创建POSTROUTING的规则。