文章目錄
使用runc運行Tomcat容器並查看運行狀態
儘管docker幾乎成爲了容器的代名詞,但要創建一個容器環境並不一定必須要docker。runc作爲一個命令行容器工具,與docker使用了相同的 “引擎” - libcontainer
。與docker相比,runc更加底層,主要提供符合OCI規範的容器創建、運行、狀態查詢等功能。本文將使用runc一步步創建一個運行Tomcat的Alpine Linux容器。
注: 本文所有腳本與代碼均在64位Ubuntu18.04-server下運行通過
準備工作
runc
Ubuntu上使用sudo apt install runc
安裝runc,或者從Github的官方release處下載可執行文件。當然從Go源碼構建也是可以的。筆者採用的是源碼構建方式。
Alpine Linux文件系統
想要創建一個容器,一個Linux操作系統是必不可少的底層文件系統是必不可少的。所謂底層文件系統實際上就是組成一個系統必須使用的一些文件及文件夾。比如/sys, /proc, /dev
等。這裏使用AlpineMiniFilesystem。Alpine Linux與Ubuntu、Fedora等一樣是一個Linux發行版,不過它對系統進行了精簡,因此非常適合存儲空間受限的設備。
下載了文件系統後,將其解壓到一個名爲rootfs的文件夾中即可,比如/home/yourname/alpine-bundle/rootfs/
。解壓之後會在rootfs中看到/bin, /dev, /etc
等文件夾。
Tomcat
Tomcat從官方網站下載即可。下載之後,我們將其解壓並放到前面文件系統rootfs
中,整個rootfs文件夾應該具有下列文件夾:
bin dev etc home lib media mnt opt proc root run sbin srv sys tmp tomcat-9.0.29 usr var
其中的tomcat-9.0.29
即爲解壓後的Tomcat文件夾。
配置Config.json
OCI運行時規範詳細闡述了容器運行時的生命週期,並規定了容器配置文件的格式以及詳細含義,詳情可以參考OCI-Github。當使用runc啓動容器時,需要在bundle
(即包含rootfs的文件夾)中創建一個config.json
文件,該文件的內容是由OCI規範制定,規定了容器啓動後要執行的程序、需要掛載的設備等信息。
在bundle目錄中使用runc spec
能夠自動創建一個config.json
,在此基礎上進行更改即可。本文所創建的Tomcat容器,本質上是啓動一個Alpine Linux
容器,並在容器內啓動Tomcat。實際上docker中的Tomcat、Python、MySQL容器使用的方式類似,不過他們是基於debian
(linux發行版)而不是Alpine
。
完整地config.json見文末。
process
規定容器啓動時要執行的程序。對於Tomcat容器來說,容器啓動後需要配置安裝JDK並配置JAVA_HOME, JRE_HOME
,然後再執行Tomcat的startup.sh
。如下爲本文的啓動腳本,啓動程序爲/bin/sh init.sh
,啓動容器前需要將init.sh拷貝到rootfs中(或者直接在rootfs下編寫該腳本)
init.sh
前面爲準備工作,如網卡屬性、路由表、源、jdk等,之後啓動Tomcat以及/bin/ash
,通過ps
命令能夠看到當前運行的Tomcat。
# setup network and start /bin/ash
ifconfig veth985 up
ifconfig veth985 10.1.1.2
ifconfig veth985 netmask 255.255.255.0
route add default gw 10.1.1.1
echo 'nameserver 202.38.64.56' > /etc/resolv.conf
echo -e 'http://mirrors.ustc.edu.cn/alpine/v3.10/main\nhttp://mirrors.ustc.edu.cn/alpine/v3.10/community' > /etc/apk/repositories
echo 'installing openjdk...'
apk add openjdk11
echo 'jdk installed, starting tomcat-9.0.29'
./tomcat-9.0.29/bin/startup.sh
/bin/ash
echo 'stopping'
apk del openjdk11
echo 'jdk uninstalled'
route del default gw 10.1.1.1
ip link del veth985
echo 'bye'
hooks
hooks
作爲配置文件的一部分,主要規定了容器在prestart, poststart, poststop
(容器內進程啓動前,啓動後,容器銷燬後)三個階段,宿主環境內需要執行的工作,如配置容器網絡等。具體來說,就是規定了三個要執行的程序、參數以及環境變量。這些程序會在前述三個階段執行,並且容器此時的狀態會通過stdin
發送給程序。由於OCI規範沒有規定網絡設備的創建,因此本文使用hooks爲容器創建網卡設備,並將宿主環境的8080
端口(Tomcat默認)映射到主機的4399
端口。
prestart
由於需要從輸入讀取容器的進程標識pid,從而將網卡設備加入容器的Namespace
,因此使用C++編寫的源程序並編譯得到可執行程序。代碼流程爲read pid --> generate shell script --> exec(bash, script)
, 源代碼見下
poststop
容器退出後,虛擬網卡設備會自動刪除,因此poststop只需刪除prestart中添加的路由表項及端口映射項即可。
sudo iptables -t nat -D PREROUTING -t nat -i ens33 -p tcp --dport 4399 -j DNAT --to 10.1.1.2:8080
sudo iptables -t filter -D FORWARD -p tcp -d 10.1.1.1 --dport 8080 -j ACCEPT
sudo iptables -t nat -D POSTROUTING -s 10.1.1.0/24 ! -d10.1.1.0/24 -j MASQUERADE
環境變量、權限屬性
權限與環境變量配置用來保證容器內進程具有配置網絡設備、讀寫文件夾的權限,從而能夠正常啓動容器,詳細信息見config.json
.
完整config.json, prestart.cpp, poststop.sh
{
"ociVersion": "1.0.1-dev",
"process": {
"terminal": true,
"user": {
"uid": 0,
"gid": 0
},
"args": [
"/bin/ash",
"init.sh"
],
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/jvm/java-11-openjdk/bin:/usr/lib/jvm/java-11-openjdk/jre/bin",
"JAVA_HOME=/usr/lib/jvm/java-11-openjdk/",
"JRE_HOME=/usr/lib/jvm/java-11-openjdk/jre/",
"TERM=xterm"
],
"cwd": "/",
"capabilities": {
"bounding": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_MKNOD",
"CAP_FOWNER",
"CAP_CHOWN",
"CAP_SYS_CHROOT",
"CAP_NET_BIND_SERVICE",
"CAP_NET_ADMIN",
"CAP_NET_RAW",
"CAP_SETUID",
"CAP_SETGID",
"CAP_SETPCAP",
"CAP_SETFCAP",
"CAP_SYS_ADMIN"
],
"effective": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_MKNOD",
"CAP_FOWNER",
"CAP_CHOWN",
"CAP_SYS_CHROOT",
"CAP_NET_BIND_SERVICE",
"CAP_NET_ADMIN",
"CAP_NET_RAW",
"CAP_SETUID",
"CAP_SETGID",
"CAP_SETPCAP",
"CAP_SETFCAP",
"CAP_SYS_ADMIN"
],
"inheritable": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_MKNOD",
"CAP_FOWNER",
"CAP_CHOWN",
"CAP_SYS_CHROOT",
"CAP_NET_BIND_SERVICE",
"CAP_NET_ADMIN",
"CAP_NET_RAW",
"CAP_SETUID",
"CAP_SETGID",
"CAP_SETPCAP",
"CAP_SETFCAP",
"CAP_SYS_ADMIN"
],
"permitted": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_MKNOD",
"CAP_FOWNER",
"CAP_CHOWN",
"CAP_SYS_CHROOT",
"CAP_NET_BIND_SERVICE",
"CAP_NET_ADMIN",
"CAP_NET_RAW",
"CAP_SETUID",
"CAP_SETGID",
"CAP_SETPCAP",
"CAP_SETFCAP",
"CAP_SYS_ADMIN"
],
"ambient": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_MKNOD",
"CAP_FOWNER",
"CAP_CHOWN",
"CAP_SYS_CHROOT",
"CAP_NET_BIND_SERVICE",
"CAP_NET_ADMIN",
"CAP_NET_RAW",
"CAP_SETUID",
"CAP_SETGID",
"CAP_SETPCAP",
"CAP_SETFCAP",
"CAP_SYS_ADMIN"
]
},
"rlimits": [
{
"type": "RLIMIT_NOFILE",
"hard": 1024,
"soft": 1024
}
]
},
"root": {
"path": "rootfs",
"readonly": false
},
"hostname": "runc",
"mounts": [
{
"destination": "/proc",
"type": "proc",
"source": "proc"
},
{
"destination": "/dev",
"type": "tmpfs",
"source": "tmpfs",
"options": [
"nosuid",
"strictatime",
"mode=755",
"size=65536k"
]
},
{
"destination": "/dev/pts",
"type": "devpts",
"source": "devpts",
"options": [
"nosuid",
"noexec",
"newinstance",
"ptmxmode=0666",
"mode=0620",
"gid=5"
]
},
{
"destination": "/dev/shm",
"type": "tmpfs",
"source": "shm",
"options": [
"nosuid",
"noexec",
"nodev",
"mode=1777",
"size=65536k"
]
},
{
"destination": "/dev/mqueue",
"type": "mqueue",
"source": "mqueue",
"options": [
"nosuid",
"noexec",
"nodev"
]
},
{
"destination": "/sys",
"type": "sysfs",
"source": "sysfs",
"options": [
"nosuid",
"noexec",
"nodev",
"ro"
]
},
{
"destination": "/sys/fs/cgroup",
"type": "cgroup",
"source": "cgroup",
"options": [
"nosuid",
"noexec",
"nodev",
"relatime",
"ro"
]
}
],
"hooks": {
"prestart": [
{
"path": "./prestart.out",
"args": [
"prestart.out"
]
}
],
"poststop": [
{
"path": "/bin/bash",
"args": [
"bash",
"./poststop.sh"
]
}
]
},
"linux": {
"resources": {
"devices": [
{
"allow": false,
"access": "rwm"
}
]
},
"namespaces": [
{
"type": "pid"
},
{
"type": "network"
},
{
"type": "ipc"
},
{
"type": "uts"
},
{
"type": "mount"
}
],
"maskedPaths": [
"/proc/acpi",
"/proc/asound",
"/proc/kcore",
"/proc/keys",
"/proc/latency_stats",
"/proc/timer_list",
"/proc/timer_stats",
"/proc/sched_debug",
"/sys/firmware",
"/proc/scsi"
],
"readonlyPaths": [
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
]
}
}
FILE *fp_log;
const void fail(const string tmps = "") {
if (fp_log != nullptr) {
if (!tmps.empty()) fprintf(fp_log, "%s\n", tmps.c_str());
fclose(fp_log);
}
exit(EXIT_FAILURE);
}
int main(int argc, char const *argv[]) {
fp_log = fopen("./pre_start.log", "w");
string container_state;
cin >> container_state;
fprintf(fp_log, "%s\n", container_state.c_str());
fclose(fp_log);
const int sz_container_state = container_state.size();
if (sz_container_state <= 0) fail("container state size 0");
int st = 0;
while (st + 2 < sz_container_state && container_state.substr(st, 3) != "pid") st++;
if (st >= sz_container_state) fail("cannot find 'pid' inside state");
st = st + 5;
int ed = st;
while (ed < sz_container_state && container_state[ed] != ',') ed++;
if (ed >= sz_container_state) fail("cannot find ',' after 'pid' inside state");
string pid_netns;
pid_netns = container_state.substr(st, ed - st);
string veth1 = "veth211";
string veth2 = "veth985";
string lines[] = {
"sudo ip link add " + veth1 + " type veth peer name " + veth2,
"sudo ifconfig " + veth1 + " 10.1.1.1/24 up",
"sudo ip link set " + veth2 + " netns " + pid_netns,
"sudo iptables -t nat -A POSTROUTING -s 10.1.1.0/24 ! -d 10.1.1.0/24 -j MASQUERADE",
"sudo iptables -A PREROUTING -t nat -i ens33 -p tcp --dport 4399 -j DNAT --to 10.1.1.2:8080",
"sudo iptables -A FORWARD -p tcp -d 10.1.1.1 --dport 8080 -j ACCEPT"
};
string hook_path = "./prestart.sh";
FILE *fp_sh = fopen(hook_path.c_str(), "w");
for (auto &&line : lines) {
fprintf(fp_sh, "%s\n", line.c_str());
}
fclose(fp_sh);
execl("/bin/bash", "bash", hook_path.c_str(), nullptr);
return 0;
}
poststop.sh
sudo iptables -t nat -D PREROUTING -t nat -i ens33 -p tcp --dport 4399 -j DNAT --to 10.1.1.2:8080
sudo iptables -t filter -D FORWARD -p tcp -d 10.1.1.1 --dport 8080 -j ACCEPT
sudo iptables -t nat -D POSTROUTING -s 10.1.1.0/24 ! -d10.1.1.0/24 -j MASQUERADE
分析容器資源使用量
runc events --interval 1s id
能夠不斷查詢容器的cpu, memory以及IO使用量統計。ab
則是Apache提供的服務器壓力測試工具,需要安裝Apace服務器後其他服務,使用方式爲ab -c numOfConcurrency -n numOfRequests http://yourweb.com/path
。如下爲具體分析過程。
- 新終端 - 執行
runc events
,查看進程的屬性sudo runc events --interval 0.01s helo > perf_analysis/runc_events.txt
- 新終端 - 使用ApacheBenchmark測試性能
ab -c 1000 -n 6000 http://10.1.1.2:8080/ > perf_analysis/ab_out.txt
- 使用Python分析數據
import json import matplotlib.pyplot as plt x_axis = [] cpu_usage = [] mem_usage = [] act_anon = [] pgfault = [] pgpgin = [] pgpgout = [] rss = [] pids_current = [] def read_data(): i = 1 with open("./runc_events.txt", encoding="utf-8", mode="r") as fp: lines = fp.readlines() mb = 1024 * 1024 for line in lines: if line == "": break data_line = json.loads(line)["data"] cpu_usage.append( int(data_line["cpu"]["usage"]["total"]) / 1000000) mem_usage.append(data_line["memory"]["usage"]["usage"] / mb) data_line_memraw = data_line["memory"]["raw"] act_anon.append(int(data_line_memraw["active_anon"]) / mb) pgfault.append(int(data_line_memraw["pgfault"])) pgpgin.append(int(data_line_memraw["pgpgin"])) pgpgout.append(int(data_line_memraw["pgpgout"])) rss.append(int(data_line_memraw["rss"]) / mb) pids_current.append(int(data_line["pids"]["current"])) x_axis.append(i) i += 1 def subplt(rows, index, title, x, y, cls=1): plt.subplot(rows, cls, index) plt.plot(x, y) plt.title(title) def show(): plt.figure(figsize=(30, 80)) rows, cols = 4, 1 plt.subplot(rows, cols, 1) plt.plot(x_axis, cpu_usage) plt.title('cpu usage(ms)') plt.subplot(rows, cols, 2) plt.plot(x_axis, mem_usage) plt.title('memory usage(mb)') plt.subplot(rows, cols, 3) plt.plot(x_axis, pgfault) plt.title('memory raw page fault') plt.subplot(rows, cols, 4) plt.plot(x_axis, pids_current) plt.title('pids current') # plt.show() plt.savefig("./events_runc_tomcat.png") read_data() show()
最後結果如下圖所示