使用runc運行Tomcat容器並分析資源使用量

使用runc運行Tomcat容器並查看運行狀態

儘管docker幾乎成爲了容器的代名詞,但要創建一個容器環境並不一定必須要docker。runc作爲一個命令行容器工具,與docker使用了相同的 “引擎” - libcontainer。與docker相比,runc更加底層,主要提供符合OCI規範的容器創建、運行、狀態查詢等功能。本文將使用runc一步步創建一個運行Tomcat的Alpine Linux容器。

: 本文所有腳本與代碼均在64位Ubuntu18.04-server下運行通過

準備工作

runc

Ubuntu上使用sudo apt install runc安裝runc,或者從Github的官方release處下載可執行文件。當然從Go源碼構建也是可以的。筆者採用的是源碼構建方式。

Alpine Linux文件系統

想要創建一個容器,一個Linux操作系統是必不可少的底層文件系統是必不可少的。所謂底層文件系統實際上就是組成一個系統必須使用的一些文件及文件夾。比如/sys, /proc, /dev等。這裏使用AlpineMiniFilesystemAlpine LinuxUbuntuFedora等一樣是一個Linux發行版,不過它對系統進行了精簡,因此非常適合存儲空間受限的設備。

下載了文件系統後,將其解壓到一個名爲rootfs的文件夾中即可,比如/home/yourname/alpine-bundle/rootfs/。解壓之後會在rootfs中看到/bin, /dev, /etc等文件夾。

Tomcat

Tomcat從官方網站下載即可。下載之後,我們將其解壓並放到前面文件系統rootfs中,整個rootfs文件夾應該具有下列文件夾:

bin  dev  etc  home lib  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  tomcat-9.0.29  usr  var

其中的tomcat-9.0.29即爲解壓後的Tomcat文件夾。

配置Config.json

OCI運行時規範詳細闡述了容器運行時的生命週期,並規定了容器配置文件的格式以及詳細含義,詳情可以參考OCI-Github。當使用runc啓動容器時,需要在bundle(即包含rootfs的文件夾)中創建一個config.json文件,該文件的內容是由OCI規範制定,規定了容器啓動後要執行的程序、需要掛載的設備等信息。

在bundle目錄中使用runc spec能夠自動創建一個config.json,在此基礎上進行更改即可。本文所創建的Tomcat容器,本質上是啓動一個Alpine Linux容器,並在容器內啓動Tomcat。實際上docker中的Tomcat、Python、MySQL容器使用的方式類似,不過他們是基於debian(linux發行版)而不是Alpine

完整地config.json見文末。

process

規定容器啓動時要執行的程序。對於Tomcat容器來說,容器啓動後需要配置安裝JDK並配置JAVA_HOME, JRE_HOME,然後再執行Tomcat的startup.sh。如下爲本文的啓動腳本,啓動程序爲/bin/sh init.sh,啓動容器前需要將init.sh拷貝到rootfs中(或者直接在rootfs下編寫該腳本)

init.sh

前面爲準備工作,如網卡屬性、路由表、源、jdk等,之後啓動Tomcat以及/bin/ash,通過ps命令能夠看到當前運行的Tomcat。

# setup network and start /bin/ash
ifconfig veth985 up
ifconfig veth985 10.1.1.2
ifconfig veth985 netmask 255.255.255.0
route add default gw 10.1.1.1
echo 'nameserver 202.38.64.56' > /etc/resolv.conf
echo -e 'http://mirrors.ustc.edu.cn/alpine/v3.10/main\nhttp://mirrors.ustc.edu.cn/alpine/v3.10/community' > /etc/apk/repositories
echo 'installing openjdk...'
apk add openjdk11
echo 'jdk installed, starting tomcat-9.0.29'
./tomcat-9.0.29/bin/startup.sh
/bin/ash
echo 'stopping'
apk del openjdk11
echo 'jdk uninstalled'
route del default gw 10.1.1.1
ip link del veth985
echo 'bye'

hooks

hooks作爲配置文件的一部分,主要規定了容器在prestart, poststart, poststop(容器內進程啓動前,啓動後,容器銷燬後)三個階段,宿主環境內需要執行的工作,如配置容器網絡等。具體來說,就是規定了三個要執行的程序、參數以及環境變量。這些程序會在前述三個階段執行,並且容器此時的狀態會通過stdin發送給程序。由於OCI規範沒有規定網絡設備的創建,因此本文使用hooks爲容器創建網卡設備,並將宿主環境的8080端口(Tomcat默認)映射到主機的4399端口。

prestart

由於需要從輸入讀取容器的進程標識pid,從而將網卡設備加入容器的Namespace,因此使用C++編寫的源程序並編譯得到可執行程序。代碼流程爲read pid --> generate shell script --> exec(bash, script) , 源代碼見下

poststop

容器退出後,虛擬網卡設備會自動刪除,因此poststop只需刪除prestart中添加的路由表項及端口映射項即可。

sudo iptables -t nat -D PREROUTING -t nat -i ens33 -p tcp --dport 4399 -j DNAT --to 10.1.1.2:8080
sudo iptables -t filter -D FORWARD -p tcp -d 10.1.1.1 --dport 8080 -j ACCEPT
sudo iptables -t nat -D POSTROUTING -s 10.1.1.0/24 ! -d10.1.1.0/24 -j MASQUERADE

環境變量、權限屬性

權限與環境變量配置用來保證容器內進程具有配置網絡設備、讀寫文件夾的權限,從而能夠正常啓動容器,詳細信息見config.json.

完整config.json, prestart.cpp, poststop.sh

{
	"ociVersion": "1.0.1-dev",
	"process": {
		"terminal": true,
		"user": {
			"uid": 0,
			"gid": 0
		},
		"args": [
			"/bin/ash",
			"init.sh"
		],
		"env": [
			"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/jvm/java-11-openjdk/bin:/usr/lib/jvm/java-11-openjdk/jre/bin",
			"JAVA_HOME=/usr/lib/jvm/java-11-openjdk/",
			"JRE_HOME=/usr/lib/jvm/java-11-openjdk/jre/",
			"TERM=xterm"
		],
		"cwd": "/",
		"capabilities": {
			"bounding": [
				"CAP_AUDIT_WRITE",
				"CAP_KILL",
				"CAP_MKNOD",
				"CAP_FOWNER",
				"CAP_CHOWN",
				"CAP_SYS_CHROOT",
				"CAP_NET_BIND_SERVICE",
				"CAP_NET_ADMIN",
				"CAP_NET_RAW",
				"CAP_SETUID",
				"CAP_SETGID",
				"CAP_SETPCAP",
				"CAP_SETFCAP",
				"CAP_SYS_ADMIN"
			],
			"effective": [
				"CAP_AUDIT_WRITE",
				"CAP_KILL",
				"CAP_MKNOD",
				"CAP_FOWNER",
				"CAP_CHOWN",
				"CAP_SYS_CHROOT",
				"CAP_NET_BIND_SERVICE",
				"CAP_NET_ADMIN",
				"CAP_NET_RAW",
				"CAP_SETUID",
				"CAP_SETGID",
				"CAP_SETPCAP",
				"CAP_SETFCAP",
				"CAP_SYS_ADMIN"
			],
			"inheritable": [
				"CAP_AUDIT_WRITE",
				"CAP_KILL",
				"CAP_MKNOD",
				"CAP_FOWNER",
				"CAP_CHOWN",
				"CAP_SYS_CHROOT",
				"CAP_NET_BIND_SERVICE",
				"CAP_NET_ADMIN",
				"CAP_NET_RAW",
				"CAP_SETUID",
				"CAP_SETGID",
				"CAP_SETPCAP",
				"CAP_SETFCAP",
				"CAP_SYS_ADMIN"
			],
			"permitted": [
				"CAP_AUDIT_WRITE",
				"CAP_KILL",
				"CAP_MKNOD",
				"CAP_FOWNER",
				"CAP_CHOWN",
				"CAP_SYS_CHROOT",
				"CAP_NET_BIND_SERVICE",
				"CAP_NET_ADMIN",
				"CAP_NET_RAW",
				"CAP_SETUID",
				"CAP_SETGID",
				"CAP_SETPCAP",
				"CAP_SETFCAP",
				"CAP_SYS_ADMIN"
			],
			"ambient": [
				"CAP_AUDIT_WRITE",
				"CAP_KILL",
				"CAP_MKNOD",
				"CAP_FOWNER",
				"CAP_CHOWN",
				"CAP_SYS_CHROOT",
				"CAP_NET_BIND_SERVICE",
				"CAP_NET_ADMIN",
				"CAP_NET_RAW",
				"CAP_SETUID",
				"CAP_SETGID",
				"CAP_SETPCAP",
				"CAP_SETFCAP",
				"CAP_SYS_ADMIN"
			]
		},
		"rlimits": [
			{
				"type": "RLIMIT_NOFILE",
				"hard": 1024,
				"soft": 1024
			}
		]
	},
	"root": {
		"path": "rootfs",
		"readonly": false
	},
	"hostname": "runc",
	"mounts": [
		{
			"destination": "/proc",
			"type": "proc",
			"source": "proc"
		},
		{
			"destination": "/dev",
			"type": "tmpfs",
			"source": "tmpfs",
			"options": [
				"nosuid",
				"strictatime",
				"mode=755",
				"size=65536k"
			]
		},
		{
			"destination": "/dev/pts",
			"type": "devpts",
			"source": "devpts",
			"options": [
				"nosuid",
				"noexec",
				"newinstance",
				"ptmxmode=0666",
				"mode=0620",
				"gid=5"
			]
		},
		{
			"destination": "/dev/shm",
			"type": "tmpfs",
			"source": "shm",
			"options": [
				"nosuid",
				"noexec",
				"nodev",
				"mode=1777",
				"size=65536k"
			]
		},
		{
			"destination": "/dev/mqueue",
			"type": "mqueue",
			"source": "mqueue",
			"options": [
				"nosuid",
				"noexec",
				"nodev"
			]
		},
		{
			"destination": "/sys",
			"type": "sysfs",
			"source": "sysfs",
			"options": [
				"nosuid",
				"noexec",
				"nodev",
				"ro"
			]
		},
		{
			"destination": "/sys/fs/cgroup",
			"type": "cgroup",
			"source": "cgroup",
			"options": [
				"nosuid",
				"noexec",
				"nodev",
				"relatime",
				"ro"
			]
		}
	],
	"hooks": {
		"prestart": [
			{
				"path": "./prestart.out",
				"args": [
					"prestart.out"
				]
			}
		],
		"poststop": [
			{
				"path": "/bin/bash",
				"args": [
					"bash",
					"./poststop.sh"
				]
			}
		]
	},
	"linux": {
		"resources": {
			"devices": [
				{
					"allow": false,
					"access": "rwm"
				}
			]
		},
		"namespaces": [
			{
				"type": "pid"
			},
			{
				"type": "network"
			},
			{
				"type": "ipc"
			},
			{
				"type": "uts"
			},
			{
				"type": "mount"
			}
		],
		"maskedPaths": [
			"/proc/acpi",
			"/proc/asound",
			"/proc/kcore",
			"/proc/keys",
			"/proc/latency_stats",
			"/proc/timer_list",
			"/proc/timer_stats",
			"/proc/sched_debug",
			"/sys/firmware",
			"/proc/scsi"
		],
		"readonlyPaths": [
			"/proc/bus",
			"/proc/fs",
			"/proc/irq",
			"/proc/sys",
			"/proc/sysrq-trigger"
		]
	}
}
FILE *fp_log;
const void fail(const string tmps = "") {
    if (fp_log != nullptr) {
        if (!tmps.empty()) fprintf(fp_log, "%s\n", tmps.c_str());
        fclose(fp_log);
    }
    exit(EXIT_FAILURE);
}
int main(int argc, char const *argv[]) {
    fp_log = fopen("./pre_start.log", "w");
    string container_state;
    cin >> container_state;
    fprintf(fp_log, "%s\n", container_state.c_str());
    fclose(fp_log);
    const int sz_container_state = container_state.size();
    if (sz_container_state <= 0) fail("container state size 0");
    int st = 0;
    while (st + 2 < sz_container_state && container_state.substr(st, 3) != "pid") st++;
    if (st >= sz_container_state) fail("cannot find 'pid' inside state");
    st = st + 5;
    int ed = st;
    while (ed < sz_container_state && container_state[ed] != ',') ed++;
    if (ed >= sz_container_state) fail("cannot find ',' after 'pid' inside state");
    string pid_netns;
    pid_netns = container_state.substr(st, ed - st);
    string veth1 = "veth211";
    string veth2 = "veth985";
    string lines[] = {
        "sudo ip link add " + veth1 + " type veth peer name " + veth2,
        "sudo ifconfig " + veth1 + " 10.1.1.1/24 up",
        "sudo ip link set " + veth2 + " netns " + pid_netns,
        "sudo iptables -t nat -A POSTROUTING -s 10.1.1.0/24 ! -d 10.1.1.0/24 -j MASQUERADE",
        "sudo iptables -A PREROUTING -t nat -i ens33 -p tcp --dport 4399 -j DNAT --to 10.1.1.2:8080",
        "sudo iptables -A FORWARD -p tcp -d 10.1.1.1 --dport 8080 -j ACCEPT"
    };
    string hook_path = "./prestart.sh";
    FILE *fp_sh = fopen(hook_path.c_str(), "w");
    for (auto &&line : lines) {
        fprintf(fp_sh, "%s\n", line.c_str());
    }
    fclose(fp_sh);

    execl("/bin/bash", "bash", hook_path.c_str(), nullptr);
    return 0;
}

poststop.sh

sudo iptables -t nat -D PREROUTING -t nat -i ens33 -p tcp --dport 4399 -j DNAT --to 10.1.1.2:8080
sudo iptables -t filter -D FORWARD -p tcp -d 10.1.1.1 --dport 8080 -j ACCEPT
sudo iptables -t nat -D POSTROUTING -s 10.1.1.0/24 ! -d10.1.1.0/24 -j MASQUERADE

分析容器資源使用量

runc events --interval 1s id能夠不斷查詢容器的cpu, memory以及IO使用量統計。ab則是Apache提供的服務器壓力測試工具,需要安裝Apace服務器後其他服務,使用方式爲ab -c numOfConcurrency -n numOfRequests http://yourweb.com/path。如下爲具體分析過程。

  • 新終端 - 執行runc events,查看進程的屬性
    sudo runc events --interval 0.01s helo > perf_analysis/runc_events.txt
    
  • 新終端 - 使用ApacheBenchmark測試性能
    ab -c 1000 -n 6000 http://10.1.1.2:8080/ > perf_analysis/ab_out.txt
    
  • 使用Python分析數據
    import json
     import matplotlib.pyplot as plt
    
     x_axis = []
     cpu_usage = []
     mem_usage = []
     act_anon = []
     pgfault = []
     pgpgin = []
     pgpgout = []
     rss = []
     pids_current = []
     def read_data():
         i = 1
         with open("./runc_events.txt", encoding="utf-8", mode="r") as fp:
             lines = fp.readlines()
             mb = 1024 * 1024
             for line in lines:
                 if line == "":
                     break
                 data_line = json.loads(line)["data"]
                 cpu_usage.append(
                     int(data_line["cpu"]["usage"]["total"]) / 1000000)
                 mem_usage.append(data_line["memory"]["usage"]["usage"] / mb)
                 data_line_memraw = data_line["memory"]["raw"]
                 act_anon.append(int(data_line_memraw["active_anon"]) / mb)
                 pgfault.append(int(data_line_memraw["pgfault"]))
                 pgpgin.append(int(data_line_memraw["pgpgin"]))
                 pgpgout.append(int(data_line_memraw["pgpgout"]))
                 rss.append(int(data_line_memraw["rss"]) / mb)
                 pids_current.append(int(data_line["pids"]["current"]))
                 x_axis.append(i)
                 i += 1
     def subplt(rows, index, title, x, y, cls=1):
         plt.subplot(rows, cls, index)
         plt.plot(x, y)
         plt.title(title)
     def show():
         plt.figure(figsize=(30, 80))
         rows, cols = 4, 1
    
         plt.subplot(rows, cols, 1)
         plt.plot(x_axis, cpu_usage)
         plt.title('cpu usage(ms)')
     
         plt.subplot(rows, cols, 2)
         plt.plot(x_axis, mem_usage)
         plt.title('memory usage(mb)')
     
         plt.subplot(rows, cols, 3)
         plt.plot(x_axis, pgfault)
         plt.title('memory raw page fault')
     
         plt.subplot(rows, cols, 4)
         plt.plot(x_axis, pids_current)
         plt.title('pids current')
         # plt.show()
         plt.savefig("./events_runc_tomcat.png")
     
     read_data()
     show()
    

最後結果如下圖所示
image

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章