MemSQL翻譯第二天--安裝配置優化

原文地址:http://docs.memsql.com/docs/installation-best-practices

Installation Best Practices This section lists the best practices for
properly configuring your MemSQL installation.

安裝最佳實踐
這一節列出了合理配置MemSQL的最佳實踐。

MemSQL Database Configuration Configure Linux ulimit settings Most
Linux operating systems provide ways to control the usage of system
resources such as threads, files and network at an individual user or
process level. The per-user limitations for resources are called
ulimits, and they prevent single users from consuming too much system
resources. For optimal performance, MemSQL recommends setting ulimits
to higher values than the default Linux settings. The ulimit settings
can be configured in the /etc/security/limits.conf file, or directly
via shell commands.

MemSQL數據庫配置
配置linux的ulimit選項
很多Linux操作系統爲每一個用戶或者每一個進程提供了控制線程,文件和網絡等系統資源的控制,每一個用戶的資源控制叫做ulimit,
這個變量組織單一用戶消耗太多的系統資源,爲了優化性能,MemSQL建議把ulimit設置的比系統默認值高,這個變量可以在
/etc/security/limits.conf中配置,或者直接通過命令行

Increase File Descriptor Limit The MemSQL cluster uses a substantial
number of client and server connections between aggregators and leaves
to run queries and cluster operations. We recommend setting the Linux
file descriptor limit to the highest possible value (at least 64,000)
to account for these connections. Failing to increase this limit can
significantly degrade performance and even cause connection limit
errors. Permanently increase this limit for all users by editing the
/etc/security/limits.conf file as root, and adding the lines: Shell *
soft NOFILE 1000000
* hard NOFILE 1000000 Alternatively, you can set the value for your session by running the following command in your shell: Shell ulimit
-n 1000000 For more information about setting the file descriptor limit on Linux, see:
http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files.

增加文件描述符限制
MemSQL集羣使用大量的數字用於在aggregator和leaf的客戶端和服務器的連接來運行查詢和集羣操作,我們建議爲了連接性能考慮設置linux文件描述符最可能的大
(至少64000),不能增加這個限制會明顯的降低性能,甚至導致連接失敗。
永久的增加這個限制可以通過使用root角色登錄來編輯/etc/security/limits.conf,增加一行:
* soft NOFILE 1000000
* hard NOFILE 1000000
另外,如果你願意的話,你可以在命令行中設置你本次會話的值:
ulimit -n 1000000
http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files

Increase Maximum Process Limit MemSQL also recommends increasing the
number of processes allowed to be run by users. Permanently increase
this limit for all users by editing the /etc/security/limits.conf file
as root, and adding the lines: Shell
* soft NPROC 128000
* hard NPROC 128000 Alternatively, you can set the value for your session by running the following command in your shell: Shell ulimit
-u 128000

增加最大進程的限制
MemSQL也建議增加用戶可以運行的進程數,永久的增加進程可以root身份修改/etc/security/limits.conf,增加:
* soft NPROC 128000
* hard NPROC 128000
也可以通過命令行設置本次會話的值
ulimit -u 128000

Configure Linux vm settings MemSQL recommends setting the following
settings via sysctl to minimize likelihood of getting memory errors.
You can use the /sbin/sysctl command to view, set and automate
settings in the /proc/sys/ directory. Permanently set these variables
by editing the /etc/sysctl.conf file as root, and adding these lines:
Shell vm.max_map_count=1000000000 vm.min_free_kbytes=500000
vm.swappiness=10 If MemSQL Ops is used to configure vm setting
defaults, it will set vm.min_free_kbytes to the minimum of 1% of
system RAM and 4 GB.

設置linux的vm
MemSQL建議通過“sysctl”設置將內存出錯的可能性降到最小,你可以使用/sbin/sysctl命令去查看,設置和在/proc/sys/中自動化設置。
通過以root身份編輯/etc/sysctl.conf永久的設置變量,增加:
vm.max_map_count=1000000000
vm.min_free_kbytes=500000
vm.swappiness=10
如果使用MemSQL ops的默認設置,它將要設置vm.min_free_kbytes爲系統RAM和4GB最小值的1%

Create swap files It is recommended that users explicity configure
swap files. It is recommended that you create a swap file so that the
operating system does not immediately start killing processes if
MemSQL runs out of memory. Here is a sample shell script to create the
swap file: Shell snapshot_dir=”/var/lib/memsql/data/snapshots”
snapshot_file=”snapshotdir/memsql.swp"mkdirp snapshot_dir dd
if=/dev/zero of=snapshotfilebs=1Mcount=10240chmod600 snapshot_file mkswap snapshotfileswapon snapshot_file echo
“$snapshot_file swap swap defaults 0 0” | tee -a /etc/fstab For more
information on configuring swap files, see:
http://www.cyberciti.biz/faq/linux-add-a-swap-file-howto/

創建交換文件
建議用戶自己配置交換文件
建議你創建一個交換文件,這樣操作系統不會再MemSQL用光內存的時候立刻殺掉進程,這有一個創建交互文件的例子:
snapshot_dir=”/var/lib/memsql/data/snapshots”
snapshot_file=”snapshotdir/memsql.swpmkdirp snapshot_dir
dd if=/dev/zero of=snapshotfilebs=1Mcount=10240chmod600 snapshot_file
mkswap snapshotfileswapon snapshot_file
echo “$snapshot_file swap swap defaults 0 0” | tee -a /etc/fstab
http://www.cyberciti.biz/faq/linux-add-a-swap-file-howto/

Ensure port 3306 is available on every cluster host The default port
used by MemSQL is 3306, which is configurable. For the smoothest user
experience, make sure this port is accessible and unoccupied. If
another process is already running on the system and occupying port
3306, you will see an error in MemSQL similar to: Shell 120501
3:04:15 [ERROR] Can’t start server: Bind on TCP/IP port: Address
already in use 120501 3:04:15 [ERROR] Do you already have another
mysqld server running on port: 3306 ? 120501 3:04:15 [ERROR] Aborting
To workaround this, update the MemSQL port value in the memsql.cnf
file to an available port. Thereafter, connect to the database using
the new port, i.e. mysql -u root -h 127.0.0.1 -P
–prompt=”memsql> “.

確認3306在集羣的每一臺機子上都沒有被佔用
MemSQL的默認端口是3306,這個是可以配置的,爲了好的體驗,請確認這個端口是可訪問的並且沒有被佔用。如果端口被佔用。你會看到與下面相似的報錯:
120501 3:04:15 [ERROR] Can’t start server: Bind on TCP/IP port: Address already in use
120501 3:04:15 [ERROR] Do you already have another mysqld server running on port: 3306 ?
120501 3:04:15 [ERROR] Aborting
爲了解決這問題,在memsql.cnf中設置成可用的端口號,從這以後,用新的端口號連接數據庫 mysql -u root -h 127.0.0.1 -P –prompt=”memsql> “

Recommendations for Optimal On-Premise Columnstore Performance We
recommend the P3600/P3700 Intel SSDs. Update the devices to the latest
firmware using the Intel SSD Data Center Tool. We support the EXT4
filesystem using the discard and noatime mount options. We currently
do not support XFS. Many improvements have been made recently in Linux
for NVMe devices, so we recommend using a 3.0+ series kernel. For
example, CentOS 7.2 uses the 3.10 kernel. Set the following parameters
in Linux (make it permanent in /etc/rc.local): Shell # Set
DEVICENUMBERforeachdeviceecho0>/sys/block/nvme {DEVICE_NUMBER}n1/queue/add_random echo 1 >
/sys/block/nvmeDEVICENUMBERn1/queue/rqaffinityechonone>/sys/block/nvme {DEVICE_NUMBER}n1/queue/scheduler echo 1023 >
/sys/block/nvme${DEVICE_NUMBER}n1/queue/nr_requests

優化列式存儲的建議
我們建議 P3600/P3700的Intel SSD,在https://downloadcenter.intel.com/download/23931/Intel-SSD-Data-Center-Tool中更新最新的固件,我們支持EXT4文件系統,
使用discard和noatime選項,我們目前不支持XFS。
在NVMe設備上的linux做了很多的提高,所以我們建議用至少3.0以上的核
在linux下設置以下參數(在/etc/rc.local中永久設置):

Set ${DEVICE_NUMBER} for each device

echo 0 > /sys/block/nvmeDEVICENUMBERn1/queue/addrandomecho1>/sys/block/nvme {DEVICE_NUMBER}n1/queue/rq_affinity
echo none > /sys/block/nvmeDEVICENUMBERn1/queue/schedulerecho1023>/sys/block/nvme {DEVICE_NUMBER}n1/queue/nr_requests

Disable requiretty For the smoothest user experience of automatically
provisioning the MemSQL cluster, MemSQL Ops needs the ability to SSH
into other hosts outside an interactive shell/session. As such, it
requires that requiretty is disabled. This can be done by modifying
the /etc/sudoers file and commenting out any lines that reference
requiretty.

設置requirtty爲disable
爲了更好的用戶體驗,MemSQL Ops需要SSH連接到其他的主機,所以需要設置requiretty爲disable,可以在/etc/sudoers中設置

Ensure working SSH between cluster hosts For MemSQL Agent to
successfully download and install itself in other hosts in the
cluster, it needs to connect via SSH from the primary host (where the
primary MemSQL Agent resides) to other hosts in the cluster. For the
best user experience, check to make sure SSH is possible between the
primary host and the other hosts in the cluster. If this is not
possible, you will need to install the MemSQL Agent manually in all
cluster hosts.

確認各個集羣主機之間的SSH連接
爲了MemSQL Agent能夠成功的在集羣上的其他節點下載和安裝他自己,他需要通過SSH鏈接原始主機到其他的主機,爲了最好的用戶體驗,確認二者之間的SSH是有鏈接的,
如果沒有連接,你需要手動的在所有主機上安裝MemSQL Agent

Start a new login shell after running install.sh The install.sh script
creates a memsql user and memsql group within the Linux machine, and
adds the user running install.sh to the memsql group. To successfully
complete the addition of your user into the memsql group, you will
need to restart your shell.

在運行install.sh之後運行一個新的登錄窗口
install.sh腳本在linux機器上創建了一個memsql用戶和memsql組,把用戶增加到memsql組,爲了成功的把當前用戶加到memsql組,你需要重新啓動你的shell窗口。

Create the same Linux user with sudo permissions on every cluster host
Setting up MemSQL using the MemSQL Agent is easiest if the user and
password is the same for every node. The MemSQL Agent will use this
user to connect to all hosts in the cluster and install / deploy
MemSQL.

在每一個集羣主機上使用sudo權限創建一樣的linux用戶
如果每個節點的用戶名和密碼是一樣的使用MemSQL Agent創建集羣是最簡單的,MemSQL將要使用這個用戶鏈接到所有主機去安裝部署MemSQL。

Ensure port 9000 is acessible on every cluster host The default port
opened by the MemSQL Ops is 9000, which is configurable. For the
smoothest user experience, open this port on every host, and ensure it
is accessible via a browser. For deployments on public cloud platforms
(e.g. AWS, Azure), make sure the proper security group is configured
for your cluster hosts that allow public access to port 9000. If it is
not possible to open port 9000, MemSQL Ops agents can be started on
other ports using the –port flag in memsql-ops start.

確認每個集羣主機的9000端口是可以訪問的
MemSQL Ops默認的端口是9000,這是可配置的,爲了最好的用戶體驗,在每一個主機上開啓這個端口,確信他可以通過瀏覽器訪問,爲了在公共雲平臺上安裝,確認爲你的
集羣配置了合適的安全組,可以訪問9000端口。如果9000端口被佔用,使用memsql-ops –port再啓動memsql-ops時候設置端口

Install numactl on machines with multiple sockets For optimal
performance, MemSQL Ops automatically detects if a machine has
multiple sockets and recommends that MemSQL be deployed in a
NUMA-aware manner. Specifically, MemSQL Ops will run numactl commands
to bind individual MemSQL nodes to CPUs. This allows faster access to
in-memory data, since individual MemSQL nodes only access data that’s
collocated with their corresponding CPU. For MemSQL Ops to enable
NUMA, you need to install the numactl package in your Linux host: sudo
apt-get install numactl

使用多個socket在機器上安裝numactl
爲了更好的性能,MemSQL Ops自動檢測機器上是否有多個MemSQL並建議以NUMA-aware方式來部署MemSQL,特殊說明的,MemSQL Ops將要運行numactl命令來綁定MemSQL 節點
和CPU,這允許更快地訪問內存數據,因爲獨立的MemSQL及誒電腦只能訪問他們對應CPU的數據,爲了讓MemSQL Ops允許NUMA,你需要安裝numactl包:
sudo apt-get install numactl

Ensure /tmp has free space or change TMPDIR MemSQL Ops, like many Unix
utilities, writes temporary data to /tmp and requires available free
space. It is possible to change the temporary directory by setting the
canonical Unix environment variable TMPDIR.

確認/tmp有空間或者改變TMPDIR
MemSQL Ops,向很多unix工具,向tmp文件夾寫入臨時數據並需要剩餘空間,可以通過設置unix的環境變量TMPDIR來設置文件目錄

Setting Default Functionality
bind-address 0.0.0.0 如果地址是0.0.0.0, 試圖連接所有的網絡接口, 否則它只連接指定IP地址接口.
flush_before_replicate OFF 如果被設置, 數據被髮送到從節點之前先被刷新到本地磁盤. 它將要增加複製的時延,但是保證從節點不會比主節點先複製數據
master_aggregator no default value 不設置的時候本機就是 master aggregator, 否則設置成host:port就是master aggregator的地址
maximum_memory 90% of System RAM MemSQL可以使用的最大內存 (MB). 默認格式是所有機器內存的百分數
maximum_table_memory 90% of maximum_memory 存儲表格的內存,默認是上一次參數的百分數.
port 3306 用於連接的端口號.
port_open_timeout 0 得知服務器端口號被佔用之後等待的時間(0代表着不等待).
redundancy_level 1 如果設置成1,沒有複製. 如果設置成2, 打開 MemSQL’s :doc:High Availibility Mode .
reported_hostname no default value 本機的IP地址或者主機名,用於和其他的機器通信.
snapshot_trigger_size 268435456 byte 日誌達到什麼大小應該開啓一個新鏡像.
snapshots_to_keep 2 應該保留幾個鏡像和日誌文件用於備份和複製.
user, u current user 作爲user運行memsqld.
datadir ./data 數據目錄,這個目錄包括鏡像,日誌和列式存儲
plancachedir ./plancache Plancache路徑. 這個路徑包含代碼生成的編譯計劃路徑.
tracelogsdir ./tracelogs Tracelogs路徑. 這個目錄包括日誌文件, 包括memsql.log 和查詢日誌.
Cluster Management Settings
Setting Default Functionality
aggregator_failure_detection ON Aggregators是否應該檢測其他aggregators的失敗.
auto_attach ON 如果aggregator的重複設置成 1, 它死之後是不是應該自動連接其他節點
distributed_heartbeat_timeout 10 分佈式的查詢超時.
leaf_failure_detection ON master aggregator是否應該檢測葉子節點的失敗.

Connection Management Settings
Setting Default Functionality
connect_timeout 10 memsqld服務器等待一個連接的秒數.
interactive_timeout 28800 服務器等待應用程序多少秒不反應就關閉連接.
max_allowed_packet 104857600 協議包的最大大小.
max_connect_errors 10 如果一個主機打斷的連接超過這個數量,這個主機將來的連接將會被阻塞.
max_connections 151 同時在線的客戶端的數量.
max_connection_threads 8192 處理連接的最大內核線程數量.
max_pooled_connections 1024 每個葉子存儲的最多的連接.
skip_name_resolve AUTO 是否實施域名解析.OFF, ON, or AUTO. 默認的AUTO在有根據主機的安全策略的時候將要實施反向DNS查找.
sync_slave_timeout 10000 以milliseconds爲單位設置 master等待同步複製的slave的確認反饋.
wait_timeout 28800 以seconds爲單位服務器在這段時間內等待不到連接就關閉.

Database Optimization Settings
Setting Default Functionality
buffered_rows 1024 分佈式join中緩存的行數量.
columnstore_disk_insert_threshold 0.5 到了這個閥值( columnar_segment_rows的分數), 多個插入 columnstore 將要被直接寫入磁盤.
columnar_segment_rows 102400 一個段文件的最大行數.
columnstore_window_size 2147483648 從節點保存重複的列式存儲數據的總大小.更大的大小將要防止從節點上重複的列式存儲的表失去同步性.
default_partitions_per_leaf 8 每個葉子節點的默認分區數量.
load_data_read_size 8192 LOAD DATA一次讀的byte.
load_data_write_size 8192 LOAD DATA一次寫的byte.
lock_wait_timeout 60 以秒爲單位返回等待row lock的最長事件,等不到返回錯誤.
max_prepared_stmt_count 16382 同時prepared statements的最大數量.
multi_insert_tuple_count 1000 aggregators在多插入下發送到葉子節點的元組數量
net_first_packet_read_timeout 30 以second爲單位設置等待一個空轉的阻塞了DDL操作事務的時間.
net_read_timeout 3600 以秒爲單位在設置的時間內從一個連接沒有更多的數據過來就放棄這個連接.
net_write_timeout 3600 以秒爲單位設置在指定時間內一個連接沒有寫入數據就放棄寫.
optimize_columnar_tables ON 在後臺線程優化列式存儲.
plan_expiration_minutes 720 一個查詢計劃存活的時間.
query_parallelism 0 併發運行查詢的最大數量.
recovery_batch_size 5 從磁盤中恢復每次讀取多少行數據.
replication_timeout_ms 60000 重複超時的milliseconds數
rows_per_period 1 每一期SimpleStreamingIterator 的拉出的行數量.
transaction_buffer 67108864 MemSQL內存中的事務緩存的大小.

Geospatial Settings
Setting Default Functionality
geo_sphere_radius 6367444.657120 半球半徑大小,一般用於距離的計算.

Logging Settings
Setting Default Functionality
core_file ON 在崩潰的時候生成完整的核心緩存.
critical_diagnostics ON 向MemSQL發送用法和關鍵的錯誤診斷.
debug_mode OFF 代碼生成階段產生debug信息.
general_log OFF 把日誌連接和查詢放到表或者日誌文件,OFF – 不打印日誌, ON – 打印爲一個日誌, PARTIAL- 負載不大的時候打印日誌
general_log_file /var/lib/memsql/tracelogs/query.log 日誌連接和查詢放在指定文件.
warn_level WARNINGS 不常見的語法支持:ERRORS,WARNINGS 或者EXPERIMENTAL.

Security Settings
Setting Default Functionality
ssl_ca none CA文件,用於SSL.
ssl_capath none CA目錄,用於SSL.
ssl_cert none 證書文件,用於SSL.
ssl_cipher none 密碼,用於SSL.
ssl_key none 公鑰,私鑰對,用於SSL.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章