1. 概述
業務發展到一定程度,分庫分表是一種必然的要求,分庫可以實現資源隔離,分表則可以降低單表數據量,提高訪問效率。
分庫分表的技術方案,很久以來都有兩種理念:
集中式的Proxy,實現MySQL客戶端協議,使用戶無感知
分佈式的Proxy,在代碼層面進行增強,實現一個路由程序
這兩種方式是各有利弊的,集中式Proxy的好處是業務沒有感知,一切交給DBA把控,分佈式的Proxy其支持的語言有限,比如本文要提及的ShardingShpere-JDBC就只支持Java。
我們需要了解一點,集中式的Proxy其實現非常複雜,這要從MySQL處理SQL語句的原理說起,因爲不是本文要論述的重點,因此只是簡單的提及幾點:
- SQL語句要被Parser解析成抽象語法樹
- SQL要被優化器解析出執行計劃
- SQL語句完成解析後,發給存儲引擎
因此大部分的中間件都選擇了自己實現SQL的解析器和查詢優化器,下面是著名的中間件dble的實現示意圖:
只要有解析的過程,其性能損耗就是比較可觀的,我們也可以認爲這是一種重量級的解決方案。
與之形成對比的是ShardingSphere-JDBC,其原理示意圖如下:
每一個服務都持有一個Sharing-JDBC,這個JDBC以Jar包的形式提供,基本上可以認爲是一個增強版的jdbc驅動,需要一些分庫分表的配置,業務開發人員不需要去對代碼進行任何的修改。可以很輕鬆的移植到SpringBoot,ORM等框架上。
但是這個中結構也不是完美的,每一個服務持有一個proxy意味着會在MySQL服務端新建大量的連接,維持連接會增加MySQL服務器的負載,雖然這種負載提升一般無法察覺。
關於ShardingSphere的詳細知識,我們可以參考其官方文檔,地址如下:
2. 編碼實現
要分庫分表首先需要有不同的數據源,我們啓動兩個mysqld進行,監聽3306和3307兩個端口,以多實例的形式模擬多數據源。
我們的分庫是以用戶ID爲依據的,分表是以表本身的主鍵爲依據的。下面是一張示意表:
-- 注意,這是邏輯表,實際不存在
create table t_order
(
order_id bigint not null auto_increment primary key,
user_id bigint not null,
name varchar(100)
);
CREATE TABLE `t_order_item` (
`order_id` bigint(20) NOT NULL,
`item` varchar(100) DEFAULT NULL,
`user_id` bigint(20) NOT NULL,
PRIMARY KEY (`order_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
我現在有兩個數據源,每個數據源上根據order_id分成2兩表,也就是說每個實例上都應該有這兩張表:
create table t_order0
(
order_id bigint not null auto_increment primary key,
user_id bigint not null,
name varchar(100)
);
create table t_order1
(
order_id bigint not null auto_increment primary key,
user_id bigint not null,
name varchar(100)
);
-- 這是廣播表,新建在其中一個節點上就可以
CREATE TABLE `t_config` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` bigint(20) DEFAULT NULL,
`config` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
CREATE TABLE `t_order_item0` (
`order_id` bigint(20) NOT NULL,
`item` varchar(100) DEFAULT NULL,
`user_id` bigint(20) NOT NULL,
PRIMARY KEY (`order_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `t_order_item1` (
`order_id` bigint(20) NOT NULL,
`item` varchar(100) DEFAULT NULL,
`user_id` bigint(20) NOT NULL,
PRIMARY KEY (`order_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
作爲一個DBA,不能在公司需要你的時候頂上去做一個Java程序員,是可恥的的,因此我會Java。
利用SpringBoot技術可以很快的構建一個RESTful的Web服務,下面是application.properties的內容:
# 這裏要註冊所有的數據源
spring.shardingsphere.datasource.names=ds0,ds1
# 這是數據源0的配置
spring.shardingsphere.datasource.ds0.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds0.jdbc-url=jdbc:mysql://localhost:3306/test?serverTimezone=GMT%2B8
spring.shardingsphere.datasource.ds0.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds0.username=root
spring.shardingsphere.datasource.ds0.password=
# 這是數據源1的配置
spring.shardingsphere.datasource.ds1.type=com.zaxxer.hikari.HikariDataSource
spring.shardingsphere.datasource.ds1.jdbc-url=jdbc:mysql://localhost:3307/test?serverTimezone=GMT%2B8
spring.shardingsphere.datasource.ds1.driver-class-name=com.mysql.jdbc.Driver
spring.shardingsphere.datasource.ds1.username=root
spring.shardingsphere.datasource.ds1.password=
# 分庫策略
# 分庫的列是user_id
spring.shardingsphere.sharding.default-database-strategy.standard.sharding-column=user_id
spring.shardingsphere.sharding.default-database-strategy.standard.precise-algorithm-class-name=com.sinosun.demo.sharding.PreciseShardingAlgorithmImpl
# 分表策略
spring.shardingsphere.sharding.tables.t_order.actual-data-nodes=ds$->{0..1}.t_order$->{0..1}
spring.shardingsphere.sharding.tables.t_order.table-strategy.inline.sharding-column=order_id
spring.shardingsphere.sharding.tables.t_order.table-strategy.inline.algorithm-expression=t_order$->{order_id % 2}
spring.shardingsphere.sharding.tables.t_order.key-generator.column=order_id
spring.shardingsphere.sharding.tables.t_order.key-generator.type=SNOWFLAKE
spring.shardingsphere.sharding.tables.t_order_item.actual-data-nodes=ds$->{0..1}.t_order_item$->{0..1}
spring.shardingsphere.sharding.tables.t_order_item.table-strategy.inline.sharding-column=order_id
spring.shardingsphere.sharding.tables.t_order_item.table-strategy.inline.algorithm-expression=t_order_item$->{order_id % 2}
spring.shardingsphere.sharding.binding-tables=t_order, t_order_item
# 廣播表, 其主節點是ds0
spring.shardingsphere.sharding.broadcast-tables=t_config
spring.shardingsphere.sharding.tables.t_config.actual-data-nodes=ds$->{0}.t_config
spring.jpa.show-sql=true
server.address=10.1.20.96
server.port=8080
這是buid.gradle內容,只列舉ShardingSphere相關的,至於SpringBoot工程如何構建,參考SpringBoot的書籍或者資料:
dependencies {
compile group: 'org.apache.shardingsphere', name: 'sharding-jdbc-spring-boot-starter', version: '4.0.0-RC1'
compile group: 'org.apache.shardingsphere', name: 'sharding-jdbc-spring-namespace', version: '4.0.0-RC1'
}
下圖是工程的代碼結構,供參考:
現在開始列舉代碼:
Entity是最簡單的部分:
package com.example.demo.entity;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.Table;
import java.util.StringJoiner;
@Entity
@Table(name = "t_order")
public class Order {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private long orderId;
@Column(name = "user_id")
private long userId;
@Column(name = "name")
private String name;
public long getOrderId() {
return orderId;
}
public void setOrderId(long orderId) {
this.orderId = orderId;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public long getUserId() {
return userId;
}
public void setUserId(long userId) {
this.userId = userId;
}
@Override
public String toString() {
return new StringJoiner(", ", Order.class.getSimpleName() + "[", "]")
.add("orderId=" + orderId)
.add("userId=" + userId)
.add("name='" + name + "'")
.toString();
}
}
package com.example.demo.entity;
import com.google.common.base.MoreObjects;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.Table;
@Entity
@Table(name = "t_order_item")
public class OrderItem {
@Id
@Column(name = "order_id")
private long orderId;
@Column(name = "user_id")
private long userId;
@Column(name = "item")
private String item;
public long getOrderId() {
return orderId;
}
public void setOrderId(long orderId) {
this.orderId = orderId;
}
public long getUserId() {
return userId;
}
public void setUserId(long userId) {
this.userId = userId;
}
public String getItem() {
return item;
}
public void setItem(String item) {
this.item = item;
}
@Override
public String toString() {
return MoreObjects.toStringHelper(this)
.add("orderId", orderId)
.add("userId", userId)
.add("item", item)
.toString();
}
}
package com.example.demo.entity;
import com.google.common.base.MoreObjects;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.Table;
@Entity
@Table(name = "t_config")
public class TConfig {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private int id;
@Column(name = "user_id")
private long userId;
@Column(name = "config")
private String config;
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public long getUserId() {
return userId;
}
public void setUserId(long userId) {
this.userId = userId;
}
public String getConfig() {
return config;
}
public void setConfig(String config) {
this.config = config;
}
@Override
public String toString() {
return MoreObjects.toStringHelper(this)
.add("id", id)
.add("userId", userId)
.add("config", config)
.toString();
}
}
Dao層的實現,有了SpringBoot以後連代碼都不需要怎麼寫了,聲明一個接口就可以了:
package com.example.demo.dao;
import com.example.demo.entity.Order;
import org.springframework.data.jpa.repository.JpaRepository;
public interface OrderDao extends JpaRepository<Order, Long> {
}
這裏我利用了Query註解,寫了一條HQL語句:
package com.example.demo.dao;
import com.example.demo.entity.OrderItem;
import org.springframework.data.jpa.repository.JpaRepository;
import org.springframework.data.jpa.repository.Query;
import org.springframework.data.repository.query.Param;
import java.util.Optional;
public interface OrderItemDao extends JpaRepository<OrderItem, Long> {
//爲了測試綁定表
@Query(value = "select n from Order t inner join OrderItem n on t.orderId = n.orderId where n.orderId=:orderId")
Optional<OrderItem> getOrderItemByOrderId(@Param("orderId") Long orderId);
}
package com.example.demo.dao;
import com.sinosun.demo.entity.TConfig;
import org.springframework.data.jpa.repository.JpaRepository;
public interface ConfigDao extends JpaRepository<TConfig, Integer> {
}
Controller層具體實現:
package com.example.demo.controller;
import com.example.demo.dao.OrderDao;
import com.example.demo.entity.Order;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestMethod;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import java.util.Optional;
@RestController
public class OrderController {
@Autowired
private OrderDao orderDao;
@RequestMapping(value = "/order", method = RequestMethod.GET)
public Optional<Order> getOrderById(@RequestParam("id") Long id) {
return this.orderDao.findById(id);
}
@RequestMapping(value = "/order/save", method = RequestMethod.POST)
public Order saveOrder(@RequestParam("name") String name, @RequestParam("userid") Long userId) {
Order order = new Order();
order.setName(name);
order.setUserId(userId);
return this.orderDao.save(order);
}
}
package com.example.demo.controller;
import com.example.demo.dao.OrderItemDao;
import com.example.demo.entity.OrderItem;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestMethod;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import java.util.Optional;
@RestController
public class OrderItemController {
@Autowired
private OrderItemDao orderItemDao;
@RequestMapping(value = "/orderItem", method = RequestMethod.GET)
public Optional<OrderItem> getOrderItemById(@RequestParam(name = "id") Long id) {
return this.orderItemDao.findById(id);
}
@RequestMapping(value = "/orderItem/save", method = RequestMethod.POST)
public OrderItem saveOrderItem(@RequestParam("item") String item, @RequestParam("userid") Long userId, @RequestParam("orderid") Long orderId) {
OrderItem orderItem = new OrderItem();
orderItem.setUserId(userId);
orderItem.setItem(item);
orderItem.setOrderId(orderId);
return this.orderItemDao.save(orderItem);
}
@RequestMapping(value = "/orderItem/query", method = RequestMethod.GET)
public Optional<OrderItem> getOrderItemByOrderId(@RequestParam(name = "orderid") Long orderId) {
return this.orderItemDao.getOrderItemByOrderId(orderId);
}
}
package com.example.demo.controller;
import com.example.demo.dao.ConfigDao;
import com.example.demo.entity.TConfig;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestMethod;
import org.springframework.web.bind.annotation.RestController;
import java.util.List;
@RestController
public class ConfigController {
@Autowired
private ConfigDao configDao;
@RequestMapping(value = "/listConfig", method = RequestMethod.GET)
public List<TConfig> getConfig() {
return this.configDao.findAll();
}
}
這三段代碼寫完基本的功能就完備了,但是剛纔配置的時候提過,我們的目的是按照user_id進行分庫,比如user_id=0則分配這條數據到ds0去,如果爲1則將數據分配到ds1去,這就要求我們自己實現分庫的算法,ShardingSphere提供了接口,只需要去實現就可以了:
package com.example.demo.sharding;
import org.apache.shardingsphere.api.sharding.standard.PreciseShardingAlgorithm;
import org.apache.shardingsphere.api.sharding.standard.PreciseShardingValue;
import java.util.Collection;
public class PreciseShardingAlgorithmImpl implements PreciseShardingAlgorithm<Long> {
@Override
public String doSharding(Collection<String> availableTargetNames, PreciseShardingValue<Long> shardingValue) {
String dbName = "ds";
Long val = shardingValue.getValue();
dbName += val;
for (String each : availableTargetNames) {
if (each.equals(dbName)) {
return each;
}
}
throw new IllegalArgumentException();
}
}
這段代碼很簡單,其中有幾個地方只需要講明白了就可以。
availableTargetNames:這是datasource的名字列表,在這裏應該是ds0和ds1;
shardingValue:這是分片列的值,我們只要其value部分就可以。
之後用一個循環遍歷["ds0", "ds1"]集合,當我們的dbName和其中一個相等時,就能的到正確的數據源。這就簡單的實現了根據user_id精確分配數據的目的。
這是實測例子中,shardingValue和availableTargetNames的實際值:
本次測試的請求是:
curl -X POST \
'http://10.1.20.96:8080/order/save?name=LiLei&userid=0' \
-H 'Postman-Token: d5e15e85-c760-4252-a7d4-ef57b5e95c2e' \
-H 'cache-control: no-cache'
下面看看實際效果,這是ds0的數據:
這是ds1的數據:
可以看到,所有的數據都根據user_id分佈到了不同的庫中,所有的數據都根據order_id的奇偶分佈到了不同的表中。
記錄下保存t_order請求返回的order_id,組裝一條POST請求寫t_order_item表:
curl -X POST \
'http://10.1.20.96:8080/orderItem/save?item=pen&userid=0&orderid=371698107924086785' \
-H 'Accept: */*' \
-H 'Cache-Control: no-cache' \
-H 'Connection: keep-alive' \
-H 'Host: 10.1.20.96:8080' \
-H 'Postman-Token: 347b6c4d-0e2c-474f-b53e-6f0994db5871,24b362da-e77e-4b04-94e1-fa20dcb15845' \
-H 'User-Agent: PostmanRuntime/7.15.0' \
-H 'accept-encoding: gzip, deflate' \
-H 'cache-control: no-cache' \
-H 'content-length: '
得到結果如下:
使用這個order_id去進行聯合查詢:
curl -X GET \
'http://10.1.20.96:8080/orderItem/query?orderid=371698107924086785' \
-H 'Accept: */*' \
-H 'Cache-Control: no-cache' \
-H 'Connection: keep-alive' \
-H 'Host: 10.1.20.96:8080' \
-H 'Postman-Token: d0da0523-d46e-429f-a8db-9f844cd77fe6,b61c6089-253d-4535-b473-158c037850be' \
-H 'User-Agent: PostmanRuntime/7.15.0' \
-H 'accept-encoding: gzip, deflate' \
-H 'cache-control: no-cache'
得到返回如下:
測試廣播表,可以用下面的請求:
curl -X GET \
http://10.1.20.96:8080/listConfig \
-H 'Accept: */*' \
-H 'Cache-Control: no-cache' \
-H 'Connection: keep-alive' \
-H 'Host: 10.1.20.96:8080' \
-H 'Postman-Token: 1c9d0349-4b6d-4a2c-834f-4e2f94194649,3dff68f4-2e10-4e96-926a-344faa5f0a19' \
-H 'User-Agent: PostmanRuntime/7.15.0' \
-H 'accept-encoding: gzip, deflate' \
-H 'cache-control: no-cache'
得到的結果:
這只是簡單地實現了分庫分表,但是任何分庫分表集羣都很複雜,必然包括分庫分表,讀寫分離還有配置中心分發。這些我基本都驗證了,後面再詳細記錄。