Python服務器運維筆記：第一章數據庫精講 - 1.1.9索引

前言：本文是學習網易微專業的《python全棧工程師》中的《服務器運維開發工程師》專題的課程筆記，歡迎學習交流。同時感謝老師們的精彩傳授！

一、課程目標

什麼是索引
如何創建索引
索引設計原則
索引查詢分析explain

二、詳情解讀

2.1.什麼是索引

索引就好像書本的目錄一樣，如果沒有索引我們需要一頁頁的翻過去尋找（全表掃描），但是如果給每頁的內容按照索引組織起來，我們就可以根據索引快速的查找內容。

在上圖中，在沒有索引的情況下，如果要查找city爲021的會員，那麼必然需要每個會員檢查一遍，會員數量越大，耗時越長。把會員按照city組織起來，city就是會員的索引。

MySQL中的索引是以B樹(B-Tree)結構爲索引數據結構，各存儲引擎使用的方式有所不同，innodb引擎以B+Tree結構構造索引。除了B樹索引還有HASH索引。

2.2.索引策略

現在有如下一張表：

2.3.創建索引

創建表時指定索引：

create table `table_name` (
	id int not null,
	field int not null,
	field2 char(25) not null,
	primary key(id),
	key `index_name`(`field`)
);

示例一：創建索引

create table `mycms`.`goods`(
	goods_id int not null auto_increment,
	goods_type int not null,
	goods_name int not null,
	goods_price dec not null,
	seller char925) not null,
	primary key (goods_id),  # 主鍵索引
	key `goods_type_index` (`goods_type`), # 普通索引
	key `seller_index` (`goods_type`, `seller`), # 普通索引
	key `union_index` (`goods_type`, `selelr`) # 聯合索引，它是有順序的
);

爲已經存在的表添加索引：

# 語句一
create index 
	index_name
on table(field)

# 語句二
alter table `table`
add index `index_name`(`field`);

刪除索引：

ALTER TABLE `table`
DROP INDEX `index_name`;

示例二：修改索引

ALTER TABLE `mycms`.`goods`
DROP INDEX `seller_index`,
ADD INDEX `seller_index` (`goods_price` ASC);

示例三：添加全文索引

ALTER TABLE `mycms`.`goods`
ADD FULLTEXT INDEX `name_index` (`goods_name`);

2.4.索引帶來的查詢性能變化

2.4.1.索引帶來的性能提升

爲了測試索引的性能提升，先生成大量的數據。執行如下程序：

Step1：在數據庫mycms中新建表users

#  用戶表結構
CREATE TABLE `users` (
   `user_id` int(11) NOT NULL AUTO_INCREMENT,
   `username` varchar(30) DEFAULT NULL,
   `realname` varchar(45) DEFAULT NULL,
   `password` varchar(45) DEFAULT NULL,
   `province` varchar(45) DEFAULT NULL,
   `city` varchar(5) DEFAULT NULL,
   `age` tinyint(2) DEFAULT NULL,
   `sex` enum('男','女') DEFAULT NULL,
   `mylike` set('釣魚','旅遊','看書','唱歌') DEFAULT NULL,
   `reg_date` datetime DEFAULT NULL,
   `last_login` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
   PRIMARY KEY (`user_id`),
   UNIQUE KEY `username_UNIQUE` (`username`)
 ) ENGINE=InnoDB AUTO_INCREMENT=560026 DEFAULT CHARSET=utf8;

Step2：批量生成數據

import mysql.connector  as connector
import random
from threading import Thread, activeCount, Lock
from mysql.connector import pooling


# 批量設置註冊時間，登陸時間
# 批量設置省份,城市
from datetime import datetime, timedelta
provinces = ['江蘇','浙江', '上海','北京','重慶','廣東','山東','湖北']
citys = {}
citys['江蘇'] = ["南京", "蘇州", "無錫", "常州", "泰州", "揚州"]
citys['浙江'] = ["杭州", "寧波", "嘉興", "麗水", "台州", "溫州"]
citys['上海'] = ['黃埔區', '靜安區', '楊浦區', '虹口區', '松江區', '浦東新區']
citys['北京'] = ['東城區', '西城區', '朝陽區', '豐臺區', '昌平區', '通州區']
citys['重慶'] = ['萬州區', '江北區', '渝北區', '南岸區', '巴南區', '江津區']
citys['廣東'] = ['廣州', '佛山', '肇慶', '中山', '珠海', '江門']
citys['山東'] = ['濟南', '青島', '煙臺', '威海', '菏澤', '臨沂']
citys['湖北'] = ['武漢', '宜昌', '黃石', '十堰', '襄陽', '鄂州']

max = round(datetime.timestamp(datetime.now()))
min = round(datetime.timestamp(datetime.now()-timedelta(days=365)))


pooling.CNX_POOL_MAXSIZE = 32
cnxpool =pooling.MySQLConnectionPool(pool_name="mypool", pool_size=30,
                                                         user='root', password='root',
                                                         host='localhost',database='mycms')
print("pool:",type(cnxpool))

def set_fields(last_id):
    print(last_id)
    sql = "update users set `province`='{province}', `city`='{city}', " \
          "`reg_date`='{reg_date}', `last_login`='{last_login}' " \
          "where `user_id`='{last_id}'"
    province = random.choice(provinces)
    reg_date = datetime.now()-timedelta(days=random.randint(0, 365))
    params = {
        'province': province,
        'city': random.choice(citys[province]),
        'reg_date': reg_date,
        'last_login': reg_date+timedelta(days=random.randint(0,300)),
        'last_id': last_id
    }
    try:
        cnx = cnxpool.get_connection()
        cursor = cnx.cursor()
    except Exception as e:
        print("no get",e)
        return False
    try:
        sql = sql.format(**params)
        cursor.execute(sql)
        cnx.commit()
    except:
        pass
    finally:
        cursor.close()
        cnx.close()


last_id=0
def run():
    global last_id
    while True:
        print("active_count:", activeCount())
        if activeCount() < 10:
            try:
                cnx = cnxpool.get_connection()
                cursor = cnx.cursor()
            except:
                print("等待釋放")
            else:
                sql = "select user_id from users where user_id > {last_id} and province is null limit 10 "
                sql = sql.format(last_id=last_id)
            # print(sql)
                cursor.execute(sql)
                ids = cursor.fetchall()
                cursor.close
                cnx.close()
                # print(ids)
                if not ids:
                    break
                else:
                    last_id = ids[-1][0]
                    for id in ids:
                        t = Thread(target=set_fields, args=(id[0],))
                        t.start()
                        pass


run()

Step3：從用戶表中搜索某個特定城市：（未創建索引的情況下）

select * from mycms.users where city="蘇州"

運行結果：（注意運行時間）

Step4：給用戶表創建索引，並執行Step3中的查詢：

ALTER TABLE `mycms`.`users`
ADD INDEX `city`(`city` ASC);

select * from mycms.users where city="蘇州"

運行結果：（注意運行時間，少了好多）

說明：
當有索引的情況下，count(*)只需統計索引的數目即可，並不需要查詢每個數據，所以查詢速度會提升。

多列索引的問題：
多列索引，需要注意索引列的順序，比如一個由三列組成索引field1, field2, field3，下面的使用方法要注意：

使用方式	結果
`where field1 = val`	使用索引有效
`where field1 = val and field2 = val2`	使用索引有效
`where field2 = val`	使用索引無效
`where field3 = val`	使用索引無效
`where field2 = val and field3 = val`	使用索引無效

示例三：

當創建多列索引時，索引是有順序的：

ALTER TABLE `mycms`.`users`
ADD INDEX `city`(`province` ASC, `city` ASC);

現在進行查詢測試：

select  sql_no_cache count(*) from mycms.users where province="江蘇" and city="蘇州"

運行結果：（耗時0.0046s）

當執行下面查詢語句時，查詢時間會變長：

select sql_no_cache count(*) from mycms.users where city="蘇州"

運行結果：（耗時0.120s）

2.4.2.`explain`索引分析

通過explain語句可以分析索引使用狀況

explain 查詢語句

項目名	功能
select_type	表示`select`類型。 `simple`：簡單表， `primary`：主查詢， `union`：`union`後面的查詢， `subquery`：子查詢中的`select`
table	輸出結果集的表
type	訪問類型。 `ALL`：全表掃描， `index`：索引全掃描， `range`：索引範圍掃描， `ref`：使用非唯一索引或者唯一索引前綴掃描， `eq_ref`：唯一索引， `const`：單表主鍵或者唯一鍵查詢
possible_keys	查詢時可能使用的索引
key_len	使用到索引字段的長度
rows	掃描行的數量
Extra	`extra`

示例：

explain select * from mycms.users where user_id=1000

運行結果：

explain select * from mycms.users where user_id > 1000 and user_id < 1300

explain select * from mycms.users where username="luxp"

運行結果：

explain select * from mycms.users where province="江蘇" and city="蘇州"

運行結果：

explain select * from mycms.users where province="江蘇"

運行結果：

explain select * from mycms.users where city="蘇州"
explain select sql_no_cache count(*) from mycms.users where reg_date > unix_timestamp

運行結果：

explain select sql_no_cache count(*) from mycms.users where reg_date > unix_timestamp(now())-24*3600*60

運行結果：（這個是沒創建索引的查詢結果）

2.5.索引的設計原則

原則1： 使用在where條件中出現的列，而不是在select中出現的列。

原則2： 索引列的數據分佈。如果你有100萬條記錄，然後根據性別索引，每個索引依然要掃描50萬，沒有太大意義，但是根據省份，城市這種多樣化的字段做索引就比較就意義。

原則3： 索引列長度儘量短。如果一個列的前綴索引就可以完成索引，比如真實姓名的姓部分，就可以只用姓部分做索引，而不要姓名整列做索引。

原則4: 索引不是越多越好，索引影響插入性能。

三、課程小結

01、索引概念
02、索引創建與修改
03、索引查詢分析

索引是mysql中比較重要的概念，我們要掌握索引的創建與修改方法，以及使用explain進行搜索查詢的分析。

Python服務器運維筆記：第一章數據庫精講 - 1.1.9索引

一、課程目標

二、詳情解讀

2.1.什麼是索引

2.2.索引策略

2.3.創建索引

2.4.索引帶來的查詢性能變化

2.4.1.索引帶來的性能提升

2.4.2.`explain`索引分析

2.5.索引的設計原則

三、課程小結

Python服務器運維筆記：第三章電商實戰 - 3.1.3.購物車

Python學習筆記：7.5.5 Django快速建站 - Web開發實戰API3

Python服務器運維筆記：第三章電商實戰 - 3.1.2.多級分類

Python服務器運維筆記：第三章電商實戰 - 3.1.1.電商二次開發

Python服務器運維筆記：第二章Linux - 1.2.12 http協議

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

Python服務器運維筆記：第一章數據庫精講 - 1.1.9索引

一、課程目標

二、詳情解讀

2.1.什麼是索引

2.2.索引策略

2.3.創建索引

2.4.索引帶來的查詢性能變化

2.4.1.索引帶來的性能提升

2.4.2.explain索引分析

2.5.索引的設計原則

三、課程小結

2.4.2.`explain`索引分析