Python之網絡爬蟲實戰（爬圖篇）——LOL英雄和皮膚我都要

使用requests庫來爬取英雄聯盟所有英雄及皮膚，小白有何不清楚可查看入門篇：Python之網絡爬蟲實戰（入門篇）
打開英雄聯盟官網的所有英雄所在的頁面來獲取英雄的編號Id：
https://lol.qq.com/data/info-heros.shtml
鼠標右鍵，選擇“查看元素”（或直接按快捷鍵F12），點擊選項“網絡”，按快捷鍵F5刷新一下，避免部分文件沒顯示出來，下拉查找一個命名爲hero_list.js的文件，該文件保存了所有英雄的相關信息，點擊該文件，右邊欄的消息頭會有個請求網址https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js，該網址就是所要找的，保存了所有英雄的相關信息
網頁打開https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js，出現的是混亂的代碼：

對此使用快捷鍵：Ctrl+A將所有代碼選中並複製下來，放到JSON解析https://www.json.cn/來使代碼格式化，方便查看：
可見目前一共有145個英雄，展開hero目錄，裏面的heroId就是所要的，仔細觀察會發現heroId並不是按1-145的順序（注意此坑），故不能直接用個循環來解決
點開一個英雄，查看英雄的皮膚及對應的名稱（操作與上述雷同）：
可見安妮有13個英雄皮膚
接着就是細節的處理與代碼的編寫了

爬取英雄聯盟所有英雄及皮膚的完整代碼：

import requests
import os

headers = {"User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36"}
def get_hero():
    url = "https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js"
    res = requests.get(url).json()
    for hero in res['hero']:
        hero_id = hero['heroId']    #獲取英雄編號
        detail_line = 'https://game.gtimg.cn/images/lol/act/img/js/hero/'+hero_id+'.js' #字符串拼接
        #detail_line = 'https://game.gtimg.cn/images/lol/act/img/js/hero/%s.js'%hero_id #python2.5
        #detail_line = f'https://game.gtimg.cn/images/lol/act/img/js/hero/{hero_id}.js' #字符串格式化python3.6
        #detail_line = 'https://game.gtimg.cn/images/lol/act/img/js/hero/{}.js'.format(hero_id) #format()形式
        get_skin(detail_line)
    
def get_skin(url):
    res = requests.get(url,headers=headers).json()
    for skin in res["skins"]:
        if not skin["mainImg"]:
            continue
        item = {}
        item["heroName"] = skin["heroName"]     #英雄的名字
        item["skinName"] = skin["name"].replace("/","_")    #皮膚的名字並將名字中出現的斜線/用下劃線代替_
        item["skinImage"] = skin["mainImg"] #皮膚的圖片鏈接
        print(item)
        save(item)

def save(item):
    #構造一個目錄
    hero_path = '.images/'+item['heroName']+'/'
    if not os.path.exists(hero_path):   #若目錄不存在則創建目錄
        os.makedirs(hero_path)
    res = requests.get(item["skinImage"])       #發送圖片請求
    with open(hero_path + item["skinName"]+".png","wb") as f:
        f.write(res.content)

if __name__ == "__main__":
    get_hero()

發現一個不錯的爬蟲教程，在此分享一下，掃描上方二維碼或直接在微信上搜索公衆號“百里鎖鑰”，於後臺回覆“爬蟲實戰教程”即可獲取

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Python之網絡爬蟲實戰（爬圖篇）——LOL英雄和皮膚我都要

druid數據源 xml配置

SQL Server 的完整下載安裝教程

Python網絡編程的應用與網頁數據的處理

關係數據理論

os庫的正確打開姿勢

Linux之文本編輯器工具Vim

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結