Python爬取英雄聯盟所有英雄皮膚

原創

New_Yao

2020-06-30 11:56

一、得到所有英雄信息

通過查詢英雄聯盟首頁上的英雄信息全英雄地址，發現所有英雄信息是存放在一個js下的json文件，文件地址所有英雄json

通過格式化此json文件，我們可以得到如下信息

通過分析得到hero爲英雄信息，更進一步，很容易猜到所有信息的含義，

二、確定英雄信息和英雄皮膚文件的關聯關係

把這個js格式化出來查看，安妮信息地址 https://game.gtimg.cn/images/lol/act/img/js/hero/1.js

格式化後可以推斷出url最後的n.js，n代表着英雄的heroId，繼續查找

這裏主要講解一下chromas的意思，參數爲0:是基礎、1:炫彩，我們可以用這個參數來區分炫彩皮膚。

三、代碼

# auth:jh
# date:2020年2月28日 15:00:03
import json
import os
import re
import random
import requests
from requests.exceptions import RequestException

# 本地保存地址
base_path = 'D:\\lol_hero_skin'
# 人機識別信息
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
                         'Chrome/80.0.3987.122 Safari/537.36'}


# 處理文件名,window系統下有些字符不允許出現\/:*?"<>| K\DA皮膚引起此問題
def handle_str(_str):
    temp = re.sub('[\\\/:*?"<>|]', '', _str)
    if len(temp) == 0:
        return ''.join(str(random.choice(range(10))) for _ in range(10))
    return temp


# 下載圖片
def download_img(img_url, _base_path, name):
    r = requests.get(img_url, headers=headers, stream=True)
    print(name, r.status_code)  # 返回狀態碼
    if r.status_code == 200:
        name = handle_str(name)
        open(_base_path + "\\" + name + '.jpg', 'wb').write(r.content)  # 將內容寫入圖片
        print("done")
    del r


def load_hero_skin(heroId):
    hero_img_url_prefix = 'https://game.gtimg.cn/images/lol/act/img/js/hero/'
    hero_img_url_suffix = '.js'
    response = requests.get(hero_img_url_prefix + heroId + hero_img_url_suffix, headers=headers)
    html = json.loads(response.text)  # 將網頁內容以json返回
    skinsList = html.get('skins')  # 皮膚列表
    heroName = html.get('hero').get('name')  # 黑暗之女
    heroTitle = html.get('hero').get('title')  # 安妮
    heroName = handle_str(heroName)
    heroTitle = handle_str(heroTitle)
    hero_skins_path = base_path + '\\' + heroName + ' ' + heroTitle
    if not os.path.exists(hero_skins_path):
        print('不存在,創建中。。。')
        os.makedirs(hero_skins_path, 755)
    for n in skinsList:
        skinName = n.get('name')
        _chromas = n.get('chromas')  # 0:是基礎、1:炫彩
        mainImg = n.get('mainImg')  # 皮膚圖片地址
        # print(skinName)
        # print(_chromas)
        # print(mainImg)
        if _chromas == '0':
            # 下載該圖片
            download_img(mainImg, hero_skins_path, skinName)


# 獲取全部英雄對象json
def get_hero_json():
    try:
        hero_list_url = 'https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js'
        response = requests.get(hero_list_url, headers=headers)
        html = json.loads(response.text)  # 將網頁內容以json返回
        print('版本:', html.get('version'))
        print('文件名:', html.get('fileName'))
        print('文件更新時間:', html.get('fileTime'))
        print('總英雄數量:', len(html.get('hero')))
        for i in html.get('hero'):
            heroId = i.get('heroId')
            load_hero_skin(heroId)
    except RequestException:
        return None


def main():
    get_hero_json()


# 當.py文件被直接運行時，當.py文件以模塊形式被導入時，if __name__ == '__main__'之下的代碼塊不被運行。
if __name__ == '__main__':
    main()

下載好後格式如下

使用及注意

base_path 改爲自己需要的地址即可運行
因爲window系統下有些字符不允許出現/😗?"<>| K\DA皮膚引起此問題，此代碼會特殊處理一下名字

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Python爬取英雄聯盟所有英雄皮膚

一、得到所有英雄信息

二、確定英雄信息和英雄皮膚文件的關聯關係

三、代碼

下載好後格式如下

使用及注意

Springboot2集成Shiro框架（八）使用redis管理session

Oracle表鎖定：record os locked by another user

Linux下Redis 5.0.7集羣搭建

Google:Error in event handler for runtime.onInstalled: TypeError: Cannot read property 'sync'...

使用存儲過程批量插入測試數據

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結