用爬蟲爬取某妹子圖片網站圖片

閒聊

這部分在這就省了吧 感興趣去我自己搭的博客看 : www.jojo-m.cn

代碼實現

import requests
from lxml import etree
import time
import re
import os

header = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36'
}


def Download(url):
    # 獲取網頁
    # url = 'https://www.xxxx.com/9456.html'
    # url = 'https://www.xxxx.com/13487.html'

    def getPage(url):
        response = requests.get(url, headers=header)
        return response.text

    # 下載
        # names = html.xpath('//*[@id="app"]/div/div/div[1]/dl/dd/a/@title')
        # times = html.xpath('//p[@class="releasetime"]/text()')
    # 創建文件夾
    html = etree.HTML(getPage(url))
    dir_name = html.xpath('//h1[@class="post-title h3"]/text()')
    if not os.path.exists("./spider/vmgirls/Down/vmgirls/" + dir_name[0]):
        os.makedirs("./spider/vmgirls/Down/vmgirls/" + dir_name[0])
        # 下載圖片
        if html.xpath('//div[@class="post-content"]/div/p/a/img/@data-src') != []:
            imgs = html.xpath(
                '//div[@class="post-content"]/div/p/a/img/@data-src')
        else:
            imgs = html.xpath(
                '//div[@class="post-content"]/div/p/img/@data-src')
        # print(imgs)
        for img in imgs:
            time.sleep(1)
            file_name = img.split('/')[-1]
            print(file_name)
            response = requests.get(img, headers=header)
            with open('./spider/vmgirls/Down/vmgirls/' + dir_name[0] + '/' + file_name, 'wb') as f:
                f.write(response.content)


# num = 9017
a = int(input("輸入上界:"))
b = int(input("輸入下界:"))
for num in range(a, b):
    time.sleep(1)
    print('https://www.xxxx.com/' + str(num) + '.html')
    if(requests.get('https://www.xxxx.com/' + str(num) + '.html', headers=header).status_code == 200):
        Download('https://www.xxxx.com/' + str(num) + '.html')

因爲這個網站似乎是個人性質的網站 而且瀏覽的人好像也不少 我就兩個地方寫了 sleep(1) 不給他服務器太大壓力 以免宕機造成損失

然後手機上下載 Pydroid3 這個軟件 手機上也能運行 (這樣就可以在手機上看小姐姐了(滑稽))

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章