scrapy利用登陸後的cookie請求人人網個人主頁

獲取cookie

首先在本地瀏覽器手動打開人人網登陸頁面,鍵入帳號密碼登陸到個人主頁,點擊大鵬董成鵬頁面,在瀏覽器上獲取到當前頁面的cookie字符串信息如下:
anonymid=k6li2urqmmt9jn; r01=1; taihe_bi_sdk_uid=47b05d5b6248dcc6bfdb17ccb7e300ea; jebe_key=47aa4a50-a3c8-40ca-9db7-4f2bc30b2698%7Cf53e32baff17067fc2a3314328abf6f5%7C1581644213414%7C1%7C1581644225324; depovince=ZGQT; JSESSIONID=abcfFdaSA0QD2d1p4sdmx; ick_login=fc47a3b9-0c47-44fe-ade3-9e6d5b3fe632; taihe_bi_sdk_session=eda59d8bd55d0b638faa357f44e151fe; ick=e432aac9-4d6a-433c-8fd6-1f3d55c32c2c; __utma=151146938.130496749.1593488178.1593488178.1593488178.1; __utmc=151146938; __utmz=151146938.1593488178.1.1.utmcsr=renren.com|utmccn=(referral)|utmcmd=referral|utmcct=/; __utmb=151146938.1.10.1593488178; jebecookies=8d94a419-6a35-4e35-8777-f1f26d9a950b|||||; _de=5389E98C12C7F6A2406083BE9D97C8CE; p=d35644c2155672cbf8a17f61337261418; first_login_flag=1;

在scrapy的spider文件中重寫start_requests方法

    def start_requests(self):
        cookies_str = "anonymid=k6li2urqmmt9jn; _r01_=1; taihe_bi_sdk_uid=47b05d5b6248dcc6bfdb17ccb7e300ea; jebe_key=47aa4a50-a3c8-40ca-9db7-4f2bc30b2698%7Cf53e32baff17067fc2a3314328abf6f5%7C1581644213414%7C1%7C1581644225324; depovince=ZGQT; JSESSIONID=abcfFdaSA0QD2d1p4sdmx; ick_login=fc47a3b9-0c47-44fe-ade3-9e6d5b3fe632; taihe_bi_sdk_session=eda59d8bd55d0b638faa357f44e151fe; ick=e432aac9-4d6a-433c-8fd6-1f3d55c32c2c; __utma=151146938.130496749.1593488178.1593488178.1593488178.1; __utmc=151146938; __utmz=151146938.1593488178.1.1.utmcsr=renren.com|utmccn=(referral)|utmcmd=referral|utmcct=/; __utmb=151146938.1.10.1593488178; jebecookies=8d94a419-6a35-4e35-8777-f1f26d9a950b|||||; _de=5389E98C12C7F6A2406083BE9D97C8CE; p=d35644c2155672cbf8a17f61337261418; first_login_flag=1; ln_uact=17835704219; ln_hurl=http://head.xiaonei.com/photos/0/0/men_main.gif; t=97a71838c4d1d0e0bb518da9263f56de8; societyguester=97a71838c4d1d0e0bb518da9263f56de8; id=974301188; xnsid=fcb2f778; ver=7.0; loginfrom=null; wp_fold=0; jebe_key=47aa4a50-a3c8-40ca-9db7-4f2bc30b2698%7C7509c9b571970a73aca0fcf7b197de35%7C1593488543944%7C1%7C1593488546025"
        cookies_dict = {i.split("=")[0]:i.split("=")[1] for i in cookies_str.split("; ")}
        yield scrapy.Request(
            self.start_urls[0],
            callback=self.parse,
            cookies=cookies_dict
        )

注意的是需要將cookie字符串轉化成字典類型

在parse方法中驗證是否請求成功

    def parse(self, response):
        print(re.findall(r"大鵬董成鵬", response.body.decode()))

這裏驗證請求道的response中是否有"大鵬董成鵬"字符串
驗證結果如下:
在這裏插入圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章