current position:Home>Python crawler actual combat, requests module, python realizes the full set of skin to capture the glory of the king

Python crawler actual combat, requests module, python realizes the full set of skin to capture the glory of the king

2022-01-30 13:53:42 Dai mubai

Little knowledge , Great challenge ! This article is participating in “ A programmer must have a little knowledge ” Creative activities .

Preface

Today, I'll take you to climb the full set of skin of the king's glory , I don't say much nonsense , Just start ~

development tool

Python edition : 3.6.4

Related modules :

requests modular ;

urllib modular ;

As well as some Python Built in modules .

Environment building

install Python And add to environment variable ,pip Install the relevant modules required .

Thought analysis

1、 Open the official King's glory wallpaper website Web site address :pvp.qq.com/web201605/w…

2、 Shortcut key F12, Call out the console to capture packets

 Grab the bag

3、 Find the right link and analyze

URL Address

4、 View the return data format

 data format 1  data format 2

5、 analysis url link

 analysis url link

6、 see url Whether the content is the required picture , The discovery is actually a thumbnail

 thumbnail

7、 Then analyze the website , Just click on a piece of wallpaper , View links in the specified format

 Link in specified format

8、 Find the target address

 Destination address

9、 Analyze the differences between the target link and the thumbnail link

 thumbnail :http://shp.qpic.cn/ishow/2735090714/1599460171_84828260_8311_sProdImgNo_6.jpg/200

 Goal map :http://shp.qpic.cn/ishow/2735090714/1599460171_84828260_8311_sProdImgNo_6.jpg/0
 Copy code 

You can know , Will specify the format of the thumbnail address after 200 Replace with 0 It's the real picture of the target

Code implementation

import os, time, requests, json, re
from retrying import retry
from urllib import parse
 
class HonorOfKings:
    ''' This is a main Class, the file contains all documents. One document contains paragraphs that have several sentences It loads the original file and converts the original file to new content Then the new content will be saved by this class '''
    def __init__(self, save_path='./heros'):
        self.save_path = save_path
        self.time = str(time.time()).split('.')
        self.url = 'https://apps.game.qq.com/cgi-bin/ams/module/ishow/V1.0/query/workList_inc.cgi?activityId=2735&sVerifyCode=ABCD&sDataType=JSON&iListNum=20&totalpage=0&page={}&iOrder=0&iSortNumClose=1&iAMSActivityId=51991&_everyRead=true&iTypeId=2&iFlowId=267733&iActId=2735&iModuleId=2735&_=%s' % self.time[0]
 
    def hello(self):
        ''' This is a welcome speech :return: self '''
        print("*" * 50)
        print(' ' * 18 + ' Download the glory Wallpaper of the king ')
        print(' ' * 5 + ' author : Felix Date: 2020-05-20 13:14')
        print("*" * 50)
        return self
 
    def run(self):
        ''' The program entry '''
        print('↓' * 20 + '  Format selection : ' + '↓' * 20)
        print('1. thumbnail  2.1024x768 3.1280x720 4.1280x1024 5.1440x900 6.1920x1080 7.1920x1200 8.1920x1440')
        size = input(' Please enter the serial number of the format you want to download , Default 6:')
        size = size if size and int(size) in [1,2,3,4,5,6,7,8] else 6
 
        print('--- Download start ...')
        page = 0
        offset = 0
        total_response = self.request(self.url.format(page)).text
        total_res = json.loads(total_response)
        total_page = --int(total_res['iTotalPages'])
        print('--- in total  {}  page ...' . format(total_page))
        while True:
            if offset > total_page:
                break
            url = self.url.format(offset)
            response = self.request(url).text
            result = json.loads(response)
            now = 0
            for item in result["List"]:
                now += 1
                hero_name = parse.unquote(item['sProdName']).split('-')[0]
                hero_name = re.sub(r'[【】:.<>|·@#$%^&() ]', '', hero_name)
                print('--- Downloading section  {}  page  {}  hero   speed of progress {}/{}...' . format(offset, hero_name, now, len(result["List"])))
                hero_url = parse.unquote(item['sProdImgNo_{}'.format(str(size))])
                save_path = self.save_path + '/' + hero_name
                save_name = save_path + '/' + hero_url.split('/')[-2]
                if not os.path.exists(save_path):
                    os.makedirs(save_path)
                if not os.path.exists(save_name):
                    with open(save_name, 'wb') as f:
                        response_content = self.request(hero_url.replace("/200", "/0")).content
                        f.write(response_content)
            offset += 1
        print('--- Download complete ...')
 
 @retry(stop_max_attempt_number=3)
    def request(self, url):
        ''' Send a request :param url: the url of request :param timeout: the time of request :return: the result of request '''
        response = requests.get(url, timeout=10)
        assert response.status_code == 200
        return response
 
if __name__ == "__main__":
    HonorOfKings().hello().run()
 Copy code 

For the complete source code of this issue, please refer to the profile of your home page

Code run results

 Code runs

 result

copyright notice
author[Dai mubai],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201301353405579.html

Random recommended