current position:Home>Python crawler actual combat, requests module, python to grab the beautiful wallpaper of a station

Python crawler actual combat, requests module, python to grab the beautiful wallpaper of a station

2022-01-30 19:58:19 Dai mubai

「 This is my participation 11 The fourth of the yuegengwen challenge 1 God , Check out the activity details :2021 One last more challenge 」.


utilize Python Crawl back to the desktop wallpaper , I don't say much nonsense .

Let's start happily ~

development tool

Python edition : 3.6.4

Related modules :

requests modular ;

re modular

As well as some Python Built in modules .

Environment building

install Python And add to environment variable ,pip Install the relevant modules required .

Thought analysis

Target website…

After entering the website, you can see the following contents from the drop-down menu :  Website Click on any image , Go to the picture details page , Inside is a group of pictures , Contains large images and thumbnails :  Large and thumbnails This page prohibits the right mouse button , Press ctrl+u View the web page source code , Find the picture link, which can be obtained in the web page source code ; Each picture has two links , Comparing the two links, it is found that one of them has more parameters _360_360, The link without this parameter is the original HD image , The other is the standard definition drawing !  HD original & Standard definition drawing The details page is accessed by the home page link , Let's go back to the home page , Press ctrl+u View the web page source code ; It is found that there is a link to the details page in the web page source code , It can be inferred that both the home page and the detail page are statically loaded web pages !  Web source code Drop down the page on the home page , It is found that it will continue to load data , But the website hasn't changed :  Drop down page But click the page turning operation below alone , The website will change :  Insert picture description here thus it can be seen , For page turning operation, we only need to change the parameters of the website :
 Copy code 

Core code

def main(html_url): #  Incoming home page url
    response = get_response(html_url) #  The request function receives the home page url And request data 
    urls = re.findall('<a href="(.*?)" target="_blank">.*?</a>', response.text)[31:47] #  Extract details page url
    for link in urls:
        response_ = get_response(link)#  Request function receive details page url And request data 
        image_url = re.findall('src="(*?)"', response_.text)[1:] #  To extract the image url
        url_data(image_url) #  Back to picture url
 Copy code 

Delete selected data

 Regular expression delete data

The complete source code can be found in the profile of the personal home page

Data saving local

 Save the data

copyright notice
author[Dai mubai],Please bring the original link to reprint, thank you.

Random recommended