current position:Home>Use Python to get the information of popular positions and see what skills you need from a high salary

Use Python to get the information of popular positions and see what skills you need from a high salary

2022-02-02 09:49:29 Hi, learn programming


It's the end of the year , Those who have jobs should consider changing jobs next year , Those who don't have a job should consider looking for a job next year , So have you figured out what to do ?

 Insert picture description here
I don't know how to understand ? Don't worry , Come on , We use it python One click View and analyze !

One 、 Prepare foreplay

1、 The software used

python 3.8
pycharm 2021 pro   Activation code 

2、 Built in modules used

pprint >>> #  Format input module 
csv >>> #  preservation csv file 
re >>> # re  Regular expressions 
time >>> #  Time module 

3、 Third party modules to be installed

requests >>> #  Data request module  

win + R Input cmd Enter the installation command pip install Module name , If there is a burst of red , It may be because the network connection timed out , Switch the domestic image source .

If you really won't, see my top article
 Insert picture description here

Two 、 Thinking process

A crawler is a simulated browser , Send a request to the server , Get the data it returns in response .

Data source analysis

First, set goals , Analyze data content , Where can I get .

Which data is passed through url What kind of request does the address send , With those request headers , And then you get the data ( Through developer tools for packet capture analysis )

We analyze the data , Is the data returned by the analysis server , Not the element panel ,elements It's the element panel , Content after front-end code rendering .
 Insert picture description here

Code implementation steps

  1. Send a request , For what we just analyzed url( Data packets ) Address send request ,post request , Request parameters , header Request header ;
  2. get data , Get the data content of the response body , Data returned by the server ;
  3. Parsing data , Extract what we want , According to the returned data , Select the most suitable parsing method to extract data ;
  4. Save the data , Save local database / Text / Tabular data ;
  5. Multi page crawling ;
  6. No software or tutorials , On the left side of the web page, you can get , E-books , The video is ready .

3、 ... and 、 Code section

brothers , The favorite part is here , Code up .
 Insert picture description here

import requests  #  Data request module  
import pprint  #  Format input module 
import csv   #  preservation csv file 
import time #  Time module 


#  Open file   It will be saved later  mode  Is the way to save  a  Append write 
f = open(' pinkie promise .csv', mode='a', encoding='utf-8', newline='')
csv_writer = csv.DictWriter(f, fieldnames=[
    ' title ',
    ' Company name ',
    ' City ',
    ' Salary ',
    ' Experience ',
    ' Education ',
    ' Details page ',
])
csv_writer.writeheader()  #  Write header 
for page in range(1, 11):
    # 1.  Send a request   String format output  {} Place holder 
    print(f'=================== Climbing to the top {
      page} The data content of the page ===================')
    time.sleep(2) #  Time delay 2 Second 
    url = 'https://www.lagou.com/jobs/v2/positionAjax.json'  #  Determine the requested url Address 
    # headers  Request header   A crawler is a simulated browser   Send a request to the server ,  Get his response data 
    # headers  effect   camouflage python Code   hold python Code disguised as a browser   To send a request   Simple anti climbing is a means 
    # user-agent  The user agent   The identity of the browser 
    headers = {
    
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36'
    }
    # data  Request parameters , post request   Need to pass on a from data The form data 
    # pycharm Tips to use 1 :  Fast batch replacement   Choose  ctrl + R  Enter regular expression matching rules 
    # pycharm Tips to use 2 :  Translation plug-ins   You can install 
    data = {
    
        'first': 'true',
        'needAddtionalResult': 'false',
        'city': ' The national ',
        'px': 'new',
        'pn': page,
        'fromSearch': 'true',
        'kd': 'python',
    }
    #  adopt requests In this module post Request method   about url Address send request ,  And pass me a data Request parameters , headers  Request header ,  Last response Variable reception 
    response = requests.post(url=url, data=data, headers=headers)
    # <Response [200]>  The result of putting it back  response  object  200  Status code   Indicates that the request was successful 
    # 2.  get data  response.json()  obtain json Dictionary data  response.text  Get text data ( String data ) response.content  binary data 
    # print(response.text)
    # pprint.pprint(response.json())
    # 3.  Parsing data   Dictionary data type ,  Parsing data   Extract content   You can take values according to key values   According to the content to the left of the colon ,  Extract the content to the right of the colon 
    #  According to the content to the left of the colon ,  Extract the content to the right of the colon 
    result = response.json()['content']['positionResult']['result']
    # pprint.pprint(result)
    for index in result:  # for loop   Traverse   Extract every element in the list 
        title = index['positionName']  #  title 
        company_name = index['companyFullName']  #  Company name 
        city = index['city']  #  City 
        money = index['salary']  #  Salary 
        workYear = index['workYear']  #  Experience 
        edu = index['education']  #  Education 
        href = f'https://www.lagou.com/wn/jobs/{
      index["positionId"]}.html'
        # json.loads()  String to dictionary data 
        dit = {
    
            ' title ': title,
            ' Company name ': company_name,
            ' City ': city,
            ' Salary ': money,
            ' Experience ': workYear,
            ' Education ': edu,
            ' Details page ': href,
        }
        csv_writer.writerow(dit)
        print(dit)

If brothers think it's ok , Remember San Lian ha ~

 Insert picture description here

copyright notice
author[Hi, learn programming],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/02/202202020949267169.html

Random recommended