current position:Home>Introduction to python (III) network request and analysis

Introduction to python (III) network request and analysis

2022-01-30 13:20:13 baiyuliang

Install network request module :Requests

pip install requests

 Insert picture description here

Do you seem to know each other ? Do you think of nodejs?

A simple test :

First import requests modular :

import requests
 Copy code 

get request :

response = requests.get("https://www.baidu.com")
print(response)
 Copy code 

result :

 Insert picture description here

Indicates that the request has been successful , We can view... In the editor response What's in it :

 Insert picture description here

Print response.text:

 Insert picture description here

This is the content of Baidu home page , But it's garbled , take it easy , Add this step :

response = requests.get("https://www.baidu.com")
response.encoding = response.apparent_encoding
print(response.text)
 Copy code 

 Insert picture description here

OK! Is it often simple ?

Of course ,requests Not just for get, It also supports post,put,delete etc. :

 Insert picture description here

When using different request methods , It also supports headler,param wait :

requests.get("https://www.baidu.com",headers=,params=)
 Copy code 

This is essential to the network request framework !

We use 360 Picture interface as an example , Make paging requests :wallpaper.apc.360.cn/index.php?c…

Of course , We can go straight through get Request the link , It can also be done through post Request and pass in parameters :

params = {
    'c': 'WallPaperAndroid',
    'a': 'getAppsByCategory',
    'cid': 9,
    'start': 0,
    'count': 10
}
response = requests.post("http://wallpaper.apc.360.cn/index.php", params=params)
print(response.text)
 Copy code 

Request the results (json Format ):

 Insert picture description here

analysis json:

json_data = json.loads(response.text)
print('errno=%s,errmsg=%s' % (json_data['errno'], json_data['errmsg']))
list = json_data['data']
print("count=" + str(len(list)))
 Copy code 

result :

 Insert picture description here

Be careful :print Print log when , Use... After the string + When splicing parameters, only string types can be used , So we need to use str() take int Type to string type !

Okay ,json The parsing of the format is finished , What should I do if I want to parse the web page ? I used... A long time ago java When parsing a web page , Use a tool called jsoup, I believe many students have used , When it is, press xml Format parsing , Various node,element...

python There are similar and powerful web page parsing tools : BeautifulSoup.( notes :python Primary zone xml-sax,xml-dom Parser , But whether it's good or not needs your own experience !)

BeautifulSoup Using document

BeautifulSoup Advantages and disadvantages :

 Insert picture description here

We usually analyze web page data , It's the second kind of :BeautifulSoup(markup, "lxml")!

install BeautifulSoup:pip install bs4

A simple test , Take Baidu home page as an example :

from bs4 import BeautifulSoup

response = requests.get("https://www.baidu.com")
response.encoding = response.apparent_encoding
print(response.text)
soup = BeautifulSoup(response.text, "lxml")
title = soup.find(name='title').text  #  You can omit it name:soup.find('title')
print(title)
 Copy code 

Error report in execution :

Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?  Copy code 

Solution :

1. install virtualenv: pip install virtualenv

1. install lxml: pip install lxml

Re execution py Program , result :

 Insert picture description here

copyright notice
author[baiyuliang],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201301320113205.html

Random recommended