current position:Home>Learn Python quickly and take a shortcut~

Learn Python quickly and take a shortcut~

2022-02-01 04:23:09 Vegetable farmer said

Hello everyone , I'm a small dish . A hope to be Blowing cattle X On Architecture Man ! If you want to be what I want to be , Or I'll pay attention and be a companion , Let's not be alone !

This paper mainly introduces python Introduction to learning

If necessary , You can refer to

If help , Don't forget give the thumbs-up *

The official account of WeChat has been opened , Vegetable farmer said , If you don't pay attention to it, remember to pay attention to it !

hello , Hello everyone . Here is Cai Bucai The forerunner of Vegetable farmer said . Don't change your name , The avatar changed , Everyone gets lost ~

Recently, in order to expand the language , This week I learned about Python How to play , After learning, I found , Ah , It's delicious . I wonder if you think this language is a little interesting when you first learn a language , I want to try everything .

Speaking of Python The reaction may be Reptiles automated testing , I seldom talk about using python To do it web Development , relatively speaking , At home web More languages are used for development java~ But it doesn't mean python Not suitable for web Development , As far as I know, the commonly used web Framework has Django and flask etc. ~

Django It's a very heavy framework , It provides many convenient tools , Many things are also encapsulated , You don't need to build too many wheels yourself

Flask The good thing about it is that it's small , But the disadvantage is also small , Flexibility also means you need to build more wheels , Or spend more time configuring

But the point of this article is not to introduce python Of web Development , It's not an introduction python Basic introduction to , But talk about python Introduction to automated testing and crawlers ~

in my opinion , If you have experience in developing other languages , It's still a small dish. It's recommended to start with a case directly , Learn while watching , Grammar and so on are actually the same ( There will be a combination later java Go to school python The content of ), The code can basically read eight or nine times , But if you don't have any language development experience , It's still recommended to study systematically from the beginning , Videos and books are good choices , It's recommended here Mr. Liao Xuefeng The blog of , The content is pretty good Python course

One 、 automated testing

python There are many things that can be done , There are many interesting things

Learning a language , Of course, you have to find something interesting to learn faster , For example, you want to climb the pictures or videos of so and so website , Is that so? ~

What is automated testing ? That's it automation + test , As long as you write a script (.py file ), The process that will automatically help you test in the background after running , Then use automated testing , There is a good tool to help you complete , That's it Selenium

Selenium Is a web Automated test tool , It can easily simulate the operation of real users on the browser , It supports a variety of mainstream browsers , such as IE、Chrome、Firefox、Safari、Opera etc. , Use here python Demonstrate , Is not to say that Selenium Only support python, It has client drivers for multiple programming languages , Syntactic introduction ~ Let's do a simple example demonstration !

1) Lead to

In order to ensure the smooth demonstration , We need to do some preparatory work , Otherwise, the browser may not open normally ~

step 1

Check out the browser version , We use the following Edge, We can enter... In the URL input box edge://version Check out the browser version , Then go to the corresponding driver store to install the corresponding version of the driver Microsoft Edge - Webdriver (windows.net)

step 2

Then we will unzip the downloaded driver file to you python Under the installation directory of Scripts Under the folder

image-20211119231250381

2) Browser operation

Get ready , Let's look at the following simple code :

Plus the guide bag, it's only 4 Line code , And input... At the terminal python autoTest.py, The following demonstration results are obtained :

You can see that the script has been implemented Open browser automatically Automatically enlarge the window Automatically open Baidu web page , Three automated operations , It brings our study one step closer , Do you think it's interesting ~ Let you gradually sink !

Here are some common methods for browser operation :

Method explain
webdriver.xxx() Used to create browser objects
maximize_window() window maximizing
get_window_size() Get browser size
set_window_size() Set browser size
get_window_position() Get browser location
set_window_position(x, y) Set browser location
close() Close current label / window
quit() Turn off all tags / window

These are of course Selenium Basic routine operation of , Better is still to come ~

When we open the browser , Of course, what I want to do is not just the simple operation of opening a web page , After all, the programmer's ambition is unlimited ! We also want to automate page elements , Then this needs to be said Selenium The positioning operation of

3) Positioning elements

The element positioning of the page is no stranger to the front end , use JS Element positioning can be easily realized , For example, the following :

  • adopt id Positioning

document.getElementById("id")

  • adopt name Positioning

document.getElementByName("name")

  • Locate by tag name

document.getElementByTagName("tagName")

  • adopt class Class

document.getElementByClassName("className")

  • adopt css Selectors to locate

document.querySeletorAll("css selector")

The above methods can realize the selection and positioning of elements , Of course, the protagonist of our section is Selenium, As the main automatic test tool , How can you show weakness ~ It realizes the positioning of page elements in 8 Kind of , as follows :

  1. id location

driver.find_element_by_id("id")

We open Baidu page , You can find the... Of this input box id yes kw,

In the clear element ID after , We can use id Position elements , The way is as follows

from selenium import webdriver

#  load  Edge  drive 
driver = webdriver.ChromiumEdge()
#  Set the maximum window size 
driver.maximize_window()
#  Open Baidu web page 
driver.get("http://baidu.com")

#  adopt  id  Positioning elements 
i = driver.find_element_by_id("kw")
#  Enter a value into the input box 
i.send_keys(" Vegetable farmer said ")
 Copy code 

  1. name Attribute value positioning

driver.find_element_by_name("name")

name The way of positioning is the same as id be similar , All need to find name Value , Then call the corresponding. api, Use as follows :

from selenium import webdriver

#  load  Edge  drive 
driver = webdriver.ChromiumEdge()
#  Set the maximum window size 
driver.maximize_window()
#  Open Baidu web page 
driver.get("http://baidu.com")

#  adopt  id  Positioning elements 
i = driver.find_element_by_name("wd")
#  Enter a value into the input box 
i.send_keys(" Vegetable farmer said ")
 Copy code 
  1. Class name positioning

driver.find_element_by_class_name("className")

And id and name The positioning mode is consistent , We need to find the corresponding className And then position it ~

  1. Tag name positioning

driver.find_element_by_tag_name("tagName")

This method is still rarely used in our daily life , Because in HTML It's through tag To define the function , such as input It's input ,table It's a form ... Each element is actually a tag, One tag Often used to define a class of functions , There may be more than one... In a page div,input,table etc. , Therefore use tag It's hard to accurately locate elements ~

  1. css Selectors

driver.find_element_by_css_selector("cssVale")

In this way, you need to connect css The five selectors of

Five selectors

  1. Element selector

The most common css A selector is an element selector , stay HTML In the document, this selector usually refers to some kind of HTML Elements , for example :

html {background-color: black;}
p {font-size: 30px; backgroud-color: gray;}
h2 {background-color: red;}
 Copy code 
  1. Class selectors

. Add the class name to form a class selector , for example :

.deadline { color: red;}
span.deadline { font-style: italic;}
 Copy code 
  1. id Selectors

ID Selectors are somewhat similar to class selectors , But the difference is very significant . First, an element cannot have more than one class like a class attribute , An element can only have one unique ID attribute . Use ID The method of selector is pound number # add id value , for example :

#top { ...}
 Copy code 
  1. Attribute selector

We can select elements according to their attributes and attribute values , for example :

a[href][title] { ...}
 Copy code 
  1. descendent selector

It is also called context selector , It's using documents DOM Structure to css Select the . for example :

body li { ...}
h1 span { ...}
 Copy code 

Of course, the selector here is just a brief introduction , For more information, please refer to the documentation ~

After learning about selectors, we can proceed happily css Selectors It's positioned :

from selenium import webdriver

#  load  Edge  drive 
driver = webdriver.ChromiumEdge()
#  Set the maximum window size 
driver.maximize_window()
#  Open Baidu web page 
driver.get("http://baidu.com")

#  adopt  id Selectors   Positioning elements 
i = driver.find_elements_by_css_selector("#kw")
#  Enter a value into the input box 
i.send_keys(" Vegetable farmer said ")
 Copy code 
  1. Link text positioning

driver.find_element_by_link_text("linkText")

This method is specifically used to locate text links , For example, we can see on Baidu's home page Journalism hao123 Map ... And so on

Then we can use the link text to locate

from selenium import webdriver

#  load  Edge  drive 
driver = webdriver.ChromiumEdge()
#  Set the maximum window size 
driver.maximize_window()
#  Open Baidu web page 
driver.get("http://baidu.com")

#  adopt   Link text   Locate the element and   Click on 
driver.find_element_by_link_text("hao123").click()
 Copy code 

  1. Some link text

driver.find_element_by_partial_link_text("partialLinkText")

This way is right link_text The assistance of , Sometimes a hyperlink text may be very long , If we all input, it's both troublesome and unsightly

In fact, we only need to intercept part of the string and let selenium Just understand what we want to choose , So it's using partial_link_text This way, ~

  1. xpath Path expression

driver.find_element_by_xpath("xpathName")

The previous positioning methods are all in the ideal state , Each element has a unique id or name or class Or properties of hyperlink text , Then we can locate them through this unique attribute value . But sometimes the element we want to locate does not id,name,class attribute , Or these attribute values of multiple elements are the same , Or refresh the page , These attribute values change . Then we can only pass xpath perhaps CSS Here we are . Of course xpath You don't need to calculate the value of, we just need to open the page and then F12 Corresponding element found in , Right click Copy xpath that will do

Then locate in the code :

from selenium import webdriver

#  load  Edge  drive 
driver = webdriver.ChromiumEdge()
#  Set the maximum window size 
driver.maximize_window()
#  Open Baidu web page 
driver.get("http://www.baidu.com")

driver.find_element_by_xpath("//*[@id='kw']").send_keys(" Vegetable farmer said ")
 Copy code 

4) Element operation

Of course, what we want to do is not just the selection of elements , But the operation after selecting the element , In the above demonstration, we have actually carried out two operations click() and send_keys("value"), Several other operations continue here ~

Method name explain
click() Click on the element
send_keys("value") Analog key input
clear() Clear the content of the element , such as Input box
submit() Submit Form
text Get the text content of the element
is_displayed Determines whether an element is visible

Is there a similar feeling after reading , Isn't that what js The basic operation of ~!

5) Practical exercises

After learning the above operations , We can simulate the shopping operation of a Xiaomi Mall , The code is as follows :

from selenium import webdriver

item_url = "https://www.mi.com/buy/detail?product_id=10000330"

#  load  Edge  drive 
driver = webdriver.ChromiumEdge()
#  Set the maximum window size 
driver.maximize_window()
#  Open the product shopping page 
driver.get(item_url)
#  An implicit wait   Set up   Prevent the network from blocking and the page is not loaded in time 
driver.implicitly_wait(30)

#  Select address 
driver.find_element_by_xpath("//*[@id='app']/div[3]/div/div/div/div[2]/div[2]/div[3]/div/div/div[1]/a").click()
driver.implicitly_wait(10)
#  Click to manually select the address 
driver.find_element_by_xpath("//*[@id='stat_e3c9df7196008778']/div[2]/div[2]/div/div/div/div/div/div/div["
                             "1]/div/div/div[2]/span[1]").click()
#  Choose Fujian 
driver.find_element_by_xpath("//*[@id='stat_e3c9df7196008778']/div[2]/div[2]/div/div/div/div/div/div/div/div/div/div["
                             "1]/div[2]/span[13]").click()
driver.implicitly_wait(10)
#  Choose the city 
driver.find_element_by_xpath("//*[@id='stat_e3c9df7196008778']/div[2]/div[2]/div/div/div/div/div/div/div/div/div/div["
                             "1]/div[2]/span[1]").click()
driver.implicitly_wait(10)
#  Selection area 
driver.find_element_by_xpath("//*[@id='stat_e3c9df7196008778']/div[2]/div[2]/div/div/div/div/div/div/div/div/div/div["
                             "1]/div[2]/span[1]").click()
driver.implicitly_wait(10)
#  Choose a street 
driver.find_element_by_xpath("//*[@id='stat_e3c9df7196008778']/div[2]/div[2]/div/div/div/div/div/div/div/div/div/div["
                             "1]/div[2]/span[1]").click()
driver.implicitly_wait(20)

#  Click Add to cart 
driver.find_element_by_class_name("sale-btn").click()
driver.implicitly_wait(20)

#  Click to go to the shopping cart to settle 
driver.find_element_by_xpath("//*[@id='app']/div[2]/div/div[1]/div[2]/a[2]").click()
driver.implicitly_wait(20)

#  Click to settle 
driver.find_element_by_xpath("//*[@id='app']/div[2]/div/div/div/div[1]/div[4]/span/a").click()
driver.implicitly_wait(20)

#  Click agree agreement 
driver.find_element_by_xpath("//*[@id='stat_e3c9df7196008778']/div[2]/div[2]/div/div/div/div[3]/button[1]").click()
 Copy code 

The effect is as follows :

This is the practice of our learning achievements , Of course, if you encounter a second kill, you might as well write a script to practice your hand ~:boom: , If there is no stock, we can add while Loop to poll access !

Two 、 Crawler test

Above we implemented how to use Selenium To automate testing , Use must be legal ~ Next, let's show python Another powerful feature , That's for Reptiles

Before learning to crawl , We need to know a few necessary tools

1) Page downloader

python The standard library already provides :urlliburllib2httplib Wait for the module for http request , however api Not easy to use, elegant ~, It takes a lot of work , And the coverage of various methods , To complete the simplest task , Of course, this is what programmers can't stand , Heroes of all parties develop various easy-to-use third-party libraries for use ~

  • request

request It's using apaches2 The license is based on python Developed http library , It's in python The built-in module is highly encapsulated , Thus, users can more conveniently complete all the operations that the browser can have when making network requests ~

  • scrapy

request and scrapy The difference may be ,scrapy Is a relatively heavyweight framework , It's a web crawler , and request It's a page crawler , There is no difference in concurrency and performance scrapy So nice

2) Page parser

  • BeautifulSoup

BeautifulSoup It's a module , This module is used to receive a HTML or XML character string , Then format it , Later, you can use the method provided by him to quickly find the specified elements , So that in HTML or XML It's easy to find the specified elements in .

  • scrapy.Selector

Selector Is based on parsel, A relatively advanced package , Through a specific XPath perhaps CSS Expression to select HTML Part of the file . It is built on lxml Above Library , This means that they are very similar in speed and resolution accuracy .

You can refer to Scrapy file , The introduction was quite detailed

3) data storage

When we climbed down , At this time, you need to have a corresponding storage source to store

Specific database operations will be performed in the following web Introduce in the development blog ~

  • txt Text

Working with files file Common operations of

  • sqlite3

SQLite, It's a lightweight database , Abide by ACID Database management system based on RDBMS , It's contained in a relatively small C In the library

  • mysql

Don't introduce too much , Know everything. ,web Develop an old lover

4) Practical exercises

Web crawler , It's actually called Network data collection Easier to understand . It is Request data from the network server through programming (HTML Forms ), And then parse HTML, Extract the data you want .

We can simply divide it into 4 A step :

  • According to the given url obtain html data
  • analysis html, Get target data
  • Store the data

Of course, all this needs to be based on you know python Simple syntax and html Basic operation ~

Let's use request + BeautifulSoup + text Practice with a combination of , Suppose we want to climb teacher Liao Xuefeng's python Course content ~

#  Import requests library 
import requests
#  Import file operation Library 
import codecs
import os
from bs4 import BeautifulSoup
import sys
import json
import numpy as np
import importlib

importlib.reload(sys)

#  Assign a request header to the request to simulate chrome browser 
global headers
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36'}
server = 'https://www.liaoxuefeng.com/'
#  Liao Xuefeng python Tutorial address 
book = 'https://www.liaoxuefeng.com/wiki/1016959663602400'
#  Define the storage location 
global save_path
save_path = 'D:/books/python'
if os.path.exists(save_path) is False:
    os.makedirs(save_path)


#  Get chapter content 
def get_contents(chapter):
    req = requests.get(url=chapter, headers=headers)
    html = req.content
    html_doc = str(html, 'utf8')
    bf = BeautifulSoup(html_doc, 'html.parser')
    texts = bf.find_all(class_="x-wiki-content")
    #  obtain div label id attribute content The content of  \xa0  It's continuous white space   
    content = texts[0].text.replace('\xa0' * 4, '\n')
    return content


#  write file 
def write_txt(chapter, content, code):
    with codecs.open(chapter, 'a', encoding=code)as f:
        f.write(content)


#  Main method 
def main():
    res = requests.get(book, headers=headers)
    html = res.content
    html_doc = str(html, 'utf8')
    # HTML analysis 
    soup = BeautifulSoup(html_doc, 'html.parser')
    #  Get all the chapters 
    a = soup.find('div', id='1016959663602400').find_all('a')
    print(' Total number of articles : %d ' % len(a))
    for each in a:
        try:
            chapter = server + each.get('href')
            content = get_contents(chapter)
            chapter = save_path + "/" + each.string.replace("?", "") + ".txt"
            write_txt(chapter, content, 'utf8')
        except Exception as e:
            print(e)


if __name__ == '__main__':
    main()
 Copy code 

When we run the program, we can D:/books/python You can see the tutorial content we climbed to !

In this way, we have simply implemented the crawler , But reptiles need to be careful ~!

Let's take this article in two dimensions automated testing and Reptiles Get to know python Use , I hope it can stimulate your interest ~

Don't talk , Don't be lazy , Make one with the side dish Blowing cattle X Architecture It's the ape ~ Pay attention and be a companion , Let's not be alone . I'll see you later !

 I don't like it after reading it , All bad guys

Try harder today , Tomorrow, you will be able to say less of a word to ask for help !

I'm a small dish , A man who grows stronger with you .

The official account of WeChat has been opened , Vegetable farmer said , If you don't pay attention to it, remember to pay attention to it !

copyright notice
author[Vegetable farmer said],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/02/202202010423061337.html

Random recommended