current position:Home>[Python data collection] selenium automated test framework
[Python data collection] selenium automated test framework
2022-01-30 19:37:29 【liedmirror】
Little knowledge , Great challenge ! This article is participating in “ A programmer must have a little knowledge ” Creative activities
Preface
A lot of times , adopt js Dynamic rendering does not return plaintext data directly , But through some encryption algorithms , As a result, we can't get the correct data . Regarding this , We can go through Selenium Automated testing framework , Simulate the browser to achieve real “ You can climb when you can see it ”.
principle
Selenium By imitating browser behavior ,selenium By opening a browser , Then execute the operation events set by our implementation , So as to achieve data acquisition .
edition
Selenium There are two different versions
Selenium RC,Remote Control: Tradition Selenium frame . Selenium Webdriver: New automation interface , Break through the Selenium 1 Some of the limitations of .
Our common version is Selenium Webdriver, This version will also be selected later .
technological process
- Driver created and sent to browser ;
- The driver contains a HTTP Server, Used to receive http request ;
- HTTP Server Manipulate the browser to perform steps according to the request ;
- The browser returns the result of the step execution to HTTP Server;
- HTTP Server Return the result to Selenium Script .
install
- install selenium library
Execute the following command :
pip install selenium
Copy code
- install chrome drive
( limit windows) download chromedrive.exe Driver program , Then copy it to python or env Of scripts Under the table of contents .
Use
from selenium import webdriver
from selenium.webdriver.chrome.options import Options from bs4 import BeautifulSoup
url = r"https://juejin.cn/"
chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get(url)
html = driver.page_source
soup=BeautifulSoup(html,"lxml")
# Use... In the back BeautifulSoup To extract
Copy code
summary
Selenium The advantage is that , It can be passed sleep Wait for loading , So you can ignore js Logic on ; But it also has a fatal disadvantage : Easy to detect , Therefore, there are great restrictions on the use .
copyright notice
author[liedmirror],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201301937268573.html
The sidebar is recommended
- Exploratory data analysis (EDA) in Python using SQL and Seaborn (SNS).
- Turn audio into shareable video with Python and ffmpeg
- Using rbind in python (equivalent to R)
- Pandas: how to create an empty data frame with column names
- Talk about quantifying investment using Python
- Python, image restoration in opencv - CV2 inpaint
- Python notes (14): advanced technologies such as object-oriented programming
- Python notes (13): operations such as object-oriented programming
- Python notes (12): inheritance such as object-oriented programming
- Chapter 2: Fundamentals of python-5 Boolean
guess what you like
-
Python notes (11): encapsulation such as object-oriented programming
-
Python notes (10): concepts such as object-oriented programming
-
Gradient lifting method and its implementation in Python
-
Van * Python | simple crawling of a site course
-
Chapter 1 preliminary knowledge of pandas (list derivation and conditional assignment, anonymous function and map method, zip object and enumerate method, NP basis)
-
Nanny tutorial! Build VIM into an IDE (Python)
-
Fourier transform of Python OpenCV image processing, lesson 52
-
Introduction to python (III) network request and analysis
-
China Merchants Bank credit card number recognition project (Part I), python OpenCV image processing journey, Part 53
-
Introduction to python (IV) dynamic web page analysis and capture
Random recommended
- Python practice - capture 58 rental information and store it in MySQL database
- leetcode 119. Pascal's Triangle II(python)
- leetcode 31. Next Permutation(python)
- [algorithm learning] 807 Maintain the city skyline (Java / C / C + + / Python / go / trust)
- The rich woman's best friend asked me to write her a Taobao double 11 rush purchase script in Python, which can only be arranged
- Glom module of Python data analysis module (1)
- Python crawler actual combat, requests module, python realizes the full set of skin to capture the glory of the king
- Summarize some common mistakes of novices in Python development
- Python libraries you may not know
- [Python crawler] detailed explanation of selenium from introduction to actual combat [2]
- This is what you should do to quickly create a list in Python
- On the 55th day of the journey, python opencv perspective transformation front knowledge contour coordinate points
- Python OpenCV image area contour mark, which can be used to frame various small notes
- How to set up an asgi Django application with Postgres, nginx and uvicorn on Ubuntu 20.04
- Initial Python tuple
- Introduction to Python urllib module
- Advanced Python Basics: from functions to advanced magic methods
- Python Foundation: data structure summary
- Python Basics: from variables to exception handling
- Python notes (22): time module and calendar module
- Python notes (20): built in high-order functions
- Python notes (17): closure
- Python notes (18): decorator
- Python notes (16): generators and iterators
- Python notes (XV): List derivation
- Python tells you what timing attacks are
- Python -- file and exception
- [Python from introduction to mastery] (IV) what are the built-in data types of Python? Figure out
- Python code to scan code to pay attention to official account login
- [algorithm learning] 1221 Split balanced string (Java / C / C + + / Python / go / trust)
- Python notes (22): errors and exceptions
- Python has been hidden for ten years, and once image recognition is heard all over the world
- Python notes (21): random number module
- Python notes (19): anonymous functions
- Use Python and OpenCV to calculate and draw two-dimensional histogram
- Python, Hough circle transformation in opencv
- A library for reading and writing markdown in Python: mdutils
- Datetime of Python time operation (Part I)
- The most useful decorator in the python standard library
- Python iterators and generators