current position:Home>[Python crawler] multithreaded daemon & join() blocking
[Python crawler] multithreaded daemon & join() blocking
2022-01-30 22:31:01 【Dream, killer】
「 This is my participation 11 The fourth of the yuegengwen challenge 2 God , Check out the activity details :2021 One last more challenge 」.
The collection
Python Reptiles are slow ? Learn about concurrent programming
The guardian thread
stay Python
In a multithreaded , After the code of the main thread runs , If there are other child threads that have not been executed yet , Then the main thread will wait for the execution of the child thread before ending ; This will create a problem , If a thread is set to infinite loop , That means the whole main thread ( Python
Program ) It can't end . Let's take a look at .
import threading
import time
# Non-daemon thread
def normal_thread():
for i in range(10000):
time.sleep(1)
print(f'normal thread {i}')
print(threading.current_thread().name, ' Thread start ')
thread1 = threading.Thread(target=normal_thread)
thread1.start()
print(threading.current_thread().name, ' Thread end ')
Copy code
The above results can be seen , The main thread ( MainThread
) Although it's over , But the child thread is still running , When the child thread runs , The whole process is just beginning The real end . If you want to terminate the main thread while terminating other unfinished threads , You can set the thread to The guardian thread , If only the daemon thread is still executing and the main program ends , that Python
The program can exit normally .threading
Module provides two ways to set daemon threads .
threading.Thread(target=daemon_thread, daemon=True)
thread.setDaemon(True)
import threading
import time
# The guardian thread ( Mandatory waiting 1s)
def daemon_thread():
for i in range(5):
time.sleep(1)
print(f'daemon thread {i}')
# Non-daemon thread ( No forced waiting )
def normal_thread():
for i in range(5):
print(f'normal thread {i}')
print(threading.current_thread().name, ' Thread start ')
thread1 = threading.Thread(target=daemon_thread, daemon=True)
thread2 = threading.Thread(target=normal_thread)
thread1.start()
# thread1.setDaemon(True)
thread2.start()
print(threading.current_thread().name, ' Thread end ')
Copy code
The above will thread1
Set to daemons , The program is in Non-daemon thread And The main thread ( MainThread
) After completion of operation , End directly , therefore daemon_thread()
The output statement in did not have time to execute . The output in the figure shows MainThread
Thread end Still outputting normal_thread()
What's in the function , as a result of It will take some time from the end of the main thread to the forced stop of the daemon thread .
Inheritance of daemon threads
The child thread will inherit the current thread's daemon
attribute , The main thread defaults to Non-daemon thread , Therefore, the new threads in the main thread are also... By default Non-daemon thread , But in The guardian thread When a new thread is created in , Will inherit the current thread daemon
attribute , So is the child thread The guardian thread .
join() Blocking
In a multithreaded crawler , Generally, multiple threads crawl the information of different pages at the same time , Then, it is analyzed and processed uniformly , Statistical storage , This requires waiting for all child threads to execute , To continue the following processing , And that's where it comes in join()
The method .
join()
The function of the method is to block ( Hang up ) Other threads ( The thread that is not started is different from the main thread ), Wait for the called thread to run, and then wake up the operation of other threads . Look at an example .
import threading
import time
def block(second):
print(threading.current_thread().name, ' The thread is running ')
time.sleep(second)
print(threading.current_thread().name, ' Thread end ')
print(threading.current_thread().name, ' The thread is running ')
thread1 = threading.Thread(target=block, name=f'thread test 1', args=[3])
thread2 = threading.Thread(target=block, name=f'thread test 2', args=[1])
thread1.start()
thread1.join()
thread2.start()
print(threading.current_thread().name, ' Thread end ')
Copy code
It's just for thread1
Use join()
, Pay attention to join()
The location of , It's in thread2.start()
Performed before startup , After execution thread2
And the main thread are suspended , Only thread1
After thread execution ,thread2
And The main thread will execute , Because here thread2
Not a daemon thread , So when the main thread (MainThread
) After execution ,thread2
Will continue to run .
See here , Is there a question ? If you follow the execution process of the above code , The whole program has completely become a single threaded program , That's why join()
Caused by improper use position of . Let's change the above code a little .
import threading
import time
def block(second):
print(threading.current_thread().name, ' The thread is running ')
time.sleep(second)
print(threading.current_thread().name, ' Thread end ')
print(threading.current_thread().name, ' The thread is running ')
thread1 = threading.Thread(target=block, name=f'thread test 1', args=[3])
thread2 = threading.Thread(target=block, name=f'thread test 2', args=[1])
thread1.start()
thread2.start()
thread1.join()
print(threading.current_thread().name, ' Thread end ')
Copy code
Now the program is really multithreaded , Now we use join()
When the method is used , Only the main thread is suspended , When thread1
After execution , To execute the main thread .
Finally, we need to explain ,join()
Method blocking is object independent , And whether to guard threads , Whether or not the main thread is irrelevant . Attention should be paid when using , To really multithread, you need to start all the sub threads , Call again join()
, Otherwise it will become a single thread !
This is all the content of this article , If it feels good . Let's go with a compliment !!!
For beginnersPython
Or want to get startedPython
Little buddy , You can search through wechat 【Python New horizons 】, Exchange and study together , They all come from novices , Sometimes a simple question card takes a long time , But maybe someone else's advice will suddenly realize , I sincerely hope you can make progress together .
copyright notice
author[Dream, killer],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201302231000486.html
The sidebar is recommended
- Introduction to python (IV) dynamic web page analysis and capture
- leetcode 119. Pascal's Triangle II(python)
- leetcode 31. Next Permutation(python)
- [algorithm learning] 807 Maintain the city skyline (Java / C / C + + / Python / go / trust)
- The rich woman's best friend asked me to write her a Taobao double 11 rush purchase script in Python, which can only be arranged
- Glom module of Python data analysis module (1)
- Python crawler actual combat, requests module, python realizes the full set of skin to capture the glory of the king
- Summarize some common mistakes of novices in Python development
- Python libraries you may not know
- [Python crawler] detailed explanation of selenium from introduction to actual combat [2]
guess what you like
-
This is what you should do to quickly create a list in Python
-
On the 55th day of the journey, python opencv perspective transformation front knowledge contour coordinate points
-
Python OpenCV image area contour mark, which can be used to frame various small notes
-
How to set up an asgi Django application with Postgres, nginx and uvicorn on Ubuntu 20.04
-
Initial Python tuple
-
Introduction to Python urllib module
-
Advanced Python Basics: from functions to advanced magic methods
-
Python Foundation: data structure summary
-
Python Basics: from variables to exception handling
-
Python notes (22): time module and calendar module
Random recommended
- Python notes (20): built in high-order functions
- Python notes (17): closure
- Python notes (18): decorator
- Python notes (16): generators and iterators
- Python notes (XV): List derivation
- Python tells you what timing attacks are
- Python -- file and exception
- [Python from introduction to mastery] (IV) what are the built-in data types of Python? Figure out
- Python code to scan code to pay attention to official account login
- [algorithm learning] 1221 Split balanced string (Java / C / C + + / Python / go / trust)
- Python notes (22): errors and exceptions
- Python has been hidden for ten years, and once image recognition is heard all over the world
- Python notes (21): random number module
- Python notes (19): anonymous functions
- Use Python and OpenCV to calculate and draw two-dimensional histogram
- Python, Hough circle transformation in opencv
- A library for reading and writing markdown in Python: mdutils
- Datetime of Python time operation (Part I)
- The most useful decorator in the python standard library
- Python iterators and generators
- [Python from introduction to mastery] (V) Python's built-in data types - sequences and strings. They have no girlfriend, not a nanny, and can only be used as dry goods
- Does Python have a, = operator?
- Go through the string common sense in Python
- Fanwai 4 Handling of mouse events and solutions to common problems in Python opencv
- Summary of common functions for processing strings in Python
- When writing Python scripts, be sure to add this
- Python web crawler - Fundamentals (1)
- Pandas handles duplicate values
- Python notes (23): regular module
- Python crawlers are slow? Concurrent programming to understand it
- Parameter passing of Python function
- Stroke tuple in Python
- Talk about ordinary functions and higher-order functions in Python
- [Python data acquisition] page image crawling and saving
- [Python data collection] selenium automated test framework
- Talk about function passing and other supplements in Python
- Python programming simulation poker game
- leetcode 160. Intersection of Two Linked Lists (python)
- Python crawler actual combat, requests module, python to grab the beautiful wallpaper of a station
- Fanwai 5 Detailed description of slider in Python opencv and solutions to common problems