current position:Home>How fast Python sync and async execute

How fast Python sync and async execute

2022-01-31 02:52:00 Chen Siyu

「 This is my participation 11 The fourth of the yuegengwen challenge 2 God , Check out the activity details :2021 One last more challenge

Preface

Python The new version supports async/await grammar , Many articles say that the implementation code of this syntax will become very fast , But this kind of speed is limited by the scene . This article will try to explain why Async In some scenarios, the code is better than Sync Fast code .

For the latest revision, see original text , Official account < Bo Hai picks up shellfish diary> You can receive new tweets in time

1. A simple example

First, understand the difference between the two calling methods from an example , In order to clearly see the difference of their running time , Let them repeat 10000 Time , The specific code is as follows :

import asyncio
import time


n_call = 10000


# sync Call duration of 
def demo(n: int) -> int:
    return n ** n

s_time = time.time()
for i in range(n_call):
    demo(i)
print(time.time() - s_time)

# async Call duration of 
async def sub_demo(n: int) -> int:
    return n ** n

async def async_main() -> None: 
    for i in range(n_call):
        await sub_demo(i)

loop = asyncio.get_event_loop()
s_time = time.time()
loop.run_until_complete(async_main())
print(time.time() - s_time)

#  Output 
# 5.310615682601929
# 5.614157438278198
 Copy code 

You can see that , sync Everyone is familiar with the grammar of , and async The grammar of is quite different , The function needs to use async def start , At the same time call async def The function needs to use await grammar , When running, you need to get the event loop of the thread first , Then run through the event loop async_main Function to achieve the same effect , However, it can be seen from the output of the operation results , sync The syntax of is better than async The grammar speed of is a little faster ( because Python Of GIL reason , Multi core performance cannot be used here , Can only run as a single core ).

The reason for this is that when it is also executed by the same thread (cpu Single core ),async The call of also needs to go through some additional calls in the event loop , This will incur some small expenses , Thus, the running time will be longer than sync Slow , At the same time, this is a pure cpu Examples of operations , and async The advantage of is the network io operation , You can't take advantage in this scenario , But it will shine in high concurrency scenarios , The reason for this is because async It runs as a collaborative process , sync It runs as a thread .

NOTE: What's being said at the moment async The syntax is to support the network io, The asynchrony of the file system io Not very perfect , Therefore, asynchronous reading and writing of the file system is handled by multiple threads through encapsulation , Not a collaborative process . concrete : github.com/python/asyn…

2. One io Example

In order to understand async stay io Operational advantages in scenarios , Let's assume that there is a io scene --Web Background services usually need to process many requests , All requests are made from different clients , The sample is shown in figure : io scene

In this case , Client requests are made in a short time . In order to process a large number of requests in a short time , Prevent processing delays , Will support concurrency or parallelism in some way .

NOTE: Concurrent , In the operating system , It means that several programs in a period of time are between the start and the completion of running , And these programs are all running on the same processor , But there is only one program running on the processor at any time . Parallelism is a computing method that can execute two or more processes simultaneously in a computer system .

about sync In grammar , This Web The background can be through the process , Threads or a combination of both , Their offer is concurrent / The ability of parallelism will be limited to woker The number of , For example, when there is 5 Multiple clients request at the same time, while the server has only 4 individual worker when , A request will enter the blocking waiting phase , Until running 4 individual worker One has been processed . In order to make the server provide better service , We will all provide enough worker, At the same time, due to the good isolation of processes and comparison, each process will occupy an independent resource , So it's all in a few processes + Provide services in the form of a large number of threads .

NOTE: Process is the smallest unit of resource allocation , Too many processes will occupy a lot of system resources , Generally, the number of processes enabled by background services is not very large , At the same time, thread is the smallest scheduling unit , Therefore, the following scheduling is described in terms of threads .

However, this method consumes a lot of system resources ( Relative to the collaborative process ), Because threads run by cpu To perform the , and cpu It is limited. , Only a fixed number of... Can be supported at the same time worker function , Other threads have to wait to be scheduled , This means that each thread can only work one time slice , Then it will be controlled by the scheduling system to enter the blocking or ready stage , Give way to other threads , You can't continue running until the next time you get the time slice . In order to simulate , Multiple threads running at the same time , And prevent other threads from starving , The running time of each thread is very short , Scheduling switching between threads is very frequent , When more processes and more threads are enabled , Scheduling will be more frequent .

However, the cost of scheduling threads is not large , The larger overhead is the following switching and contention conditions caused by scheduling threads ( For details, please refer to 《 Introduction to computer 》 Data related to process scheduling in , I'm just going to give you a brief explanation ), cpu When executing code , It needs to load data into cpu Run again in the cache , When cpu When the running thread is completed in this time slice , The latest running data of the thread will be saved , then cpu Will load the data of the thread to be scheduled , And run . Although this part of the temporary data is saved faster than memory , Closer than memory cpu On the register of , But the access speed of registers is not cpu Cache access speed is fast , therefore cpu When switching running threads , It will take some time to load the data and the competition problem when loading the cache .

Compare the context switching and preemptive caused by thread scheduling , async The concurrency of syntax implementation is non preemptive , The scheduling of cooperative process depends on a loop to control , This loop is a very efficient task manager and scheduler , Because the scheduling is the implementation logic of a piece of code , therefore cpu The execution code of does not need to switch , There is no overhead of context switching , meanwhile , There is no need to consider the competition of loading cache . Take the figure above as an example , When the service starts , Will start an event loop first , When a request is received , It will create a task to process the request sent by the client , This task will get the execution right from the event loop , Monopolize the whole thread resources and execute all the time , Until you encounter an external event that needs to wait , For example, an event waiting for the database to return data , At this time, the task will tell the event loop that it is waiting for this event , Then hand over the power of execution , The event loop will pass the execution right to the task that needs to run most . When the task that just handed over the execution right receives the database event response later , The event loop will schedule it to the first... In the ready list ( Different event loop implementations may be different ) And the next time you switch the execution right , Return the executive power to him , Let him continue , Until the next waiting event is encountered .

This way of switching collaborative processes is called collaborative multitasking , Because it can only run in a single process or a single thread , The context does not need to be changed when switching coroutines , cpu No need to re read and write the cache , So it will save some expenses . It can be seen from the above that the execution right of cooperative switching is based on the active surrender of the cooperative process itself , Threads are preemptive , The thread did not encounter io When an event is , It is also possible to change from running state to ready state , Until called again , This will add a lot of scheduling overhead , And the collaborative process will run all the time , Do not switch until a concession event is encountered , Therefore, the number of CO process scheduling will be much less than that of threads . At the same time, it can be seen that when the coordination process is scheduled is specified by the developer ( For example, as mentioned above, the database returns Events ), And it's non preemptive , This means that a coroutine is running , Other processes cannot run , We can only wait until the running process hands over the right of execution , So developers should make sure that they can't let the task in cpu Stay on for too long , Or the rest of the mission will starve to death .

3. summary

stay io scenario , io The cost ratio of cpu The overhead of executing code logic is much higher , From here, you can also think differently , In case of io The cost of , Code logic needs to wait , and cpu Is free , So through the collaborative process / The way of threading is for cpu Multiplexing of , Squeeze cpu. hypothesis sync Grammar and async The syntax executes the same code logic , Then the comparison of their execution speed can be converted into co process and multi process / Thread overhead comparison , That is, the cost of CO process event cyclic scheduling and multi process / Logical comparison of the cost of thread scheduling , The cost of event cyclic scheduling is basically unchanged ( Or it doesn't change much ), Multi process / The overhead of threads is higher than that of event loop scheduling , With worker More and more , When the concurrency is high to a certain extent , Multi process / The cost of multithreading will be greater than that of coprocess switching , At this time async The execution speed of grammar will be faster than sync grammar . So in a normal scenario , sync The execution speed of syntax will be faster than async The execution speed of Syntax , But in io The calculation is greater than cpu Computing and high concurrency scenarios , async The execution speed of syntax will be faster than sync Grammar speed is also fast .

copyright notice
author[Chen Siyu],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201310251548172.html

Random recommended