current position:Home>Using Python to realize multitasking process

Using Python to realize multitasking process

2022-01-29 17:28:02 Jacko's it journey

source : official account 【 Jay's IT The journey 】

author : Alaska

ID:Jake_Internet

The article links : utilize Python Implement multitasking process

One 、 Process introduction

process : The program being executed , By procedure 、 Data and process control blocks , Is the program being executed , An execution of a program , yes resources The basic unit of dispatch .

Program : No executed code , It's a static .

Two 、 Comparison between threads and processes

de08e3f34da587beaec41dbeb812ed3d.png

It can be seen from the picture that : At this time, the computer has 9 An application process , But a process will correspond to multiple threads , We can conclude that :\

process : Be able to accomplish multiple tasks , You can run multiple on one computer at the same time QQ

Threads : Be able to accomplish multiple tasks , One QQ Multiple chat windows in

Fundamental difference : Process is the basic unit of operating system resource allocation , Thread is the basic unit of task scheduling and execution .

Advantages of using multiple processes :

1、 Have independence GIL:

First, because in the process GIL The existence of ,Python Multithreading in does not give full play to the advantage of multi-core , Multiple threads in a process , At the same time Only one thread can run at a time . For multiple processes , Every process has its own GIL, therefore , In a multicore processor , Multi process operation is not affected by GIL The impact of . therefore , Progressive Cheng can give better play to the advantages of multi-core .

2、 Efficient

Of course , For reptiles IO For intensive tasks , The effects of multithreading and multiprocessing are not very different . For computing intensive tasks ,Python Multi process compared to multi line cheng , Its multi-core operation efficiency will be doubled .

bae15838ac6e2d0681363eafc7d21dc5.png

3、 ... and 、Python Implement multiple processes

Let's use an example to feel :

3.1 Use process class

import multiprocessing 
def process(index): 
    print(f'Process: {index}') 
if __name__ == '__main__': 
    for i in range(5): 
        p = multiprocessing.Process(target=process, args=(i,)) 
        p.start()
 Copy code 

This is the most basic way to implement multiple processes : By creating a Process To create a new child process , among target Parameter passed in method name ,args It's the parameters of the method , In order to In the form of tuples , It and the method being called process The parameters of are one-to-one correspondence .

Be careful : here args Must be a tuple , If there is only one parameter , Then add a comma after the first element of the tuple , If there is no comma, then It's no different from a single element itself , Cannot form tuples , Cause problems in parameter passing . After creating the process , We call start Method to start the process .

The operation results are as follows :

Process: 0 
Process: 1 
Process: 2
Process: 3 
Process: 4
 Copy code 

You can see , We're running 5 Subprocess , Each process calls process Method .process Methodical index Parameters through Process Of args Pass in , Namely 0~4 this 5 A serial number , Finally print it out ,5 The subprocess has finished running .

3.2 Inherit process class

from multiprocessing import Process
import time

class MyProcess(Process):
    def __init__(self,loop):
        Process.__init__(self)
        self.loop = loop


    def run(self):
        for count in range(self.loop):
            time.sleep(1)
            print(f'Pid:{self.pid} LoopCount: {count}')
if __name__ == '__main__':
    for i in range(2,5):
        p = MyProcess(i)
        p.start()
 Copy code 

We first declare a constructor , This method receives a loop Parameters , Represents the number of cycles , And set it as a global variable . stay run In the method , Use this again individual loop The variable loops loop And print the current process number and cycle times .

In the call , We use it range The method gets 2、3、4 Three numbers , And initialize them separately MyProcess process , And then call start Method starts the process Come on .

Be careful : Here, the execution logic of the process needs to be in run Method implementation , To start the process, you need to call start Method , After call run Method will execute .

The operation results are as follows :

Pid:12976 LoopCount: 0
Pid:15012 LoopCount: 0
Pid:11976 LoopCount: 0
Pid:12976 LoopCount: 1
Pid:15012 LoopCount: 1
Pid:11976 LoopCount: 1
Pid:15012 LoopCount: 2
Pid:11976 LoopCount: 2
Pid:11976 LoopCount: 3
 Copy code 

Be careful , The process here pid Represents the process number , Different machines 、 The results may be different at different times .

Four 、 Communication between processes

4.1 Queue- queue fifo

from multiprocessing import Queue
import multiprocessing

def download(p): #  Download data 
    lst = [11,22,33,44]
    for item in lst:
        p.put(item)
    print(' The data has been downloaded successfully ....')


def savedata(p):
    lst = []
    while True:
        data = p.get()
        lst.append(data)
        if p.empty():
            break
    print(lst)

def main():
    p1 = Queue()

    t1 = multiprocessing.Process(target=download,args=(p1,))
    t2 = multiprocessing.Process(target=savedata,args=(p1,))

    t1.start()
    t2.start()


if __name__ == '__main__':
    main()

 The data has been downloaded successfully ....
[11, 22, 33, 44]
 Copy code 

4.2 Shared global variables are not suitable for multiprocess programming

import multiprocessing

a = 1


def demo1():
    global a
    a += 1


def demo2():
    print(a)

def main():
    t1 = multiprocessing.Process(target=demo1)
    t2 = multiprocessing.Process(target=demo2)

    t1.start()
    t2.start()

if __name__ == '__main__':
    main()
 Copy code 

Running results :

1
 Copy code 

It turns out that : Global variables are not shared ;

5、 ... and 、 Communication between process pools

5.1 Process pool introduction

When the number of subprocesses to be created is small , Can be used directly multiprocessing Medium Process Dynamically generate multiple processes , But if there are hundreds or even thousands of goals , The workload of manually creating processes is huge , It can be used at this time multiprocessing Module provided Pool Method .

from multiprocessing import Pool
import os,time,random

def worker(a):
    t_start = time.time()
    print('%s Start execution , The process number is %d'%(a,os.getpid()))

    time.sleep(random.random()*2)
    t_stop = time.time()
    print(a," Execution completed , Time consuming %0.2f"%(t_stop-t_start))


if __name__ == '__main__':
    po = Pool(3)        #  Define a process pool 
    for i in range(0,10):
        po.apply_async(worker,(i,))    #  Add... To the process pool worker The task of 

    print("--start--")
    po.close()      

    po.join()       
    print("--end--")
 Copy code 

Running results :

--start--
0 Start execution , The process number is 6664
1 Start execution , The process number is 4772
2 Start execution , The process number is 13256
0  Execution completed , Time consuming 0.18
3 Start execution , The process number is 6664
2  Execution completed , Time consuming 0.16
4 Start execution , The process number is 13256
1  Execution completed , Time consuming 0.67
5 Start execution , The process number is 4772
4  Execution completed , Time consuming 0.87
6 Start execution , The process number is 13256
3  Execution completed , Time consuming 1.59
7 Start execution , The process number is 6664
5  Execution completed , Time consuming 1.15
8 Start execution , The process number is 4772
7  Execution completed , Time consuming 0.40
9 Start execution , The process number is 6664
6  Execution completed , Time consuming 1.80
8  Execution completed , Time consuming 1.49
9  Execution completed , Time consuming 1.36
--end-
 Copy code 

A process pool can only hold 3 A process , New tasks cannot be added until the execution is completed , In the process of continuous opening and release .

bea002a93fa0104c1be090459601dfdb.png

6、 ... and 、 Case study : Batch copying of files

Operation idea :

  • Get the name of the folder to copy
  • Create a new folder
  • Get all the file names to be copied in the folder
  • Create a process pool
  • Add tasks to the process pool

The code is as follows :

Guide pack

import multiprocessing
import os
import time
 Copy code 

Custom file copy function

def copy_file(Q,oldfolderName,newfolderName,file_name):
    #  File replication , No need to return 
    time.sleep(0.5)
    # print('\r from %s Copy folder to %s The folder %s file '%(oldfolderName,newfolderName,file_name),end='')

    old_file = open(oldfolderName + '/' + file_name,'rb') #  Files to be copied 
    content = old_file.read()
    old_file.close()

    new_file = open(newfolderName + '/' + file_name,'wb') #  Copy out the new file 
    new_file.write(content)
    new_file.close()

    Q.put(file_name) #  towards Q Add file to queue 
 Copy code 

Define the main function

def main():
    oldfolderName = input(' Please enter the name of the folder to copy :') #  step 1 Get the name of the folder to copy ( You can create... Manually , You can also create... Through code , Here we manually create )
    newfolderName = oldfolderName + ' Copy '
    #  Step two   Create a new folder 
    if not os.path.exists(newfolderName):
        os.mkdir(newfolderName)

    filenames = os.listdir(oldfolderName) # 3. Get all the file names to be copied in the folder 
    # print(filenames)

    pool = multiprocessing.Pool(5) # 4. Create a process pool 

    Q = multiprocessing.Manager().Queue() #  Create a queue , communicate 
    for file_name in filenames:
        pool.apply_async(copy_file,args=(Q,oldfolderName,newfolderName,file_name)) # 5. Add tasks to the process pool 
      po.close()

    copy_file_num = 0
    file_count = len(filenames)
    #  I don't know when to finish , So define an endless loop 
    while True:
        file_name = Q.get()
        copy_file_num += 1
        time.sleep(0.2)
        print('\r Copy progress %.2f %%'%(copy_file_num  * 100/file_count),end='') #  Make a copy of the progress bar 

        if copy_file_num >= file_count:
            break
 Copy code 

The program runs

if __name__ == '__main__':
    main()
 Copy code 

The results are shown in the following figure :

c6dc5eff8e51b7328cfce372e4b842b2.png

Comparison of file directory structure before and after operation

Before running

80332fa3cb283c8ce9ae4293750129f3.png

After operation

1326f4e6c8ec5851ae31f760258a28fb.png

The above is the general result of the whole , because test There are test files pasted casually , There will be no demonstration here .

The end of this paper .


Originality is not easy. , If you think this article is useful to you , Please like this article 、 Comment or forward , Because it will be my motivation to output more quality articles , thank !

by the way , Dig friends remember to give me a free attention ! In case you get lost, you won't find me next time .

See you next time !

copyright notice
author[Jacko's it journey],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201291728009817.html