current position:Home>Save time and effort. 10 lines of Python code automatically clean up duplicate files in the computer

Save time and effort. 10 lines of Python code automatically clean up duplicate files in the computer

2022-01-30 05:29:43 Yunyun yyds

This article mainly introduces the use of Python Automatically clean up duplicate files in the computer , as long as 10 Line code , This article is very detailed , It has a certain reference value for everyone's study or work , Friends in need can refer to . Given a folder , Use Python Check for duplicate files in the given folder , Delete if there is a duplicate .

The main knowledge points involved are :

  1. os Integrated application of modules
  2. glob Integrated application of modules
  3. utilize filecmp Module compares two files

 Insert picture description here

1. Step analysis

The logic of the program can be embodied as : Traverse to get all the files under the given folder , Then through nested loops, we compare whether the files are the same , If the same, delete the latter .

2. How to judge whether two files are the same ?

Here we can use filecmp modular , Take a look at the official introduction document :

  • filecmp.cmp(f1, f2, shallow=True)
  • The comparison name is f1 and f2 The file of , If they seem to be equal, return True, Otherwise return to False
  • If shallow It's true , So have the same os.stat() Signed documents will be considered equal . otherwise , The contents of the file will be compared .

So it can be used in this way

#  hypothesis x and y The two documents are the same 
print(filecmp.cmp(x, y))
# True
 Copy code 

3.Python Implementation steps

1. Import the required library and set the destination folder path

import os
import glob
import filecmp
 
dir_path = r'C:\\xxxx'
 Copy code 

2. Then traverse to get the absolute path of all files , We can use glob The combination of wildcards of modules recursive Parameters can be used to complete , The framework is as follows :

for file in glob.glob(path + '/**/*', recursive=True):
  pass
 Copy code 

3. After traversing to get each file or folder , Need to determine if it's a file , If it is a file, the absolute path may be stored in the list , Two more things need to be done here :

  1. First create an empty list , Later use list.append(i) Add file path
  2. Then use os.path.isfile(i) Determine if it's a document , return True The operation of adding elements is performed

The code is as follows :

file_lst = []
 
for i in glob.glob(dir_path + '/**/*', recursive=True):
  if os.path.isfile(i):
    file_lst.append(i)
 Copy code 

4. In the last step, we get all the file paths in the target folder , Next, you can nest and traverse the path list , among filecmp.cmp File judgment ,os.remove Delete the file

for x in file_lst:
  for y in file_lst:
    if x != y:
      if filecmp.cmp(x, y):
        os.remove(y)
 Copy code 

5. The code here has implemented the general logic , But there is one detail to consider : It is possible to loop to the file has been deleted by the previous judgment , Lead to os.remove(file) Error reported because the file does not exist , therefore , It can be used os.path.exists Judge the existence of the file , As shown below :

for x in file_lst:
  for y in file_lst:
    if x != y and os.path.exists(x) and os.path.exists(y):
      if filecmp.cmp(x, y):
        os.remove(y)
 Copy code 

4. Complete code

such , A simple file to duplicate the small program completed .

import os
import glob
import filecmp
 
dir_path = r'C:\xxxx'
 
file_lst = []
 
for i in glob.glob(dir_path + '/**/*', recursive=True):
  if os.path.isfile(i):
    file_lst.append(i)
 
for x in file_lst:
  for y in file_lst:
    if x != y and os.path.exists(x) and os.path.exists(y):
      if filecmp.cmp(x, y):
        os.remove(y)
 Copy code 

Through the Python Automate the scripting process , Can feel Python The power of office automation , There are more ways to use office automation , You can pay attention to me , Share more dry goods later .  Insert picture description here

Here comes the welfare , Click to collect ①Python300 This related e-book , ② Technical communication , ③ Tool installation package , ④ Order receiving AC , ⑤ Interesting source sharing .

copyright notice
author[Yunyun yyds],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201300529414710.html

Random recommended