current position:Home>Python notes (23): regular module
Python notes (23): regular module
2022-01-30 19:07:48 【A bowl week】
Little knowledge , Great challenge ! This article is participating in “ A programmer must have a little knowledge ” Creative activities .
Hello everyone , I am a A bowl week , One doesn't want to be drunk ( Internal volume ) The front end of the . If you are lucky enough to get your favor , I'm very lucky ~
stay Python Modules for manipulating regular expressions are provided in , namely re
modular .
Decorator of regular expression
Modifier | describe | Full name |
---|---|---|
re.I | Make match match case insensitive | re.IGNORECASE |
re.A | Give Way \w , \W , \b , \B , \d , \D , \s and \S Only match ASCII, instead of Unicode |
re.ASCII |
re.L | Do localization identification (locale-aware) matching | re.LOCALE |
re.M | Multi-line matching , influence ^ and $, In multiline mode, matching the beginning of a line is supported | re.MULTILINE |
re.S | send . Match all characters including line breaks |
e.DOTALL |
re.U | according to Unicode Character set parsing characters . This sign affects \w , \W , \b , \B . |
re.UNICODE |
re.X | This flag allows you to write regular expressions more easily by giving you a more flexible format . | re.VERBOSE |
Find a single match
match
re.match
If string At the beginning 0 Or multiple characters match the regular expression style , It returns a corresponding A match object . If there is no match , Just go back to None
; Note that it is different from zero length matching .
Grammar format
re.match(pattern, string, flags=0)
Copy code
- pattern: Matching regular expressions
- string: String to match .
- flags: Sign a , Used to control how regular expressions are matched , Such as : Is it case sensitive , Multi line matching and so on .
The match is successful re.match Method returns a matching object , Otherwise return to None.
Sample code
"""
-*- coding:uft-8 -*-
author: Xiaotian
time:2020/5/30
"""
import re
string1 = "hello python"
string2 = "hell5o python"
pattern = r"[a-z]+\s\w+" # a-z appear 1 Add one at any time \s Add any character to appear 1 Times to any time
print(re.match(pattern, string1)) # <re.Match object; span=(0, 12), match='hello python'>
print(re.match(pattern, string2)) # None
Copy code
Start import re modular ,
r""
Expressed as a regular expression
because string2 A number appears in the middle 5 So it doesn't match
group
re.group
It's from Match Object , However, it is not grouped. The default is 0, The grouping index starts from 0 Start (0 Is a complete match ), If multiple groups , Then the first group is 1; You can also name it and use
Sample code
"""
-*- coding:uft-8 -*-
author: Xiaotian
time:2020/5/30
"""
import re
string1 = "hello python"
string2 = "hell5o python"
pattern = r"[a-z]+\s\w+"
pattern1 = r"(\w+)(\s)(\w+)"
pattern2 = r"(?P<first>\w+\s)(?P<last>\w+)" # Name groups
print(re.match(pattern, string1)) # <re.Match object; span=(0, 12), match='hello python'>
print(re.match(pattern, string1).group()) # hello python
print(re.match(pattern, string2)) # None
print(re.match(pattern1, string2).group(0)) # hell5o python
print(re.match(pattern1, string2).group(1)) # hell5o
print(re.match(pattern1, string2).group(2)) # What matches here is the space
print(re.match(pattern1, string2).group(3)) # python
print(re.match(pattern2, string2).group("last")) # python
Copy code
search
re.search
Scan the entire string to find the first position of the matching style , And return a corresponding A match object . If there is no match , Just go back to one None
; Note that this is different from finding a zero length match .. Grammatical structure and match It's the same
Sample code
"""
-*- coding:uft-8 -*-
author: Xiaotian
time:2020/5/30
"""
import re
string = "Hi World Hello python"
pattern = r"Hello python"
print(re.search(pattern, string).group()) # Hello python
print(re.match(pattern, string)) # None
Copy code
The difference between the two
re.match
Match only the beginning of the string , If the string doesn't start with a regular expression , The match fails , The function returns None, and re.search
Match the entire string , Until we find a match .
fullmatch
re.fullmatch
If the whole string Match this regular expression , It returns a corresponding A match object . Otherwise it returns None
; Note that matching with zero length is different .
The syntax format is the same as that above
Sample code
"""
-*- coding:uft-8 -*-
author: Xiaotian
time:2020/5/30
"""
import re
string = "Hi World Hello python"
pattern = r"Hi World Hello python"
pattern1 = r"hi World hello python"
print(re.fullmatch(pattern, string)) # <re.Match object; span=(0, 21), match='Hi World Hello python'>
print(re.fullmatch(pattern1, string)) # None
Copy code
Differences among the three
match
: String start match
search
: Find matches anywhere
fullmatch
: The entire string must match the regular expression exactly
A match object
Matching objects always have a Boolean value True
. If there's no match match()
and search() return None
So you can simply use if
Statement to determine whether it matches
Sample code
import re
string = "Hi World Hello python"
pattern = r"Hello python"
match1 = re.search(pattern, string)
match2 = re.match(pattern, string)
if match1:
print(match1.group()) # Hello python
if match2: # because match2 The value of is none So don't execute
print(match2.group())
Copy code
Find multiple matches
compile
re.compile
Compile the style of regular expression into a regular object , Can be used to match
Grammatical structure
re.compile(pattern, flags=0)
Copy code
pattern
: Matching regular expressionsflags
: Sign a , Used to control how regular expressions are matched , Such as : Is it case sensitive , Multi line matching and so on .
findall
re.findall
Find all the substrings that the regular expression matches in the string , And return a list , If no match is found , Then return to the empty list . And match and search The difference is match and search It's a match findall Match all .
Grammatical structure
re.findall(string[, pos[, endpos]])
Copy code
string
: String to match .pos
: Optional parameters , Specifies the starting position of the string , The default is 0.endpos
: Optional parameters , Specify the end of the string , The default is the length of the string
finditer
pattern
stay string
All non duplicate matches in the , Return to save... As an iterator A match object . *string
* Scan from left to right , The matches are arranged in order . Null matches are also included in the results .
The grammatical structure is the same as match
Sample code
import re
from collections.abc import Iterator # Import an object that determines whether it is an iterator
string = "hello python hi javascript"
pattern = r"\b\w+\b"
pattern_object = re.compile(r"\b\w+\b")
print(type(pattern_object)) # <class 're.Pattern'>
findall = pattern_object.findall(string)
for i in findall:
print(i)
finditer = re.finditer(pattern, string)
# Determine whether it is an iterator
print(isinstance(finditer, Iterator)) # True
for _ in range(4):
finditer1 = finditer.__next__() # Take out the next value
print(finditer1.group())
'''
-- The result of the cycle --
hello
python
hi
javascript
'''
Copy code
If there are too many matches , return finditer Better than findall, That's the difference between a list and an iterator .
Division split
re.split
Method to split the string according to the matching substring and return the list
Grammatical structure
re.split(pattern, string[, maxsplit=0, flags=0])
Copy code
pattern
: Matching regular expressionsstring
: Separator .maxsplit
: Number of separations ,maxsplit=1
Separate once , The default is 0, Unlimited times .flags
: Sign a , Used to control how regular expressions are matched , Such as : Is it case sensitive , Multi line matching and so on .
Sample code
import re
string = '''hello hi good morning
goodnight
python
javascript
Linux
'''
pattern = r'\s+' # Carriage return with space and carriage return tab
print(re.split(pattern, string)) # There is no limit to the number of times to separate
# ['hello', 'hi', 'good', 'morning', 'goodnight', 'python', 'javascript', 'Linux', '']
print(re.split(pattern, string, 5)) # Separate 5 Time
# ['hello', 'hi', 'good', 'morning', 'goodnight', 'python\njavascript\nLinux\n']
Copy code
And str
Modular split
The difference is ,re
Modular split
Support regular
Replace
sub
re.sub
Used to replace matches in strings
Grammatical structure
re.sub(pattern, repl, string, count=0, flags=0)
Copy code
pattern
: Pattern string in regular .repl
: Replaced string , It can also be a function .string
: The original string to be found and replaced .count
: The maximum number of substitutions after pattern matching , Default 0 Means to replace all matches .flags
: The matching pattern used at compile time , Digital form .
Here you can complete a comment area of a certain hand , Small cases of modifying bad comments
import re
string = input(" Please enter a comment :")
pattern = r"[ Beautiful, lovely and generous ]{1}" # Detected characters
print(re.sub(pattern, " ' ", string))
Copy code
design sketch
subn
Behavior and sub()
identical , But it returns a tuple ( character string , Number of replacements )
.
escape
re.escape(pattern)
escape pattern Special character in . For example, in regular Metacharacters .
Sample code
import re
pattern = r'\w\s*\d\d.'
# Print pattern Special characters for
print(re.escape(pattern)) # \w\s*\d\d.
Copy code
Match any text string that may contain regular expression metacharacters , It is useful , But it's prone to mistakes , Manual escape is better
purge
re.purge()
Clear the cache of regular expressions .
copyright notice
author[A bowl week],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201301907447369.html
The sidebar is recommended
- Exploratory data analysis (EDA) in Python using SQL and Seaborn (SNS).
- Turn audio into shareable video with Python and ffmpeg
- Using rbind in python (equivalent to R)
- Pandas: how to create an empty data frame with column names
- Talk about quantifying investment using Python
- Python, image restoration in opencv - CV2 inpaint
- Python notes (14): advanced technologies such as object-oriented programming
- Python notes (13): operations such as object-oriented programming
- Python notes (12): inheritance such as object-oriented programming
- Chapter 2: Fundamentals of python-5 Boolean
guess what you like
-
Python notes (11): encapsulation such as object-oriented programming
-
Python notes (10): concepts such as object-oriented programming
-
Gradient lifting method and its implementation in Python
-
Van * Python | simple crawling of a site course
-
Chapter 1 preliminary knowledge of pandas (list derivation and conditional assignment, anonymous function and map method, zip object and enumerate method, NP basis)
-
Nanny tutorial! Build VIM into an IDE (Python)
-
Fourier transform of Python OpenCV image processing, lesson 52
-
Introduction to python (III) network request and analysis
-
China Merchants Bank credit card number recognition project (Part I), python OpenCV image processing journey, Part 53
-
Introduction to python (IV) dynamic web page analysis and capture
Random recommended
- Python practice - capture 58 rental information and store it in MySQL database
- leetcode 119. Pascal's Triangle II(python)
- leetcode 31. Next Permutation(python)
- [algorithm learning] 807 Maintain the city skyline (Java / C / C + + / Python / go / trust)
- The rich woman's best friend asked me to write her a Taobao double 11 rush purchase script in Python, which can only be arranged
- Glom module of Python data analysis module (1)
- Python crawler actual combat, requests module, python realizes the full set of skin to capture the glory of the king
- Summarize some common mistakes of novices in Python development
- Python libraries you may not know
- [Python crawler] detailed explanation of selenium from introduction to actual combat [2]
- This is what you should do to quickly create a list in Python
- On the 55th day of the journey, python opencv perspective transformation front knowledge contour coordinate points
- Python OpenCV image area contour mark, which can be used to frame various small notes
- How to set up an asgi Django application with Postgres, nginx and uvicorn on Ubuntu 20.04
- Initial Python tuple
- Introduction to Python urllib module
- Advanced Python Basics: from functions to advanced magic methods
- Python Foundation: data structure summary
- Python Basics: from variables to exception handling
- Python notes (22): time module and calendar module
- Python notes (20): built in high-order functions
- Python notes (17): closure
- Python notes (18): decorator
- Python notes (16): generators and iterators
- Python notes (XV): List derivation
- Python tells you what timing attacks are
- Python -- file and exception
- [Python from introduction to mastery] (IV) what are the built-in data types of Python? Figure out
- Python code to scan code to pay attention to official account login
- [algorithm learning] 1221 Split balanced string (Java / C / C + + / Python / go / trust)
- Python notes (22): errors and exceptions
- Python has been hidden for ten years, and once image recognition is heard all over the world
- Python notes (21): random number module
- Python notes (19): anonymous functions
- Use Python and OpenCV to calculate and draw two-dimensional histogram
- Python, Hough circle transformation in opencv
- A library for reading and writing markdown in Python: mdutils
- Datetime of Python time operation (Part I)
- The most useful decorator in the python standard library
- Python iterators and generators