current position:Home>Python notes - day14 - regular
Python notes - day14 - regular
2022-05-15 06:03:51【Wandering mage 12】
Preface
python Grammar learning , Leave it to those who need it , Understand everything !!
# coding=utf8
# @time:2022/4/22 20:14
# Author Haoyu
# re The module is python A built-in module dedicated to processing regular expression functions
from re import fullmatch,findall,search
# One 、 Regular expressions match symbols
# 1. What is regular expression
# Regular expression is a tool that can make string processing very simple
# Regular expressions are rules that describe strings through various regular symbols
# In different programming languages , The regular syntax is the same , But they are expressed in different ways : Such as python - ‘ Regular expressions ’;js - / Regular expressions /
# 2. Regular sign
# 1) Ordinary character - Ordinary characters represent the character itself in regular expressions
# fullmatch( Regular expressions , character string ) - Determine whether the string conforms to the rules described by regular , If yes, the return result is None, It is None
'''''''''
re_str='abc' # The rules : A string has three characters , Namely a,b, and c
result = fullmatch(re_str,'abc')
print(result)
'''''''''
# 2) Special characters
# first : . - To match an arbitrary character
'''''''''
'a.b' - Indicates that the length of a match is three , The first is a, The last one is b, In the middle is any string
re_str = 'a.b'
result = fullmatch(re_str,'a+b')
print(result) # <re.Match object; span=(0, 3), match='a+b'>
# ’xy..‘ - Indicates that the matching length is 4, The first two are xy, The last two digits are strings of arbitrary characters
'''''''''
# the second :\d - Match an arbitrary number
'''''''''
re_str = 'a\db' # Match a length of 3, The first is a, The last one is b, A character with an arbitrary number in the middle
result = fullmatch(re_str,'a5b')
print(result) # <re.Match object; span=(0, 3), match='a5b'>
'''''''''
# Third :\s - Match a blank character
# Blank character :‘ Space ’、‘\n( It's like a carriage return )’、‘\t( amount to tab key )’
'''''''''
re_str = 'abc\s123'
result = fullmatch(re_str,'abc 123')
print(result) # <re.Match object; span=(0, 7), match='abc 123'>
'''''''''
# The fourth one :\w( understand ) - Match a letter 、 Numbers 、 Underline (ASCII Characters outside the code table can match )
'''''''''
re_str='\d\w\d'
result = fullmatch('re_str','2 see 8')
print(result) # None
'''''''''
# The fifth one :\D - Match an arbitrary non numeric character
'''''''''
re_str = 'a\Db'
result = fullmatch(re_str,'aab')
print(result) # <re.Match object; span=(0, 3), match='aab'>
'''''''''
# Sixth :\S - Match any non whitespace character
'''''''''
re_str = 'a\Sb'
result = fullmatch(re_str,'acb')
print(result) # <re.Match object; span=(0, 3), match='acb'>
'''''''''
# Seventh :[ Character set ] - Match any character in the character set
'''''''''
Be careful :a. One [] Only one character can be matched
b. stay [] You can put - Put between two characters to indicate the range , however - The preceding encoding value must be less than the encoding value of the following character
c. stay [] in - It only makes sense between two characters ( Scope of representation ), In the front and back, it means - In itself
[a1+] - Matching character a perhaps 1 Or characters +.
[\dxy] - Match any number or x perhaps y.
[1-9] - matching 1 To 9 Any number of .
[a-z] - Match any lowercase letter
[A-Z] - Match any capital letter
[\u4e00-u9fa5] - Match any Chinese character
[a-z+=\] - Match any lowercase letter or + perhaps = perhaps \
'''''''''
'''''''''
re_str= 'a[xyz]b' # Match a length of 3, The first character is a, The third character is b, In the middle is xyz Any character in
result = fullmatch(re_str,'axb')
print(result)
'''''''''
# The eighth :[^ Character set ] - Matches any character in a non character set
'''''''''
[^abc] - Match except abc Any character other than
[^\d] - Match any character other than a number
[^a-z] - Match any non lowercase character
Be careful : If you put ^ Put in the middle or after the character set , said ^ In itself ;
re_str = '\d[^abc]\d'
result = fullmatch(re_str,'374')
print(result) # <re.Match object; span=(0, 3), match='374'>
'''''''''
# Two 、 Detection class symbol
# Match the symbol of the class , A symbol needs to correspond to a character ; The detection class symbol does not affect the string length , Only after the matching is successful, check whether the position of the symbol meets the requirements ;
# 1.
# \b - Detect whether it is a word boundary
# \B - Detect no word boundaries
# Word boundaries : Characters that can distinguish two words in life . for example : Space ( blank )、 Punctuation marks such as commas 、 Start and end of string
'''''''''
re_str = r'abc,\b123' # r You can't let escape characters escape , It doesn't affect the regularity
result = fullmatch(re_str,'abc,123')
print(result) # <re.Match object; span=(0, 7), match='abc,123'>
Be careful : Let's see if it can match successfully , Look again , Whether there is a word boundary behind ; Or look 1 Whether there is a word boundary in front , The latter is consistent with the above, so it can be detected successfully
'''''''''
# 2.^ - Detect the beginning of the string
'''''''''
re_str = '^\d\d\d'
result = findall(re_str,'123asd')
print(result) # ['123']
'''''''''
# 3.$ - Check whether it is the end of the string
'''''''''
re_str = '\d\d\d$'
result = findall(re_str,'sd123')
print(result) # ['123']
'''''''''
# 3、 ... and 、 Control matching times
# 1.+ - Once or more ( At least once )
'''''''''
a+ - character a One or more times ( At least once a)
\d - Match any numeric character one or more times
.+ - Match any character one or more times
re_str = 'xa+y'
result = fullmatch(re_str,'xaay')
print(result) # <re.Match object; span=(0, 4), match='xaay'>
re_str = 'x\d+y'
result = fullmatch(re_str,'x12y')
print(result) # <re.Match object; span=(0, 4), match='x12y'>
'''''''''
# 2.* - matching 0 Times or more ( Any number of times )
'''''''''
re_str = 'xa*y'
result = fullmatch(re_str,'xaaaaaay')
print(result) # <re.Match object; span=(0, 8), match='xaaaaaay'>
'''''''''
# 3.? - matching 0 Once or once
'''''''''
re_str = 'xa?y'
result = fullmatch(re_str,'xay')
result1 = fullmatch(re_str,'xy')
print(result) # <re.Match object; span=(0, 3), match='xay'>
print(result1) # <re.Match object; span=(0, 2), match='xy'>
'''''''''
# 4.{} -
'''''''''
1){
N} - matching N Time
2){
M,N} - matching M To N Time ([m,n])
3){
M,} - Match at least M Time
4){
,N} - Most matches N Time
'''''''''
# ( important !)5. Greed is not greed
# When the number of matches is uncertain , Matching is divided into greedy and non greedy , The default is greedy .
# 1) greedy : When the matching times are uncertain, the default is greedy (+ * ?)
# On the premise of matching success , Take the result with the most matching times .( hypothesis 3 Time \4 Time \6 Can match successfully every time , Finally take 6 Time )
# re_str = r'a.+b'
# print(findall(re_str,'amsnbsdhdnb')) # ['amsnbsdhdnb']
# 2) Not greed : When the number of matches is uncertain, add... After the number of matches '?'(+? *? ??)
# On the premise of matching success , Take the result with the least number of matches .( hypothesis 3 Time \4 Time \6 Can match successfully every time , Finally take 3 Time )
# re_str = r'a.+?b'
# print(findall(re_str,'amsnbsdhdnb')) # ['amsnb']
# Four 、 Branch and group
# 1.() - grouping
# Grouping is to use a part of a regular expression with () Wrap it up as a whole , Then carry out the overall operation
# In regular expressions, a () Represents a group
'''''''''
# 1) Overall operation
print(fullmatch('(\d{2}[a-z]{3})+','22asd')) # <re.Match object; span=(0, 5), match='22asd'>
# 2) Overall repetition
\M - It means repetition and the first M Content matched by a group
print(fullmatch(r'(\d{2})ab\1','22ab22')) # <re.Match object; span=(0, 6), match='22ab22'>
# 3) Screening
'''''''''
# 2.| - Branch
'''''''''
# practice : Write a regular expression , The ability to match a string is :abc Followed by three arbitrary numbers or three arbitrary capital letters ?
# 'ABC827'、'abcKNM'
re_str = '(abc\d{3})|([A-Z]{3})'
print(fullmatch(re_str,'ASD')) # <re.Match object; span=(0, 3), match='ASD'>
'''''''''
# 3. Escape symbol
# Pay money plus... With special meaning in regular \ Let the function of this symbol disappear , Represents the symbol itself
# re_str = '\d{2}\.a' - Express 2 After a number, Mina a point , And finally a a
# Be careful : In addition to the [] Symbols with special meaning in (^、-) Outside , Other symbols are in [] It all represents the symbol itself
from re import fullmatch,match,search,findall,split,sub
# 5、 ... and 、re modular
# 1.fullmatch( Regular expressions , character string ) - Determine whether the string and regular expression exactly match ; If you can match, return the matching object , Otherwise return to None;
# 2.match( Regular expressions , character string ) - Determine whether the beginning of the string matches the regular ; If you can match, return the matching object , Otherwise return to None;
'''''''''
re_str = '\d{3}'
print(match(re_str,'123asd asd')) # <re.Match object; span=(0, 3), match='123'>
'''''''''
# 3.search( Regular expressions , character string ) - Get the first string in this string that satisfies the regular expression ; If you can match, return the matching object , Otherwise return to None;
'''''''''
re_str = '\d{3}'
print(match(re_str,'123asd321asd')) # <re.Match object; span=(0, 3), match='123'>
'''''''''
# 4.findall( Regular expressions , character string ) - Get all substrings in the string that satisfy the regular expression , Return a list
'''''''''
print(findall('\d{3}','123asd432123')) # ['123', '432', '123']
'''''''''
# 5.split( Regular expressions , character string ) - Take all substrings in the string that meet the regular expression as the cutting point , Cut strings .
'''''''''
print(split('[ab]','dsadbdsfe')) # ['ds', 'd', 'dsfe']
'''''''''
# 6.sub( Regular expressions , character string 1, character string 2) - The string 2 Substrings of regular expressions are replaced with strings 1
'''''''''
print(sub('\d','+','jiushi123')) # jiushi+++
'''''''''
More secure sharing , Please pay attention to 【 Security info】 WeChat official account !
copyright notice
author[Wandering mage 12],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/131/202205110607424172.html
The sidebar is recommended
- Introduction to the differences between Python and Java
- Explain Python CONDA in detail
- The pycham downloaded by MAC reports an error as soon as it is opened. The downloaded Python interpreter is also the latest version
- From entry to mastery, python full stack engineers have personally taught Python core technology and practical combat for ten years
- Python is used to detect some problems of word frequency in English text.
- How to choose between excel, database and pandas (Python third-party library)?
- WxPython download has been reporting errors
- Pyside6 UIC and other tools cannot be found in the higher version of pyside6 (QT for Python 6). How to solve it?
- About Python Crawlers
- Successfully imported pandas, unable to use dataframe
guess what you like
How to extract some keywords in the path with Python
Python encountered a problem reading the file!
When Python is packaged into exe, an error is reported when opening assertionerror: C: \ users \ Acer \ appdata \ local \ temp\_ MEI105682\distutils\core. pyc
Eight practical "no code" features of Python
Python meets SQL, so a useful Python third-party library appears
100 Python algorithm super detailed explanation: a hundred dollars and a hundred chickens
[fundamentals of Python] Python code and so on
When Python uses probit regression, the program statement is deleted by mistake, and then it appears_ raise_ linalgerror_ Unrecognized error of singular
Python testing Nicholas theorem
Accelerating parallel computing based on python (BL) 136
Random recommended
- Python dynamic programming (knapsack problem and longest common substring)
- Django uses queryset filter save, and an 'queryset' object has no attribute 'Save' error occurs. Solution?
- Analysis of built-in functions in Python learning
- Python office automation - 90 - file automation management - cleaning up duplicate files and batch modifying file names
- Python office automation - 91 - word file Automation - word operation and reading word files
- After python, go also runs smoothly on the browser
- Self taught Python 26 method
- Summary of Python Tkinter component function examples (code + effect picture) (RadioButton | button | entry | menu | text)
- Python implementation of official selection sorting of Luogu question list
- Application of Django template
- Get project root path and other paths in Python project
- Get, rename, and delete file names in Python projects
- How to set the width and height of Python operation table
- Python string preceded by 'f' R 'B' U '
- JSON and other types convert to each other in Python
- Key value of key combination in pynput in Python
- Conversion of Python PDF file to word file
- Interface testing uses Python decorators
- Get the current time in Python
- Python course notes -- Python string, detailed explanation of related functions
- Python file and folder operations
- Python file content operation
- Three basic data quality evaluation methods and python implementation
- Python data structure and mathematical algorithm examples (continuously updating)
- GUI interface mail sending program based on SMTP protocol server and python language
- Python application tool development: mail sending GUI program
- Python application tool development: PIP instruction GUI program
- Application development based on Python and MySQL database: student information score management system version 1.0
- [Python] sort and sorted
- [Python] create a two-dimensional array with list, which is easy to step on
- Multiply [email protected]
- About creating in Python project folder__ init__. Understanding of PY
- Python, zsbd
- Smplify introduction and in python2 7 operation in environment
- Boost(2):boost. Python library introduction and simple examples
- Boost (3): encapsulate C + + classes into Python classes
- Boost (5): extract the type of C + + language and extract class from Python object
- Boost(6):Boost. How Python converts C + + parameter and return value types
- Boost(7):Boost. Python encapsulates overloaded functions and passes default parameters
- Boost(8):Boost. Python implements Python objects and various types of built-in operations and methods