current position:Home>Quick reference manual of common regular expressions, necessary for Python text processing

Quick reference manual of common regular expressions, necessary for Python text processing

2022-02-01 07:26:16 Can you guarantee it

One 、 Check the expression of a number
  1. Numbers :^[0-9]*$
  2. n Digit number :^\d{n}$
  3. At least n Digit number :^\d{n,}$
  4. m-n Digit number :^\d{m,n}$
  5. Zero and non-zero digits :^(0|[1-9][0-9]*)$
  6. A number with a maximum of two decimal places beginning with a nonzero :^([1-9][0-9]*)+(.[0-9]{1,2})?$
  7. belt 1-2 Positive or negative number of decimal places :^(-)?\d+(.\d{1,2})?$
  8. Positive numbers 、 negative 、 And decimal fraction :^(-|+)?\d+(.\d+)?$
  9. A positive real number with two decimal places :^[0-9]+(.[0-9]{2})?$
  10. Yes 1~3 Positive real number of decimal places :^[0-9]+(.[0-9]{1,3})?$
  11. Nonzero positive integer :^[1-9]\d*$ or ^([1-9][0-9]*){1,3}$ or ^+?[1-9][0-9]*$
  12. Nonzero negative integer :^-[1-9][]0-9"*$ or ^-[1-9]\d*$
  13. Non-negative integer :^\d+$ or ^[1-9]\d*|0$
  14. Non positive integer :^-[1-9]\d*|0$ or ^((-\d+)|(0+))$
  15. Nonnegative floating point number :^\d+(.\d+)?$ or ^[1-9]\d*.\d*|0.\d*[1-9]\d*|0?.0+|0$
  16. Non positive floating point number :^((-\d+(.\d+)?)|(0+(.0+)?))$ or ^(-([1-9]\d*.\d*|0.\d*[1-9]\d*))|0?.0+|0$
  17. Positive floating point :^[1-9]\d*.\d*|0.\d*[1-9]\d*$ or ^(([0-9]+.[0-9]*[1-9][0-9]*)|([0-9]*[1-9][0-9]*.[0-9]+)|([0-9]*[1-9][0-9]*))$
  18. Negative floating point number :^-([1-9]\d*.\d*|0.\d*[1-9]\d*)$ or ^(-(([0-9]+.[0-9]*[1-9][0-9]*)|([0-9]*[1-9][0-9]*.[0-9]+)|([0-9]*[1-9][0-9]*)))$
  19. Floating point numbers :^(-?\d+)(.\d+)?$ or ^-?([1-9]\d*.\d*|0.\d*[1-9]\d*|0?.0+|0)$

Two 、 Expression for check character

  1. Chinese characters :^[\u4e00-\u9fa5]{0,}$
  2. English and numbers :^[A-Za-z0-9]+$ or ^[A-Za-z0-9]{4,40}$
  3. The length is 3-20 All characters of :^.{3,20}$
  4. from 26 A string of English letters :^[A-Za-z]+$
  5. from 26 A string of uppercase letters :^[A-Z]+$
  6. from 26 A string of lowercase letters :^[a-z]+$
  7. By numbers and 26 A string of English letters :^[A-Za-z0-9]+$
  8. By digital 、26 A string of English letters or underscores :^\w+$ or ^\w{3,20}
  9. chinese 、 english 、 Numbers include underscores :^[\u4E00-\u9FA5A-Za-z0-9_]+$
  10. chinese 、 english 、 Number but excluding symbols such as underscores :^[\u4E00-\u9FA5A-Za-z0-9]+$ or ^[\u4E00-\u9FA5A-Za-z0-9]{2,20}$
  11. Can be entered with ^%&',;=?$" Equal character :[^%&',;=?$\x22]+
  12. Disable input containing ~ The characters of [^~\x22]+

Other :

.* Matching elimination  \n  Any character other than ./[\u4E00-\u9FA5]/  Chinese characters /[\uFF00-\uFFFF]/  Full angle symbol /[\u0000-\u00FF]/  Half angle symbol

3、 ... and 、 Expression of special requirements

  1. Email Address :^\w+([-+.]\w+)*@\w+([-.]\w+)*.\w+([-.]\w+)*$
  2. domain name :[a-zA-Z0-9][-a-zA-Z0-9]{0,62}(/.[a-zA-Z0-9][-a-zA-Z0-9]{0,62})+/.?
  3. InternetURL:[a-zA-z]+://[^\s]* or ^http://([\w-]+.)+[\w-]+(/[\w-./?%&=]*)?$
  4. Phone number :^(13[0-9]|14[5|7]|15[0|1|2|3|5|6|7|8|9]|18[0|1|2|3|5|6|7|8|9])\d{8}$
  5. Phone number ("XXX-XXXXXXX"、"XXXX-XXXXXXXX"、"XXX-XXXXXXX"、"XXX-XXXXXXXX"、"XXXXXXX" and "XXXXXXXX):^((\d{3,4}-)|\d{3.4}-)?\d{7,8}$
  6. Domestic phone number (0511-4405222、021-87888822):\d{3}-\d{8}|\d{4}-\d{7}
  7. ID number (15 position 、18 Digit number ):^\d{15}|\d{18}$
  8. Short ID number ( Numbers 、 Letter x ending ):^([0-9]){7,18}(x|X)?$ or ^\d{8,18}|[0-9x]{8,18}|[0-9X]{8,18}?$
  9. Is the account number legal ( Beginning of letter , allow 5-16 byte , Allow alphanumeric underscores ):^[a-zA-Z][a-zA-Z0-9_]{4,15}$
  10. password ( Start with a letter , The length is in 6~18 Between , Can only contain letters 、 Numbers and underscores ):^[a-zA-Z]\w{5,17}$
  11. Strong password ( Must contain a combination of upper and lower case letters and numbers , Special characters cannot be used , The length is in 8-10 Between ):^(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,10}$
  12. Date format :^\d{4}-\d{1,2}-\d{1,2}
  13. One year 12 Months (01~09 and 1~12):^(0?[1-9]|1[0-2])$
  14. A month 31 God (01~09 and 1~31):^((0?[1-9])|((1|2)[0-9])|30|31)$

Input format of money :

1. There are four forms of money we can accept :"10000.00" and "10,000.00", And no " branch " Of "10000" and "10,000":^[1-9][0-9]*$2. This means that either of them can't 0 Number at the beginning , however , It also means a character "0" Not through , So we take the following form :^(0|[1-9][0-9]*)$3. One 0 Or one doesn't 0 Number at the beginning . We can also allow a minus sign at the beginning :^(0|-?[1-9][0-9]*)$4. This means a 0 Or a start that may be negative is not 0 The number of . Let users 0 The beginning is good. . Remove the minus sign as well , Because money can't be negative . What we're going to add here is the possible decimal part :^[0-9]+(.[0-9]+)?$5. It must be noted that , There should be at least one after the decimal point 1 digit , therefore "10." No way , however "10" and "10.2" Yes. :^[0-9]+(.[0-9]{2})?$6. In this way, we stipulate that there must be two decimal places after the decimal point , If you think it's too harsh , It can be like this :^[0-9]+(.[0-9]{1,2})?$7. This allows the user to write only one decimal place . Now it's time to think about commas in numbers , We can do this :^[0-9]{1,3}(,[0-9]{3})*(.[0-9]{1,2})?$8.1 To 3 A digital , Follow any one comma +3 A digital , Comma as optional , Not necessarily :^([0-9]+|[0-9]{1,3}(,[0-9]{3})*)(.[0-9]{1,2})?$

remarks : This is the end result , Don't forget + It can be used * Instead, if you think empty strings are acceptable ( strange , Why? ?) Last , Don't forget to remove the backslash when using functions , Common mistakes are here

  1. xml file :^([a-zA-Z]+-?)+[a-zA-Z0-9]+\.[x|X][m|M][l|L]$
  2. Regular expression of Chinese characters :[\u4e00-\u9fa5]
  3. Double byte character :[^\x00-\xff] ( Including Chinese characters , Can be used to calculate the length of a string ( A double byte character length meter 2,ASCII Character meter 1))
  4. Regular expression for blank lines :\n\s*\r ( Can be used to delete blank lines )
  5. HTML Tagged regular expression :<(\S*?)[^>]*>.*?</\1>|<.*? /> ( The version circulating on the Internet is too bad , This one is only part of it , There's nothing we can do about complex nested tags )
  6. Regular expression of first and last whitespace characters :^\s*|\s*$ or (^\s*)|(\s*$) ( Can be used to delete blank characters at the beginning and end of a line ( Including Spaces 、 tabs 、 Page breaks and so on ), Very useful expressions )
  7. QQ Number :[1-9][0-9]{4,} (QQ Number from 10000 Start )
  8. Postal Code :[1-9]\d{5}(?!\d) ( The postal code is 6 Digit number )
  9. IP Address :\d+.\d+.\d+.\d+ ( extract IP Useful for addresses )
  10. IP Address :((?:(?:25[0-5]|2[0-4]\d|[01]?\d?\d)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d?\d))
  11. IP-v4 Address :\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b ( extract IP Useful for addresses )
  12. check IP-v6 Address :(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))
  13. Subnet mask :((?:(?:25[0-5]|2[0-4]\d|[01]?\d?\d)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d?\d))
  14. Verification date :^(?:(?!0000)[0-9]{4}-(?:(?:0[1-9]|1[0-2])-(?:0[1-9]|1[0-9]|2[0-8])|(?:0[13-9]|1[0-2])-(?:29|30)|(?:0[13578]|1[02])-31)|(?:[0-9]{2}(?:0[48]|[2468][048]|[13579][26])|(?:0[48]|[2468][048]|[13579][26])00)-02-29)$(“yyyy-mm-dd“ Format date verification , A flat leap year has been considered .)
  15. Extract comments :``
  16. lookup CSS attribute :^\s*[a-zA-Z\-]+\s*[:]{1}\s[a-zA-Z0-9\s.#]+[;]{1}
  17. Extract page hyperlinks :(<a\s*(?!.*\brel=)[^>]*)(href="https?:\/\/)((?!(?:(?:www\.)?'.implode('|(?:www\.)?', $follow_list).'))[^" rel="external nofollow" ]+)"((?!.*\brel=)[^>]*)(?:[^>]*)>
  18. Take pictures of web pages :\< *[img][^\\>]*[src] *= *[\"\']{0,1}([^\"\'\ >]*)
  19. Extract web page color code :^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$
  20. File extension validation :^([a-zA-Z]\:|\\)\\([^\\]+\\)*[^\/:*?"<>|]+\.txt(l)?$
  21. Judge IE edition :^.*MSIE [5-8](?:\.[0-9]+)?(?!.*Trident\/[5-9]\.0).*$

Schedule :

picture

picture

picture

picture

picture

picture

picture

picture

\

picture

picture

**

0 People praise points **

**

** diary

**

\

copyright notice
author[Can you guarantee it],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/02/202202010726148354.html

Random recommended