current position:Home>Python implements filtering emoticons in text
Python implements filtering emoticons in text
2022-01-31 21:15:42 【Zong AI's life】
background
In the project, you need to filter the text , The main expression in the text is the expression of wechat , Its form is mainly “[ The name of the expression ]”, Because they are all strings , So I'm going to use a regular way to match and replace .
step
Search wechat expression library
To replace an expression , Need a wechat expression library , So the first step is to find out the expressions of wechat , After searching , Found this page WeChat (Wechat) Emoticon list List with wechat expression , But it's a page , You can't copy and paste directly , So we have to find a way to take down the expression name .
Get wechat name list
For the content of the page , We can use JavaScript To get dom The value of the node , Open console , View its nodes , According to its characteristics , Simply write a paragraph that prints the name of the expression js Code .
var doms = document.getElementsByClassName('emoji_card_list');
for(var i=0;i<doms.length;i++){
var tds = doms[i].getElementsByTagName('td');
for(var j=0;j<tds.length;j++){
var text = tds[j].innerText;
if(text.indexOf('[') === 0 || text.indexOf('/')===0) {
console.log(text);
}
}
}
Copy code
After copying to the console for execution , Copy and get the text in this format ( Just part of it ):
[ Let me see ] debugger eval code:7:21
[666] debugger eval code:7:21
[ roll one's eye ] debugger eval code:7:21
/ smile debugger eval code:7:21
/ Pout debugger eval code:7:21
/ color debugger eval code:7:21
/ Shyness debugger eval code:7:21
/ Shut up debugger eval code:7:21
/ sleep debugger eval code:7:21
/ Wangchai debugger eval code:7:21
/ Ladybug
Copy code
For content in this form , We can replace the following content in the text editor , Of course, you can use Python To operate .
Handle wechat expression name
stay Python Console , We can handle it easily , Through the following code , You can get the final collection of expression names .
data = """ The large section copied above , Enter after copying """
data_list = [x.split(' ')[0] for x in data.split('\n')]
emoji_list = []
for x in all_emoj:
if x[0] == '/':
emoji_list.append('[%s]' % x[1:])
else:
emoji_list.append(x)
Copy code
Regular matching expression
With the expression content , We use regular to match the text , The matching code is as follows :
def remove_emoji(text):
"""
Remove emoticons from text , Emoticons are "[ The name of the expression ]" Form like this
return: str
"""
if '[' not in text:
return text
reg_expression = '|'.join([x.replace('[', '\[').replace(']', '\]') for x in emoji_list])
pattern = re.compile(reg_expression)
matched_words = pattern.findall(text)
# Use set duplicate removal , It can prevent multiple loops when finding multiple
for matched_word in set(matched_words):
text = text.replace(matched_word, '')
return text
Copy code
The test results are as follows , Meet the requirements :
>>> remove_emoji("[ ha-ha ][ Shut up ][ Shut up ]")
'[ ha-ha ]'
Copy code
At the end
Because we use non-standard in our project Unicode expression , If you use standard in your project Unicode expression , have access to Python Of emoji package , There is a standard expression list , For details, please refer to an article on the Internet Use python Environment filter text emoji expression .
copyright notice
author[Zong AI's life],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201312115400887.html
The sidebar is recommended
- Python crawls the map of Gaode and the weather conditions of each city
- leetcode 1275. Find Winner on a Tic Tac Toe Game(python)
- leetcode 2016. Maximum Difference Between Increasing Elements(python)
- Run through Python date and time processing (Part 2)
- Application of urllib package in Python
- Django API Version (II)
- Python utility module playsound
- Database addition, deletion, modification and query of Python Sqlalchemy basic operation
- Tiobe November programming language ranking: Python surpasses C language to become the first! PHP is about to fall out of the top ten?
- Learn how to use opencv and python to realize face recognition!
guess what you like
-
Using OpenCV and python to identify credit card numbers
-
Principle of Python Apriori algorithm (11)
-
Python AI steals your voice in 5 seconds
-
A glance at Python's file processing (Part 1)
-
Python cloud cat
-
Python crawler actual combat, pyecharts module, python data analysis tells you which goods are popular on free fish~
-
Using pandas to implement SQL group_ concat
-
How IOS developers learn Python Programming 8 - set type 3
-
windows10+apache2. 4 + Django deployment
-
Django parser
Random recommended
- leetcode 1560. Most Visited Sector in a Circular Track(python)
- leetcode 1995. Count Special Quadruplets(python)
- How to program based on interfaces using Python
- leetcode 1286. Iterator for Combination(python)
- leetcode 1418. Display Table of Food Orders in a Restaurant (python)
- Python Matplotlib drawing histogram
- Python development foundation summary (VII) database + FTP + character coding + source code security
- Python modular package management and import mechanism
- Django serialization (II)
- Python dataloader error "dataloader worker (PID XXX) is killed by signal" solution
- apache2. 4 + Django + windows 10 Automated Deployment
- leetcode 1222. Queens That Can Attack the King(python)
- leetcode 1387. Sort Integers by The Power Value (python)
- Tiger sniffing 24-hour praise device, a case with a crawler skill, python crawler lesson 7-9
- Python object oriented programming 01: introduction classes and objects
- Baidu Post: high definition Python
- Python Matplotlib drawing contour map
- Python crawler actual combat, requests module, python realizes IMDB movie top data visualization
- Python classic: explain programming and development from simple to deep and step by step
- Python implements URL availability monitoring and instant push
- Python avatar animation, come and generate your own animation avatar
- leetcode 1884. Egg Drop With 2 Eggs and N Floors(python)
- leetcode 1910. Remove All Occurrences of a Substring(python)
- Python and binary
- First acquaintance with Python class
- [Python data collection] scrapy book acquisition and coding analysis
- Python crawler from introduction to mastery (IV) extracting information from web pages
- Python crawler from entry to mastery (III) implementation of simple crawler
- The apscheduler module in Python implements scheduled tasks
- 1379. Find the same node in the cloned binary tree (Java / C + + / Python)
- Python connects redis, singleton and thread pool, and resolves problems encountered
- Python from 0 to 1 (day 11) - Python data application 1
- Python bisect module
- Python + OpenGL realizes real-time interactive writing on blocks with B-spline curves
- Use the properties of Python VTK implicit functions to select and cut data
- Learn these 10000 passages and become a humorous person in the IT workplace. Python crawler lessons 8-9
- leetcode 986. Interval List Intersections(python)
- leetcode 1860. Incremental Memory Leak(python)
- How to teach yourself Python? How long will it take?
- Python Matplotlib drawing pie chart