current position:Home>I used Python to climb my wechat friends. They are like this
I used Python to climb my wechat friends. They are like this
2022-02-01 19:04:52 【Manon attack 666】
With the popularity of wechat , More and more people are using wechat . Wechat has gradually changed from a simple social software to a lifestyle , People's daily communication needs wechat , Work communication also needs wechat . Every friend in wechat , All represent the different roles people play in society .
Today's article will be based on Python Data analysis of wechat friends , The dimensions selected here mainly include : Gender 、 Head portrait 、 Signature 、 Location , The results are mainly presented in the form of charts and word clouds , among , For text information, we will use word frequency analysis and emotion analysis . As the saying goes : A good workman does his work well , We must sharpen our tools first . Before officially starting this article , Briefly introduce the third-party module used in this article :
- itchat: Wechat web page interface encapsulation Python edition , In this article, it is used to obtain wechat friend information .
- jieba: Stuttering participle Python edition , In this paper, it is used to segment text information .
- matplotlib:Python Chart drawing module in , In this paper, it is used to draw column chart and pie chart
- snownlp: One Python Chinese word segmentation module in , In this paper, it is used to make emotional judgment on text information .
- PIL:Python Image processing module in , In this paper, it is used to process pictures .
- numpy:Python in Numerical calculation module , In this paper, we cooperate with wordcloud Module USES .
- wordcloud:Python Word cloud module in , In this paper, it is used to draw word cloud pictures .
- TencentYoutuyun: Provided by Tencent Youtu Python edition SDK , In this paper, it is used to recognize face and extract picture label information .
The above modules can be passed pip install , Detailed instructions on the use of each module , Please refer to your own documents .
1. Data analysis
The premise of analyzing wechat friend data is to obtain friend information , By using itchat This module , All this will become very simple , We can do this through the following two lines of code :
itchat.auto_login(hotReload = True)
friends = itchat.get_friends(update = True)
Copy code
It's the same as logging in to the web version of wechat , We can log in by scanning QR code with our mobile phone , Back here friends Object is a collection , The first element is the current user . therefore , In the following data analysis process , We always take friends[1:] As raw input data , Each element in the collection is a dictionary structure , Take myself for example , You can notice that there are Sex、City、Province、HeadImgUrl、Signature These four fields , The following analysis starts with these four fields :
2 Friend gender
Analyze friends' gender , We first need to get the gender information of all our friends , Here we will the information of each friend Sex Field extraction , Then count out Male、Female and Unkonw Number of , We assemble these three values into a list , You can use matplotlib The module draws a pie chart to , Its code implementation is as follows :
def analyseSex(firends):
sexs = list(map(lambda x:x['Sex'],friends[1:]))
counts = list(map(lambda x:x[1],Counter(sexs).items()))
labels = ['Unknow','Male','Female']
colors = ['red','yellowgreen','lightskyblue']
plt.figure(figsize=(8,5), dpi=80)
plt.axes(aspect=1)
plt.pie(counts, # Gender statistics
labels=labels, # Gender display label
colors=colors, # Pie chart area color matching
labeldistance = 1.1, # The distance between the label and the dot
autopct = '%3.1f%%', # Pie chart area text format
shadow = False, # Whether the pie chart shows shadows
startangle = 90, # The starting angle of the pie chart
pctdistance = 0.6 # The distance between the text in the pie chart area and the dot
)
plt.legend(loc='upper right',)
plt.title(u'%s Gender composition of wechat friends ' % friends[0]['NickName'])
plt.show()
Copy code
Here is a brief explanation of this code , The values of wechat gender fields are Unkonw、Male and Female Three , The corresponding values are 0、1、2. adopt Collection Module Counter() These three different values are statistically analyzed , Its items() Method returns a collection of tuples .
The first dimensional element of the tuple represents the key , namely 0、1、2, The second element of the tuple represents the number , And the set of tuples is sorted , That is, press the key 0、1、2 The order of , So pass map() Method can get the number of these three different values , We pass it on to matplotlib Draw it , The percentages of these three different values are determined by matplotlib calculated . The picture below is matplotlib Draw the gender distribution map of friends :
3 My friend's picture
Analyze friends' avatars , Analyze from two aspects , First of all , Among these friends' avatars , What is the proportion of friends who use facial avatars ; second , From these friends' avatars , What valuable keywords can be extracted .
It needs to be based on HeadImgUrl Field download avatar to local , Then through the... Provided by Tencent Youtu Face recognition dependent API Interface , Detect whether there are faces in the avatar picture and extract the labels in the picture . among , The former is a subtotal , We use pie charts to present the results ; The latter is to analyze the text , We use word clouds to present the results . The key code is as follows :
def analyseHeadImage(frineds):
# Init Path
basePath = os.path.abspath('.')
baseFolder = basePath + '\\HeadImages\\'
if(os.path.exists(baseFolder) == False):
os.makedirs(baseFolder)
# Analyse Images
faceApi = FaceAPI()
use_face = 0
not_use_face = 0
image_tags = ''
for index in range(1,len(friends)):
friend = friends[index]
# Save HeadImages
imgFile = baseFolder + '\\Image%s.jpg' % str(index)
imgData = itchat.get_head_img(userName = friend['UserName'])
if(os.path.exists(imgFile) == False):
with open(imgFile,'wb') as file:
file.write(imgData)
# Detect Faces
time.sleep(1)
result = faceApi.detectFace(imgFile)
if result == True:
use_face += 1
else:
not_use_face += 1
# Extract Tags
result = faceApi.extractTags(imgFile)
image_tags += ','.join(list(map(lambda x:x['tag_name'],result)))
labels = [u' Use face faces ',u' Don't use face avatars ']
counts = [use_face,not_use_face]
colors = ['red','yellowgreen','lightskyblue']
plt.figure(figsize=(8,5), dpi=80)
plt.axes(aspect=1)
plt.pie(counts, # Gender statistics
labels=labels, # Gender display label
colors=colors, # Pie chart area color matching
labeldistance = 1.1, # The distance between the label and the dot
autopct = '%3.1f%%', # Pie chart area text format
shadow = False, # Whether the pie chart shows shadows
startangle = 90, # The starting angle of the pie chart
pctdistance = 0.6 # The distance between the text in the pie chart area and the dot
)
plt.legend(loc='upper right',)
plt.title(u'%s Your wechat friends use facial avatars ' % friends[0]['NickName'])
plt.show()
image_tags = image_tags.encode('iso8859-1').decode('utf-8')
back_coloring = np.array(Image.open('face.jpg'))
wordcloud = WordCloud(
font_path='simfang.ttf',
background_color="white",
max_words=1200,
mask=back_coloring,
max_font_size=75,
random_state=45,
width=800,
height=480,
margin=15
)
wordcloud.generate(image_tags)
plt.imshow(wordcloud)
plt.axis("off")
plt.show()
Copy code
Here we will create a new... In the current directory HeadImages Catalog , Used to store the avatars of all friends , Then we'll use a name here FaceApi class , This class is created by Tencent Youtu SDK Come in a package ,** Face detection and image label recognition are called here respectively API Interface ,** The former will count ” Use face faces ” and ” Don't use face avatars ” The number of your friends , The latter will accumulate the tags extracted from each avatar . The analysis results are shown in the figure below :
It can be noted that , Among all wechat friends , It's close to 1/4 Our wechat friends use facial avatars , And there is proximity 3/4 My wechat friends don't have faces , This shows that among all wechat friends ” Level of appearance “ Confident people , It only accounts for... Of the total number of friends 25%, Or say 75% Our wechat friends are mainly on the low side , I don't like making wechat avatars with face avatars .
** secondly , Considering that Tencent Youtu can't really identify ” Face ”,** Here we extract the tags in the friends' avatars again , To help us understand the keywords in the avatar of wechat friends , The analysis results are shown in the figure :
Through word cloud , We can find out : In the signature words of wechat friends , Keywords with relatively high frequency are :** The girl 、 tree 、 House 、 Text 、 Screenshot 、 cartoon 、 A group photo 、 sky 、 The sea .** This shows that among my wechat friends , Wechat avatars selected by friends mainly include daily 、 tourism 、 scenery 、 Screenshot four sources .
The style of wechat avatar selected by friends is mainly cartoon , The common elements in the wechat avatar selected by friends are sky 、 The sea 、 House 、 tree . By observing the avatars of all your friends , I found that among my wechat friends , Using personal photos as wechat avatars are 15 people , Using network pictures as wechat avatars are 53 people , Using animation pictures as wechat avatars are 25 people , Using group photos as wechat avatars are 3 people , Using children's photos as wechat avatars are 5 people , Some people who use landscape pictures as wechat avatars are 13 people , Some people who use girls' photos as wechat avatars are 18 people , It is basically consistent with the analysis results of image label extraction .
4 Friend signature
Analyze friends' signatures , Signature is the most abundant text information in friend information , According to the usual human ” Label ” Methodology of , Signature can analyze a person's state in a certain period of time , Just like people laugh when they are happy 、 Sad will cry , Cry and laugh , It shows that people are happy and sad .
Here we deal with signatures in two ways , The first is to use stuttering word segmentation to generate word cloud , The purpose is to understand the keywords in the friend signature , Which keyword appears relatively frequently ;** The second is to use SnowNLP Analyze the emotional tendencies in friends' signatures ,** That is, friends' signatures are generally positive 、 Negative or neutral , What are their respective proportions . Extract here Signature Field can be , The core code is as follows :
def analyseSignature(friends):
signatures = ''
emotions = []
pattern = re.compile("1f\d.+")
for friend in friends:
signature = friend['Signature']
if(signature != None):
signature = signature.strip().replace('span', '').replace('class', '').replace('emoji', '')
signature = re.sub(r'1f(\d.+)','',signature)
if(len(signature)>0):
nlp = SnowNLP(signature)
emotions.append(nlp.sentiments)
signatures += ' '.join(jieba.analyse.extract_tags(signature,5))
with open('signatures.txt','wt',encoding='utf-8') as file:
file.write(signatures)
# Sinature WordCloud
back_coloring = np.array(Image.open('flower.jpg'))
wordcloud = WordCloud(
font_path='simfang.ttf',
background_color="white",
max_words=1200,
mask=back_coloring,
max_font_size=75,
random_state=45,
width=960,
height=720,
margin=15
)
wordcloud.generate(signatures)
plt.imshow(wordcloud)
plt.axis("off")
plt.show()
wordcloud.to_file('signatures.jpg')
# Signature Emotional Judgment
count_good = len(list(filter(lambda x:x>0.66,emotions)))
count_normal = len(list(filter(lambda x:x>=0.33 and x<=0.66,emotions)))
count_bad = len(list(filter(lambda x:x<0.33,emotions)))
labels = [u' Negative negative ',u' Neutral ',u' Positive ']
values = (count_bad,count_normal,count_good)
plt.rcParams['font.sans-serif'] = ['simHei']
plt.rcParams['axes.unicode_minus'] = False
plt.xlabel(u' Emotional judgment ')
plt.ylabel(u' frequency ')
plt.xticks(range(3),labels)
plt.legend(loc='upper right',)
plt.bar(range(3), values, color = 'rgb')
plt.title(u'%s Wechat friend signature information emotional analysis ' % friends[0]['NickName'])
plt.show()
Copy code
Through word cloud , We can find out : In the signature information of wechat friends , Keywords with relatively high frequency are : Strive 、 Grow up 、 happy 、 happy 、 life 、 Happiness 、 life 、 distance 、 time 、 take a walk .
Through the following histogram , We can find out : In the signature information of wechat friends , Positive emotional judgment accounts for about 55.56%, Neutral emotional judgment accounts for about 32.10%, Negative emotional judgments account for about 12.35%. This result is basically consistent with the result we show through word cloud , This shows that in the signature information of wechat friends , There are about 87.66% Signature information for , It conveys a positive attitude .
5 Friend location
Analyze friend locations , Mainly by extracting Province and City These two fields .Python Map visualization in mainly through Basemap modular , This module needs to download map information from foreign websites , It's very inconvenient to use .
Baidu ECharts Used more at the front end , Although the community provides pyecharts project , But I noticed that because of the change of policy , at present Echarts The ability to export maps is no longer supported , Therefore, the customization of maps is still a problem , The mainstream technical scheme is to configure the equipment of provinces and cities all over the country JSON data .
What I'm using here is BDP Personal Edition , This is a zero programming scheme , We go through Python Export a CSV file , Then upload it to BDP in , You can create a visual map by simply dragging , It can't be simpler , Here we just show the generation CSV Part of the code :
def analyseLocation(friends):
headers = ['NickName','Province','City']
with open('location.csv','w',encoding='utf-8',newline='',) as csvFile:
writer = csv.DictWriter(csvFile, headers)
writer.writeheader()
for friend in friends[1:]:
row = {}
row['NickName'] = friend['NickName']
row['Province'] = friend['Province']
row['City'] = friend['City']
writer.writerow(row)
Copy code
The picture below is BDP The geographical distribution map of wechat friends generated in , You can find : My wechat friends are mainly concentrated in Ningxia and Shaanxi .
**PS: Thank you for your patience . In addition, reading more and improving cognition can enhance the competitiveness of the workplace .** Here also give you a set of my flowers 1 A complete set of study in six months Python The information package of , A total of 400 Set ( Source code . video . note ) books , I hope you found that useful :
Stamp here to get :
Python 400 Set ( Source code . video . note )
Extraction code : WeChat search official account 【 Code farmer attack 】 Focus on , reply 【Python】 Can get .
6 summary
This article is another attempt of my data analysis , Mainly from gender 、 Head portrait 、 Signature 、 Position has four dimensions , A simple data analysis of wechat friends , The results are mainly presented in the form of charts and word clouds . In a word ,” Data visualization is a means, not an end ”, The important thing is not that we made these pictures here , But the phenomenon reflected in these pictures , What essential enlightenment can we get , I hope this article can inspire you .
source : The Internet . If invade , Please contact to delete
copyright notice
author[Manon attack 666],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/02/202202011904490210.html
The sidebar is recommended
- Python data analysis - linear regression selection fund
- How to make a python SDK and upload and download private servers
- Python from 0 to 1 (day 20) - basic concepts of Python dictionary
- Django -- closure decorator regular expression
- Implementation of home page and back end of Vue + Django tourism network project
- Easy to use scaffold in Python
- [Python actual combat sharing] I wrote a GIF generation tool, which is really TM simple (Douluo continent, did you see it?)
- [Python] function decorators and common decorators
- Explain the python streamlit framework in detail, which is used to build a beautiful data visualization web app, and practice making a garbage classification app
- Construction of the first Django project
guess what you like
-
Python crawler actual combat, pyecharts module, python realizes the visualization of river review data
-
Python series -- web crawler
-
Plotly + pandas + sklearn: shoot the first shot of kaggle
-
How to learn Python systematically?
-
Analysis on several implementations of Python crawler data De duplication
-
leetcode 1616. Split Two Strings to Make Palindrome (python)
-
Python Matplotlib drawing violin diagram
-
Python crawls a large number of beautiful pictures with 10 lines of code
-
[tool] integrated use of firebase push function in Python project
-
How to use Python to statistically analyze access logs?
Random recommended
- How IOS developers learn Python Programming 22 - Supplement 1
- Python can meet any API you need
- Python 3 process control statement
- The 20th of 120 Python crawlers, 1637. All the way business opportunity network joined in data collection
- Datetime of pandas time series preamble
- How to send payslips in Python
- [Python] closure and scope
- Application of Python Matplotlib color
- leetcode 1627. Graph Connectivity With Threshold (python)
- Python thread 08 uses queues to transform the transfer scenario
- Python: simple single player strange game (text)
- Daily python, chapter 27, Django template
- TCP / UDP communication based on Python socket
- Use of pandas timestamp index
- leetcode 148. Sort List(python)
- Confucius old book network data collection, take one anti three learning crawler, python crawler 120 cases, the 21st case
- [HTB] cap (datagram analysis, setuid capability: Python)
- How IOS developers learn Python Programming 23 - Supplement 2
- How to automatically identify n + 1 queries in Django applications (2)?
- Data analysis starts from scratch. Pandas reads HTML pages + data processing and analysis
- 1313. Unzip the coding list (Java / C / C + + / Python / go / trust)
- Python Office - Python edit word
- Collect it quickly so that you can use the 30 Python tips for taking off
- Strange Python strip
- Python crawler actual combat, pyecharts module, python realizes China Metro data visualization
- DOM breakpoint of Python crawler reverse
- Django admin custom field stores links in the database after uploading files to the cloud
- Who has powder? Just climb who! If he has too much powder, climb him! Python multi-threaded collection of 260000 + fan data
- Python Matplotlib drawing streamline diagram
- The game comprehensively "invades" life: Python releases the "cool run +" plan!
- Python crawler notes: use proxy to prevent local IP from being blocked
- Python batch PPT to picture, PDF to picture, word to picture script
- Advanced face detection: use Dlib, opencv and python to detect face markers
- "Python 3 web crawler development practice (Second Edition)" is finally here!!!!
- Python and Bloom filters
- Python - singleton pattern of software design pattern
- Lazy listening network, audio novel category data collection, multi-threaded fast mining cases, 23 of 120 Python crawlers
- Troubleshooting ideas and summary of Django connecting redis cluster
- Python interface automation test framework (tools) -- interface test tool requests
- Implementation of Morse cipher translator using Python program