current position:Home>When winter comes, python chooses a coat with temperament for mom! Otherwise, there's really no way to start!
When winter comes, python chooses a coat with temperament for mom! Otherwise, there's really no way to start!
2022-01-31 21:25:15 【Programming small code farmer】
Preface
Winter is coming , I still envy the North , It's already snowing , Our south is still a big sun . But it's still cold . Many times it's only a few degrees ! Mom should be colder in her hometown . So today we'll use Python Choose the best coat for mom . Make mom happy !
today , Let's climb Netease to strictly select a coat brand comment , Let's see what you think about color 、 How to choose the size ?
Target acquisition
We have six target data this time , Color 、 Size 、 Comment on time 、 Membership level 、 Likes and comments
Finally, it is intuitively displayed to selection difficulties through data visualization
It makes it easy for you to make a choice ~
Web analytics
We F12 Open browser developer mode , You can see that all the data we want to obtain are in it
Next, we find the web request link to simulate the browser request to obtain data
Pay attention to add headers.
Send a request
After the web page analysis is completed, send the request
url = f'http://you.163.com/xhr/comment/listByItemByTag.json?__timestamp=1636785180888&itemId=3532002&tag=%E5%85%A8%E9%83%A8&size=20&page={page}&orderBy=0&oldItemTag=%E5%85%A8%E9%83%A8&oldItemOrderBy=0&tagChanged=0'
headers = {
'Cookie': 'yx_from=web_search_baidu; yx_aui=ada226e7-929f-419c-af99-53ad3eda94f0; mail_psc_fingerprint=01caf6305f28d3e4b8cfe162559acaac; yx_s_device=92db99a-a0c8-22cd-47c3-61d5b24664; yx_but_id=c18807c330874f4aaae2799cd51cdf9fd04f970cabedddc6_v1_nl; P_INFO=18392144506|1636766426|1|yanxuan_web|00&99|null=zyQS1ZWw9NX5Xw50muitOHx0kkWJG3WT51-8azp0ZDa&wd=&eqid=ca2a415f0000e3c900000005618f20af; _ntes_nnid=f1c2812145357d2b883902c65c421256,1636769976319; yx_delete_cookie_flag=true; yx_stat_seesionId=ada226e7-929f-419c-af99-53ad3eda94f01636769983423; yx_stat_ypmList=; yx_show_painted_egg_shell=false; yx_new_user_modal_show=1; yx_page_key_list=http%3A//you.163.com/search%3Fkeyword%3D%25E6%25A3%2589%25E8%25A2%2584%25E5%25A5%25B3%26timestamp%3D1636769989980%26_stat_search%3Dhistory%26searchWordSource%3D5%26_stat_referer%3Dindex%23page%3D1%26sortType%3D0%26descSorted%3Dtrue%26categoryId%3D0%26matchType%3D0%2Chttp%3A//you.163.com/item/detail%3Fid%3D3991647%26_stat_area%3D1%26_stat_referer%3Dsearch%26_stat_query%3D%25E6%25A3%2589%25E8%25A2%2584%25E5%25A5%25B3%26_stat_count%3D169%26_stat_searchversion%3Dmmoe_model-1.1.0-1.3; yx_stat_seqList=v_315469b8cb%7Cv_f72eac7e53%3B-1%3Bv_0e93fce746%3Bc_6b9da68e5d%3Bv_315469b8cb%3B-1',
'Referer': 'http://you.163.com/item/detaierer=search&_stat_query=%E6%96%87%E8%83%B8&_stat_count=132&_stat_searchversion=dcn_model-1.1.0-1.3',
'User-Agent': 'Mozilla/5.0 (WindowHTML, like Gecko) Chrome/96.0.4664.9 Safari/537.36'
}
resp = requests.get(url, headers = headers)
if resp.status_code == 200:
comts_List = resp.json()['data']['commentList']
print(comts_List)
Copy code
Data obtained successfully , What we need to do next is to extract the six data we want to obtain
Color 、 Size 、 Comment on time 、 Membership level 、 The likes and comments are extracted as follows :
for item in comts_List:
# Color
colors = item['skuInfo'][0]
# Size
size = item['skuInfo'][1]
# Comment on time
times = item['createTime']
content_time = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(times/1000))
# Membership level
memberLevel = item['memberLevel']
# Comment like
stars = item['star']
# Comment content
content = item['content']
print(colors, size, content_time, memberLevel, stars, content)
'''
Color : Tibetan green hoodless Size :M 2020-12-13 00:11:23 4 5 The color is good to match the clothes , The clothes are light and thin , Keep warm OK, Very satisfied with ~~~
Color : Black hoodless Size :M 2021-01-16 11:08:59 2 5 High cost performance , Slippery and soft , No more than bosden 400 Difference , No problem with workmanship , No thread, no smell , Very satisfied , Those who hesitate can start , I bought it with more than 160 coupons
Color : Black hoodless Size :M 2020-12-26 17:03:07 2 5 Sure It's light Satisfied
Color : Pink Hooded Size :S 2019-11-11 23:10:51 3 5 Clothes received ! Very good , It's really a light fit , And very portable , Anyway, it's great !
Color : Black hoods Size :L 2019-11-10 06:01:12 1 5 Clothes received , You can wear it right now , Light and warm , It's worth having , Strictly selected things have never disappointed me .
Color : Black hoodless Size :M 2019-11-26 17:47:00 1 5 The light down jacket is good , Very comfortable to wear , It's cold. Sometimes I don't want to wear two other heavy clothes
Color : Tibetan green hoodless Size :M 2020-12-11 12:32:16 5 5 Nice clothes , Is too small .
Color : Black hoodless Size :S 2021-10-23 22:13:17 4 5 great , It's warm , It's still a little hot now
Color : Tibetan green hoodless Size :M 2021-08-19 17:21:03 6 5 good
Color : Tibetan green hoodless Size :M 2020-12-25 16:27:10 4 5 Two strictly selected down jackets around a month , Needless to say
Color : Khaki hoodless Size :S 2021-01-13 14:42:49 3 5 The third time I bought . This time is to help colleagues place an order . continue ~
Color : Black hoodless Size :M 2020-12-12 20:16:34 3 5 The quality can be , Lightweight comfort , Believe in Yan Xuan !
Color : Pink Hooded Size :L 2020-10-16 09:03:03 4 5 Although there was a problem with the delivery , But Yan Xuan's overall service is still very good , The problem was handled in a timely manner ! Products also OK! For mom , The old man is very satisfied !
Color : Red hood Size :M 2021-08-20 08:59:04 3 5 It feels good , Fluffy and soft , No color difference, just like the picture , Buy what you like
Color : Black hoodless Size :S 2021-11-10 18:01:39 1 5 good
Color : Tibetan green hoodless Size :M 2021-09-14 10:39:14 3 5 excellent , Love it , Light, easy to carry, easy to collect , And keep warm
Color : Red hood Size :M 2021-09-08 09:52:49 5 5 It's the right size , Take... When you travel .
Color : Tibetan green hooded Size :M 2021-09-07 14:49:21 4 5 It's really a very comfortable and lightweight down jacket
Color : Tibetan green hooded Size :L 2021-09-04 00:44:29 4 5 be cheap and at the same time very good , The arrival is very fast , There are no other problems at present . Buy back if necessary , Such a cheap price is a free gift ! The price is beautiful ! It's super cost-effective ! Cost performance is also very high ! Logistics , The seller's delivery speed is also very fast , The service was also up to standard , Fast ~ Good service ! I've bought this before , But it's really not as good as this one , I really made money this time , ha-ha !
Color : Red hood Size :L 2021-09-03 00:48:47 5 5 very good
'''
Copy code
Save the data
We save the obtained data to excel in , Later, we will do data analysis and visual display .
ws = op.Workbook()
wb = ws.create_sheet(index=0)
wb.cell(row=1, column=1, value=' Color ')
wb.cell(row=1, column=2, value=' Size ')
wb.cell(row=1, column=3, value=' Comment on time ')
wb.cell(row=1, column=4, value=' Membership level ')
wb.cell(row=1, column=5, value=' Comment like ')
wb.cell(row=1, column=6, value=' Comment content ')
wb.cell(row=count, column=1, value=colors)
wb.cell(row=count, column=2, value=size)
wb.cell(row=count, column=3, value=content_time)
wb.cell(row=count, column=4, value=memberLevel)
wb.cell(row=count, column=5, value=stars)
wb.cell(row=count, column=6, value=content)
ws.save(' Netease strictly selects overcoats .xlsx')
Copy code
Some data are shown below :
Data cleaning
So let's use pandas De duplication and de emption of the acquired data
You also need to convert the color and size data
# Reading data
rcv_data = pd.read_excel(' Netease strictly selects overcoats .xlsx')
# Remove ' Color :' word
rcv_data.loc[:, ' Color 1'] = rcv_data[' Color '].str.replace(' Color :', '')
# Remove ' Size :' word
rcv_data.loc[:, ' Size 1'] = rcv_data[' Size '].str.replace(' Size :', '')
# Store the data
rcv_data.to_excel(' Netease strictly selects overcoats .xlsx')
# Delete duplicate records and missing values
rcv_data = rcv_data.drop_duplicates()
rcv_data = rcv_data.dropna()
# Sample shows
print(rcv_data.sample(5))
'''
Color Size Comment on time Membership level Comment like Comment content Color 1 Size 1
299 Color : Khaki hoodless Size :L 2019-11-21 17:47:33 4 5 For mom , It fits well , Very light and comfortable , It's also warm . Khaki hoodless L
1217 Color : Khaki hoods Size :XL 2019-11-15 11:19:12 4 5 like , It's warm , The color is also beautiful Khaki hoods XL
896 Color : Pink hoodless Size :L 2019-11-26 22:38:45 2 5 For my daughter , Down is very good , comfortable to wear . Pink hoodless L
1551 Color : Pink Hooded Size :XL 2019-11-05 09:53:28 5 5 Cheap and fine Pink Hooded XL
1036 Color : Khaki hoodless Size :S 2019-11-21 21:18:47 2 5 good Khaki hoodless S
'''
Copy code
Ci cloud show
The next step is to use jieba、wordcloud To make a nice picture of the word cloud .
c_title = rcv_data[' Comment content '].tolist()
# Movie Review cloud picture
wordlist = jieba.cut(''.join(c_title))
result = ' '.join(wordlist)
pic = 'img.jpg'
gen_stylecloud(text=result,
icon_name='fas fa-tshirt',
font_path='msyh.ttc',
background_color='white',
custom_stopwords=stop_words,
output_name=pic,
)
print(' Drawing successful !')
Copy code
Word frequency display
Let's find out the top ten high-frequency words in the comments to show them
# Word frequency setting
all_words = [word for word in result.split(' ') if len(word) > 1 and word not in stop_words]
wordcount = Counter(all_words).most_common(10)
x1_data, y1_data = list(zip(*wordcount))
print(x1_data)
print(y1_data)
'''
(' Pretty good ', ' very ', ' Satisfied ', ' like ', ' quality ', ' clothes ', ' Thin and light ', ' Color ', ' Yan Xuan ', ' To keep warm ')
(274, 236, 216, 205, 203, 156, 136, 128, 123, 119)
'''
Copy code
Word frequency bubble chart
Word frequency pie chart
Large screen display of synthetic Kanban
Coat size display
As can be seen from the figure ,L and M More than 50%, Most little sisters are in 165-170 Between
Click here for the complete project source code
Comment on the heat ranking distribution map
Let's take a look at the top three comments :
good
very good
Pretty good
Coat color distribution map
Black hoods and black hoodless tops the list , There's nothing wrong with following miss ~
copyright notice
author[Programming small code farmer],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201312125139818.html
The sidebar is recommended
- Python crawls the map of Gaode and the weather conditions of each city
- leetcode 1275. Find Winner on a Tic Tac Toe Game(python)
- leetcode 2016. Maximum Difference Between Increasing Elements(python)
- Run through Python date and time processing (Part 2)
- Application of urllib package in Python
- Django API Version (II)
- Python utility module playsound
- Database addition, deletion, modification and query of Python Sqlalchemy basic operation
- Tiobe November programming language ranking: Python surpasses C language to become the first! PHP is about to fall out of the top ten?
- Learn how to use opencv and python to realize face recognition!
guess what you like
-
Using OpenCV and python to identify credit card numbers
-
Principle of Python Apriori algorithm (11)
-
Python AI steals your voice in 5 seconds
-
A glance at Python's file processing (Part 1)
-
Python cloud cat
-
Python crawler actual combat, pyecharts module, python data analysis tells you which goods are popular on free fish~
-
Using pandas to implement SQL group_ concat
-
How IOS developers learn Python Programming 8 - set type 3
-
windows10+apache2. 4 + Django deployment
-
Django parser
Random recommended
- leetcode 1560. Most Visited Sector in a Circular Track(python)
- leetcode 1995. Count Special Quadruplets(python)
- How to program based on interfaces using Python
- leetcode 1286. Iterator for Combination(python)
- leetcode 1418. Display Table of Food Orders in a Restaurant (python)
- Python Matplotlib drawing histogram
- Python development foundation summary (VII) database + FTP + character coding + source code security
- Python modular package management and import mechanism
- Django serialization (II)
- Python dataloader error "dataloader worker (PID XXX) is killed by signal" solution
- apache2. 4 + Django + windows 10 Automated Deployment
- leetcode 1222. Queens That Can Attack the King(python)
- leetcode 1387. Sort Integers by The Power Value (python)
- Tiger sniffing 24-hour praise device, a case with a crawler skill, python crawler lesson 7-9
- Python object oriented programming 01: introduction classes and objects
- Baidu Post: high definition Python
- Python Matplotlib drawing contour map
- Python crawler actual combat, requests module, python realizes IMDB movie top data visualization
- Python classic: explain programming and development from simple to deep and step by step
- Python implements URL availability monitoring and instant push
- Python avatar animation, come and generate your own animation avatar
- leetcode 1884. Egg Drop With 2 Eggs and N Floors(python)
- leetcode 1910. Remove All Occurrences of a Substring(python)
- Python and binary
- First acquaintance with Python class
- [Python data collection] scrapy book acquisition and coding analysis
- Python crawler from introduction to mastery (IV) extracting information from web pages
- Python crawler from entry to mastery (III) implementation of simple crawler
- The apscheduler module in Python implements scheduled tasks
- 1379. Find the same node in the cloned binary tree (Java / C + + / Python)
- Python connects redis, singleton and thread pool, and resolves problems encountered
- Python from 0 to 1 (day 11) - Python data application 1
- Python bisect module
- Python + OpenGL realizes real-time interactive writing on blocks with B-spline curves
- Use the properties of Python VTK implicit functions to select and cut data
- Learn these 10000 passages and become a humorous person in the IT workplace. Python crawler lessons 8-9
- leetcode 986. Interval List Intersections(python)
- leetcode 1860. Incremental Memory Leak(python)
- How to teach yourself Python? How long will it take?
- Python Matplotlib drawing pie chart