current position:Home>When winter comes, python chooses a coat with temperament for mom! Otherwise, there's really no way to start!

When winter comes, python chooses a coat with temperament for mom! Otherwise, there's really no way to start!

2022-01-31 21:25:15 Programming small code farmer

Preface

      Winter is coming , I still envy the North , It's already snowing , Our south is still a big sun . But it's still cold . Many times it's only a few degrees ! Mom should be colder in her hometown . So today we'll use Python Choose the best coat for mom . Make mom happy !

today , Let's climb Netease to strictly select a coat brand comment , Let's see what you think about color 、 How to choose the size ?

Target acquisition

We have six target data this time , Color 、 Size 、 Comment on time 、 Membership level 、 Likes and comments

Finally, it is intuitively displayed to selection difficulties through data visualization

It makes it easy for you to make a choice ~

Web analytics

We F12 Open browser developer mode , You can see that all the data we want to obtain are in it

Next, we find the web request link to simulate the browser request to obtain data

Pay attention to add headers.

Send a request

After the web page analysis is completed, send the request

url = f'http://you.163.com/xhr/comment/listByItemByTag.json?__timestamp=1636785180888&itemId=3532002&tag=%E5%85%A8%E9%83%A8&size=20&page={page}&orderBy=0&oldItemTag=%E5%85%A8%E9%83%A8&oldItemOrderBy=0&tagChanged=0'
headers = {
      'Cookie': 'yx_from=web_search_baidu; yx_aui=ada226e7-929f-419c-af99-53ad3eda94f0; mail_psc_fingerprint=01caf6305f28d3e4b8cfe162559acaac; yx_s_device=92db99a-a0c8-22cd-47c3-61d5b24664; yx_but_id=c18807c330874f4aaae2799cd51cdf9fd04f970cabedddc6_v1_nl; P_INFO=18392144506|1636766426|1|yanxuan_web|00&99|null=zyQS1ZWw9NX5Xw50muitOHx0kkWJG3WT51-8azp0ZDa&wd=&eqid=ca2a415f0000e3c900000005618f20af; _ntes_nnid=f1c2812145357d2b883902c65c421256,1636769976319; yx_delete_cookie_flag=true; yx_stat_seesionId=ada226e7-929f-419c-af99-53ad3eda94f01636769983423; yx_stat_ypmList=; yx_show_painted_egg_shell=false; yx_new_user_modal_show=1; yx_page_key_list=http%3A//you.163.com/search%3Fkeyword%3D%25E6%25A3%2589%25E8%25A2%2584%25E5%25A5%25B3%26timestamp%3D1636769989980%26_stat_search%3Dhistory%26searchWordSource%3D5%26_stat_referer%3Dindex%23page%3D1%26sortType%3D0%26descSorted%3Dtrue%26categoryId%3D0%26matchType%3D0%2Chttp%3A//you.163.com/item/detail%3Fid%3D3991647%26_stat_area%3D1%26_stat_referer%3Dsearch%26_stat_query%3D%25E6%25A3%2589%25E8%25A2%2584%25E5%25A5%25B3%26_stat_count%3D169%26_stat_searchversion%3Dmmoe_model-1.1.0-1.3; yx_stat_seqList=v_315469b8cb%7Cv_f72eac7e53%3B-1%3Bv_0e93fce746%3Bc_6b9da68e5d%3Bv_315469b8cb%3B-1',
      'Referer': 'http://you.163.com/item/detaierer=search&_stat_query=%E6%96%87%E8%83%B8&_stat_count=132&_stat_searchversion=dcn_model-1.1.0-1.3',
      'User-Agent': 'Mozilla/5.0 (WindowHTML, like Gecko) Chrome/96.0.4664.9 Safari/537.36'
        }

resp = requests.get(url, headers = headers)

if resp.status_code == 200:
    comts_List = resp.json()['data']['commentList']
    print(comts_List)
 Copy code 

Data obtained successfully , What we need to do next is to extract the six data we want to obtain

Color 、 Size 、 Comment on time 、 Membership level 、 The likes and comments are extracted as follows :

for item in comts_List:
    #  Color 
    colors = item['skuInfo'][0]

    #  Size 
    size = item['skuInfo'][1]

    #  Comment on time 
    times = item['createTime']
    content_time = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(times/1000))

    #  Membership level 
    memberLevel = item['memberLevel']

    #  Comment like 
    stars = item['star']

    #  Comment content 
    content = item['content']

    print(colors, size, content_time, memberLevel, stars, content)
    
    '''
     Color : Tibetan green hoodless   Size :M 2020-12-13 00:11:23 4 5  The color is good to match the clothes , The clothes are light and thin , Keep warm OK, Very satisfied with ~~~
     Color : Black hoodless   Size :M 2021-01-16 11:08:59 2 5  High cost performance , Slippery and soft , No more than bosden 400 Difference , No problem with workmanship , No thread, no smell , Very satisfied , Those who hesitate can start , I bought it with more than 160 coupons 
     Color : Black hoodless   Size :M 2020-12-26 17:03:07 2 5  Sure   It's light   Satisfied 
     Color : Pink Hooded   Size :S 2019-11-11 23:10:51 3 5  Clothes received ! Very good , It's really a light fit , And very portable , Anyway, it's great !
     Color : Black hoods   Size :L 2019-11-10 06:01:12 1 5  Clothes received , You can wear it right now , Light and warm , It's worth having , Strictly selected things have never disappointed me .
     Color : Black hoodless   Size :M 2019-11-26 17:47:00 1 5  The light down jacket is good , Very comfortable to wear , It's cold. Sometimes I don't want to wear two other heavy clothes 
     Color : Tibetan green hoodless   Size :M 2020-12-11 12:32:16 5 5  Nice clothes , Is too small .
     Color : Black hoodless   Size :S 2021-10-23 22:13:17 4 5  great , It's warm , It's still a little hot now 
     Color : Tibetan green hoodless   Size :M 2021-08-19 17:21:03 6 5  good 
     Color : Tibetan green hoodless   Size :M 2020-12-25 16:27:10 4 5  Two strictly selected down jackets around a month , Needless to say 
     Color : Khaki hoodless   Size :S 2021-01-13 14:42:49 3 5  The third time I bought . This time is to help colleagues place an order . continue ~
     Color : Black hoodless   Size :M 2020-12-12 20:16:34 3 5  The quality can be , Lightweight comfort , Believe in Yan Xuan !
     Color : Pink Hooded   Size :L 2020-10-16 09:03:03 4 5  Although there was a problem with the delivery , But Yan Xuan's overall service is still very good , The problem was handled in a timely manner ! Products also OK! For mom , The old man is very satisfied !
     Color : Red hood   Size :M 2021-08-20 08:59:04 3 5  It feels good , Fluffy and soft , No color difference, just like the picture , Buy what you like 
     Color : Black hoodless   Size :S 2021-11-10 18:01:39 1 5  good 
     Color : Tibetan green hoodless   Size :M 2021-09-14 10:39:14 3 5  excellent , Love it , Light, easy to carry, easy to collect , And keep warm 
     Color : Red hood   Size :M 2021-09-08 09:52:49 5 5  It's the right size , Take... When you travel .
     Color : Tibetan green hooded   Size :M 2021-09-07 14:49:21 4 5  It's really a very comfortable and lightweight down jacket 
     Color : Tibetan green hooded   Size :L 2021-09-04 00:44:29 4 5  be cheap and at the same time very good , The arrival is very fast , There are no other problems at present . Buy back if necessary , Such a cheap price is a free gift ! The price is beautiful ! It's super cost-effective ! Cost performance is also very high ! Logistics , The seller's delivery speed is also very fast , The service was also up to standard , Fast ~ Good service ! I've bought this before , But it's really not as good as this one , I really made money this time , ha-ha !
     Color : Red hood   Size :L 2021-09-03 00:48:47 5 5  very good 
    '''
 Copy code 

Save the data

We save the obtained data to excel in , Later, we will do data analysis and visual display .

ws = op.Workbook()
wb = ws.create_sheet(index=0)

wb.cell(row=1, column=1, value=' Color ')
wb.cell(row=1, column=2, value=' Size ')
wb.cell(row=1, column=3, value=' Comment on time ')
wb.cell(row=1, column=4, value=' Membership level ')
wb.cell(row=1, column=5, value=' Comment like ')
wb.cell(row=1, column=6, value=' Comment content ')

wb.cell(row=count, column=1, value=colors)
wb.cell(row=count, column=2, value=size)
wb.cell(row=count, column=3, value=content_time)
wb.cell(row=count, column=4, value=memberLevel)
wb.cell(row=count, column=5, value=stars)
wb.cell(row=count, column=6, value=content)

ws.save(' Netease strictly selects overcoats .xlsx')
 Copy code 

Some data are shown below :

Data cleaning

So let's use pandas De duplication and de emption of the acquired data

You also need to convert the color and size data

#  Reading data 
rcv_data = pd.read_excel(' Netease strictly selects overcoats .xlsx')

#  Remove ' Color :' word 
rcv_data.loc[:, ' Color 1'] = rcv_data[' Color '].str.replace(' Color :', '')
#  Remove ' Size :' word 
rcv_data.loc[:, ' Size 1'] = rcv_data[' Size '].str.replace(' Size :', '')

#  Store the data 
rcv_data.to_excel(' Netease strictly selects overcoats .xlsx')

#  Delete duplicate records and missing values 
rcv_data = rcv_data.drop_duplicates()
rcv_data = rcv_data.dropna()

#  Sample shows 
print(rcv_data.sample(5))

'''
            Color       Size                   Comment on time    Membership level    Comment like                         Comment content     Color 1  Size 1
299    Color : Khaki hoodless     Size :L  2019-11-21 17:47:33     4     5   For mom , It fits well , Very light and comfortable , It's also warm .   Khaki hoodless    L
1217   Color : Khaki hoods    Size :XL  2019-11-15 11:19:12     4     5               like , It's warm , The color is also beautiful    Khaki hoods   XL
896    Color : Pink hoodless     Size :L  2019-11-26 22:38:45     2     5           For my daughter , Down is very good , comfortable to wear .   Pink hoodless    L
1551   Color : Pink Hooded    Size :XL  2019-11-05 09:53:28     5     5                        Cheap and fine    Pink Hooded   XL
1036   Color : Khaki hoodless     Size :S  2019-11-21 21:18:47     2     5                           good    Khaki hoodless    S
'''
 Copy code 

Ci cloud show

The next step is to use jieba、wordcloud To make a nice picture of the word cloud .

 c_title = rcv_data[' Comment content '].tolist()
#  Movie Review cloud picture 
wordlist = jieba.cut(''.join(c_title))
result = ' '.join(wordlist)
pic = 'img.jpg'
gen_stylecloud(text=result,
                icon_name='fas fa-tshirt',
                font_path='msyh.ttc',
                background_color='white',
                custom_stopwords=stop_words,
                output_name=pic,
                )
print(' Drawing successful !')
 Copy code 

Word frequency display

Let's find out the top ten high-frequency words in the comments to show them

    #  Word frequency setting 
    all_words = [word for word in result.split(' ') if len(word) > 1 and word not in stop_words]
    wordcount = Counter(all_words).most_common(10)

    x1_data, y1_data = list(zip(*wordcount))
    print(x1_data)
    print(y1_data)
    
    '''
    (' Pretty good ', ' very ', ' Satisfied ', ' like ', ' quality ', ' clothes ', ' Thin and light ', ' Color ', ' Yan Xuan ', ' To keep warm ')
    (274, 236, 216, 205, 203, 156, 136, 128, 123, 119)
    '''
 Copy code 

Word frequency bubble chart

Word frequency pie chart

Large screen display of synthetic Kanban

Coat size display

As can be seen from the figure ,L and M More than 50%, Most little sisters are in 165-170 Between

Click here for the complete project source code

Comment on the heat ranking distribution map

Let's take a look at the top three comments :

good

very good

Pretty good

Coat color distribution map

Black hoods and black hoodless tops the list , There's nothing wrong with following miss ~

copyright notice
author[Programming small code farmer],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201312125139818.html

Random recommended