current position:Home>Python crawler actual combat, pyecharts module, python data analysis tells you which goods are popular on free fish~
Python crawler actual combat, pyecharts module, python data analysis tells you which goods are popular on free fish~
2022-01-31 13:08:58 【Dai mubai】
「 This is my participation 11 The fourth of the yuegengwen challenge 12 God , Check out the activity details :2021 One last more challenge 」.
Preface
Make use of Python Automation to obtain the best selling goods of a certain kind for reference . I don't say much nonsense .
Let's start happily ~
development tool
Python edition : 3.6.4
Related modules :
pyecharts modular ;
As well as some Python Built in modules .
Environment building
install Python And add to environment variable ,pip Install the relevant modules required .
preparation
1、 Good configuration Android ADB development environment
2、Python Install in virtual environment pocoui Dependency Library
# pocoui\
pip3 install pocoui
# Data visualization charts
pip3 install pyecharts -U
Copy code
step
We can offer 7 To achieve this function , Namely : Open the target application client 、 Search keywords to the product list interface 、 Calculate the best sliding distance 、 Screen products 、 Get product link address 、 Write files, sort and count products 、 Configuration parameters .
The first 1 Step , Use pocoui Automatically open the target application .
def __pre(self):
""" preparation :return: """
home()
stop_app(package_name)
start_my_app(package_name, activity)
# Waiting to get to the desktop
self.poco(text=' Idle fish ').wait_for_appearance()
self.poco(text=' Fish pond ').wait_for_appearance()
self.poco(text=' news ').wait_for_appearance()
self.poco(text=' my ').wait_for_appearance()
print(' Enter the idle fish main interface ')
Copy code
After entering the home page of idle fish , The application side will get the data of the shear board , When there is a specific law of password , A dialog box will pop up immediately , So we need to simulate closing the dialog box .
# If there is a search password within the specified time , Just shut it down \
for i in range(10, -1, -1):\
close_element = self.poco('com.taobao.idlefish:id/ivClose')\
if close_element.exists():\
close_element.click()\
break\
time.sleep(1)
Copy code
The first 2 Step , Search keywords to the product list interface
By the keywords to be searched , Analog input into input box , Then click the search button , Wait until the search list appears .
in addition , In order to process data more conveniently , Item list switch to list mode , That is, one line only shows one product .
def __input_key_word(self):
""" Enter key :return: """
# Enter the search interface
perform_click(self.poco('com.taobao.idlefish:id/bar_tx'))
# Enter text in the search box
self.poco('com.taobao.idlefish:id/search_term').set_text(self.good_msg)
# Click the search button
while True:
# Wait for the search result list to appear
if not self.poco('com.taobao.idlefish:id/list_recyclerview').exists():
perform_click(self.poco('com.taobao.idlefish:id/search_button', text=' Search for '))
else:
break
# Wait for the product list to appear
self.poco('com.taobao.idlefish:id/list_recyclerview').wait_for_appearance()
# Switch to list
perform_click(self.poco('com.taobao.idlefish:id/switch_search'))
Copy code
The first 3 Step , Calculate the best sliding distance .
In order to ensure the efficiency of crawling data , Get the best distance for each slide .
First get Of the current interface UI Control tree , Then through the properties of the control ID Get the coordinates of the goods , And then get the height of each item .
Last , By observing the number of products on the screen to get the best sliding distance .
def __get_good_swipe_distance(self):
""" Get every slide , The most suitable distance :return: """
element = Element()
# Save the current UI Tree to local
element.get_current_ui_tree()
# The first product Item Coordinates of
position_item = element.find_elment_position_by_id_and_index("com.taobao.idlefish:id/card_root",
"1")
# Height of commodity
item_height = position_item[1][1] - position_item[0][1]
# Through observation , The current screen has 3 Commodity
return item_height * 3
Copy code
The first 4 Step , Screen products .
The above steps get the best sliding distance , Constantly sliding the page, traversing the list of elements of the child Item.
It should be noted that , To avoid errors caused by sliding inertia , The duration of each slide should be set to 2s above .
Through Commodities Item select Desired number Items larger than the preset number .
# How many people want to
want_element_parent = item.offspring('com.taobao.idlefish:id/search_item_flowlayout')
if want_element_parent.exists():
# Want to count / Amount paid
want_element = want_element_parent.children()[0]
want_content = want_element.get_text()
# To filter out 【 Paid 】 And other products , Keep only personal publications
if ' People want it ' not in want_content:
continue
# Get the exact number of items you want , Represents the heat of the product
want_num = get_num(want_content)
if int(want_num) < self.num_assign:
# print(' Substandard , To filter out ')
pass
else:
# The goods want to reach the standard , Add Statistics
Copy code
The first 5 Step , Get product link address
For products that meet the conditions in the previous step , Click on the product Item Go to the product details page .
Then click the share button in the upper right corner , The sharing dialog box will pop up immediately .
Then click on the password control , You will be prompted that the password was successfully copied to the system clipboard .
# Click More
while True:
if self.poco('com.taobao.idlefish:id/ftShareName').exists():
break
print(' Click More ~')
perform_click(self.poco(text=' more '))
# Click to copy the password
perform_click(self.poco('com.taobao.idlefish:id/ftShareName', text=' Ambush '))
# Get the password code
taobao_code_element = self.poco('com.taobao.idlefish:id/tvWarnDetail')
taobao_code = taobao_code_element.get_text()
Copy code
The first 6 Step , Write product 、 Sort and count the data
The title of the product obtained above 、 Want to count 、 Write the shared address to CSV In file .
And then read the data file , By comparing the second column in the table Reverse sorting , Arrange the goods in descending order according to the desired number .
def __sort_result(self):
""" Sort the results of crawling :return: """
reader = csv.reader(open(self.file_path), delimiter=",")
# Head title
head_title = next(reader)
# Reverse the order in the second column
sortedlist = sorted(reader, key=lambda x: (int(x[1])), reverse=True)
# Write header data
write_to_csv(self.file_path, [(head_title[0], head_title[1], head_title[2])], False)
for value in sortedlist:
write_to_csv(self.file_path, [(value[0], value[1], value[2])], False)
return sortedlist
Copy code
Before you finally get it 10 Data , utilize pyecharts Generate statistical charts .
def draw_image(self, sortedlist):
""" drawing :param sortedlist: :return: """
# Title list
titles = []
# sales
sales_num = []
# Get the title of the crawl results 、 Two lists of sales
with open(self.file_path, 'r') as csvfile:
# Read the file
reader = csv.DictReader(csvfile)
# Add to the list
for row in reader:
titles.append(row['title'])
sales_num.append(row['num'])
# Number limit
if len(titles) > self.num:
titles = titles[:self.num]
sales_num = sales_num[:self.num]
# drawing
bar = (
Bar()
.add_xaxis(titles)
.add_yaxis(" What's good to sell ", sales_num)
.set_global_opts(title_opts=opts.TitleOpts(title=" I want to sell "))
)
bar.render('%s.html' % self.good_msg)
Copy code
The first 7 Step , Configuration parameters
To write yaml file , Specify the keywords to crawl the product 、 Crawling time 、 Want to count the number of assessment indicators 、 Number of items to be screened .
goods:
# Search for products 1, Contains search keywords 、 Crawling time
good1:
key_word: ' Information ' # Search for keywords
key_num: 100 # Screening 【 Want to count 】 The critical point of
num: 10 # Only select the hot ones
time: 600 # Crawling time ( second )
Copy code
Effect display
Configure the key words in advance 、 Crawling time and other parameters , That is to say, it can climb up to meet the requirements of 、 Best selling product data , Finally, it is shown in the form of a graph .
copyright notice
author[Dai mubai],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201311308565821.html
The sidebar is recommended
- [algorithm learning] 1108 IP address invalidation (Java / C / C + + / Python / go / trust)
- Test platform series (71) Python timed task scheme
- Java AES / ECB / pkcs5padding encryption conversion Python 3
- Loguru: the ultimate Python log solution
- Blurring and anonymizing faces using OpenCV and python
- How fast Python sync and async execute
- Python interface automation test framework (basic) -- common data types list & set ()
- Python crawler actual combat, requests module, python realizes capturing video barrage comments of station B
- Python: several implementation methods of multi process
- Sword finger offer II 054 Sum of all values greater than or equal to nodes | 538 | 1038 (Java / C / C + + / Python / go / trust)
guess what you like
-
How IOS developers learn python programming 3-operator 2
-
How IOS developers learn python programming 2-operator 1
-
[Python applet] 8 lines of code to realize file de duplication
-
Python uses the pynvml tool to obtain the working status of GPU
-
Data mining: Python actual combat multi factor analysis
-
Manually compile opencv on MacOS and Linux and add it to Python / C + + / Java as a dependency
-
Use Python VTK to batch read 2D slices and display 3D models
-
Complete image cutting using Python version VTK
-
Python interface automation test framework (basic) -- common data types Dict
-
Django (make an epidemic data report)
Random recommended
- Python specific text extraction in actual combat challenges the first step of efficient office
- Daily python, Part 8 - if statement
- Django model class 1
- The same Python code draws many different cherry trees. Which one do you like?
- Python code reading (Chapter 54): Fibonacci sequence
- Django model class 2
- Python crawler Basics
- Mapping 3D model surface distances using Python VTK
- How to implement encrypted message signature and verification in Python -- HMAC
- leetcode 1945. Sum of Digits of String After Convert(python)
- leetcode 2062. Count Vowel Substrings of a String(python)
- Analysis of Matplotlib module of Python visualization
- Django permission management
- Python integrated programming -- visual hot search list and new epidemic situation map
- [Python data collection] scripy realizes picture download
- Python interface automation test framework (basic part) -- loop statement of process control for & while
- Daily python, Chapter 9, while loop
- Van * Python | save the crawled data with docx and PDF
- Five life saving Python tips
- Django frequency control
- Python - convert Matplotlib image to numpy Array or PIL Image
- Python and Java crawl personal blog information and export it to excel
- Using class decorators in Python
- Untested Python code is not far from crashing
- Python efficient derivation (8)
- Python requests Library
- leetcode 2047. Number of Valid Words in a Sentence(python)
- leetcode 2027. Minimum Moves to Convert String(python)
- How IOS developers learn Python Programming 5 - data types 2
- leetcode 1971. Find if Path Exists in Graph(python)
- leetcode 1984. Minimum Difference Between Highest and Lowest of K Scores(python)
- Python interface automation test framework (basic) -- basic syntax
- Detailed explanation of Python derivation
- Python reptile lesson 2-9 Chinese monster database. It is found that there is a classification of color (he) desire (Xie) monsters during operation
- A brief note on the method of creating Python virtual environment in Intranet Environment
- [worth collecting] for Python beginners, sort out the common errors of beginners + Python Mini applet! (code attached)
- [Python souvenir book] two people in one room have three meals and four seasons: 'how many years is it only XX years away from a hundred years of good marriage' ~?? Just come in and have a look.
- The unknown side of Python functions
- Python based interface automation test project, complete actual project, with source code sharing
- A python artifact handles automatic chart color matching