current position:Home>Python crawler - fund information storage

Python crawler - fund information storage

2022-02-01 07:32:44 first quarter of the moon

This is my participation 11 The fourth of the yuegengwen challenge 7 God , Check out the activity details :2021 One last more challenge

Starship human's home is not the earth , Don't return , This is not home ! Your home is far away !

1 Preface

As mentioned earlier, many secondary data storage , Finally, I finished the design of the database in the last article , Start the data storage operation in this article , In this part of the data store , Will list the funds captured before , Fund basic information, fund change information and ETF Store information .

2 Information storage

2.1 Fund basic information storage

There are two parts to get fund information here , One part is over-the-counter fund, and the other part is over-the-counter fund information . In the previous article , The OTC fund has been completed with us ETF Code of fund information , So here we just need to store the database , So there's a problem , The information of the fund is subject to change or update at any time . When saving, you need to judge whether the fund code already exists , Update if it exists , If not, add , But it's a little inefficient , At this time, we will use the content of the previous article , Use this on duplicate key update Just one sentence sql done . An example is shown below :

INSERT INTO `tb_fund_list`(`code`, `name`, `fund_type`) VALUES ('000363',' Cathay Pacific Juxin has a mixture of value advantages C',' mixed type - flexible ')
on duplicate key update `code` = '000363', `name` = ' Cathay Pacific Juxin has a mixture of value advantages C' ,`fund_type` = ' mixed type - flexible ';
 Copy code 

If there is 000363 Fund words , Let's update , If it doesn't exist for that long, insert data . The specific implementation code is shown in the figure below :

2.2 Fund change information acquisition

Fund change information is obtained in the same way whether it is on-site fund or off-site fund , Here you can use general logic for processing , This is the way to capture fund change information and fund price information before .

3 What needs to be improved
3.1 Type of Fund

At present, the fund type in the basic information of the fund is still in Chinese , Such Chinese storage does not meet the common coding specifications , I haven't dealt with it before because I don't know how many types of funds there are , Now all the funds have been obtained , At this time, we need to check all fund types , Then an enumeration is established to express different fund types .

#  Get all fund type information 
select distinct fund_type from tb_fund_list;
 Copy code 

According to the fund type queried , The final defined fund type is shown in the figure below :

fund_type_dic = {
    "QDII": "1",
    " goods ( Not included QDII)": "2",
    " Stock type ": "3",
    " Exponential type - Stocks ": "4",
    " mixed type - Partial debt ": "51",
    " mixed type - Partial strand ": "52",
    " mixed type - Balance ": "53",
    " mixed type - flexible ": "61",
    " Bond type - Medium and short-term debt ": "62",
    " Bond type - Convertible bond ": "63",
    " Bond type - Mixed debt ": "64",
    " Bond type - Long debt ": "65"
}
 Copy code 

According to experience , There are relatively many bond funds , If you are interested in bond funds, you can update the data from time to time , In the follow-up operation, non bond funds are mainly used for analysis , The total amount of data is relatively small , The time of batch update is also relatively short .

3.2 Update order of funds

In the previous fund acquisition process , Generally speaking, the order of acquisition is chaotic , When the final data result is stored , It is necessary to splice and assemble the obtained information . The final updated data order is :

  • 1 Update the list of OTC funds ( Add or update )
  • 2 to update ETF Information list ( Add or update )
  • 3 Query the basic information of the fund and update it
  • 4 Query the stage change information of the fund and update it

4 summary

thus , Access to fund information has been completed , The fund information has been saved successfully , In the next chapter, we will introduce how to establish a linear model to evaluate the score of the Fund , Make quantitative analysis for investment funds .

copyright notice
author[first quarter of the moon],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/02/202202010732404023.html

Random recommended