current position:Home>Van * Python! Write an article and publish the script on multiple platforms

Van * Python! Write an article and publish the script on multiple platforms

2022-01-31 23:12:55 coder_ pig

0x1、 introduction

There is a reason for everything , The recent discovery , New article on Nuggets , The amount of reading is much lower than before ...

Although don't pay too much attention to this ( fake ), But I wrote , It's coming out , I must hope someone will see , Some people discuss communication , In this way, there will be progress , Otherwise, the cloud notes are not delicious ?

Simply analyze the reasons why the amount of reading becomes less , It may be the following three points :

  • Rewarding essay solicitation activities , Attracted a wave of new creators , Also produced a large number of articles , cannot meet the needs of the people , The flow is naturally less ;
  • Personalized recommendation algorithm , Old prose is out of date and in power , The new article starts with xuezang , The specific official interpretation can be seen : Description of personalized recommended style feedback
  • The quality of your own articles has declined , How is that possible? ( Wok )? There are all the soul expression packages that should be , And it's all hands-on .

The previous articles are Starting Nuggets , Then feed the juice converter with self-developed mice hzwz-markdown-wx hold md Convert to... With custom styles HTML, Stick it on the official number and finish it .

Other platforms are too lazy , It's too tired to paste and copy , Have thought about writing an automated script , Later, it was shelved for various reasons and I forgot .

I suddenly thought of it recently , How can a good article be buried , Other platforms have to send a message , As usual , First ask if there are wheels , If so, you don't have to make it yourself , So the group asked a wave :

em... It doesn't seem to be , If not , Just make one yourself , It's not too complicated , It happens that the product manager of the great enemy recently asked for leave , Daily work is to change UI nothing more , Enough time to fish , start doing sth. !!!

Without cracking the interface , Handle multiple sites , The most efficient 、 The simplest way to implement it is → Browser simulation bit by bit

First list the sites you want to publish , If you have any additional information, please leave a message in the comment area ~

And the previous two sections :

《Van * Python | Simple crawling of a site course 》 《Van*Python | Simple crawling of a planet 》

Afraid of a lawyer's warning , It's different if you don't dare to send scripts secretly , The script in this section is open source , Welcome to the party clone Try to make suggestions ~

0x2、 Tactical Analysis

The process of sending a document can be divided into three steps : Before release Publishing After the release of , Then refine the specific process of each stage :

Briefly explain the main points ~

Before release

Just some preparations before release , First, The content of the article is related to , In two parts , Text + Additional information , In the text, different platforms , The supported compilers are slightly different , Be prepared against want , So prepare the following three :

  • MD Text → Most platforms support this ~
  • MD file → Some platforms do not support MD Text , But it supports importing MD file , For example, Zhihu ;
  • Rendered text → Some platforms do not support MD, You may need to parse the copy after rendering , Such as 51CTO The old version of the blog WuKong Editor ;

However, there are miscellaneous additional information , About these :

title Abstract cover label classification

then Login related , account number + password , If there are other requirements, you can also add ~

Publishing

All platform documents should be logged in , So before sending the document, we should Login status judgment , Generally, you can visit the article release page directly without logging in , Automatically jump to the login page . There are exceptions , For example, Nuggets is still on the edit page , But I can't send an article , So you need to trigger the jump related to login .

And then there was automatic logon , Is the process of simulating human login , Find node element , Click on 、 Enter the corresponding information , Then execute login . In addition, there are write sites that have detected login exceptions , It also triggers various verification codes ( slider 、 Click on 、 written words 、 Spin, etc ), Inform the user to take the initiative to deal with , Then poll timeout or hibernate for a period of time and wait .

After processing the login , And then we'll get there Text fill 了 , Text nodes that support direct input , Direct plug , Don't support , Sure : Click to get focus The body content is written to the clipboard keyboard Ctrl+A Future generations keyboard Ctrl+V Paste .

Then to Additional information fill , Find node , Click or enter .

The last is Article release 了 , Some site publishing may have other additional operations , If not, execute the post release action .

After the release of

The release process is not necessarily smooth , Occasionally there are exceptions , You need to write the exception information into the file , Users can publish manually or re publish by introducing retry mechanism ~

0x3、 Detailed design

The analysis is almost complete , Then it comes to code design , First, the entity , From above, you need two : The article information + Account and password , The latter is generally bound to the website , There's no need to be independent , First write the article information entity :

class Article:
    def __init__(self, md_file=None, md_content=None, render_content=None, tags=None, avatar=None, summary=None, category=None, column=None, title=None):
        """  Initialization method  Args: md_file: md file  md_content: md Text  render_content:  Rendered text  tags:  label  avatar:  cover  summary:  Abstract  category:  classification  column:  special column  title:  title  """
        self.md_file = md_file
        self.md_content = md_content
        self.render_content = render_content
        self.tags = tags
        self.avatar = avatar
        self.summary = summary
        self.category = category
        self.column = column
        self.title = title
 Copy code 

Then to the post , The behavior of each site is similar , Extract common attributes and methods , Define a parent class , Subclasses can be implemented as needed :

class Publish:
    def __init__(self, website_name=None, write_page_url=None, login_url=None, account=None, password=None, is_publish=True, page=None, article=None):
        """  Extract the public attributes of published articles  Args: website_name:  Site name  write_page_url:  Release page url login_url:  The login page url account:  account number  password:  password  is_publish:  Publish or not , The default is True page: Pyppeteer  Of  Page example , Represents a page of the browser  """
        self.website_name = website_name
        self.write_page_url = write_page_url
        self.login_url = login_url
        self.account = account
        self.password = password
        self.is_publish = is_publish
        self.page = page
        self.article = article
        self.logger = logging.getLogger(self.website_name)
        self.logger.setLevel(logging.INFO)

    #  Pass in Page and Article
    def set_page(self, page, article):
        self.page = page
        self.article = article

    #  Load release page 
    def load_write_page(self):
        self.logger.info(" Load the article page :{}".format(self.write_page_url))

    #  Check login status 
    def check_login_status(self):
        self.logger.info(" Check login status ...")

    #  automatic logon 
    def auto_login(self):
        self.logger.info(" Start automatic login :{}".format(self.login_url))

    #  Content filling 
    def fill_content(self):
        self.logger.info(" Start content filling ...")

    #  Other filling 
    def fill_else(self):
        self.logger.info(" Fill in with other content ...")

    #  Release 
    def publish_article(self):
        self.logger.info(" Publish articles ...")

    #  The result processing 
    def deal_result(self):
        self.logger.info(" The article is published ...")
 Copy code 

Then take the release to Nuggets as an example , Demonstrate how to play ~

0x4、 Instance to explain —— Nuggets posting process

① Login status detection

According to the actual situation , Override the corresponding method in the parent class , First visit the post page :juejin.cn/editor/draf…

Not logged in , You can visit , It doesn't jump automatically , So we need to judge by ourselves , Compare the differences before and after login :

2333, It's not hard to find out , The login status , There will be a user's Avatar in the upper right corner , Check whether this node exists , Look at the node information :

It's not hard to write such code :

class JueJinPublish(Publish):
    async def load_write_page(self):
        super().load_write_page()
        #  Load article publishing page , One minute overtime 
        await self.page.goto(self.write_page_url, options={'timeout': 60000})
        await asyncio.sleep(1)
        await self.check_login_status()

    async def check_login_status(self):
        super().check_login_status()
        try:
            await self.page.waitForXPath("//nav//div[@class='toggle-btn']", {'visible': 'visible', 'timeout': 3000})
            self.logger.info(" Logged in ...")
            await self.fill_content()
        except errors.TimeoutError as e:
            self.logger.warning(e)
            self.logger.info(" Not logged in , Perform automatic login ...")
            await self.auto_login()
 Copy code 

② automatic logon

technological process : Jump page Click the login button in the upper right corner Other ways to log in Enter account Input password Click login

And then there's the slide verification... That's sweet :

Wait for user authentication , When will the verification be completed ? direct Wait for login button is not visible that will do , Overtime 1 minute , Then jump to the article editing page ~

    async def auto_login(self):
        super().auto_login()
        try:
            await self.page.goto(self.login_url, options={'timeout': 60000})
            await asyncio.sleep(2)
            login_bt = await self.page.Jx("//button[@class='login-button']")
            await login_bt[0].click()
            prompt_box = await self.page.Jx("//div[@class='prompt-box']/span")
            await prompt_box[0].click()
            account_input = await self.page.Jx("//input[@name='loginPhoneOrEmail']")
            await account_input[0].type(self.account)
            password = await self.page.Jx("//input[@name='loginPassword']")
            await password[0].type(self.password)
            login_btn = await self.page.Jx("//button[@class='btn']")
            await login_btn[0].click()
            self.logger.info(" Wait for user authentication ...")
            #  Then timeout and wait for the login button to disappear , Prompt the user that login authentication may be required 
            await self.page.waitForXPath("//button[@class='login-button']", {'hidden': True, 'timeout': 60000})
            self.logger.info(" User authentication successful ...")
            await self.load_write_page()
        except errors.TimeoutError:
            self.logger.info(" User authentication failed ...")
            self.logger.error(" login timeout ")
            await self.page.close()
        except Exception as e:
            self.logger.error(e)
 Copy code 

③ Text fill

Jump back to the post page , Then there is the process of article filling :

Fill in the title Fill in the content section choice Markdown The theme Select the code highlight style

The title is OK , Get the text control to fill , The content part cannot be directly plugged , Use the shear plate method to solve , And then there was Markdon Selection of theme and code highlight style , This is not easy to do :

The node gets the focus , And then Dynamic pop-up options list ,Elements As soon as you follow the node, the option list disappears , Can't get node information , The author's solution code :

When you get the focus and display the list , Print web source code , Step by step

The same is true for several of the release pages , Determine whether the next style text is the same as the preset , Just click directly , It's not hard to write such code :

    async def fill_content(self):
        super().fill_content()

        #  Set title 
        title_input = await self.page.Jx("//input[@class='title-input title-input']")
        await title_input[0].type(self.article.title)

        #  The content part is not plain text input , Click Select , Then copy and paste a wave ~
        content_input = await self.page.Jx("//div[@class='CodeMirror-scroll']")
        await content_input[0].click()
        cp_utils.set_copy_text(self.article.md_content)
        await cp_utils.hot_key(self.page, "Control", "KeyA")
        await cp_utils.hot_key(self.page, "Control", "KeyV")

        #  Nuggets will compress images , We'll have to wait for the follow-up 
        await asyncio.sleep(3)

        #  choice Markdown Theme and code highlight styles 
        md_theme = await self.page.Jx("//div[@bytemd-tippy-path='16']")
        await md_theme[0].hover()

        #  Choose your favorite theme , such as :smartblue
        md_theme_choose = await self.page.Jx(
            "//div[@class='bytemd-dropdown-item-title' and text()='{}']".format('smartblue'))
        await md_theme_choose[0].click()

        #  Similarly, select your favorite code style , such as :androidstudio
        code_theme = await self.page.Jx("//div[@bytemd-tippy-path='17']")
        await code_theme[0].hover()
        code_theme_choose = await self.page.Jx(
            "//div[@class='bytemd-dropdown-item-title' and text()='{}']".format('androidstudio'))
        await code_theme_choose[0].click()

        #  Additional information 
        await self.fill_else()
 Copy code 

④ Fill in with additional information

The process of filling in additional information is as follows :

Click the Publish button in the upper right corner Select classification Add tags Upload the article cover Choose a column ( Optional ) Enter summary Click OK and publish

Here's the picture :

The classification is good , Check whether the text is consistent with the preset , It's a check , Add tags and play the same way as the theme selection above , Then upload the cover of the article , find //input[@type='file'] The node of , Call next uploadFile() Method to complete the upload .

Fill in the summary , Just like the text , Use the clipboard to paste the Dafa interface , Finally, click OK and release it , It is also not difficult to write the following code :

    async def fill_else(self):
        super().fill_else()

        #  Click the Publish button 
        publish_bt = await self.page.Jx("//button[@class='xitu-btn']")
        await publish_bt[0].click()

        #  Select the category 
        category_check = await self.page.Jx("//div[@class='item' and text()=' {} ']".format(self.article.category))
        await category_check[0].click()

        #  Add tags 
        for tag in self.article.tags:
            tag_input = await self.page.Jx("//input[@class='byte-select__input']")
            await tag_input[0].type(tag)
            await asyncio.sleep(1)
            #  The first... Is selected by default 
            tag_li = await self.page.Jx("//li[@class='byte-select-option byte-select-option--hover']")
            await tag_li[0].click()

        #  Add cover 
        upload_avatar = await self.page.Jx("//input[@type='file']")
        await upload_avatar[0].uploadFile(self.article.avatar)

        #  Fill in the summary 
        summary_textarea = await self.page.Jx("//textarea[@class='byte-input__textarea']")
        await summary_textarea[0].click()
        cp_utils.set_copy_text(self.article.summary)
        await cp_utils.hot_key(self.page, "Control", "KeyA")
        await cp_utils.hot_key(self.page, "Control", "KeyV")
        await self.publish_article()

    async def publish_article(self):
        super().publish_article()
        publish_btn = await self.page.Jx("//div[@class='btn-container']/button")
        await publish_btn[1].click()
        await asyncio.sleep(2)
        await self.deal_result()
 Copy code 

⑤ Publish result processing

After publishing, it will jump to the page , Then show tips related to success , Here, you can directly find out whether there is information related to the published node :

What other results are written , The follow-up can be changed slowly , Then run to see the effect of publishing the article :

Don't be too happy to be lazy !!!

The basic prototype is like this , The following is the scripting of other sites , Add profile , Support multi site publishing at the same time , Publish result processing , There are also some logical optimizations ~

0x5、 Summary

First lose the warehouse connection here :ChaoMdPublish, Those who are interested can first Star Next , Seize the fishing time in the afternoon , Finish the rest of the liver , Adding a hundred million details , thank you ~

copyright notice
author[coder_ pig],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201312312499012.html

Random recommended