# Some thoughts on pooling in Python -2022

2022-07-25 14:26:03

# introduction

Pooling in Neural Networks (pooling)： Downsampling the input data , Reduce the resolution of the input data .
except batch size, The resolution of the data itself and the number of characteristic channels affect  The Internet  Important indicators of calculation and parameter quantity . Pooling will reduce the data resolution , So as to reduce the amount of calculation . Look at it from another Angle , Low resolution data can use fewer characteristic channels ( Network parameters ) To fit features , Thus, the amount of network parameters is reduced .
This paper summarizes the mainstream pooling methods , And put forward some conjectures ：

•  The effect of pooling is often well reflected in high-resolution input , Low resolution data may not need pooling
•  The effectiveness of pooling may not be attributed to the pooling method , On the contrary, as long as the data resolution is reduced, it can often get good results

Pooling method – structured or Unstructured ？ ： Whether it depends on the structure of input data

# One 、 structured ( The average pooling 、 Maximum pooling ？)

• I will call the pooling method in matrix form  structured  Pooling of .

## 1.1 The average pooling 、 Maximum pooling, etc

The most widely used pooling method at present ： The average pooling 、 Maximum pooling – Simple It works 1.

There are many ways to pool ： Random pooling 、 Combination pooling 、 Pyramid pooling and so on 2

## 1.2 Strip Pooling

Address of thesis 3:Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

Strip pooling is a very novel 、 Bold pooling method , Most of the pooling methods mentioned before are based on a n × n n \times n Matrix , Quite regular , I didn't expect to be based on a certain row or column ( 1 × n 1 \times n ) Pooling of can also achieve such good results ：

The comparison with the previous pooling method is shown in the following figure ：

## 1.3 Collapse edge ( Noodles ) Pooling - Triangular grid

All of the above mentioned are pooling in the form of standard matrix , This pooling method is not suitable for three-dimensional data . Just take triangular mesh , It is a good direction to extend the traditional simplification of grid to the pooling of network .

Address of thesis 4:MeshCNN: A Network with an Edge

MeshCNN Take collapse as pooling method , Extended to triangular mesh ：

Tested it , If there is no collapse pool , Only use the convolution in the paper ： stay SHREC (500 Face to face ) The classification performance is similar ,COSEG The segmentation accuracy will be reduced 2 - 3%.

Address of thesis 5:Subdivision-Based Mesh Convolution Networks

It is very complicated to directly introduce the collapse surface into the grid pooling ,SubdivNet Find another way first Remesh Pool again :

Subdivision Make the triangular mesh have some regular structures similar to the image
The above needs to construct a matrix - Feature mapping matrix from before to after pooling

# Two 、 Unstructured ( clustering 、Top_k?)

• I call the pooling method in the form of clustering or sorting  Unstructured  Pooling of .

## 2.1 Three dimensional point cloud

Take point cloud as an example , It is discrete 、 messy 、 There is no structure , At present, the mainstream method of pooling or downsampling is farthest point sampling (Farthest Point Sampling)

It mainly uses the method of sorting according to distance to sample under the point cloud , Use maximum pooling to deal with point cloud disorder In doubt ： Whether just maximizing pooling ignores a lot of potential information

## 2.2 Graph - chart

There are many based on clustering or Top_k Pooling method of 7: Figure pool Topk

Address of thesis 8:Primal-Dual Mesh Convolutional Neural Networks

PDMesh The triangular mesh is converted into a graph , We introduce graph vertex merging as a pooling method

## 2.3 Transformer - Images

Regular pooling from 2D to 3D , Then return from three dimensions to two dimensions
In recent years, 2D On image Transformer Not only the position coding of image is introduced 、 Self attention , also Token Pooling of …

Address of thesis 9:PSViT: Better Vision Transformer via Token Pooling and Attention Sharing
Address of thesis 10:Token Pooling in Vision Transformers
Address of thesis 11:MetaFormer Is Actually What You Need for Vision

The most impressive thing is MetaFormer This paper , There is no gorgeous pool , Only simple average pooling has a very good effect ：

It shows the power of network architecture …

# 3、 ... and 、 Discuss

There are still many disputes about pooling :

As far as the present situation is concerned , Pooling should be indispensable , Because no matter the latest research or the landing of many neural networks , Pooling is used
But if the input of the network is only low resolution data , I think it's OK to get rid of pooling :

•  Is there such a possible solution ： Preprocessing ( Down sampling ) -> The Internet ( No pooling ) -> post-processing ( On the sampling or other )