current position：Home>Notes on Python deep learning (11): loss function
Notes on Python deep learning (11): loss function
20220202 09:50:02 【Tanlyn】
1. Definition
The general expression of the loss function is L(y,f(x)), In order to use Measure true value y And the forecast f(x) The extent of the inconsistency , Generally smaller is better . In order to facilitate the comparison of different loss functions , It is often expressed as a function of a single variable , stay The return question The variable in is [yf(x)] ： Residual means , stay Classification problem Medium is yf(x) ： The trend is the same .
Neural networks with multiple outputs may have multiple loss functions , Each output corresponds to a loss function . However, the gradient descent process must be based on the loss value of a single scalar . therefore , For networks with multiple loss functions , All losses need to be averaged , Become a scalar value .
2. classification
2.1 Classification problem
For dichotomies , label y ∈ { + 1 , − 1 }. The loss function is often expressed as y f ( x ) Monotonically decreasing form of . As shown in the figure below ：
yf(x) It's called the interval margin, obviously y f ( x ) > 0 It indicates that the classifier classification is correct ,y f ( x ) < 0 , Description classifier classification error . among f ( x ) Is the hyperplane of the classifier . This reminds us of, for example, perceptron , Classification problems of support vector machines, etc , Minimizing the loss function is maximizing margin A process of .
2.1.1 01 Loss （01 loss）
The function formula is as follows ：
01 Loss treats any misclassification point the same , Even in remote areas .01 The losses are discrete 、 Nonconvex functions , Therefore, optimization is difficult . Therefore, other proxy loss functions are often used for optimization .
2.1.2 Cross entropy loss （cross entropy loss）
Cross entropy loss is the of deep learning Classification problem The loss function commonly used in . Cross entropy It is also an important concept in information theory , It is mainly used for Measure the difference between two probability distributions . There is no detailed knowledge of cross entropy theory , Only show the specific code implementation .
Background using cross entropy ：
Solve the problem through neural network Classification problem when , It's usually set to k Output points ,k Represents the number of categories , Here's the picture ：
Each output node , The score of the corresponding category of the node will be output , Such as [cat,dog,car,pedestrian] by [44,10,22,5]. But the output node outputs the score , Not the probability distribution , Then there is no way to use cross entropy to measure the predicted results and true results , Then what shall I do? , The solution is to output the result followed by a layer softmax,softmax The function of is to convert the output score into probability distribution .
Dichotomous problem ： Binary cross entropy （binary_cross_entropy）
Use tensorflow Realization ：
# When batch_size by 1, The total number of tags is 1, The output shape by (1,1,1) when
import tensorflow as tf
y_true = [[[0.]]]
y_pred = [[[0.5]]]
# Use the builtin function to realize
loss = tf.keras.losses.binary_crossentropy(y_true, y_pred)
loss.numpy()
print(loss.numpy())
print
# Their own coding, using formulas
loss_1 = (1/1)*( 0*tf.math.log(0.5) +(10)*tf.math.log(10.5))
print(loss_1)
Use pytorch Realization ：
import numpy as np
import torch
import torch.nn.functional as F
y_true = np.array([0., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
y_pred = np.array([0.2, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8])
# Code yourself and calculate with formula
my_loss =  y_true * np.log(y_pred)  (1  y_true) * np.log(1  y_pred)
mean_my_loss = np.mean(my_loss)
print('my_loss:', mean_my_loss)
# Use pytorch Self contained function calculation
torch_pred = torch.tensor(y_pred)
torch_true = torch.tensor(y_true)
bce_loss = F.binary_cross_entropy(torch_pred, torch_true)
print('bce_loss:', bce_loss)
Multiple classification problem ： Classification cross entropy （categorical_cross_entropy）
Use tensorflow Realization ：
import tensorflow as tf
y_true = [[[0.,1.]]]
y_pred = [[[0.4,0.6]]]# Suppose you've gone through softmax, So and must be 1
# Use the builtin function to calculate
loss = tf.keras.losses.categorical_crossentropy(y_true, y_pred)
print(loss.numpy())
# Use formula code to calculate
loss = ( 0*tf.math.log(0.4) + 1*tf.math.log(0.6) )
print(loss.numpy())
2.2 The return question
The learning of regression problem is equivalent to Function fitting ： Select a function curve to make it fit the known data well and predict the unknown data well . So the problem of regression y and f(x) all ∈ R And use residual y − f ( x ) To measure the inconsistency between the predicted value and the real value of the regression problem .
2.2.1 Loss of mean square error （MSE,L2 loss）
The loss of mean square error is also called L2 Loss , The mathematical expression is as follows ：
This is the most common loss function , It's a convex function , The gradient descent method can be used for optimization . But it is relatively sensitive to points far from the real value , The cost of the loss function is very high , This makes the robustness of the mean square error loss function worse .
2.2.2 Absolute value loss （MAE,L1 Loss）
The absolute value loss function is also called L 1 L1L1 Loss function , The mathematical expression is as follows ：
The absolute value loss function is good for the processing of remote points relative to the mean square error , But in y = f ( x ) y=f(x)y=f(x) Where is a non differentiable function , also M A E MAEMAE The updated gradient is always the same , It is still possible to maintain a large gradient near the optimal value and miss the optimal value .
3. summary ：
1. Square loss is most commonly used , The drawback is that For abnormal points will be given a greater punishment , So it's not enough robust.
2. Absolute loss has the characteristics of resisting the interference of abnormal points , But in yf(x) A discontinuity can lead , Difficult to optimize .
3.Huber Loss is a combination of the two , When yf(x) Less than a pre specified value δ when , Change to square loss ; Greater than δ when , It becomes something like an absolute loss , So it's also a comparison robust Loss function of .
4. If the outlier represents an important exception , And need to be detected , use MSE. Regression problems are often used MSE Loss function . If you just treat abnormal data as damaged data , use MAE.
copyright notice
author[Tanlyn],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/02/202202020949590594.html
The sidebar is recommended
 Spring boot calls Python interface
 Using Python to make a key recorder
 Python combat case, pyGame module, python implementation routine confession artifact vs no routine confession artifact
 Python series tutorial 132  why use indentation syntax
 10 minutes to learn how to play excel easily with Python
 Python develops a color dynamic twodimensional code generator in one hour, and uses the virtual environment to package and release the EXE program.
 Elimination of grammar left recursion in Python
 Python testing  the patches in Python
 Python image processing, CV2 module, OpenCV to achieve target tracking
 How to send alarm notification to nail in Python?
guess what you like

Introduction to pandas operation

Mail sending, SMTP and exchange sending in Python 3

Show your hand and use Python to analyze house prices

The strongest Python visualization artifact, none of them

8 practical Python skills that are easy to use and don't have to suffer a loss for half a year

Tips: teach you to generate 30 cool dynamic interactive charts with one click of pandas

I use one line of Python code to dynamically load dependencies

Blow up this pandas GUI artifact and automatically turn the code!

Getting started exploring and analyzing data using Python

Python image processing, CV2 module, OpenCV to achieve template matching
Random recommended
 Teach you three Python Programming Tips
 Python code reading (Chapter 69): case conversion of initial letters
 Python tutorial series 133  several special grammars
 Dry goods  Python operation: Python controls Excel to realize office automation
 Understand the principle of affine transformation and python implementation code
 A little trick every day, python can easily convert PDF to text and bid farewell to copy and paste
 Climb Conan barrage with Python + gephi to sort out the main plot
 As a programmer, do you know Python variable reference
 One line of code, take you to learn Python
 Summary of 22 advanced Python knowledge points, dry goods!
 [Python learning] nanny level teaching parsing and parsing XML in Python
 Use of Python JSON module
 "Python" guide to using itertools of Python standard library
 What are the functions and classifications of Python interpreters
 Python implements four schemes for timed tasks, lazy artifact
 Python can actually realize the freedom of punch in?
 What about Python memory leak? Pit filling troubleshooting tips
 Object oriented programming in Python
 Quick start  Python Basics
 [target detection (8)] a thorough understanding of the loss function of the regression box of target detection  the principles of IOU, giou, Diou and ciou and Python code
 【Python】2. Logic control statement, list, tuple, dictionary
 Write an Enigma machine in Python
 Python lists, dictionaries, tuples, collections, learning notes
 Python actual combat case, requests module, python implementation, obtaining dynamic charts
 Information recommendation platform based on Python search engine and recommendation algorithm
 It's better to know the python decorator than to know the heart of your girlfriend
 Use Python to develop a dinosaur running game and play it
 Use SciPy FFT for Fourier transform: Python signal processing
 Let the python program automatically play Sudoku, and the second becomes the strongest brain!
 Python ETL tool
 Vue3 + ant design Vue + Python realizes the combined deployment of a set of code and multiple sub applications
 Python snail sorting
 Generating AI web applications using Python and flask
 Share 4 practical Python automation scripts
 [Python basics] Python collaboration
 11 "Python dictionary" knowledge points easy to use
 These 20 pandas functions can improve your 'data cleaning' ability by 100 times
 British onetoone student Python poker homework Q & A
 Django's 35000 word blog post
 Four skills of pandas row column conversion