current position:Home>Python Wu Enda deep learning assignment 17 -- deep learning and art neurostyle transfer (NST)

Python Wu Enda deep learning assignment 17 -- deep learning and art neurostyle transfer (NST)

2022-07-25 12:47:00Puzzle harvester

Deep learning and art - Neurostylistic migration

In this assignment , You will learn neurostyle transfer . The algorithm consists of Gatys Et al. 2015 Created in (https://arxiv.org/abs/1508.06576%E3%80%82)).

In this job , You will be

  • Realize neural style migration algorithm
  • Use algorithms to generate novel artistic images

Most of the algorithms you study now optimize the loss function to obtain a set of parameter values . And in neural style transformation , You will learn to optimize the loss function to obtain pixel values !

import os
import sys
import scipy.io
import scipy.misc
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
from PIL import Image
from nst_utils import *
import numpy as np
import tensorflow as tf

%matplotlib inline

1 Problem statement

Neurostylistic migration (NST) It is one of the most interesting technologies in deep learning . As shown below , It will “ Content ” Images (Content) and “ style ” Images (Style) Merge together , In order to create “ Generate ” Images (Generated). The resulting image G The image C Of “ Content ” With images S Of “ style ” Put together .

In this example , You will take the image of the Louvre Museum in Paris ( Content image C) And Claude, the leader of the Impressionist movement · Monet's works ( Style image S) Blend together to produce a new image .
 Insert picture description here

Let's see how to do this .

2 The migration study

Neurostylistic migration (NST) Use previously trained convolution Networks , And on this basis . The idea of applying the network trained by different tasks to new tasks is called transfer learning .

Follow the original NST The paper , We will use VGG The Internet . say concretely , We will use VGG-19, This is a VGG Online 19 Layer version . The model is already in a very large ImageNet Training on the database , Therefore, we have learned to recognize various low-level features and high-level features .

model = load_vgg_model("pretrained-model/imagenet-vgg-verydeep-19.mat")
print(model)
{'input': <tf.Variable 'Variable:0' shape=(1, 300, 400, 3) dtype=float32_ref>, 'conv1_1': <tf.Tensor 'Relu:0' shape=(1, 300, 400, 64) dtype=float32>, 'conv1_2': <tf.Tensor 'Relu_1:0' shape=(1, 300, 400, 64) dtype=float32>, 'avgpool1': <tf.Tensor 'AvgPool:0' shape=(1, 150, 200, 64) dtype=float32>, 'conv2_1': <tf.Tensor 'Relu_2:0' shape=(1, 150, 200, 128) dtype=float32>, 'conv2_2': <tf.Tensor 'Relu_3:0' shape=(1, 150, 200, 128) dtype=float32>, 'avgpool2': <tf.Tensor 'AvgPool_1:0' shape=(1, 75, 100, 128) dtype=float32>, 'conv3_1': <tf.Tensor 'Relu_4:0' shape=(1, 75, 100, 256) dtype=float32>, 'conv3_2': <tf.Tensor 'Relu_5:0' shape=(1, 75, 100, 256) dtype=float32>, 'conv3_3': <tf.Tensor 'Relu_6:0' shape=(1, 75, 100, 256) dtype=float32>, 'conv3_4': <tf.Tensor 'Relu_7:0' shape=(1, 75, 100, 256) dtype=float32>, 'avgpool3': <tf.Tensor 'AvgPool_2:0' shape=(1, 38, 50, 256) dtype=float32>, 'conv4_1': <tf.Tensor 'Relu_8:0' shape=(1, 38, 50, 512) dtype=float32>, 'conv4_2': <tf.Tensor 'Relu_9:0' shape=(1, 38, 50, 512) dtype=float32>, 'conv4_3': <tf.Tensor 'Relu_10:0' shape=(1, 38, 50, 512) dtype=float32>, 'conv4_4': <tf.Tensor 'Relu_11:0' shape=(1, 38, 50, 512) dtype=float32>, 'avgpool4': <tf.Tensor 'AvgPool_3:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_1': <tf.Tensor 'Relu_12:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_2': <tf.Tensor 'Relu_13:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_3': <tf.Tensor 'Relu_14:0' shape=(1, 19, 25, 512) dtype=float32>, 'conv5_4': <tf.Tensor 'Relu_15:0' shape=(1, 19, 25, 512) dtype=float32>, 'avgpool5': <tf.Tensor 'AvgPool_4:0' shape=(1, 10, 13, 512) dtype=float32>}

The model is stored in python In the dictionary , Each variable name is a key , The corresponding value is the tensor containing the value of the variable . To test the image through this network , Just provide the image to the model . stay TensorFlow in , You can use tf.assign Function to do this . Specially , You will use the following assign function :

model["input"].assign(image)

This assigns the image as input to the model . thereafter , If you want to access the activation function of a specific layer , For example, when the network is running on this image 4_2 layer , Then we can use the correct tensor conv4_2 Up operation TensorFlow conversation , As shown below :

sess.run(model["conv4_2"])

3 Neurostylistic migration

We will build in three steps NST Algorithm :

  • Establish content loss function J c o n t e n t ( C , G ) J_{content}(C,G) Jcontent(C,G).
  • Establish style loss function J s t y l e ( S , G ) J_{style}(S,G) Jstyle(S,G).
  • Put them together to get J ( G ) = α J c o n t e n t ( C , G ) + β J s t y l e ( S , G ) J(G) = \alpha J_{content}(C,G) + \beta J_{style}(S,G) J(G)=αJcontent(C,G)+βJstyle(S,G).

3.1 Calculate content loss

In our running example , Content image C It's a picture of the Louvre Museum in Paris . Run the following code to view the picture of the Louvre .

content_image = scipy.misc.imread("images/louvre.jpg")
imshow(content_image)
d:\vr\virtual_environment\lib\site-packages\ipykernel_launcher.py:1: DeprecationWarning:     `imread` is deprecated!
    `imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
    Use ``imageio.imread`` instead.
  """Entry point for launching an IPython kernel.





<matplotlib.image.AxesImage at 0x26905cddb38>

 Insert picture description here

Content image (C) Shows the pyramids of the Louvre Museum , Surrounded by ancient Paris buildings , There are only a few layers of clouds under the clear sky .

3.1.1 How to ensure the generated image G With images C Match the content of ?

As we mentioned in the course ,ConvNet The bottom layer of tends to detect low-level features such as edges and simple textures , Deep layers tend to detect more complex high-level features such as textures .

The image we want to generate G With input image C Similar content . Suppose you have selected the activation of some layers to represent the content of the image . In practice , If you choose a layer in the middle of the network , Neither too shallow nor too deep , You will get visually satisfactory results .( After completing this exercise , Please feel free to go back and try different layers , To see the result changes .)

therefore , Suppose you choose a specific hidden layer to use . Now? , The image C Set as pre trained VGG Network input , And forward propagation . hypothesis a ( C ) a^{(C)} a(C) It is the hidden layer in the layer you choose that activates .( In the course , We write it as a [ l ] ( C ) a^{[l](C)} a[l](C), But here we will delete the superscript [ l ] [l] [l] In a simplified way .) This will be tensor n H × n W × n C n_H \times n_W \times n_C nH×nW×nC. To image G Repeat the process : take G Set to input , Then forward propagation . Make a ( G ) a^{(G)} a(G) Activate for the corresponding hidden layer . We define the content loss function as :

J c o n t e n t ( C , G ) = 1 4 × n H × n W × n C ∑ all entries ( a ( C ) − a ( G ) ) 2 (1) J_{content}(C,G) = \frac{1}{4 \times n_H \times n_W \times n_C}\sum _{ \text{all entries}} (a^{(C)} - a^{(G)})^2\tag{1} Jcontent(C,G)=4×nH×nW×nC1all entries(a(C)a(G))2(1)

ad locum , n H , n W n_H,n_W nH,nW and n C n_C nC Is the height of the hidden layer you choose , Width and number of channels , It is shown in normalized terms of loss . Please note that a ( C ) a^{(C)} a(C) and a ( G ) a^{(G)} a(G) It corresponds to the activation of the hidden layer . In order to calculate the loss J c o n t e n t ( C , G ) J_{content}(C,G) Jcontent(C,G), Will these 3D The volume expands to 2D Matrix is more convenient , As shown below .( Technically speaking , This expansion step does not require calculation J c o n t e n t J_{content} Jcontent, But you need to do similar operations to calculate the style later J s t y l e J_{style} Jstyle In the case of constants , It will be a good practice .)
 Insert picture description here

practice : Use TensorFlow Calculation “ Content loss ”.

explain : Implement this function to include 3 A step :

  1. from a_G Retrieve dimensions :
    • From the tensor X Retrieve dimensions , Please use : X.get_shape().as_list()
  2. Expand as shown in the above figure a_C and a_G
    • If there is a problem , Please check out Hint1 and Hint2.
  3. Calculate content loss :
def compute_content_cost(a_C, a_G):
    """  A function that calculates the cost of content   Parameters : a_C -- tensor type , Dimension for (1, n_H, n_W, n_C), Represents the image in the hidden layer C The activation value of the content . a_G -- tensor type , Dimension for (1, n_H, n_W, n_C), Represents the image in the hidden layer G The activation value of the content .  return : J_content --  The set of real Numbers , Use the formula above 1 Calculated value . """
    #  obtain a_G Dimension information of 
    m, n_H, n_W, n_C = a_G.get_shape().as_list()
    
    #  Yes a_C And a_G from 3 Dimension down to 2 dimension 
    a_C_unrolled = tf.transpose(tf.reshape(a_C, [n_H * n_W, n_C]))
    a_G_unrolled = tf.transpose(tf.reshape(a_G, [n_H * n_W, n_C]))
    
     # Calculate the cost of content 
    #J_content = (1 / (4 * n_H * n_W * n_C)) * tf.reduce_sum(tf.square(tf.subtract(a_C_unrolled, a_G_unrolled)))
    J_content = 1 / (4 * n_H * n_W * n_C) * tf.reduce_sum(tf.square(tf.subtract(a_C_unrolled, a_G_unrolled)))
    return J_content
tf.reset_default_graph()

with tf.Session() as test:
    tf.set_random_seed(1)
    a_C = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
    a_G = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
    J_content = compute_content_cost(a_C, a_G)
    print("J_content = " + str(J_content.eval()))
    
    test.close()
J_content = 6.7655935

You should remember

  • Content loss requires hidden layer activation of Neural Networks , And calculate a ( C ) a^{(C)} a(C) and a ( G ) a^{(G)} a(G) Differences between .
  • When we are minimizing content loss , This will help ensure that G G G Have and C C C Similar content .

3.2 Computing style loss

We will use the following style image as an example to run :

style_image = scipy.misc.imread("images/monet_800600.jpg")
imshow(style_image)
d:\vr\virtual_environment\lib\site-packages\ipykernel_launcher.py:1: DeprecationWarning:     `imread` is deprecated!
    `imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
    Use ``imageio.imread`` instead.
  """Entry point for launching an IPython kernel.





<matplotlib.image.AxesImage at 0x269080c23c8>

 Insert picture description here

This painting is based on Impressionism Style painting .

Let's see how to define “ style ” Constant function J s t y l e ( S , G ) J_{style}(S,G) Jstyle(S,G).

3.2.1 Style matrix

The style matrix is also known as “ Grammar matrix ”. In linear algebra , vector ( v 1 , … , v n ) (v_{1},\dots ,v_{n}) (v1,,vn) Of the set of Gram matrix G Is a matrix of dot products , The item is G i j = v i T v j = n p . d o t ( v i , v j ) {\displaystyle G_{ij} = v_{i}^T v_{j} = np.dot(v_{i}, v_{j}) } Gij=viTvj=np.dot(vi,vj). let me put it another way , G i j G_{ij} Gij Compare v i v_i vi And v j v_j vj The similarity : If they are very similar , Then they will have a large dot product , therefore G i j G_{ij} Gij It will also be larger .

Please note that , There is a conflict between the variable names used here . We follow the general terms used in the literature , however G G G Used to represent style matrix ( or Gram matrix ) It also represents the generated image . We will ensure clarity from the context G G G Reference to .

stay NST in , It can be done by putting “ In the ” The filter matrix is multiplied by its transpose to calculate the style matrix :
 Insert picture description here

The result is a dimension of ( n C , n C ) (n_C,n_C) (nC,nC) Matrix , among n C n_C nC Is the number of filters . value G i j G_{ij} Gij Measure filter i i i Activation and filter j j j The similarity of activation .

An important part of the grammatical matrix is the diagonal element ( for example G i i G_{ii} Gii) You can also measure filters i i i Activity level of . for example , Suppose the filter i i i Detecting vertical texture in image . then G i i G_{ii} Gii Measure the popularity of vertical texture in the whole image : If G i i G_{ii} Gii Big , It means that the image has many vertical textures .

By capturing the universality of different types of features ( G i i G_{ii} Gii) And how many different features appear together ( G i i G_{ii} Gii), The style matrix can measure the style of the image .

practice
Use TensorFlow Implement a calculation matrix A Function of the syntax matrix of . Formula for :A The syntax matrix of is G A = A A T G_A = AA^T GA=AAT. If there is a problem , Please check out Hint 1 and Hint 2.

def gram_matrix(A):
    """  Parameters : A --  Matrix shape by (n_C, n_H * n_W)  return : GA -- A Of Gram matrix , Shape is (n_C, n_C) """
    GA = tf.matmul(A,tf.transpose(A))
    return GA
tf.reset_default_graph()

with tf.Session() as test:
    tf.set_random_seed(1)
    A = tf.random_normal([3, 2*1], mean=1, stddev=4)
    GA = gram_matrix(A)
    
    print("GA = " + str(GA.eval()))
GA = [[ 6.422305 -4.429122 -2.096682]
 [-4.429122 19.465837 19.563871]
 [-2.096682 19.563871 20.686462]]

3.2.2 Style loss

Generate style matrix (Gram matrix ) after , Your goal is to make "style" Images S Of Gram Matrix and generated image G Of Gram The distance between matrices is the smallest . Now? , We only use a single hidden layer A [ L ] A^{[L]} A[L], The corresponding style loss of this layer is defined as :

J s t y l e [ l ] ( S , G ) = 1 4 × n C 2 × ( n H × n W ) 2 ∑ i = 1 n C ∑ j = 1 n C ( G i j ( S ) − G i j ( G ) ) 2 (2) J_{style}^{[l]}(S,G) = \frac{1}{4 \times {n_C}^2 \times (n_H \times n_W)^2} \sum _{i=1}^{n_C}\sum_{j=1}^{n_C}(G^{(S)}_{ij} - G^{(G)}_{ij})^2\tag{2} Jstyle[l](S,G)=4×nC2×(nH×nW)21i=1nCj=1nC(Gij(S)Gij(G))2(2)

among G ( S ) G^{(S)} G(S) and G ( G ) G^{(G)} G(G) Namely “ style ” Images and “ Generate ” Syntax matrix of image , Use activation for specific hidden layers in the network to calculate .

practice : Calculate the style loss of single layer .

explain : The steps to implement this function are :

  1. Activate from hidden layer a_G Retrieve dimensions :
    • From the tensor X Retrieve dimensions , Please use :X.get_shape().as_list()
  2. As shown in the figure above , Activate the hidden layer a_S and a_G Begin to 2D matrix .
  3. Calculate the image S and G Style matrix .( Use the previously written function )
  4. Computing style loss :
def compute_layer_style_cost(a_S, a_G):
    
    #  from a_G Retrieve dimensions in 
    m, n_H, n_W, n_C = a_G.get_shape().as_list()
    
    #  Reshape the image (n_C, n_H * n_W)
    a_S = tf.reshape(a_S,shape=(n_H * n_W,n_C))
    a_G = tf.reshape(a_G,shape=(n_H * n_W,n_C))
    
    #  Calculate the image S and G Of gram_ matrix 
    GS = gram_matrix(tf.transpose(a_S))
    GG = gram_matrix(tf.transpose(a_G))
    
    #  Calculate the loss 
    J_style_layer = tf.reduce_sum(tf.square(tf.subtract(GS,GG))) / (4 * (n_C * n_C) * (n_W * n_H) * (n_W * n_H))
    
    return J_style_layer
tf.reset_default_graph()

with tf.Session() as test:
    tf.set_random_seed(1)
    a_S = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
    a_G = tf.random_normal([1, 4, 4, 3], mean=1, stddev=4)
    J_style_layer = compute_layer_style_cost(a_S, a_G)
    
    print("J_style_layer = " + str(J_style_layer.eval()))
J_style_layer = 9.190278

3.2.3 Style weight

up to now , You only capture style features from one level . If we go from several different levels “ Merge ” Style loss , We will get better results . After completing this exercise , Please feel free to return and try different weights , To see how it changes the generated image G G G. But for now , This is a reasonable default :

STYLE_LAYERS = [
    ('conv1_1', 0.2),
    ('conv2_1', 0.2),
    ('conv3_1', 0.2),
    ('conv4_1', 0.2),
    ('conv5_1', 0.2)]

You can combine the style loss of different layers as follows :
J s t y l e ( S , G ) = ∑ l λ [ l ] J s t y l e [ l ] ( S , G ) J_{style}(S,G) = \sum_{l} \lambda^{[l]} J^{[l]}_{style}(S,G) Jstyle(S,G)=lλ[l]Jstyle[l](S,G)
λ [ l ] \lambda^{[l]} λ[l] The value of the STYLE_LAYERS Give in .

We've done that compute_style_cost(...) function . It simply calls your compute_layer_style_cost(...), And use STYLE_LAYERS The result is weighted by the value in . Please read it carefully to make sure you understand what it is doing .

2. from STYLE_LAYERS loop (layer_name,coeff):  
         a.  Select the output tensor of the current layer   for example , From the layer "conv1_1" Call tensor in , You can do that :out = model["conv1_1"]  
         b.  Through the tensor "out" Run session on , Get from the current layer style The style of the image   
         c.  Get a tensor that represents the image style generated by the current layer .  This is just "out".  
         d.  Now? , You have two styles . Use the function implemented above to calculate the style_cost  
         e.  Set the current layer's (style_cost x coeff) Add to the overall style loss (J_style) in   
3. return J_style, It should be on each floor now (style_cost x coeff) The sum of the .
def compute_style_cost(model, STYLE_LAYERS):
    """  Calculate the overall style cost from several selected layers   Parameters : model --  our tensorflow Model  STYLE_LAYERS --  One python list , contain : -  The name of the layer from which we want to extract the style  -  Each has a coefficient   return : J_style --  Tensors representing scalar values , By the type (2) Defined style cost  """
    #  Initialize the overall style overhead 
    J_style = 0
    
    for layer_name, coeff in STYLE_LAYERS:
        
        #  Select the output tensor of the currently selected layer 
        out = model[layer_name]
        
        #  adopt out Run the session , take a_S Set as hidden layer activation of the selected layer 
        a_S = sess.run(out)
        
        #  take a_G Set to activate hidden layers from the same layer . ad locum ,a_G quote model[layer_name], And it has not been calculated . In the following code , We will specify the image G As model input , So when we run a session , This will be the activation of painting from the appropriate layer ,G As input .
        a_G = out
        
        #  Calculate the of the current layer style_cost
        J_style_layer = compute_layer_style_cost(a_S, a_G)
        
        #  Turn the layer's coeff * J_style_layer Add to the overall style overhead 
        J_style += coeff * J_style_layer
        
    return J_style

Be careful : In the above for Inside the loop ,a_G It's a tensor , Has not been evaluated . When we are below model_nn() Run in TensorFlow When calculating the graph , It will be evaluated and updated at each iteration .

How do you choose the coefficient of each layer ? Deeper layers capture more complex features , And the features in the deeper layers are less positioned in the image relative to each other . therefore , If you want the generated image to follow the style image gently , Try choosing a larger weight for deeper layers , Choose a smaller weight for the first layer . contrary , If you want the generated image to strongly follow the style image , Try choosing a smaller weight for the lower layer , Choose a larger weight for the first layer .

You should remember

  • You can use hidden layers to activate Gram The matrix represents the style of the image . however , Combine the syntax matrix representation of multiple different layers , We can get better results . This is the opposite of content representation , The latter usually uses only one hidden layer .
  • Minimizing style loss will result in images G G G Follow the image S S S Style .

3.3 Define the total loss of optimization

Last , Let's create a loss function , To minimize style and content loss . Formula for :

J ( G ) = α J c o n t e n t ( C , G ) + β J s t y l e ( S , G ) J(G) = \alpha J_{content}(C,G) + \beta J_{style}(S,G) J(G)=αJcontent(C,G)+βJstyle(S,G)

practice : Realize the total loss function , This includes loss of content and loss of style .

def total_cost(J_content, J_style, alpha = 10, beta = 40):
    J = alpha * J_content + beta * J_style
    return J
tf.reset_default_graph()

with tf.Session() as test:
    np.random.seed(3)
    J_content = np.random.randn()    
    J_style = np.random.randn()
    J = total_cost(J_content, J_style)
    print("J = " + str(J))
J = 35.34667875478276

You should remember

  • Total loss is content loss J c o n t e n t ( C , G ) J_{content}(C,G) Jcontent(C,G) And style loss J s t y l e ( S , G ) J_{style}(S,G) Jstyle(S,G) The linear combination of
  • α \alpha α and β \beta β It is a super parameter that controls the relative weight between content and style

4 Solving optimization problems

Last , Let's put all the content together to achieve neurostyle transfer !

The program must perform the following operations :

  1. Create an interactive session
  2. Load content image
  3. Load style image
  4. Randomly initialize the image to be generated
  5. load VGG16 Model
  6. structure TensorFlow Calculation chart :
    • adopt VGG16 The model runs the content image and calculates the content loss
    • adopt VGG16 The template runs the style image and calculates the style loss
    • Calculate the total loss
    • Define optimizer and learning rate
  7. initialization TensorFlow chart , And run a lot of iterations , Then update the generated image at each step .

Let's go through the steps in detail .

You have realized the total loss before J ( G ) J(G) J(G), We will now set TensorFlow For G G G To optimize . So , Your program must reset the calculation chart and use "Interactive Session". Different from regular conversation , The interactive session will start itself as the default session to build the calculation diagram . This allows you to run variables without having to refer to session objects often , This simplifies the code .

Let's start an interactive conversation .

#  Reset graph 
tf.reset_default_graph()

#  Start an interactive session 
sess = tf.InteractiveSession()

Let's load , Reshape and standardize our “ Content ” Images ( Louvre Museum pictures ):

content_image = scipy.misc.imread("images/louvre_small.jpg")
content_image = reshape_and_normalize_image(content_image)
d:\vr\virtual_environment\lib\site-packages\ipykernel_launcher.py:1: DeprecationWarning:     `imread` is deprecated!
    `imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
    Use ``imageio.imread`` instead.
  """Entry point for launching an IPython kernel.

load , Reshape and standardize our “ style ” Images ( Claude · Monet's painting ):

style_image = scipy.misc.imread("images/monet.jpg")
style_image = reshape_and_normalize_image(style_image)
d:\vr\virtual_environment\lib\site-packages\ipykernel_launcher.py:1: DeprecationWarning:     `imread` is deprecated!
    `imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
    Use ``imageio.imread`` instead.
  """Entry point for launching an IPython kernel.

Now? , We will “ Generated ” Image initialization as from content_image Created noise image . By initializing the pixels of the generated image to pixels that are mainly noise but still slightly related to the content image , This will help the content of the generated image match more quickly “ Content ” The content of the image .( Can be in nst_utils.py View in generate_noise_image(...) Details of ; So , Please come here Jupyter Click in the upper left corner of the notebook "File–>Open…")

generated_image = generate_noise_image(content_image)
imshow(generated_image[0])
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).





<matplotlib.image.AxesImage at 0x26905ec3d30>

 Insert picture description here

Next , As the first (2) Described in part , Let's load VGG16 Model .

model = load_vgg_model("pretrained-model/imagenet-vgg-verydeep-19.mat")

In order to get the program of calculating content loss , We will now a_C and a_G Assign to the appropriate hidden layer to activate . We will use conv4_2 Layer to calculate content loss . The following code performs the following operations :

  1. Assign the content image as VGG Model input .
  2. take a_C Set to tensor , For layer "conv4_2" Provide hidden layer activation .
  3. Set up a_G Tensor , Provide hidden layer activation for the same layer .
  4. Use a_C and a_G Calculate content loss .
#  Specify content image as VGG Model input .
sess.run(model['input'].assign(content_image))

#  choice conv4_2 The output tensor of the layer 
out = model['conv4_2']

#  Set up a_C Activate the hidden layer for the layer we selected 
a_C = sess.run(out)

#  take a_G Set to activate hidden layers from the same layer . here ,a_G Refer to the model['conv4_2'], Not calculated yet . In the following code , We will specify the image G As model input , So when we run a session , This will be the activation of painting from the appropriate layer ,G As input .
a_G = out

#  Calculate content cost 
J_content = compute_content_cost(a_C, a_G)

Be careful : here ,a_G It's a tensor , Not yet verified . When we are below model_nn() Run in Tensorflow When calculating the graph , It will be confirmed and updated at each iteration .

#  Specify the input of the model as “style” Images 
sess.run(model['input'].assign(style_image))

#  Calculate style cost 
J_style = compute_style_cost(model, STYLE_LAYERS)

practice : Now you have J_content and J_style, By calling total_cost() Calculate the total loss J. Use alpha = 10 and beta = 40.

J = total_cost(J_content, J_style, alpha = 10, beta = 40)

You have already learned how to work in TensorFlow Set in Adam Optimizer . We use it here 2.0 Learning rate of .See reference

optimizer = tf.train.AdamOptimizer(2.0)
train_step = optimizer.minimize(J)

practice : Realization model_nn() function , This function initializes tensorflow Calculate the variables of the graph , The input image ( Initially generated image ) As VGG16 Model input , And run train_step Go through the training steps .

def model_nn(sess, input_image, num_iterations = 200):
    #  Initialize global variables ( You need to run the session on the initializer )
    sess.run(tf.global_variables_initializer())
    
    #  Run the input image with noise through the model ( Initially generated image ). Use assign().
    generated_image=sess.run(model['input'].assign(input_image))
    
    for i in range(num_iterations):
        
        #  stay train_step Run sessions on to minimize total costs 
        sess.run(train_step)
        
        #  Through in the present model['input'] Run a session on to calculate the generated image 
        generated_image = sess.run(model['input'])
        
        #  Every time 20 Print once .
        if i%20 == 0:
            Jt, Jc, Js = sess.run([J, J_content, J_style])
            print("Iteration " + str(i) + " :")
            print("total cost = " + str(Jt))
            print("content cost = " + str(Jc))
            print("style cost = " + str(Js))
            
            #  Save the currently generated image to “/output” Catalog 
            save_image("output/" + str(i) + ".png", generated_image)
        
    #  Save the last generated image 
    save_image('output/generated_image.jpg', generated_image)
        
    return generated_image

Run the following cells to generate an artistic image . Every operation 20 The next iteration is CPU It takes about 3 minute , But around 140 After the second iteration, you begin to observe good results . Usually use GPU Training neural style migration .

model_nn(sess, generated_image)
Iteration 0 :
total cost = 4936893000.0
content cost = 7881.85
style cost = 123420350.0
Iteration 20 :
total cost = 931792700.0
content cost = 15150.729
style cost = 23291030.0
Iteration 40 :
total cost = 476977900.0
content cost = 16802.03
style cost = 11920246.0
Iteration 60 :
total cost = 306887600.0
content cost = 17398.729
style cost = 7667841.0
Iteration 80 :
total cost = 224318640.0
content cost = 17652.709
style cost = 5603553.0
Iteration 100 :
total cost = 177715900.0
content cost = 17879.422
style cost = 4438427.5
Iteration 120 :
total cost = 147169620.0
content cost = 18050.78
style cost = 3674727.5
Iteration 140 :
total cost = 125411320.0
content cost = 18213.465
style cost = 3130729.5
Iteration 160 :
total cost = 108912420.0
content cost = 18361.072
style cost = 2718220.2
Iteration 180 :
total cost = 96001230.0
content cost = 18497.363
style cost = 2395406.5





array([[[[ -45.12358  ,  -72.19441  ,   51.746346 ],
         [ -24.75327  ,  -44.2964   ,   29.886335 ],
         [ -39.64303  ,  -31.56099  ,   13.718143 ],
         ...,
         [ -24.76011  ,  -10.788876 ,   14.371523 ],
         [ -28.816984 ,   -5.280051 ,   23.481634 ],
         [ -40.314568 ,   -6.0927963,   49.9986   ]],

        [[ -58.39488  ,  -53.06708  ,   26.432343 ],
         [ -32.944817 ,  -32.69456  ,   -1.5097439],
         [ -26.427433 ,  -31.894102 ,   15.655808 ],
         ...,
         [ -25.357164 ,   -9.746335 ,   24.563683 ],
         [ -20.002506 ,  -20.456278 ,   12.591088 ],
         [ -38.10473  ,  -10.029796 ,   10.0670185]],

        [[ -50.200523 ,  -50.721996 ,   15.647173 ],
         [ -37.31482  ,  -42.209923 ,   -6.276647 ],
         [ -33.782967 ,  -25.933123 ,    5.7868314],
         ...,
         [ -11.978632 ,  -41.413166 ,   10.257919 ],
         [ -13.608108 ,  -24.304035 ,   14.8848295],
         [ -23.316372 ,  -21.182524 ,   12.9698   ]],

        ...,

        [[ -45.040543 ,  -44.788315 ,  -27.18235  ],
         [ -90.703514 ,  -68.318245 , -255.0205   ],
         [ -65.162224 ,  -61.23871  , -127.02678  ],
         ...,
         [ -62.159573 ,  -74.07849  ,  -31.950476 ],
         [ -75.70332  ,  -98.683426 ,  -27.93272  ],
         [   3.5887198,  -34.177395 ,   23.60415  ]],

        [[ -19.461033 ,  -72.949875 ,   11.1744175],
         [-165.36955  ,  -96.42717  ,  -28.43835  ],
         [  18.556105 ,  -60.75096  ,  -17.065166 ],
         ...,
         [ -91.97378  ,  -86.639656 ,  -49.875256 ],
         [-101.63406  , -109.606384 ,  -63.365128 ],
         [ -69.75303  , -100.961685 ,   -3.9610627]],

        [[  40.353714 ,  -34.501305 ,   46.080757 ],
         [  28.722298 ,  -80.35448  ,   24.79451  ],
         [  33.369385 ,  -26.925333 ,   19.59569  ],
         ...,
         [-117.021164 , -103.20966  ,  -19.439291 ],
         [-147.96053  , -143.07509  ,  -31.319807 ],
         [ -25.229664 , -101.636154 ,   23.11785  ]]]], dtype=float32)

You finished. ! After running this command , Click... In the upper column of the notebook computer "File", And then click "Open". go to "/output" Directory to view all saved images . open "generated_image" To view the generated image !

You should see the image shown below :
 Insert picture description here

We don't want you to wait too long to see the initial results , Therefore, the super parameter has been set accordingly . For best results , Longer Optimization Algorithm ( Maybe with a small learning rate ) The operation effect is better . After completing and submitting this assignment , We suggest you go back and do more with this notebook , See if you can generate a better looking image .

Here are some other examples :

  • van gogh ( Starry sky ) Style of Persepolis ( Iran ) The beautiful ruins of the ancient city
     Insert picture description here

  • Tomb of Cyrus the great in the ceramic style of Ispahan
     Insert picture description here

  • Turbulence science research with abstract blue liquid painting style .
     Insert picture description here

5 Test with your own image

Last , You can also rerun the algorithm on your own image !

So , Please go back to the 4 part , And use your own image to change the content image and style image . Here are the actions you should perform :

  1. Click "File -> Open"
  2. go to "/images" And upload images ( requirement :(WIDTH = 300, HEIGHT = 225)), For example, rename it "my_content.png" and "my_style.png"
  3. Change part from the following position (3.4) The code in :
content_image = scipy.misc.imread("images/louvre.jpg")  
style_image = scipy.misc.imread("images/claude-monet.jpg")

To :

content_image = scipy.misc.imread("images/my_content.jpg")  
style_image = scipy.misc.imread("images/my_style.jpg")

Re run the unit ( You may need to restart Kernel ).

You can also try to adjust the super parameters :

  • Which layer is responsible for expressing style ? STYLE_LAYERS
  • How many iterations do you want to run the algorithm ? num_iterations
  • What is the relative weight between content and style ? alpha / beta

6 summary

Congratulations on your excellent completion of this task ! Now? , You can use “ Neurostylistic migration ” Creating art images . This is also your first time to build a model , In this model , The optimization algorithm will update the pixel values instead of the parameters of the neural network . There are many different types of models for deep learning , This is just one of them !

You should remember

  • Neural style transfer is an algorithm , Given content image C And style images S Can generate artistic images
  • It uses pre training based ConvNet Characteristics of ( Hidden layer activation ).
  • Use the activation of a hidden layer to calculate the content loss function .
  • Activate with this layer Gram Matrix calculates the style loss function of the first layer . Using several hidden layers, you can get the overall style loss function .
  • Optimize the total loss function to synthesize new images .

copyright notice
author[Puzzle harvester],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/206/202207251212172792.html

Random recommended