A Functionally Separate Autoencoder
Jinxin Wei1, Qunying Ren2
1Vocational School of Juancheng, Juancheng 274600
China
2Bureau of Emergency Management of Juancheng County,
Juancheng 274600 China
Abstract: According to kids’ learning process, an auto-encoder which can
be split into two parts is designed. The two parts can work well
separately. The top half is an abstract network which is trained by
supervised learning and can be used to classify and regress. The bottom
half is a concrete network which is accomplished by inverse function and
trained by self-supervised learning. It can generate the input of
abstract network from concept or label. The network can achieve its
intended functionality through testing by mnist dataset and convolution
neural network. Round function is added between the abstract network and
concrete network in order to get the representative generation of class.
The generation ability can be increased by adding jump connection and
negative feedback. At last, the characteristics of the network is
discussed. The input can be changed to any form by encoder and then
change it back by decoder through inverse function. The concrete network
can be seen as the memory stored by the parameters. Lethe is that when
new knowledge input, the training process makes the parameters change.
At last, the application of the network is discussed. The network can be
used for logic generation through deep reinforcement learning. The
network can also be used for language translation, zip and unzip,
encryption and decryption, compile and decompile, modulation and
demodulation.
Keywords: auto-encoder,
supervised learning, self-supervised learning, inverse function,
abstract network, concrete network, computer vision, jump connection,
negative feedback
Why We Design the Concrete Network?
Abstract network(namely prediction network, always used as
classification and regression) is common now. But the concrete
network(namely generation network) which generates concrete information
from concept or label is rare. Why we need concrete network? I take the
kids’ learning process as example. An infant teacher takes out several
pictures. There is a horse in all the pictures. She points at one of the
pictures and says it’s a horse, the horse has what features, the
features have what shapes. So the kids learn how to recognize a horse.
It is supervised learning. Then the teacher asks a question that how to
draw the horse? The kids can draw their own horse by using the features
they recognized. It is the process from label or concept to concrete
information and it’s self-supervised learning. I design the network
according to this process to simulate it.
How to Design the Concrete Network?
Abstract network(prediction network) is easy to design. How to design
the concrete network? The following is the detail. The top half is an
abstract network which is trained by supervised learning and the bottom
half is a concrete network which is accomplished by inverse function of
the top half and trained by self-supervised learning. We train the top
half first, then set it as untrainable, then train the bottom half. I
take two layers abstract network(binary classification)which is fully
connected and two layers concrete network as example. It is shown as
Fig. 1.
The Inverse Function
The following are inversion of the
functions. The function of fully connected layer is
(1)
so the inverse function of it is
(2)
Because the linear function’s inverse function is also linear, so the
fully connected layer’s structure of concrete network is the same as
abstract network layer’s. Because w is a matrix, we need the inverse of
w, so the w needs to be square matrix. But the real situation is that
the dimension of w is determined by the neuron numbers of the two layer
next to each other, so this network can not
reproduce the inputs, but
approximate the inputs (see Fig. 1).