A Functionally Separate Autoencoder
Jinxin Wei1, Qunying Ren2
1Vocational School of Juancheng, Juancheng 274600 China
2Bureau of Emergency Management of Juancheng County, Juancheng 274600 China
Abstract: According to kids’ learning process, an auto-encoder which can be split into two parts is designed. The two parts can work well separately. The top half is an abstract network which is trained by supervised learning and can be used to classify and regress. The bottom half is a concrete network which is accomplished by inverse function and trained by self-supervised learning. It can generate the input of abstract network from concept or label. The network can achieve its intended functionality through testing by mnist dataset and convolution neural network. Round function is added between the abstract network and concrete network in order to get the representative generation of class. The generation ability can be increased by adding jump connection and negative feedback. At last, the characteristics of the network is discussed. The input can be changed to any form by encoder and then change it back by decoder through inverse function. The concrete network can be seen as the memory stored by the parameters. Lethe is that when new knowledge input, the training process makes the parameters change. At last, the application of the network is discussed. The network can be used for logic generation through deep reinforcement learning. The network can also be used for language translation, zip and unzip, encryption and decryption, compile and decompile, modulation and demodulation.
Keywords: auto-encoder, supervised learning, self-supervised learning, inverse function, abstract network, concrete network, computer vision, jump connection, negative feedback
Why We Design the Concrete Network?
Abstract network(namely prediction network, always used as classification and regression) is common now. But the concrete network(namely generation network) which generates concrete information from concept or label is rare. Why we need concrete network? I take the kids’ learning process as example. An infant teacher takes out several pictures. There is a horse in all the pictures. She points at one of the pictures and says it’s a horse, the horse has what features, the features have what shapes. So the kids learn how to recognize a horse. It is supervised learning. Then the teacher asks a question that how to draw the horse? The kids can draw their own horse by using the features they recognized. It is the process from label or concept to concrete information and it’s self-supervised learning. I design the network according to this process to simulate it.
How to Design the Concrete Network?
Abstract network(prediction network) is easy to design. How to design the concrete network? The following is the detail. The top half is an abstract network which is trained by supervised learning and the bottom half is a concrete network which is accomplished by inverse function of the top half and trained by self-supervised learning. We train the top half first, then set it as untrainable, then train the bottom half. I take two layers abstract network(binary classification)which is fully connected and two layers concrete network as example. It is shown as Fig. 1.
The Inverse Function
The following are inversion of the functions. The function of fully connected layer is
(1)
so the inverse function of it is
(2)
Because the linear function’s inverse function is also linear, so the fully connected layer’s structure of concrete network is the same as abstract network layer’s. Because w is a matrix, we need the inverse of w, so the w needs to be square matrix. But the real situation is that the dimension of w is determined by the neuron numbers of the two layer next to each other, so this network can not reproduce the inputs, but approximate the inputs (see Fig. 1).