This is a short summary of [1] which I wrote for a lecture of the Deep Learning course I am taking this semester.

Recently, [4] showed that shallow networks require exponentially more sum-product hidden units that deep networks to represent certain functions. For several years people used smoothed activation functions as non-linearities in the neural networks. But [2] showed that Rectified Linear Units (ReLU) are much faster to train. In 2013, [3] introduced a new form of piecewise linear activation called Maxout. This paper [1] aims to present a theoretical analysis of the advantages of deep networks with such activations over shallow networks. Such an analysis was also done in [5] which showed that, while approximating a function, deep networks are able to produce exponentially more linear regions than shallow networks with the same number of hidden units. This paper presents a tighter lower bound on the maximal number of linear regions of functions computed by neural networks than [5]. This lends deep networks an advantage over shallow networks in terms of representation power. A higher number of linear pieces means an ability to represent more complex functions.

This paper also describes how intermediate layers of deep neural networks "map several pieces of their inputs into the same output". The authors present the hypothesis that deep networks re-use and compose features from lower layers to higher layers exponentially often with the increase in the number of layers. This gives them the ability to compute highly complex functions.

Theoretical bounds on the number of linear regions produced by shallow networks were presented in [5]. They propose that the maximal number of linear regions of functions computed by shallow rectifier networks with $n_0$ inputs and $n_1$ hidden units is $\sum_{j=0}^{n_0}\binom{n_1}{j}$. They also obtain a bound of $\Omega ((\frac{n}{n_0})^{(L-1)}n^{n_0})$ on the number of linear regions of a function that can be computed by a rectifier neural network with $n_{0}$ inputs and $L$ hidden units of width $n \geq n_0$.

The main result presented in this paper is the following (Corollary 6 in [1]):

A rectifier neural network with $n_0$ input units and $L$ hidden layers of width $n \geq n_0$ can compute functions that have $\Omega ((\frac{n}{n_0})^{(L-1)n_0}n^{n_0})$ linear regions.

The main contribution of this paper is a tighter bound on the complexity of the functions that deep networks can represent. The authors theoretically show that deep networks are exponentially more efficient than shallow networks. This is a step forward in developing a theoretical understanding of deep networks. However, an important question that needs to be answered is whether we actually need an exponentially higher capacity for the tasks that we are currently interested in. Recently, [6] showed that shallow networks with a similar number of parameters as deep networks mimic the accuracies obtained by deep networks on several tasks (TIMIT and CIFAR-10). Although they used deep networks to train their shallow models, this result shows that shallow networks have the capacity to perform as well as deep networks at least on some tasks. We might just have to figure out ways to train them efficiently to ensure that their capacity is fully exploited.

[2] Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. "Deep Sparse Rectifier Neural Networks." In

[3] Goodfellow, Ian J., David Warde-Farley, Mehdi Mirza, Aaron C. Courville, and Yoshua Bengio. "Maxout networks."

[4] Delalleau, Olivier, and Yoshua Bengio. "Shallow vs. deep sum-product networks." In

[5] Pascanu, Razvan, Guido Montufar, and Yoshua Bengio. "On the number of response regions of deep feed forward networks with piece-wise linear activations."

[6] Ba, Jimmy, and Rich Caruana. "Do deep nets really need to be deep?." In

__Abstract__

*This paper presents theoretical results about the advantages of deep networks over shallow networks. The authors calculate bounds on the number of linear regions that deep networks with piecewise linear activation functions (e.g. ReLU and Maxout) can use to approximate functions. The number of linear regions used for representing a function can be thought of as a measure of the complexity of the representation model. The paper shows that deep networks produce exponentially more linear regions than shallow networks.*Recently, [4] showed that shallow networks require exponentially more sum-product hidden units that deep networks to represent certain functions. For several years people used smoothed activation functions as non-linearities in the neural networks. But [2] showed that Rectified Linear Units (ReLU) are much faster to train. In 2013, [3] introduced a new form of piecewise linear activation called Maxout. This paper [1] aims to present a theoretical analysis of the advantages of deep networks with such activations over shallow networks. Such an analysis was also done in [5] which showed that, while approximating a function, deep networks are able to produce exponentially more linear regions than shallow networks with the same number of hidden units. This paper presents a tighter lower bound on the maximal number of linear regions of functions computed by neural networks than [5]. This lends deep networks an advantage over shallow networks in terms of representation power. A higher number of linear pieces means an ability to represent more complex functions.

This paper also describes how intermediate layers of deep neural networks "map several pieces of their inputs into the same output". The authors present the hypothesis that deep networks re-use and compose features from lower layers to higher layers exponentially often with the increase in the number of layers. This gives them the ability to compute highly complex functions.

Theoretical bounds on the number of linear regions produced by shallow networks were presented in [5]. They propose that the maximal number of linear regions of functions computed by shallow rectifier networks with $n_0$ inputs and $n_1$ hidden units is $\sum_{j=0}^{n_0}\binom{n_1}{j}$. They also obtain a bound of $\Omega ((\frac{n}{n_0})^{(L-1)}n^{n_0})$ on the number of linear regions of a function that can be computed by a rectifier neural network with $n_{0}$ inputs and $L$ hidden units of width $n \geq n_0$.

The main result presented in this paper is the following (Corollary 6 in [1]):

A rectifier neural network with $n_0$ input units and $L$ hidden layers of width $n \geq n_0$ can compute functions that have $\Omega ((\frac{n}{n_0})^{(L-1)n_0}n^{n_0})$ linear regions.

The main contribution of this paper is a tighter bound on the complexity of the functions that deep networks can represent. The authors theoretically show that deep networks are exponentially more efficient than shallow networks. This is a step forward in developing a theoretical understanding of deep networks. However, an important question that needs to be answered is whether we actually need an exponentially higher capacity for the tasks that we are currently interested in. Recently, [6] showed that shallow networks with a similar number of parameters as deep networks mimic the accuracies obtained by deep networks on several tasks (TIMIT and CIFAR-10). Although they used deep networks to train their shallow models, this result shows that shallow networks have the capacity to perform as well as deep networks at least on some tasks. We might just have to figure out ways to train them efficiently to ensure that their capacity is fully exploited.

**References:****[1] Montufar, Guido F., Razvan Pascanu, Kyunghyun Cho, and Yoshua Bengio. "On the number of linear regions of deep neural networks." In Advances in neural information processing systems, pp. 2924-2932. 2014.**

[2] Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. "Deep Sparse Rectifier Neural Networks." In

*Aistats*, vol. 15, no. 106, p. 275. 2011.[3] Goodfellow, Ian J., David Warde-Farley, Mehdi Mirza, Aaron C. Courville, and Yoshua Bengio. "Maxout networks."

*ICML (3)*28 (2013): 1319-1327.[4] Delalleau, Olivier, and Yoshua Bengio. "Shallow vs. deep sum-product networks." In

*Advances in Neural Information Processing Systems*, pp. 666-674. 2011.[5] Pascanu, Razvan, Guido Montufar, and Yoshua Bengio. "On the number of response regions of deep feed forward networks with piece-wise linear activations."

*arXiv preprint arXiv:1312.6098*(2013).[6] Ba, Jimmy, and Rich Caruana. "Do deep nets really need to be deep?." In

*Advances in neural information processing systems*, pp. 2654-2662. 2014.
Here is my testimony on how I was cured of HIV by Dr Akhigbe,with his natural herbal medicine. on a regular basis in efforts to help others when I could. As you may know, each donation is tested. Well, on July 6th I had a meeting with a Red Cross representative and was told that I had HIV. “What went through your mind when you heard that "Rose" Good question reader! To be honest, I thought my life was over, that I would ever find love, get married, have children or anything normal. Lucky for me I have an amazing support system. My family supported me then I never thought that I was invincible to STD s or pregnancy or anything else parents warn their kids about. I just didn’t think about it. I was in a monogamous relationship and thought that I asked the right questions. We even talked about marriage Scary. During that time I was in college and donated blood on a re as well. who helped me in search of cure through the media.there we saw a good testimony of sister 'Kate' about the good work of Dr Akhigbe natural herbal medicine cure.then I copied his email address and contacted him. A good herbalist doctor with a good heart, he is kind, loving and caring. He replied back to my message and told me what to do. After a week the doctor sent me my herbal medicine and instructed me how to take it.Yes it worked very well, after drinking it I went to the hospital for another test and everything turned negative. What a wonderful testimony I can never forget in my life. Dr Akhigbe is a man who gave me a life to live happily forever so all I want you all to believe and know that cure of HIV is real and herbs is a powerful medicine and it works and heals. Dr Akhigbe also used his herbal medicine to cure diseases like: HERPES, DIABETES, SCABIES, HEPATITIS A/B, STROKE, CANCER, ALS, TUBERCULOSIS, ASTHMA, PENIS ENLARGEMENT, MALARIA, LUPUS, LIVER CIRRHOSIS, DEPRESSION, HIV/AIDS, EPILEPSY, BACTERIAL, DIARRHEA, HEART DISEASES, HIGH BLOOD PRESSURE, PARKINSON'S, ALZHEIMER, COLD URTICARIA, HUMAN PAPILLOMAVIRUS,INSOMNIA, BACTERIAL VAGINOSIS, SCHIZOPHRENIA, JOINT PAIN, STOMACH PAIN, CHROME DISEASES, CHLAMYDIA, INSOMNIA HEARTBURN, , THYROID, MAR BURG DISEASES, MENINGITIS, ARTHRITIS, BODY WEAK, SMALLPOX, DENGUE, FEVER, CURBS, CHANCRE, AND OTHERS VARIOUS DISEASES/ VIRUS. You are there and you find yourself in any of these situations, kindly contact Dr Akhigbe now to help you get rid of it. Here is his email address:

ReplyDeletedrrealakhigbe@gmail.com or you can write to him on whats app with his phone number: +2349010754824.

My appreciation is to share his testimony for the world to know the good work Dr Akhigbe has done for me and he will do the same for you.