While neural networks used in practice are often very deep, the benefit of depth is not well understood. Interestingly, it is known that increasing depth is often harmful for regression tasks. In this work, we show that, in contrast to regression, very deep networks can be Bayes optimal for classification. In particular, this research provides simple and explicit activation functions that can be used with standard neural network architectures to achieve consistency. This work provides fundamental understanding of classification using deep neural networks, and the research team envisions it will help guide the design of future neural network architectures.