Scene Parsing with Deep Convolutional Networks

David Grangier	Leon Bottou	Ronan Collobert
NEC Labs America, Princeton, NJ
firstname@grangier.info	firstname@bottou.org	firstname@collobert.com
We propose a deep learning strategy for scene parsing, i.e. to asssign a class label to each pixel of an image. We investigate the use of deep convolutional network for modeling the complex scene label structures, relying on a supervised greedy learning strategy. Compared to standard approaches based on CRFs, our strategy does not need hand-crafted features, allows modeling more complex spatial dependencies and has a lower inference cost. Experiments over the MSRC benchmark and the LabelMe dataset show the effectiveness of our approach. [pdf] The following presents results obtained over the LabelMe dataset with 20 classes, we learned our model over the spanish city data. The model is evaluated over the data from the city of Madrid, help out during training.