Reading List for Deep Learning – Deep Learning Research Laboratory at Virginia Tech (DLRL@VT)

1 Methodologies

1.1 Basic Model

[1] Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. “A fast learning algorithm for deep belief nets.” Neural computation 18.7 (2006): 1527-1554. [pdf]

[2] Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. “Reducing the dimensionality of data with neural networks.” Science 313.5786 (2006): 504-507. [pdf]

[3] Hinton, Geoffrey E., et al. “Improving neural networks by preventing co-adaptation of feature detectors.” arXiv preprint arXiv:1207.0580 (2012). [pdf]

[4] Srivastava, Nitish, et al. “Dropout: a simple way to prevent neural networks from overfitting.” Journal of Machine Learning Research 15.1 (2014): 1929-1958. [pdf]

[5] Ioffe, Sergey, and Christian Szegedy. “Batch normalization: Accelerating deep network training by reducing internal covariate shift.” ICML 2015. [pdf]

[6] Courbariaux, Matthieu, et al. “Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to+ 1 or−1.” arXiv:1602.02830 [pdf]

[7] Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. “Distilling the knowledge in a neural network.” arXiv preprint arXiv:1503.02531 (2015). [pdf]

1.2 Optimization

[1] Sutskever, Ilya, et al. “On the importance of initialization and momentum in deep learning.” ICML (3) 28 (2013): 1139-1147. [pdf]

[2] Kingma, Diederik, and Jimmy Ba. “Adam: A method for stochastic optimization.” arXiv preprint arXiv:1412.6980 (2014). [pdf]

[3] Han, Song, Huizi Mao, and William J. Dally. “Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding.” CoRR, abs/1510.00149 2 (2015). [pdf]

1.3 Sequence-to-Sequence Model / RNN

[1] Graves, Alex. “Generating sequences with recurrent neural networks.” arXiv preprint arXiv:1308.0850 (2013). [pdf]

[2] Cho, Kyunghyun, et al. “Learning phrase representations using RNN encoder-decoder for statistical machine translation.” arXiv preprint arXiv:1406.1078 (2014). [pdf]

[3] Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. “Sequence to sequence learning with neural networks.” NIPS 2014. [pdf]

[4] Bahdanau, Dzmitry, KyungHyun Cho, and Yoshua Bengio. “Neural Machine Translation by Jointly Learning to Align and Translate.” arXiv preprint arXiv:1409.0473 (2014). [pdf]

1.4 Unsupervised Learning / Deep Generative Model

[1] Le, Quoc V. “Building high-level features using large scale unsupervised learning.” ICASSP 2013. [pdf]

[2] Kingma, Diederik P., and Max Welling. “Auto-encoding variational bayes.” arXiv preprint arXiv:1312.6114 (2013). [pdf]

[3] Ian Goodfellow, et al. “Generative adversarial nets.” NIPS 2014. [pdf]

[4] Radford, Alec, Luke Metz, and Soumith Chintala. “Unsupervised representation learning with deep convolutional generative adversarial networks.” arXiv preprint arXiv:1511.06434 (2015). [pdf]

[5] Gregor, Karol, et al. “DRAW: A recurrent neural network for image generation.” arXiv preprint arXiv:1502.04623 (2015). [pdf]

[6] Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. “Pixel recurrent neural networks.” arXiv preprint arXiv:1601.06759 (2016). [pdf]

[7] Oord, Aaron van den, et al. “Conditional image generation with PixelCNN decoders.” NIPS 2016. [pdf]

1.5 Deep Reinforcement Learning

[1] Mnih, Volodymyr, et al. “Playing atari with deep reinforcement learning.” arXiv preprint arXiv:1312.5602 (2013). [pdf])

[2] Mnih, Volodymyr, et al. “Human-level control through deep reinforcement learning.” Nature 518.7540 (2015): 529-533. [pdf]

[3] Mnih, Volodymyr, et al. “Asynchronous methods for deep reinforcement learning.” ICML 2016. [pdf]

[4] Lillicrap, Timothy P., et al. “Continuous control with deep reinforcement learning.” arXiv preprint arXiv:1509.02971 (2015). [pdf]

[5] Schulman, John, et al. “Trust region policy optimization.” ICML 2015. [pdf]

[6] Silver, David, et al. “Mastering the game of Go with deep neural networks and tree search.” Nature 529.7587 (2016): 484-489. [pdf]

2 Applications

2.1 Image Recognition

[1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.” NIPS 2012. [pdf]

[2] Simonyan, Karen, and Andrew Zisserman. “Very deep convolutional networks for large-scale image recognition.” arXiv preprint arXiv:1409.1556 (2014). [pdf]

[3] Christian Szegedy, et al. “Going deeper with convolutions.” CVPR 2015. [pdf]

[4] Kaiming He, et al. “Deep residual learning for image recognition.” CVPR 2016 [pdf]

2.2 Speech Recognition

[1] Geoffrey Hinton, et al. “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups.” IEEE Signal Processing Magazine 29.6 (2012): 82-97. [pdf]

[2] Graves, Alex, Abdel-rahman Mohamed, and Geoffrey Hinton. “Speech recognition with deep recurrent neural networks.” ICASSP 2013. [pdf]

[3] Dario Amodei, et al. “Deep speech 2: End-to-end speech recognition in english and mandarin.” ICML 2016. [pdf]

2.3 Natural Language Processing

[1] Mikolov, et al. “Distributed representations of words and phrases and their compositionality.” ANIPS(2013): 3111-3119 [pdf]

[2] Sutskever, et al. “Sequence to sequence learning with neural networks.” NIPS(2014) [pdf]

[3] Ankit Kumar, et al. “Ask Me Anything: Dynamic Memory Networks for Natural Language Processing.” ICML 2016. [pdf]

[4] Yoon Kim, et al. “Character-Aware Neural Language Models.” AAAI 2016 [pdf]

[5] Jason Weston, et al. “Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks.” arXiv:1502.05698(2015) [pdf]

[6] Karl Moritz Hermann, et al. “Teaching Machines to Read and Comprehend.” NIPS 2015 [pdf]

2.4 Object Detection

[1] Szegedy, Christian, Alexander Toshev, and Dumitru Erhan. “Deep neural networks for object detection.” NIPS 2013. [pdf]

[2] Girshick, Ross, et al. “Rich feature hierarchies for accurate object detection and semantic segmentation.” CVPR 2014. [pdf]

[3] He, Kaiming, et al. “Spatial pyramid pooling in deep convolutional networks for visual recognition.” ECCV 2014. [pdf]

[4] Girshick, Ross. “Fast r-cnn.” ICCV 2015. [pdf]

[5] Ren, Shaoqing, et al. “Faster R-CNN: Towards real-time object detection with region proposal networks.” NIPS 2015. [pdf]

[6] Redmon, Joseph, et al. “You only look once: Unified, real-time object detection.” CVPR 2016. [pdf]

[7] Liu, Wei, et al. “SSD: Single Shot MultiBox Detector.” ECCV 2016. [pdf]

2.5 Image Captioning

[1] Farhadi,Ali,etal. “Every picture tells a story: Generating sentences from images“. ECCV 2010. [pdf]

[2] Vinyals, Oriol, et al. “Show and tell: A neural image caption generator“. CVPR 2015. [pdf]

[3] Donahue, Jeff, et al. “Long-term recurrent convolutional networks for visual recognition and description“. CVPR 2015. [pdf]

[4] Karpathy, Andrej, and Li Fei-Fei. “Deep visual-semantic alignments for generating image descriptions“. CVPR 2015. [pdf]

[5] Fang, Hao, et al. “From captions to visual concepts and back“. CVPRR 2015. [pdf]

[6] Mao, Junhua, et al. “Deep captioning with multimodal recurrent neural networks (m-rnn)“. ICLR 2015. [pdf]

[7] Xu, Kelvin, et al. “Show, attend and tell: Neural image caption generation with visual attention“. ICML 2015. [pdf]

2.6 Machine Translation

[1] Luong, Minh-Thang, et al. “Addressing the rare word problem in neural machine translation.” arXiv preprint arXiv:1410.8206 (2014). [pdf]

[2] Sennrich, et al. “Neural machine translation of rare words with subword units.” In arXiv preprint arXiv:1508.07909, 2015. [pdf]

[3] Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. “Effective approaches to attention-based neural machine translation.” arXiv preprint arXiv:1508.04025 (2015). [pdf]

[4] Wu, Schuster, Chen, Le, et al. “Google’s neural machine translation system: bridging the gap between human and machine translation.” In arXiv preprint arXiv:1609.08144v2, 2016. [pdf]

2.7 Robotics

[1] Levine, Sergey, et al. “End-to-end training of deep visuomotor policies.” Journal of Machine Learning Research 17.39 (2016): 1-40. [pdf]

[2] Levine, Sergey, et al. “Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection.” The International Journal of Robotics Research, 2016. [pdf]

[3] Zhu, Yuke, et al. “Target-driven visual navigation in indoor scenes using deep reinforcement learning.” ICRA 2017. [pdf]

2.8 Object Segmentation

[1] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation.” CVPR, 2015. [pdf]

[2] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. “Semantic image segmentation with deep convolutional nets and fully connected CRFs.” ICLR 2015. [pdf]