Neural Machine Translation by Jointly Learning to Align and Translate ,
Does Multimodality Help Human and Machine for Translation and Image Captioning?, Proceedings of the First Conference on Machine Translation: Volume 2,
Shared Task Papers, pp.627-633, 2016. ,
DOI : 10.18653/v1/W16-2358
URL : https://hal.archives-ouvertes.fr/hal-01433183
Multimodal Attention for Neural Machine Translation. arXiv preprint arXiv:1609 ,
LIUM-CVC Submissions for WMT17 Multimodal Translation Task, Proceedings of the Second Conference on Machine Translation, 2017. ,
Microsoft COCO captions: Data collection and evaluation server. arXiv preprint, 2015. ,
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling ,
Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description, Proceedings of the Second Conference on Machine Translation, 2017. ,
Conditional Gated Recurrent Unit with Attention Mechanism, 2016. ,
Factored Neural Machine Translation Architectures, Proceedings of the International Workshop on Spoken Language Translation , IWSLT'16 ,
LIUM Machine Translation Systems for WMT17 News Translation Task, Proceedings of the Second Conference on Machine Translation, 2017. ,
Understanding the difficulty of training deep feedforward neural networks, Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics of Proceedings of Machine Learning Research, pp.249-256, 2010. ,
Maxout Networks, Proceedings of the 30th International Conference on Machine Learning of Proceedings of Machine NMTPY, pp.15-28 ,
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, 2015 IEEE International Conference on Computer Vision (ICCV), pp.1026-1034, 2015. ,
DOI : 10.1109/ICCV.2015.123
URL : http://arxiv.org/pdf/1502.01852
Neural Monkey: An Open-source Tool for Sequence Learning. The Prague Bulletin of Mathematical Linguistics, pp.5-17, 2017. ,
DOI : 10.1515/pralin-2017-0001
URL : https://doi.org/10.1515/pralin-2017-0001
Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling, 2016. ,
Adam: A method for stochastic optimization. arXiv preprint arXiv:1412 ,
OpenNMT: Open-Source Toolkit for Neural Machine Translation, Proceedings of ACL 2017, System Demonstrations, 2017. ,
DOI : 10.18653/v1/P17-4012
URL : http://arxiv.org/pdf/1701.02810
Meteor, Proceedings of the Second Workshop on Statistical Machine Translation, StatMT '07, pp.228-231, 2007. ,
DOI : 10.3115/1626355.1626389
Adding gradient noise improves learning for very deep networks ,
BLEU, Proceedings of the 40th Annual Meeting on Association for Computational Linguistics , ACL '02, pp.311-318, 2002. ,
DOI : 10.3115/1073083.1073135
On the Difficulty of Training Recurrent Neural Networks, Proceedings of the 30th International Conference on International Conference on Machine Learning, pp.1310-1318 ,
Using the output embedding to improve language models. arXiv preprint, 2016. ,
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv preprint arXiv:1312, 2013. ,
A Joint Dependency Model of Morphological and Syntactic Structure for Statistical Machine Translation, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp.114-121, 2015. ,
DOI : 10.18653/v1/D15-1248
Neural Machine Translation of Rare Words with Subword Units, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.1715-1725, 2016. ,
DOI : 10.18653/v1/P16-1162
Nematus: a Toolkit for Neural Machine Translation, Proceedings of the Software Demonstrations of the 15th Conference of
the European Chapter of the Association for Computational Linguistics, pp.65-68 ,
DOI : 10.18653/v1/E17-3017
URL : http://arxiv.org/pdf/1703.04357
Dropout: A Simple Way to Prevent Neural Networks from Overfitting, J. Mach. Learn. Res, vol.15, issue.1, pp.1929-1958, 2014. ,
Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude, COURSERA: Neural networks for machine learning, p.2012 ,
Attend and Tell: Neural Image Caption Generation with Visual Attention, Proceedings of the 32nd International Conference on Machine Learning (ICML-15) Conference Proceedings, pp.2048-2057, 2015. ,
ADADELTA: an adaptive learning rate method. arXiv preprint, 2012. ,