Nuevas arquitecturas hardware de procesamiento de alto rendimiento para aprendizaje profundo

A.J. Rivera; Charte Luque, Francisco David; M. Espinilla; M.D. Pérez-Godoy

Nuevas arquitecturas hardware de procesamiento de alto rendimiento para aprendizaje profundo

A.J. Rivera
Charte Luque, Francisco David
M. Espinilla
M.D. Pérez-Godoy

Journal:

Enseñanza y aprendizaje de ingeniería de computadores: Revista de Experiencias Docentes en Ingeniería de Computadores

ISSN: 2173-8688

Year of publication: 2018

Issue: 8

Pages: 67-84

Type: Article

DIALNET GOOGLE SCHOLAR DIGIBUG editor

More publications in: Enseñanza y aprendizaje de ingeniería de computadores: Revista de Experiencias Docentes en Ingeniería de Computadores

Abstract

The design and manufacture of hardware is costly, both in terms of time and economic investment, which is why integrated circuits are always manufactured in large volumes, to take advantage of economies of scale. For this reason, the majority of processors manufactured are general purpose, thus broadening their scope of application. In recent years, however, more and more processors have been manufactured for specic applications, including those designed to accelerate work with deep neural networks. This article introduces the need for this type of specialized hardware, describing its purpose, operation and current implementations.

Bibliographic References

D. López Talavera, C. Rus Casas, F. Charte Ojeda, Estructura y tecnología de computadores, Anaya, ISBN: 84-415-2606-8, 2009.
E. Martín Cuenca, J. M. Angulo Usategui, I. Angulo Martínez, Microcontroladores PIC. La solución en un chip, ITP-Paraninfo, ISBN: 84-283-2371-2, 1999.
T. L. Floyd, Digital Fundamentals, 10/e, Pearson Education, 2011.
Wikipedia, Internet de las cosas. URL https://es.wikipedia.org/wiki/Internet_de_las_cosas
I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press, 2016, http://www.deeplearningbook.org.
C. J. Hughes, Single-Instruction Multiple-Data Execution, 2015. https://doi.org/10.2200/S00647ED1V01Y201505CAC032
F. Charte, A. J. Rueda, M. Espinilla, R. A. J., Evolución tecnológica del hardware de vídeo y las GPU en los ordenadores personales, Enseñanza y Aprendizaje de Ingeniería de Computadores (7) (2017) 111-128. URL http://hdl.handle.net/10481/47376
F. Charte, A. J. Rivera, F. J. Pulgar, M. J. d. Jesús, Explotación de la potencia de procesamiento mediante paralelismo: un recorrido histórico hasta la GPGPU, Enseñanza y Aprendizaje de Ingeniería de Computadores (6) (2016) 19-33. URL http://hdl.handle.net/10481/41910
Nvidia, The New GPU Architecture Designed to Bring AI to Every Industry. URL https://www.nvidia.com/en-us/data-center/volta-gpu-architecture/
G. E. Hinton, S. Osindero, Y.-W. Teh, A fast learning algorithm for deep belief nets, Neural computation 18 (7) (2006) 1527-1554.
W. Rawat, Z. Wang, Deep convolutional neural networks for image classi-cation: A comprehensive review, Neural computation 29:9 (2017) 2352-2449.
D. Charte, F. Charte, S. García, M. J. del Jesús, F. Herrera, A practical tutorial on autoencoders for nonlinear feature fusion: Taxonomy, models, software and guidelines, Information Fusion 44 (2018) 78-96.
S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural computation 9 8 (1997) 1735-80.
D. E. Rumelhart, G. E. Hinton, R. J. Williams, Learning internal representations by error propagation, Tech. rep., DTIC Document (1985).
C. Szegedy, S. Io-e, V. Vanhoucke, A. A. Alemi, Inception-v4, inception-resnet and the impact of residual connections on learning, in: AAAI, 2017.
Y. LeCun, C. Cortes, C. J. Burges, The MNIST database of handwritten digits. URL http://yann.lecun.com/exdb/mnist/
M. Abadi, et al., TensorFlow: Large-scale machine learning on heterogeneous systems, software available from tensor-ow.org (2015). URL https://www.tensorflow.org/
S. Treil, Linear algebra done wrong, 2016. URL https://www.math.brown.edu/~treil/papers/LADW/LADW.html
T. G. Kolda, B. W. Bader, Tensor decompositions and applications, SIAM review 51 (3) (2009) 455-500.
J. Ortega Lopera, M. Anguita López, A. Prieto Espinosa, Arquitectura de computadores, Thomson, 2005.
K. He, X. Zhang, S. Ren, J. Sun, Identity mappings in deep residual networks, in: European Conference on Computer Vision, Springer, 2016, pp. 630-645.
N. P. Jouppi, et al., In-datacenter performance analysis of a tensor processing unit, SIGARCH Comput. Archit. News 45 (2) (2017) 1-12. doi:10.1145/3140659.3080246.
D. Lin, S. Talathi, S. Annapureddy, Fixed point quantization of deep convolutional networks, in: International Conference on Machine Learning, 2016, pp. 2849-2858.
K. Sato, C. Young, D. Patterson, An in-depth look at Google's -rst Tensor Processing Unit (TPU) (2017). URL https://cloud.google.com/blog/big-data/2017/05/ an-in-depth-look-at-googles-first-tensor-processing-unit-tpu
U. Köster, T. Webb, X. Wang, M. Nassar, A. K. Bansal, W. Constable, O. Elibol, S. Gray, S. Hall, L. Hornof, et al., Flexpoint: An adaptive numerical format for e-cient training of deep neural networks, in: Advances in Neural Information Processing Systems, 2017, pp. 1742-1752.
S. Markidis, S. W. Der Chien, E. Laure, I. B. Peng, J. S. Vetter, Nvidia tensor core programmability, performance & precision, arXiv preprint arXiv:1803.04014.
F. Charte, M. Espinilla, R. A. J., P. F. J., Uso de dispositivos FPGA como apoyo a la enseñanza de asignaturas de arquitectura de computadores, Enseñanza y Aprendizaje de Ingeniería de Computadores (7) (2017) 37-52. URL http://hdl.handle.net/10481/47371

Data source: Dialnet