4.2 Development Tools
DL platform provides an interface to design deep learning architectures easily by using pre-built and optimized libraries or components. Optimized performance, easy to code, parallelization, reduced computations, automatic gradient computations are some key characteristics of a good deep learning platform. Leading companies such as Google, Microsoft, Nvidia, Amazon are investing heavy money in developing Graphic Processing Unit (GPU) accelerated deep learning platforms for implementation of fast and large computation. From those all the existing platforms, TensorFlow is widely used and most popular among the users that is why we use in this research.
In this section deep learning platforms are reviewed as follows.
4.2.1 TensorFlow
This platform was introduced by Google brain team in late 2015. Its support languages such as Python, C++, R, and Java which make this tool popular. Moreover, it allows to work with one or more CPUs and GPUs with high data scalability. Hence, an individual person with a tablet or largescale distributed system can rely on TensorFlow. However, scholars suggested to use TensorFlow with server grade multi-thread implementation [81]. It takes any model as directed acyclic graph (DAG) where nodes of the graph present mathematical operations whereas edges present tensors (multi-dimensional array) between them. Video analysis, visualization of distribution, sound recognition, time-series analysis, and object detection are some uses of TensorFlow. Furthermore, TensorFlow supports distributed training, provides low latency for mobile users, and easy to integrate with SQL tables.
TensorFlow is more suited for most of deep learning models due to its features like that it supports an extensive built-in support for deep learning and Mathematical function for neural network.
4.2.2 Deeplearning4J
Deep Learning for Java (DL4J) is a robust, open-source distributed deep learning framework for the JVM created by Skymind [82], which has been contributed to the Eclipse Foundation and their Java ecosystem. DL4J is designed to be commercial-grade as well as open source, supporting Java and Scala APIs, operating in distributed environments, such as integrating with Apache Hadoop and Spark, and can import models from other deep learning frameworks (TensorFlow, Caffe, Theano) [83]. It also includes implementations of restricted Boltzmann machines, deep belief networks, deep stacked autoencoders, recursive neural networks, and more, which would need to be built from the ground up or through example code in many other platforms.
4.2.3 Theano
Theano is a highly popular deep learning platform designed primarily by academics which, unfortunately, is no longer supported after release 1.0.0 (November, 2017). Initiated in 2007, Theano is a Python library designed for performing mathematical operations on multi-dimensional arrays and to optimize code compilation [84], primarily for scientific research applications. More specifically, Theano was designed to surpass other Python libraries, like NumPy, in execution speed and stability optimizations, and computing symbolic graphs. Theano supports tensor operations, GPU computation, runs on Python 2 and 3, and supports parallelism via BLAS and SIMD support.
4.2.4 Torch
Torch is also a scientific computing framework; however, its focus is primarily on GPU accelerated computation. It is implemented in C and provides its own scripting language, LuaJIT, based on Lua. In addition, Torch is mainly supported on Mac OS X and Ubuntu 12C, while Windows implementations are not officially supported [85]. Nonetheless, implementations have been developed for iOS and Android mobile platforms. Much of the Torch documentation and implementations of various algorithms are community driven and hosted on GitHub. Despite the
GPU-centric implementation, a recent benchmarking study [86] demonstrated that Torch does not surpass the competition (CNTK, MXNet, Caffe) in single- or multi-GPU computation in any meaningful way, but is still ideal for certain types of networks.
4.2.5 Caffe and Caffe2
Caffe was designed by Berkeley AI Research (BAIR) and the Berkeley Vision and Learning Center (BVLC) at UC Berkeley to provide expressive architecture and GPU support for deep learning and primarily image classi_cation, originating in 2014 [87] [88]. Caffe is a pure C + + and CUDA library, which can also be operated in command line, Python, and MatLab interfaces. It runs on bare CUDA devices and mobile platforms, and has additionally been extended for use in the Apache Hadoop ecosystem with Spark, among others. Caffe2, as part of Facebook Research and Facebook Open Source, builds upon the original Caffe project, implementing an additional
Python API, supports Mac OS X, Windows, Linux, iOS, Android, and other build platforms [89]. 4.2.6 Keras
Though not a deep learning framework on its own, Keras provides a high-level API that integrates with TensorFlow, Theano, and CNTK. The strength of Keras is the ability to rapidly prototype a deep learning design with a user-friendly, modular, and extensible interface. Keras operates on CPUs and GPUs, supports CNNs and RNNs, is developer-friendly, and can integrate other common machine learning packages, such as scikit-learn for Python [90]. In addition, it has been widely adopted by researchers and industry groups over the last year.
4.2.7 MXNET
Apache MXNet supports Python, R, Scala, Julia, C++, and Perl APIs, as well as the new Gluon
API, and supports both imperative and symbolic programming. The project began around Mid2015, with version 1.0.0 released in December of 2017. MXNet was intended to be scalable, and was designed from a systems perspective to reduce data loading and I/O complexity [91]. It has proven to be highly efficient primarily in single- and multi-GPU implementations, while CPU implementations are typically lacking [92].
4.2.8 Microsoft Cognitive Toolkit (CNTK)
The Microsoft Cognitive Toolkit, otherwise known as CNTK, began development in Mid-2015. It can be included as a library in Python, C#, and C + + programs, or be used as a standalone with its own scripting language, BrainScript. It can also run evaluation functions of models from Java code, and utilizes ONNX, an open-source neural network model format that allows transfer between other deep learning frameworks (Caffe2, PyTorch, MXNet) [93]. Conceptually, CNTK is designed to be easy-to-use and production-ready for use on large production scale data, and is supported on Linux and Windows. In CNTK, neural networks are considered as a series of computational steps via directed graphs, and both neural network building blocks and deeper libraries are provided. CNTK has emerged as a computationally powerful tool for machine learning with performance similar to other platforms that have seen longer development and more widespread use [92].
4.2.9 Performance Evaluation Metrics
To evaluate the effectiveness of our proposed solution, we use the mean absolute error (MAE), root-mean-square error (RMSE), and mean relative error (MRE) as the performance evaluation metrics. Root mean square error (RMSE) and mean absolute percentage error (MAPE) are the most commonly used performance measures. RMSE variants like Normalized RMSE(NRMSE) and RMSE with cost (RMSEC) have also been used. Performance measures like Mean Absolute
Relative Error(MARE), Equal Coefficient(EC), R square, Mean Square Error(MSE), Mean Relative Error(MRE), Accuracy and Variance Absolute Percentage Error(VAPE) are adopted in significant numbers but not nearly as much as the ones mentioned above Performance measures that are unconventional to the field of traffic forecasting like precision, recall, F1, FP rate, sensitivity and various others as well as custom performance measures proposed by authors, all together make up a significant amount of the total. This makes it harder to compare models between papers. Zilu Liang and Yasushi Wakahara suggested that that Symmetric MAPE be used instead of the widely adopted MAPE as MAPE yields a biased evaluation when real value is close to zero [94].