Apache SINGA is an Apache top-level project for developing an open source machine learning library. It provides a flexible architecture for scalable distributed training, is extensible to run over a wide range of hardware, and has a focus on health-care applications.
|Developer(s)||Apache Software Foundation|
|Initial release||October 8, 2015|
2.0.0 / April 20, 2019
|Written in||C++, Python, Java|
|Operating system||Linux, macOS, Windows|
|License||Apache License 2.0|
The SINGA project was initiated by the DB System Group at National University of Singapore in 2014, in collaboration with the database group of Zhejiang University, in order to support complex analytics at scale, and make database systems more intelligent and autonomic. It focused on distributed deep learning by partitioning the model and data onto nodes in a cluster and parallelize the training. The prototype was accepted by Apache Incubator in March 2015, and graduated as a top-level project in October 2019. Seven versions have been released as shown in the following table. Since V1.0, SINGA is general to support traditional machine learning models such as logistic regression. Companies like NetEase, yzBigData,Shentilium and others are using SINGA for their applications, including healthcare and finance.
|Version||Original release date||Latest version||Release date|
|Current stable version: 2.0.0||2019-04-20||2.0.0||2019-04-20|
|Older version, yet still supported: 1.2.0||2018-06-06||1.2.0||2018-06-06|
|Older version, yet still supported: 1.1.0||2017-02-12||1.1.0||2017-02-12|
|Older version, yet still supported: 1.0.0||2016-09-08||1.0.0||2016-09-08|
|Old version, no longer supported: 0.3.0||2016-04-20||0.1.0||2016-04-20|
|Old version, no longer supported: 0.2.0||2016-01-14||0.2.0||2016-01-14|
|Old version, no longer supported: 0.1.0||2015-10-08||0.1.0||2015-10-08|
SINGA's software stack includes three major components, namely, core, IO and model. The following figure illustrates these components together with the hardware. The core component provides memory management and tensor operations; IO has classes for reading (and writing) data from (to) disk and network; The model component provides data structures and algorithms for machine learning models, e.g., layers for neural network models, optimizers/initializer/metric/loss for general machine learning models.
Benchmark for Distributed training
Workload: we use a deep convolutional neural network, ResNet-50 as the application. ResNet-50 has 50 convolution layers for image classification. It requires 3.8 GFLOPs to pass a single image (of size 224x224) through the network. The input image size is 224x224.
Hardware: we use p2.8xlarge instances from AWS, each of which has 8 Nvidia Tesla K80 GPUs, 96 GB GPU memory in total, 32 vCPU, 488 GB main memory, 10 Gbit/s network bandwidth.
Metric: we measure the time per iteration for different number of workers to evaluate the scalability of SINGA. The batch size is fixed to be 32 per GPU. Synchronous training scheme is applied. As a result, the effective batch size is $32N$, where N is the number of GPUs. We compare with a popular open source system which uses the parameter server topology. The first GPU is selected as the server. In the following figure, bars are for the throughput and lines are for the communication cost.
- Core classes
- Model classes
- Linear Regression
- Multi-layer Perceptron
- Convolutional Neural Network (CNN)
- Recurrent Neural Networks (RNN)
- Restricted Boltzmann Machine (RBM)
There is also an online course about SINGA.
- List of Apache Software Foundation projects
- Comparison of deep learning software
- Wei, Wang; Meihui, Zhang; Gang, Chen; H.V., Jagadish; Beng Chin, Ooi; Kian-Lee, Tan; Sheng, Wang (June 2016). "Database Meets Deep Learning: Challenges and Opportunities". SIGMOD Record. 45 (2): 17–22. arXiv:1906.08986. doi:10.1145/3003665.3003669.
- Ooi, Beng Chin; Tan, Kian-Lee; Sheng, Wang; Wang, Wei; Cai, Qingchao; Chen, Gang; Gao, Jinyang; Luo, Zhaojing; Tung, Anthony K. H.; Wang, Yuan; Xie, Zhongle; Zhang, Meihui; Zheng, Kaiping (2015). "SINGA: A distributed deep learning platform" (PDF). ACM Multimedia. doi:10.1145/2733373.2807410. Retrieved 8 September 2016.
- Wei, Wang; Chen, Gang; Anh Dinh, Tien Tuan; Gao, Jinyang; Ooi, Beng Chin; Tan, Kian-Lee; Sheng, Wang (2015). "SINGA: putting deep learning in the hands of multimedia users" (PDF). ACM Multimedia. doi:10.1145/2733373.2806232. Retrieved 8 September 2016.
- 网易. "网易携手Apache SINGA角逐人工智能新战场_网易科技". tech.163.com. Retrieved 2017-06-03.
- "New app allows pre-diabetics to use photos of their meal to check if it is healthy". www.straitstimes.com. Retrieved 6 April 2019.
- Wang, Wei; Gao, Jinyang; Zhang, Meihui; Sheng, Wang; Chen, Gang; Khim Ng, Teck; Ooi, Beng Chin; Shao, Jie; Reyad, Moaz (2018). "Rafiki: Machine Learning as an Analytics Service System" (PDF). PVLDB 12(2). doi:10.14778/3282495.3282499. Retrieved 9 January 2019.