Implementing a Distributed Deep Learning Network over Spark

You are here

Inside Signal Processing Newsletter Home Page

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

News and Resources for Members of the IEEE Signal Processing Society

Implementing a Distributed Deep Learning Network over Spark

 

Deep learning is becoming an important AI paradigm for pattern recognition, image/video processing and fraud detection applications in finance. The computational complexity of a deep learning network dictates need for a distributed realization. The big data lab at Impetus, led by Dr. Vijay Srinivas Agneeswaran, has built the first prototype of the distributed deep learning network over Apache Spark. They have parallelized the training phase of the network and consequently reduced training time.

Geoffrey Hinton presented the paradigm for fast learning in a deep belief network [1]. This paper, with the advent of GPUs and widespread availability of computing power, was seminal. Consequently a number of applications are being realized over it, in various fields such as credit card fraud detection, multi-modal information processing etc. This is in addition to speech recognition and image processing, which have been already transformed by the application of deep learning [2]. A few other efforts to realize distributed deep learning networks include the Google’s work by Jeffrey Dean [3], the Sparkling water from HexData, the DeepLearning4J etc.

Spark is the next generation Hadoop framework from the UC Berkeley and Databricks teams – even the Hadoop vendors have started bundling and distributing Spark with Hadoop versions. The Impetus researchers have implemented a stacked Restricted Boltzman Machines as a deep belief network, similar to [4]. The architecture of their deep learning network over Spark is given in the diagram above. Every node in the cluster runs a copy of the whole deep learning network – they start from the same exact network. However, as each node looks at only parts of the training data, they diverge, reflecting the lessons learnt from the respective training data. The nodes use a publish-subscribe system that the researchers have built over Spark to exchange the results. Each node is however asynchronously running the network through the local training data and occasionally synchronizing with other nodes. Eventually, the process will ensure the equivalent of all nodes learning all the training data.

This is the first attempt at realizing a distributed deep learning network directly over Spark. The researchers are working on a few applications including image search and semantic compositionality (to provide natural language interface to relational queries) to show case the power of the deep learning platform.

For more details, please visit http://www.datasciencecentral.com/profiles/blogs/implementing-a-distributed-deep-learning-network-over-spark.

References

[1] Hinton, Geoffrey, Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18.7 (2006): 1527-1554.

[2] Li Deng and Dong Yu, "Deep Learning: Methods and Applications." Foundations and Trends® in Signal Processing: Vol. 7: No. 3–4, pp 197-387, 2014.

[3] Dean, Jeffrey, et al. “Large scale distributed deep networks.” Advances in Neural Information Processing Systems. 2012.

[4] Le Roux, Nicolas, and Yoshua Bengio. "Representational power of restricted Boltzmann machines and deep belief networks." Neural Computation 20.6 (2008): 1631-1649.

Open Calls

Nomination/Position Deadline
Call for Nominations: IEEE Transactions on Multimedia (TMM) Editor-in-Chief 15 June 2025
Call for Nominations: IEEE Medals & Recognitions 15 June 2025
Call For Industry Short Course Proposals is Open 15 June 2025
Call for Papers for IEEE JSTSP Special Series on Artificial Intelligence for Smart Agriculture 15 June 2025
Call for Nominations: IEEE Transactions on Multimedia (TMM) Editor-in-Chief 15 June 2025
Call For Industry Short Course Proposals is Open 15 June 2025
Call for Nominations: Fellow Evaluation Committee Member Positions 20 June 2025
Call for Nominations: Fellow Evaluation Committee Member Positions 20 June 2025
2025 IEEE SPS Scholarship Program Now Open! 30 June 2025
Call for Papers IEEE Journal of Selected Topics in Signal Processing (JSTSP) Special Series on AI in Signal & Data Science -- Toward Large Language Model (LLM) Theory and Applications (Update) 1 July 2025
ICASSP 2026 Call for Satellite Workshops 9 July 2025
Call for Nominations for Chair, Women in Signal Processing Committee (WISP) 14 July 2025
Call for Nominations for Chair, Scholarship Committee 14 July 2025
Call for Nominations for Chair, Women in Signal Processing (WISP) 14 July 2025
Call for Nominations for Chair, Scholarship Committee 14 July 2025
Nominate a Colleague for a 2025 IEEE Signal Processing Society Award 1 September 2025
Nominate a Colleague for a 2025 IEEE Signal Processing Society Award 1 September 2025
Call for Mentors: 2025 IEEE SPS SigMA Program - Signal Processing Mentorship Academy 14 September 2025

Table of Contents:

Conferences & Events

SPS Social Media

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel