Alsaby, Faisal Abdullah, (The George Washington University), “Investigation and Development of a Novel Clustering Algorithm for Big Data Applications” (2016)

You are here

Inside Signal Processing Newsletter Home Page

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

News and Resources for Members of the IEEE Signal Processing Society

Alsaby, Faisal Abdullah, (The George Washington University), “Investigation and Development of a Novel Clustering Algorithm for Big Data Applications” (2016)

Alsaby, Faisal Abdullah, (The George Washington University), “Investigation and Development of a Novel Clustering Algorithm for Big Data Applications” (2016) Advisor: Berkovich, Simon Y.

The advances in digital technology including sensors, communications, cloud computing and storage have allowed people to generate large volumes of data which as a result have posed some difficulties on the current computation capabilities. This data could be of any format such as texts, geometries, images, videos, sounds, or their combination. As many organizations have become data-driven entities, developing supervised and unsupervised learning algorithms that are scalable and efficient in processing big data has attracted researchers from different applications domains such as computer vision, data mining, computational biology and social sciences. One feasible way to deal with this massive data is to group them into subsets of categories or clusters where objects within a cluster are similar to one another and dissimilar to objects in other clusters. This indispensable technique can often lead to find models, discover useful knowledge and insights as well as uncover hidden features or characteristics that naturally divide the cases. In this work, the authors extend previous efforts for developing a hash-based clustering algorithm that employs double Golay encoding technique.

Initially, the authors study the structures and properties of the error-correction codes and introduce our improved version of a hash-based clustering algorithm which is scalable, efficient, reliable and much more suitable to handle big data applications. Our approach in particular utilizes single encoding technique; therefore, it increases the quality of the clustering method and reduces the overall computational cost. This clustering algorithm can be effectively applied to various computational intelligence problems.

Furthermore, the authors provide a fast and noise-robust pattern prediction and classification algorithm. This method aims at classifying an input pattern into a specific category or class by using some characteristics derived from Golay clustering scheme. Moreover, the authors investigate and discuss the factors that could increase the noise and illustrate how they could be controlled. To support their theoretical arguments, empirical evidence is shown. Also, number of similar algorithms are reviewed and evaluated for their relative performances.
In addition, the authors present a clustering ensembles approach that attempts to achieve and improve the clustering quality. This clustering ensembles approach generates a set of clustering schemes from the same dataset and combines them into a concluding clustering. Hence, this technique ensures that the produced clustering is a consensus of multiple clustering schemes.

SPS Social Media

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel