Pourkamali Anaraki, Farhad. University of Colorado at Boulder, (2017)
Pourkamali Anaraki, Farhad. University of Colorado at Boulder,(2017) "Randomized Algorithms for Large-Scale Data Analysis", advisor: Becker, Stephen Massive high-dimensional data sets are ubiquitous in all scientific disciplines.
Extracting meaningful information from these data sets will bring future advances in fields of science and engineering. However, the complexity and high-dimensionality of modern data sets pose unique computational and statistical challenges. The computational requirements of analyzing large-scale data exceed the capacity of traditional data analytic tools. The challenges surrounding large high-dimensional data are felt not just in processing power, but also in memory access, storage requirements, and communication costs. For example, modern data sets are often too large to fit into the main memory of a single workstation and thus data points are processed sequentially without a chance to store the full data. Therefore, there is an urgent need for the development of scalable learning tools and efficient optimization algorithms in today's high-dimensional data regimes.

