Data Theory at UCLA
the mathematical and statistical foundations of data science
a collaboration between the Departments of Statistics and Data Science and Mathematics
Promoting research into fundamental ideas of Statistics and Mathematics
Preparing students for leadership roles in data science in industry and academia
Why Data Theory is important
Over the last three decades, the interdisciplinary field of Data Science has emerged. The field employs advanced mathematics, statistics, and computer science, in concert with the remarkable growth in the capacity of computers, to achieve insights, to make predictions, and to make decisions in ways that would have been science fiction in 1985.
Sources and types of data are increasing rapidly, driven by the ubiquitous use of computing in science and commerce, as well as the routine use of automatic measuring and electronic recording devices. This data revolution is transforming the sciences, engineering, and even the humanities.
Examples include biologists working on gene expression, solving problems concerning the ethics of data use and storage; speech, image, and other pattern recognition; banking, finance, investments, insurance of all types, and their government regulation; digital health records; epidemiological prediction; surveillance; self-driving cars; advertising; Internet searches; logistics; robots; weather, climate; earthquakes; and predictive policing and legal decision-making.
The world is awash in the sea of data, creating an urgent demand for scientists who are able to understand data collection, storage, integration, and analysis. Using tools based in mathematics, especially the theory of probability, Statistics has become the language of data. However, the deluge of new types and magnitudes of data has outstripped traditional statistical methods and has led to the rapid development of methods outside of Statistics and Applied Mathematics, including what has become known as Machine Learning.
Most Data Science programs focus on the methods of data modeling, analysis and engineering. What is missing is a rigorous understanding of the statistical and mathematical foundational concepts that underlie these methods. Without these, data scientists lack the understanding to deal with the plethora of problems they will face.
How can I get involved?
For undergraduates, the Data Theory Major is a program at UCLA that produces students well equipped to understand current data science and develop the data science of the future.