Computational Statistics

in the

Data Sciences Program

 

compstat1.gif

Colloquium Series

Seminars in Computational Statistics

Courses in Computational Statistics

Other Courses in Statistics

PhD Program in Computational Statistics

Faculty in Statistics

Department of Computational and Data Sciences

College of Science

Department of Statistics

School of Information Technology

George Mason University

 

Statistics is the science of analyzing data; that is, extracting knowledge from data and making decisions based on that knowledge. Statistics also accommodates randomness within data, and reflects that randomness in statements of confidence levels for decision rules.

Computational Statistics is the area of specialization within statistics that includes statistical visualization and other computationally-intensive methods of statistics. Computational statistics is built on the mathematical theory and methods of statistics, and includes visualization, statistical computing, and Monte Carlo methods. The emphasis in computational statistics is often on exploratory methods.

Research in computational statistics involves the development of visualization and computationally-intensive methods for mining large, nonhomogeneous, multi-dimensional datasets so as to discover knowledge in the data. As in all areas of statistics, probability models are important, and results are qualified by statements of confidence or of probability. An important activity in computational statistics is model building and evaluation.

Examples of research areas in computational statistics:

  • Techniques for discovering structure in data. These are usually exploratory or visual, and may involve such things as density estimation, clustering, or classification. In most cases, the emphasis would be on large-dimensional datasets.
  • Statistical learning.
  • Methods of analysis of extremely large datasets (large number of observations or large number of dimensions).
  • Computationally-intensive methods of analysis (Monte Carlo methods or resampling methods).
  • Simulation methods.
  • Methods for statistical modeling. These may be classical statistical models, models based on differential equations, especially SDEs, or Bayesian hierarchical models.
  • Numerical methods for statistical analysis (statistical computing).
  • Methods for statistical problems that have a major "computer science" aspect (record matching, for example).
There are obvious overlaps in these area. The methods of computational statistics are built on the traditional areas of statistics, such as mathematical statistics, linear models, survey sampling, time series, and so on.

The computational statistics area of the Data Sciences Program also places a strong emphasis on applications in such diverse fields as bioinformatics, climatology, intrusion detection, and finance.

For more information, contact James Gentle