精东传媒app

UniSQ Logo
The current and official versions of the course specifications are available on the web at .
Please consult the web for updates that may occur during the year.

STA6100 Multivariate Analysis for High-Dimensional Data

Semester 1, 2023 Online
Units : 1
School or Department : School of Mathematics, Physics & Computing
Grading basis : Graded
Course fee schedule : /current-students/administration/fees/fee-schedules

Staffing

Course Coordinator:

Requisites

Pre-requisite or Co-requisite: STA8170 or STA6200 or STA2300 or STA1003
Enrolment is not permitted in STA6100 if STA3200 has been previously completed

Overview

Statistics is concerned with the process of making sense out of data. It is the study of uncertainty and is concerned with the process of decision making in the face of uncertainty. As our ability to collect, accumulate and access data increases so does the Volume (amount), Variety (of types, sources and resolutions of data), Velocity (speed of data generation and handling) and Veracity (amount of noise and processing errors) of the data sets we wish to analyse and extract valuable information from. Variety creates wide or high-dimensional data sets that may require specific analytic approaches in order to distinguish useful patterns or develop predictive models for decision making.

This course covers some of the statistical concepts and methodologies appropriate for the analysis of large and/or high dimensional data sets. Students will learn the mathematical foundation of a number of statistical methods, the benefit and limitations of each method, how to correctly apply these methods using statistical software and how to assess the effectiveness of given analyses for given data sets. Students will also learn how to perform statistical analyses in the statistical software R. This will require students to master the writing of R code.

Course learning outcomes

On successful completion of this course students should be able to:

  1. Demonstrate advanced and integrated understanding of high-dimensional data sets.
  2. Apply the knowledge of high-dimensional data sets in the evaluation and choice of appropriate statistical methods.
  3. Apply the knowledge of a range of computational methods and diagnostic techniques to test hypotheses and evaluate and interpret the output correctly and in context.
  4. Analyse critically the capabilities of and implement R software as a statistical package.
  5. Independently develop an appropriate strategy for the analysis of a complex high-dimensional data set and effectively communicate the results with justification of statistical decisions made throughout the process.

Topics

Description Weighting(%)
1. Review matrix algebra, linear regression and confidence intervals. Introduction to the features of high-dimension data, graphical summaries and R programming. 20.00
2. Multivariate Normality and Hypothesis Testing 20.00
3. Multidimensional Scaling and Cluster Analysis 20.00
4. Discriminant Function Analysis and Canonical Correlation Analysis 20.00
5. Principle Components Analysis and Factor Analysis 20.00

Text and materials required to be purchased or accessed

Manly, BFJ (2016), Multivariate Statistical Methods: A Primer, 4th edn, Chapman & Hall /CRC, London.

Student workload expectations

To do well in this subject, students are expected to commit approximately 10 hours per week including class contact hours, independent study, and all assessment tasks. If you are undertaking additional activities, which may include placements and residential schools, the weekly workload hours may vary.

Assessment details

Approach Type Description Group
Assessment
Weighting (%) Course learning outcomes
Assignments Written Quiz No 10 1
Assignments Written Problem Solving 1 No 20 1,2,5
Assignments Written Problem Solving 2 No 30 1,2,4,5
Assignments Written Report No 40 1,2,3,4,5
Date printed 9 February 2024