Semester 1, 2023 Online | |
Units : | 1 |
School or Department : | School of Mathematics, Physics & Computing |
Grading basis : | Graded |
Course fee schedule : | /current-students/administration/fees/fee-schedules |
Staffing
Course Coordinator:
Requisites
Pre-requisite or Co-requisite: STA8170 or STA6200 or STA2300 or STA1003
Enrolment is not permitted in STA6100 if STA3200 has been previously completed
Overview
Statistics is concerned with the process of making sense out of data. It is the study of uncertainty and is concerned with the process of decision making in the face of uncertainty. As our ability to collect, accumulate and access data increases so does the Volume (amount), Variety (of types, sources and resolutions of data), Velocity (speed of data generation and handling) and Veracity (amount of noise and processing errors) of the data sets we wish to analyse and extract valuable information from. Variety creates wide or high-dimensional data sets that may require specific analytic approaches in order to distinguish useful patterns or develop predictive models for decision making.
This course covers some of the statistical concepts and methodologies appropriate for the analysis of large and/or high dimensional data sets. Students will learn the mathematical foundation of a number of statistical methods, the benefit and limitations of each method, how to correctly apply these methods using statistical software and how to assess the effectiveness of given analyses for given data sets. Students will also learn how to perform statistical analyses in the statistical software R. This will require students to master the writing of R code.
Course learning outcomes
On successful completion of this course students should be able to:
- Demonstrate advanced and integrated understanding of high-dimensional data sets.
- Apply the knowledge of high-dimensional data sets in the evaluation and choice of appropriate statistical methods.
- Apply the knowledge of a range of computational methods and diagnostic techniques to test hypotheses and evaluate and interpret the output correctly and in context.
- Analyse critically the capabilities of and implement R software as a statistical package.
- Independently develop an appropriate strategy for the analysis of a complex high-dimensional data set and effectively communicate the results with justification of statistical decisions made throughout the process.
Topics
Description | Weighting(%) | |
---|---|---|
1. | Review matrix algebra, linear regression and confidence intervals. Introduction to the features of high-dimension data, graphical summaries and R programming. | 20.00 |
2. | Multivariate Normality and Hypothesis Testing | 20.00 |
3. | Multidimensional Scaling and Cluster Analysis | 20.00 |
4. | Discriminant Function Analysis and Canonical Correlation Analysis | 20.00 |
5. | Principle Components Analysis and Factor Analysis | 20.00 |
Text and materials required to be purchased or accessed
Student workload expectations
To do well in this subject, students are expected to commit approximately 10 hours per week including class contact hours, independent study, and all assessment tasks. If you are undertaking additional activities, which may include placements and residential schools, the weekly workload hours may vary.
Assessment details
Description | Group Assessment |
Weighting (%) | Course learning outcomes |
---|---|---|---|
Quiz | No | 10 | 1 |
Problem Solving 1 | No | 20 | 1,2,5 |
Problem Solving 2 | No | 30 | 1,2,4,5 |
Report | No | 40 | 1,2,3,4,5 |