Short Courses

courses

 

Over the course of the year Stats Central teaches short courses aimed at researchers across all disciplines. Offerings in 2024 include:

  • Sample Size and Power Calculations: February 7-8, 9.30am to 2.30pm – Each day
  • Introductory Statistics for Researchers using SPSS: May 6-8
  • Introduction to R: May 9
  • Introductory Statistics for Researchers using R: May 14-16
  • Basics of Regression in R: May 21-23
  • Intermediate R: July 24-25
  • Mixed Models using R: July 1-2
  • Introduction to R: August 5
  • Introductory Statistics for Researchers using SPSS: August 6-8
  • Multivariate Statistics Using R: August 8
  • Introductory Statistics for Researchers using R: August 13-15

Delivery mode: Hybrid (highly recommended attending in person)

2024 - Intermediate R

24-25 July

Course Overview

In this course, we teach students techniques for scaling up their analyses. How do you get R to repeat tasks in an efficient way? This general skill is needed in diverse contexts, such as processing lots of files (e.g. from imaging or surveys), working with large datasets, or simulations (statistical or individual-based). In these circumstances, you'll often find yourself having to repeat the same thing over and over and it's good to learn how to get R to do this in an organised and efficient manner. We'll introduce a series of powerful tools, all which take a bit of instruction and practice to master.

Course Outline

This course will cover topics including:

  • Advanced data processing with tidyverse, including dplyr, joins, nests
  • Functional programming with purrr
  • For loops
  • Writing your own functions

Presenters and Expertise: A/Professor Daniel Falster and A/Professor Will Cornwell, School of Biological, Earth & Environmental Sciences (BEES)

Course Requirements: Own laptop with R (4.2 or newer) and Rstudio installed.

Course Prerequisites: Intermediate R users (have been using R for at least 6 months, comfortable loading data, using basic ggplot and dplyr)

Dates: Wednesday 24 and Thursday 25 July 2024

Duration: 9.00am - 4.00pm, each day

Location: This course will be delivered in person only!

You will receive a certificate of completion for the course.

2024 - Introduction to R

5 August

Course Overview

R is widely used and extremely powerful statistical software. This course assumes that you have never used R before. You will learn how to obtain and install R, which is open-source software, and RStudio, which is a versatile, user-friendly interface for using R. It is very useful to do this course before our introductory statistics course, Introductory Statistics for Researchers.

This course will cover some basic features of R and lay the groundwork for you to improve your R skills independently. The course is self-paced and focused on developing practical skills.

Course Outline

This course will cover topics including:

  • Basics of interacting with R – calculations, saving variables so you can reuse them, data types and structures, organising R code in scripts
  • Tidyverse – a basic introduction to tidy R code
  • Data – reading in and organising data (from spreadsheets) with dplyr
  • Plotting – make beautiful figures with ggplot

Course Requirements: You will need a computer with administrator access (to install R and RStudio software before attending the course).

Date: Monday 5 August 2024

Duration: 9.00am - 5.00pm

Location: This workshop will be delivered in person and online

You will receive a certificate of completion for the course.

TBGgJgUnCvbj3eXdPLNm8nOzPnpj20J5P1LftksAsZvnRq0eJlffrZaU9dJrQWtqA2drUdbe3CCSSzN6ydtGnEegCM43Id3juAvM7gJdm7QIzjch3eP4f5SZZLJTnlqoAAAAASUVORK5CYII= 

cid214624*image005.png@01DACC8F.33ECAF70

2024 - Introductory Statistics for Researchers Using SPSS

6 - 8 August

Course Overview

This workshop is designed as an introduction to statistical analysis for researchers. There is emphasis on understanding the concepts of statistical procedures (with a minimum of mathematics, although some will be discussed) and on interpreting computer output. It is designed to help you, the researcher. It is helpful if you have done an undergraduate statistics subject, although this workshop can serve as a first introduction or a refresher. The theory behind the statistical procedures will, in general, not be discussed.

A range of statistical analyses will be discussed in the workshop, as described in the outline below. We will talk through examples of all analysis types and will demonstrate how to carry them out in SPSS. Equal emphasis will also be put on interpreting the output of these analyses. There will be plenty of practical work.

You will be expected to watch this seminar on study design and statistical principles (samples and populations, confounding, statistical inference) ahead of the workshop.

Course outline

Revision

  • SPSS basics
  • Descriptive statistics – mean, mode, standard deviation, inter-quartile range, correlation
  • Data visualisation - boxplot, histogram, scatterplot, bar graph

Introduction to statistical inference

  • Uncertainty, confidence intervals, p-values, significance
  • T-test (comparing two groups)
  • Checking model assumptions

Analysis of continuous responses with linear models

  • Simple linear regression
  • ANOVA
  • Multiple regression, ANCOVA

Analysis of categorical responses

  • Relative risk, odds ratios
  • Chi-square test
  • Logistic regression

Accessibility

This is an in person course. If you wish to participate, but in person attendance is not accessible to you (e.g. you have hearing or vision impairment, parenting responsibilities, live too far) please email Gordana (g.popovic@unsw.edu.au) to arrange online attendance. The virtual component will be run using Zoom, and closed captions will be activated on request.

Slides are in PDF format, exercises are in R markdown, and both will be downloadable in advance. If HTML slides and alt text are needed to assist accessibility we will make every effort to provide these, please let us know well in advance. Lectures will be recorded and uploaded to YouTube, and available for a week following the workshop. Please email Gordana (g.popovic@unsw.edu.au) with any questions or requests.

SPSS skills

We do not require you to have any prior skills in SPSS, and we will spend the first morning getting with some very basic SPSS skills, just enough to participate in this course. For resources include SPSS Tutorials and UCLA Institute for Digital Education and Research SPSS pages.

Prerequisites: SPSS access. If you have a UNSW zID you can access SPSS through myAccess. If you do not have a UNSW zID, we cannot provide access to SPSS.

Course requirements: You will need to bring and use your own computer during the workshop.

Date: Tuesday 6 to Thursday 8 August 2024

Duration: 9.30am - 4.00pm, each day

Location: This course will be delivered mainly in person with limited remote

You will receive a certificate of completion for the course.

TBGgJgUnCvbj3eXdPLNm8nOzPnpj20J5P1LftksAsZvnRq0eJlffrZaU9dJrQWtqA2drUdbe3CCSSzN6ydtGnEegCM43Id3juAvM7gJdm7QIzjch3eP4f5SZZLJTnlqoAAAAASUVORK5CYII=

cid214174*image008.png@01DAC8C9.FF815A60

2024 - Multivariate Statistics Using R

8 August

Course Overview

This workshop is designed as an introduction to multivariate statistics in a model based framework. We aim to move beyond analysing a single response variable to visualising and analysing a collection of correlated response variables. Methods available to you depend on the number of responses relative to your sample size; and like in univariate regression we also need to think about the response variable type (binary, count etc). There is emphasis on understanding the concepts of statistical procedures (with a minimum of mathematics, although some will be discussed) and on interpreting computer output. It is designed to help you, the researcher. There will be plenty of practical work.

You will be expected to understand the basics of univariate regression including concepts like choosing a “family” (e.g. Binomial, Negative Binomial), interpreting model coefficients and checking assumptions as well as have basic R skills. Topics to consider refreshing include linear models, generalised linear models, and linear mixed models.

Course outline

Introduction to multivariate data – with fewer response variables

  • What is a multivariate research question
  • Why use multivariate methods
  • Covariance matrices
  • Analysis with manova
  • Checking model assumptions

Multivariate data – with lots of response variables

  • Reducing the rank of the covariance matrix
  • Reduced Rank Analysis with PCA
  • Reduced Rank Analysis with generalised latent variable models (glmmTMB)
  • Visualising high dimensional data

Multivariate data – hypothesis testing with LOTS and LOTS of response variables

  • Design based inference in mvabund
  • Analysing Compositional data – row effects and offsets
  • Correlation types

Accessibility

This is an in person course. If you wish to participate, but in person attendance is not accessible to you (e.g. you have hearing or vision impairment, parenting responsibilities, live too far) please email Eve (e.slavich@unsw.edu.au) to arrange online attendance. The virtual component will be run using Zoom, and closed captions will be activated on request.

Slides are in PDF format, exercises are in R markdown, and both will be downloadable in advance. If HTML slides and alt text are needed to assist accessibility we will make every effort to provide these, please let us know well in advance. Lectures will be recorded and uploaded to YouTube, and available for a week following the workshop. Please email Eve (e.slavich@unsw.edu.au) with any questions or requests.

Prerequisites: RStudio and Regression (with a single response variable) are assumed knowledge.

Course requirements: You will need to bring and use your own computer during the workshop.

Date: Thursday 8 Augut 2024

Duration:  9.00am - 4.30pm

Location: This course will be delivered mainly in person with limited remote

You will receive a certificate of completion for the course.

2024 - Introductory Statistics for Researchers Using R

13 - 15 August

Accessibility

The virtual component will be run using Zoom, and closed captions will be activated on request. Slides are in PDF format, exercises are in R markdown, and both will be downloadable in advance. If HTML slides and alt text are needed we will make every effort to provide these, please let us know well in advance. Lectures will be recorded and available for a week following the workshop.

Important Notes - please read

1. Participants must have basic R skills prior to workshop

This is NOT an introductory workshop in using the statistical software package, R. Basic R coding will not be taught. To do this workshop successfully, you must have basic proficiency in using the R package. All examples and exercises used in this workshop are done using R. We want to ensure everyone is able to follow the material, and no participant is disappointed.

  • If you have basic R skills, that's excellent, please complete this quick task HERE. Once you have completed this and emailed your results you will be given a code to allow you to register.
  • If you do not have basic R skills, but want to do the Introductory Statistics for Researchers workshop, you can enroll in our Introduction to R course that runs ahead of this workshop, Aug 5 tickets on sales soon!. Once you have registered, you will be given a code to allow you to register for Introductory Statistics for Researchers.

2. Own computer

You will need to bring and use your own computer during the workshop with both R and RStudio installed. You will also need administrator rights to install further packages needed throughout the workshop.

Course Overview

This workshop is designed as an introduction to statistical analysis for researchers. There is emphasis on understanding the concepts of statistical procedures (with a minimum of mathematics, although some will be discussed) and on interpreting computer output. It is designed to help you, the researcher. It is helpful if you have done an undergraduate statistics subject, although this workshop can serve as a first introduction or a refresher. The theory behind the statistical procedures will, in general, not be discussed.

A range of statistical analyses will be discussed in the workshop, as described in the outline below. We will talk through examples of all analysis types and will demonstrate how to carry them out in R. Equal emphasis will also be put on interpreting the output of these analyses. There will be plenty of practical work.

Content

You will be expected to watch this seminar on study design and statistical principles (samples and populations, confounding, statistical inference) ahead of the workshop.

Revision

  • Descriptive statistics – mean, mode, standard deviation, inter-quartile range, correlation
  • Data visualisation - boxplot, histogram, scatterplot, bar graph

Introduction to statistical inference

  • Uncertainty, confidence intervals, p-values, significance/evidence
  • T-test (comparing two groups)
  • Checking model assumptions

Analysis of continuous responses with linear models

  • Simple linear regression
  • ANOVA
  • Multiple regression, ANCOVA

Analysis of categorical responses

  • Relative risk, odds ratios
  • Chi-square test
  • Logistic regression

Course Requirements: You will need to bring and use your own computer during the workshop with both R and RStudio installed. You will also need administrator rights to install further packages needed throughout the workshop.

Date: Tuesday 13 to Thursday 15 August 2024

Duration: 9.30am - 4.00pm, each day

Location: This workshop will be delivered in person and online

You will receive a certificate of completion for the course.

TBGgJgUnCvbj3eXdPLNm8nOzPnpj20J5P1LftksAsZvnRq0eJlffrZaU9dJrQWtqA2drUdbe3CCSSzN6ydtGnEegCM43Id3juAvM7gJdm7QIzjch3eP4f5SZZLJTnlqoAAAAASUVORK5CYII=  

cid214174*image009.png@01DAC8C9.FF815A60

2024 - Mixed Models using R

1-2 July

Course Overview

Simple statistical methods (t-tests, chi-square tests, linear models/regression) assume independence of observations and cannot be used when dependence is present in the sample. Mixed models are extensions of linear models to dependent data. Common reasons for dependence are

  • Clustering e.g. multiple patients per hospital, multiple plants per site, multiple measurements on the same individual
  • Time and Space -  measurements taken close in time and space are more similar (dependent)

In this course we will teach mixed models as a straightforward extension of (generalised) linear models. The course will include lectures and practical sessions, and explain how to fit, check, interpret and communicate results from mixed models.

Accessibility

This is an in person course. If you wish to participate, but in person attendance is not accessible to you (e.g. you have hearing or vision impairment, parenting responsibilities, live too far) please email Gordana (g.popovic@unsw.edu.au) to arrange online attendance. The virtual component will be run using Zoom, and closed captions will be activated on request.

Slides are in PDF format, exercises are in R markdown, and both will be downloadable in advance. If HTML slides and alt text are needed to assist accessibility we will make every effort to provide these, please let us know well in advance. Lectures will be recorded, and available for a week following the workshop. Please email Gordana g.popovic@unsw.edu.au with any questions or requests.

Course Pre-requisite: Ability to run, check and interpret linear models in R, either though having completed our Fundamentals of Regression course in May 21-23, or by self assessment using this 3 minute questionnaire.

We assume a thorough understanding of linear and generalised linear models. If you recently took our Fundamentals of Regression course May 21-23, you would have all the background you need. If not, please use this 3 minute questionnaire to self asses your readiness for this course.

Course requirements: You will need to bring and use your own computer during the workshop.

Date: Monday 1 to Tuesday 2 July 2024

Duration: 10.00am - 4.00pm, each day

Location: Hybrid, this workshop will be delivered mainly in person with limited remote access. We highly encourage you to attend in person.

You will receive a certificate of completion for the course.

2024 - Fundamentals of Regression in R

21 - 23 May

Course Overview

This course provides a comprehensive hands-on introduction to regression analysis techniques The course content is designed for researchers with some prior knowledge of basic statistical testing, such as t-tests, p-values, confidence intervals and simple linear regression. The primary focus is on developing a conceptual understanding of regression models through numerous examples. There will be a strong emphasis on practical implementation in R, and interpretation of output. Approximately half the time will be dedicated to practical hands-on sessions. 

The core content starts from linear models with more than one variable, enabling research questions like "What is the effect of this treatment/intervention after adjusting for confounding variables?" or "What is the relationship between two variables while controlling for other factors?" 

We then cover interactions between variables in linear models, enabling research questions like: "How does the effect of the treatment depend on some other variable? Is the treatment effect different between groups?" and "How is the relationship between two variables modified by some other variable?" 

Fundamental regression concepts and skills that arise in regression, like multicollinearity, multiple testing, model selection, generalizing the linear model to data that is non-normal (e.g., binary response and count data), and regression with non-linear relationships between variables, are all covered in this course. 

By the end of this course, you will have a foundation in regression modelling techniques with the practical experience in R needed for more advanced regression methods like mixed models, longitudinal data analysis, survival analysis, meta-analysis, multivariate analysis, ordinal and multinomial regression, spatial regression and other extensions.

Course Outline

Day 1: Revision, Multiple Regression Introduction and Extensions 

Day 2: Morning/Afternoon: Multiple Comparisons/Model Selection 

Day 3: Morning/Afternoon: Generalized Linear Models (GLMs)/Generalized Additive Models (GAMs) 

Assumed Knowledge 

We assume knowledge of introductory statistics, including principles of study design, the concept of a p-value, the concept of a confidence interval, one-sample and two-sample t-tests, and the equivalence of a t-test to simple linear regression and simple linear regression (with a single dependent and single independent variable). All of these are covered in our Introduction to Statistics courses.

We also assume you have some experience with R. To enrol in this course, we ask you to complete a quick exercise. If you are new to R, you should complete our one-day introduction to R course prior to this course.

Course Requirements: You will need to bring and use your own computer during the course.

Presenter and Expertise: Peter Humburg, Biostatistician UNSW Stats Central

Date: Tuesday 21 to Thursday 23 May 2024

Duration: 9.30am - 4.00pm, each day

Location: K-D26 - BioScience

You will receive a certificate of completion for the course.

2024 - Sample Size and Power Calculations

7-8 February

Course Overview

Power calculations and sample size determination are essential parts of planning a scientific study. In this course (run over two days), we will introduce the basic principles of precision-based and power-based sample size calculations. Using practical examples, we will demonstrate how to perform sample size calculations for common designs in single-sample studies and for comparing groups, as well as discussing practical issues such as multiple comparisons. No knowledge of statistical software will be required.

Course Outline

This course will cover topics including:

  • Introduction and motivation

  • Basic principles of power and sample size calculations

  • Precision-based sample size calculations

  • Power-based sample size calculations

  • Complex power calculations

  • Practical considerations & other issues

Date: Wednesday 7 & Thursday 8 February 2024

Time: 9.30am - 2.30pm, each day

Delivery Mode: In-person (preferred) and online option

Presenter and Expertise: Mark Donoghoe, Biostatistician, UNSW Medicine & Health Clinical Research Unit

Course Requirement: You will need to use a computer during the course.

You will receive a certificate of completion for the course.

2020 - Introduction to Python for Data Science

1 and 2 September

Course Overview

Python is a widely used programming language to manipulate, analyze, and visualize data. It is one of the most popular languages for Data Science, especially when dealing with complex, uncurated or text datasets.

This course assumes that you have never used Python before, but you have some basic programming knowledge. You will learn how to obtain and install Python, which is open-source software, and Jupyter Notebook, which is an interactive computational environment, in which you can combine code execution, rich text, mathematics, plots and rich media.

This two half-days introduction to Python will cover some useful features of Python for data science. It will discuss various online resources available to further develop your data science skills using Python.

Course Outline

This course will cover topics including:

• Python overview

• Jupyter Notebook

• Basic Python programming

• Typical process of data science

• Techniques to manipulate and analyze datasets

• Result visualization

• Selected statistical analysis and/or machine learning examples

Presenter and Expertise: A/Professor Raymond Wong (Stats Central and UNSW School of Computer Science and Engineering)

Course Requirements: You will need to use a computer during the course.

Date: Tuesday 1st to Wednesday 2nd September 2020 (two morning sessions)

Duration: 10.00am - 1.00pm each day

Location: Online

 

2020 - Text Analytics in Python (Advanced)

8 and 9 September

Course Overview

More than 70% of the data on the internet is unstructured. Among them, text is the most common form that appears in almost all data sources. For example, text data such as emails, online reviews, tweets, news and reports hold valuable information and insight for most research and applications. Text analytics, usually involving techniques from text mining or natural language processing (NLP), can automatically uncover patterns and extract meaning/context from these unstructured texts.

This course assumes that you have basic Python programming knowledge, or have previously attended "Introduction to Python for Data Science" from Stats Central. This course will provide you the foundation to process and analyze text.

In this course, we will cover some useful Python features and libraries for text processing and analysis. We will touch on some advanced topics such as sentiment analysis, text classification, and/or topic extraction.

 

Course Outline

This course will cover topics including:

• Jupyter Notebook

• Basic text operations in Python

• Text analytics and NLP

• Tokenization, stopwords, lexicon normalization, POS tagging

• Sentiment analysis and text classification

Presenter and Expertise: A/Professor Raymond Wong (Stats Central and UNSW School of Computer Science and Engineering)

Course Requirements: You will need to use a computer during the course.

Date: Tuesday 8 to Wednesday 9 September 2020 (two morning sessions)

Duration: 10.00am - 1.00pm each day

Location: Online

 

2020 - Study Design

19 May

Course Overview

Good study design is crucial for answering your research questions. No amount of post processing or statistical expertise can compensate for poor or inadequate study design. In this course you will lean the basic concepts of study design including how to;

  • Randomize, so that your sample can be used make inferences about the real world (the population).

  • Implement appropriate controls and manipulation to infer causation.

  • Determine adequate sample size.

We will then move on to advanced concepts, which will focus on how blocking and stratification can reduce variability and improve power (the ability to answer your research question) using a smaller sample size (hence less resources).

Practical Component

  • How to randomise – Random allocation of subjects to treatments and random sampling

  • Simple power analysis – How much data do we need to answer the question of interest?

  • We will use free online tools and the software packages, either Excel and G*Power; or R (optional).

 

Course Requirement: You will need a computer with Excel (or equivalent) and G*Power installed and access to the course.

Duration: 9.00am to 1.00pm

Location: Online