How we can help
Contact us if you are unsure about
- how to set up your experiment,
- how to analyse your data
- which graphics to employ to study your data or to illustrate the point
- which software to use and how
- how to interpret and present your results
- how to revise your manuscript in accordance with reviewers' advice or explain why you will not revise it
- how to perform statistical power/sample size calculations for your grant application
Statistical consulting is provided free of charge to all UC academics and students. It is the responsibility of the student to inform his or her supervisor about seeking statistical help.
When to seek advice
If you have not started your project yet, it is highly recommended to have a consultation before you start collecting the data, as it helps to avoid many problems later on.
If you have a quick and easy question, just send us an e-mail. Otherwise, please request a consultation (1h slot). The consultation days are Tuesday and Thursday, but other arrangements can be made by special request.
We also organise specialised statistical seminars, workshops and tutorials on various topics, specifically tailored for the level and time-schedule of your choice.
Frequently asked questions
The choice of statistical software depends on your personal preference and on the common usage among your colleagues. Some of the most common programs are R, SPSS, SAS, and Stata. R is free and probably the most flexible/widely applicable statistical software around. WinBUGS is a good statistical package for Bayesian inference. Excel is not statistical software.
It is important to take a good look at your data. Make as many scatterplots, tables, histograms, maps as you can. Study them, get a feel of interactions, look out for patterns and strange observations. It is also important to have a list of questions, as specific as possible, which can be answered statistically. For example, a question: “Is school A different from school B?” is not precise enough, because it is unclear what it is you are going to compare (students’ heights, teachers’ age, wall colour?…) While a question: “Do students in school A have generally higher grades in math than those in school B?” is something, which can be answered using statistical methodology.
Some statistical models, such as linear regression and, in particular, ANOVA, require normality of residuals. If your residuals are not normally distributed, you may consider transformation of the response variable. Note, that transformation does not always solve the problem, so you should recheck diagnostics after fitting the new model.
You can either use non-parametric methods or opt for new parametric methods which are more robust with respect to non-normality.
Firstly, our hypotheses, no matter how well thought out, cannot always be correct, and therefore sometimes there is simply no effect to be found. Sometimes, statistical significance may be increased by collecting additional observations or using more powerful statistical methods. Finally, statistical significance simply tells you that there is a non-zero effect or difference. Given a large enough sample, almost any tiny effect will be statistically significant. But non-zero does not always mean ‘practically interesting’.
While, causality is still a subject of many philosophical discussions, statistics is primarily the study of associations. Two variables may be highly correlated without a causal relationship. Higher sales of ice cream may be associated with higher incidence of drowning, but that is not because ice cream causes drowning. (It is because on hot days more people go for a swim and more people buy ice cream).
R-squared is generally not a good tool for model comparison, because it will increase with the number of parameters. There exist various statistical tests depending on the particular type of models you want to compare. One of the most generally useful measures is Akaike’s Information Criterion (AIC). It can be used to compare any two models fitted to the same data set. The smaller the values of AIC, the better is the model. A difference of 3 is generally considered ‘suggestive’ and a difference of 7 ‘significant’.