Correlation and the issues associated.

by Research Methods & Statistics

A correlation is defined as a relationship between two variables. But is this ‘relationship’ a healthy romance or is there need for a break up? Just like friends, there are many benefits with correlation but like any relationship, correlation is far from a perfect fairytale. This is what I shall be discussing in this week’s blog. Correlations; the positives and negatives. (See what I did there?)

Correlations are useful as they help indicate the relationship between two variables that in turn increases understanding and can lead to positive real world applications. For example, biologists such as McNeal and Cimbolic (1986)* have found a positive correlation between low synaptic levels of serotonin and depression. This association has been extremely important and almost vital in the treatment of depression. Correlation pointed out the link between low serotonin levels and depression and from this association researchers introduced antidepressants such as Prozac and Selective Serotonin Reuptake Inhibitors (SSRIs) that work by increasing levels of the neurotransmitter serotonin and have shown to be a very effective treatment.

Another reason to love correlation is that it gives researchers (especially in psychology) the chance to investigate variables that would be considered unethical, if not impossible to manipulate when applying the experimental method. Just like a knight in shining armour there to help a damsel in distress, correlation helps researchers assess many interesting variables such as alcohol consumption, personality, and levels of education and colour preferences that cannot be manipulated in an experiment to see how they affect behaviour. However, researchers can easily measure and describe these variables using correlation research. One example of correlation used to assess education levels comes from testing the relationship between personality types and study methods and academic performance (Entwistle 2011)**, this would be extremely difficult to test in an experiment as well as very time consuming. Using correlation to stay within ethical guidelines, researchers could study the link between certain behaviour types and diet deficiencies. Obviously, this could never be tested in a laboratory setting, as it would be extremely unethical to create such conditions for the purpose of experimentation. However, it is possible and within ethical guidelines to record diet deficiencies if they occur naturally. This brings me nicely on to my next point; a huge advantage of correlation is that researcher’s record what exists naturally. As researchers do not interfere, manipulate or control any variables correlation studies have high external validity.

Correlation is useful and relatively straightforward to display with the use of scatter plots. The advantage of scatter plots is that they allow you to see a visual representation of the characteristics in the relationship between two variables. They clearly show the trends in data. Take a look at the example below****; as the temperature increases, the number of ice-cream sales also increases. The results are pretty much in a straight line with a positive gradient. Therefore, this scatter plot clearly shows a positive correlation as the two variables are changing in the same direction.

Now we’ve discussed the positives of correlation it’s time to dive into the depths of the problems associated with this research type (I’m starting to feel like a relationship counsellor). Correlation only shows a relationship between an X variable and a Y variable. Granted this is useful however, it does not result in a clear, precise and unambiguous explanation for the relationship and therefore correlation has low internal validity. The point I’m trying to make is that correlation cannot assess causality. There are other possibilities such as factor B causes factor A (reverse causation), factor A causes factor B and B causes A (bidirectional model) or an unknown factor C is the cause (third variable problem).

For example, Gentile and Anderson*** studied the patterns of video game use along with aggressive behaviour. They found that time spent playing violent video games positively correlated with aggressive feelings, arguments with schoolteachers and increased physical violence. So, a relationship between an X variable (video game use) and Y variable (aggression) has been established but let’s discuss the limitations. There is no causal link established between violent videogame use and aggressive behaviour. A ‘bi directional model’ (Gentle and Anderson 2003) suggests violent games may increase aggression but it is just as likely that those who possess an aggressive personality are orientated to act aggressively and choose violent video games as a result of this. Also, it may be that the violent video games cause the aggression, this is known as reverse causation. In short, correlation does not establish which variable is the cause and which is the effect.

Another limitation of correlation is known as the third variable problem. This can be seen as a third wheel in the relationship like a gooseberry or maybe even a love triangle (oooh things are starting to get interesting!) So, you’ve established a relationship between two variables and it’s all happy family’s right? Wrong, there is also the possibility that an unknown third variable (that we’ll call Z) is controlling variable X, Y or both. A correlation established a relationship between young children sleeping with the light on and being likely to develop myopia in later life (University of Pennsylvania Medical Center, 1999)*****. Therefore, sleeping with the light on causes myopia yeah? No! Another study by The Ohio State University found that infants sleeping with the light did not cause the development of myopia. However, they did find a strong link between parental myopia along with the development of child myopia. It was also noted that myopic parents were likely to leave the light on in their child’s bedroom. So, the cause of both conditions (variables A and B) is parental myopia (variable C), and therefore the above conclusion that sleeping with the light on causes myopia is false.

To conclude; correlations establish relationships between variables and by doing so can indicate where more research needs to be carried out to investigate why the relationship between and X and Y variable occurs. However, this relationship may be one best described as going through a rough patch. Directionality and the third variable problem are huge flaws that arise in correlation and it is extremely important to remember that correlation does NOT mean causation. In fact, I’m going to say it again just to stress my point; correlation is NOT causation!

*McNeal, E.T. and Cimbolic, P. (1986). Antidepressants and biochemical theories of depression. Psychological Bulletin, 99, 361-74


***Gentile, D.A. and Anderson, C.A. (2003). Violent video games: the newest media violence hazard. In D. A. Gentile (Ed.) Media violence and children. Westport, CT: Praeger Publishing.