We have a guest article, finally, from Messanger of Death of Imperial Life lack-of-game. I've been asking him to do this article for a while now in relation to taking tournament results as canon in terms of 'this army is good cause it won a tournament herp a derp.' Hopefully there will be a follow-up with some more lay understandings but is otherwise a highly recommended read for all players.
Hello all my Imaginary Friends. Today we will take a look at how to determine whether the results of a tournament apply to you. And hopefully by the end of this article you will understand why using tournaments as a means to compare armies is not only flawed, but just plain stoopid. So sit back and be prepared to have your brains melted from extreme levels of boredom.
Knowledge is power, it allows us to build balanced lists and play the game we all love. However, most of our understanding of the game mechanics is through intuition and reasoning as there is an absence of supportable or confirmable data. To fill this gap, some players use results of tournaments as a source of information. And this is a problem. The lists taken to tournaments are mainly produced through trial and error where someone has thrown several lists together until one works for them. This is a haphazard approach where the results may not be reproduced a second time think rock-paper-scissors. Some players do this 100s of times to hone their skills and lists... in a way they are conducting their own research where we, the player base, analyse the tournament results.
However, it is never good enough to just conduct research. In order to use or apply research it is necessary to make a judgement about the quality of the research and its relevance to a particular context or purpose. Or in layman terms, you need to know how good the information is practical the tournament results are before you use it.
To determine the quality of the research we need to know how the researcher tournament organiser controlled the study. A researcher controls their study by imposing rules so as to decrease the possibility of error and thus increase the probability that the study’s findings are an accurate reflection of reality. In a table top setting this is done with the rulebook, codices and tournament design. Through control, the researcher tournament organiser can reduce the influence or confounding effect of extraneous variables on the study variables. To do this they need to ensure the missions are balanced, there is a suitable amount and mix of terrain, and that the player scoring and seeding don’t allow the system to be gamed. Without a rigorous tournament design the results are neither valid nor reliable as they are unlikely to be reproduced again.
To determine the relevance to a particular context or purpose, we need to look at the sample, the sample size and just how generalised the results are.
A sample is the subset of a population selected to participate in a research. For this discussion the sample of a tournament is the army builds that are being played. In order for a sample to be representative, it must be like the target population in as many ways as possible. Composition scoring and list tailoring for bias missions can mess around with the sample to the point that it no longer represents the population. The inclusion of comp automatically excludes a tournament from analysis. The sample size is the size of the sample obviously. As a general rule, the larger the sample, the more representative it is of the population, while the smaller the sample the larger the sampling error.
Generalisation is the ability to apply study results from the sample to the population. This is where everything comes to show just how stoopid it is to think that the results from any tournament, of any size, is applicable to how you play with your war dollies. Very few tournaments have missions that don’t screw around with the game mechanics. This year’s Lords of Terra are one of the extreme examples of this where the missions *self-edit* mechanised players. But even the most balanced missions with a large enough sample won’t help simply because the larger the sample size the worse the terrain gets. AdeptiCon 2011 is possibly the best example of this. Even with thousands of hours of volunteer work they still didn’t have a suitable mix of terrain for the 128 tables. No knock against them. Even if there is a tournament where everything comes together the results can’t be generalised to the population. Why? I will give you two reasons.
Firstly, it is a single tournament. The hallmark of a good research study is that the results can be reproduced. Unless everybody replays that tournament we won’t know if the tournament results are reliable and valid. Secondly, you may not be the target population. The build you play may not have been a part of that population or the player using it may not be at your skill level... even with over a hundred players the sample size is still too small to reflect the Global Gaming CommunityTM.
I will skip a conclusion and instead help revive your brain with this...