Quantitative data analysis

    Quantitative data analysis
    What is meant by a “spurious” relationship between two variables?
    One of the conditions under which it is appropriate to use multivariate analysis is when the relationship between two variables might be spurious: this means that the relationship, which seemed to exist, doesn’t exist in reality. A third variable turns out, perhaps, to be responsible for the variation in both sets of values, and so they are not really related to each other, so their relationship was “spurious”.
    Reference: Bryman: Social Research Methods: 5th Edition Page(s) 344
    Quantitative data analysis
    What is an outlier?
    When we calculate a simple average, the ‘arithmetic mean’, we have to remember that a wide range of values can give the same average as a narrow range and that extreme values could make a simple average fairly meaningless. These values are called ‘outliers’, extremely high or low values in a distribution that threaten to bias the results. The ‘median’ is useful, in this regard, because it simply identifies the mid-point in a whole array of values, giving us a measure of the significance of the arithmetic mean.
    Reference: Bryman: Social Research Methods: 5th Edition Page(s) 338
    Quantitative data analysis
    What is the function of a contingency table, in the context of bivariate analysis?
    ‘Bivariate’ analysis means that we are analysing two variables together, usually to see if any correlation exists between them. There are various techniques available for this, one of which is a contingency table. This technique is principally used to compare nominal variables with another type, where the frequencies (in numbers or percentages) of the two different variables are simultaneously analysed to identify patterns of association between them.
    Reference: Bryman: Social Research Methods: 5th Edition Page(s) 340
    Quantitative data analysis
    What is the name of the test that is used to assess the relationship between two ordinal variables?
    Pearson’s r test is extremely valuable but limited to assessing correlations between interval/ratio variables. Spearman’s rho test is a very similar technique which can be used on pairs of variables when either both are ordinal or one is ordinal and the other is interval/ratio. The result will lie between -1 and +1, indicating the range of possible correlation, from perfectly negative to perfectly positive. The phi coefficient is used for dichotomous variables and Cramer’s V is a test of the strength of the relationship between nominal variables. Chi square, in brief, tests for the likelihood of relationships existing through mere chance, so is usually used in conjunction with the tests discussed in this question.
    Reference: Bryman: Social Research Methods: 5th Edition Page(s) 343
    Quantitative data analysis
    Setting the p level at 0.01 increases the chances of making a:
    The p value represents the level of probability that an apparently significant relationship between variables was really just due to chance. If p is set at 0.01, this means that we would expect such a result in only 1 in 100 cases. This is a very stringent level, and while it means that the researcher can be more confident about a significant result if they find one, it also increases the chance of making a Type II error: confirming the null hypothesis when it should be rejected. Bryman shows the connections between Type I and Type II errors and levels of p in Figure 15.12 on page 347.
    Reference: Bryman: Social Research Methods: 5th Edition Page(s) 347
    Quantitative data analysis
    What is the difference between interval/ratio and ordinal variables?
    The data that we gather varies from person to person. People are of different ages, have different income levels and prefer to do some things more than other people. We call these things variables just because their values vary from person to person. Analysis of quantitative data starts by trying to understand what kinds of variables we are dealing with. A person’s age is an example of an interval/ratio variable, because ages are measured in years. We can do a lot of statistical analysis on this kind of variable because the interval (one year) is the same for everybody in our data-set. Some variables are called ‘dichotomous’, meaning all possible answers are of one of two types (male/female, for example). We call those variables ‘nominal’, which we can, literally, only “name”, like many types of job occupation, for example. Finally, we refer to some variables as ‘ordinal’, which means we can only place the values in an order of first, second, third and so on, without considering the gap between the first and second, or whether it was the same as between second and third. Apart from dichotomous variables, all others can be rank-ordered.
    Reference: Bryman: Social Research Methods: 5th Edition Page(s) 334,335
    Quantitative data analysis
    If there were a perfect positive correlation between two interval/ratio variables, the Pearson’s r test would give a correlation coefficient of:
    A coefficient is a measure of the degree to which two sets of numbers co-relate. If the variables always move in ‘lock-step’ with each other, we call that a ‘perfect’ correlation. Sometimes the variables move in the same direction as each other, a ‘positive’ correlation and sometimes in the opposite direction, a ‘negative’ correlation. Pearson’s r test gives an answer of +1 when there is a perfect positive correlation between interval/ratio variables and -1 when there is a perfect negative correlation between them.
    Reference: Bryman: Social Research Methods: 5th Edition Page(s) 341
    Quantitative data analysis
    A test of statistical significance indicates how confident the researcher is about:
    Tests of statistical significance allow the researcher to estimate how confident they can be that there is a real relationship between the variables they are studying and thus that their results can be generalized from the sample to the target population.
    Reference: Bryman: Social Research Methods: 5th Edition Page(s) 346, Key concept 15.1
    Quantitative data analysis
    When might it be appropriate to conduct a multivariate analysis test?
    Multivariate analysis involves the analysis of three or more variables, and tends to be used when we have reason to suspect the nature of the relationship between two variables. Bryman discusses the three main reasons for doing this analysis on pages 345 and 346. Two variables can, indeed, be related to each other but perhaps in a more complex way than appears at first sight. Perhaps when a number of factors co-exist the relationship between any two of them is strong. Multivariate analysis enables us to test for many types of cross-relationships between a number of variables at once.
    Reference: Bryman: Social Research Methods: 5th Edition Page(s) 344,345
    Quantitative data analysis
    What is the difference between a bar chart and a histogram?
    Histograms are used to display interval/ratio variables, which involve a continuous range of values, and so there are no gaps between the bars that represent each category. Bar charts, on the other hand, display nominal or ordinal data, which fall into discrete categories.
    Reference: Bryman: Social Research Methods: 5th Edition Page(s) 337 (Figures 15.2 and 15.3)