DT840 index

de Vaus, D. A. (1996) Surveys in Social Research, London: UCL Press

Ch 1: The nature of surveys

Any work that collects data in the form of cases, variables, and attributes on those variables.

Criticisms:

Ch 2: Theory and social research

11 social research involves constant interplay between observation and explanation. Theory construction (building from facts, also known as "grounded theory"); and theory testing (developing a theory, then testing it with facts).

Process of theory construction:

Theory testing process:

  1. specify theory
  2. derive set of conceptual propositions
  3. restate conceptual propositions as testable propositions
  4. collect data
  5. analyse data
  6. assess theory

29 good descriptive research important as basis for sound theory.

Ch 3: Formulating and clarifying research questions

Variable - characteristic with more than one category or value. Dependent, independent, intervening.

Decide scale - particular but exhaustive or general but partial. Unit of analysis - e.g. individual, region, organisation.

Decide frame of reference - context - data about other groups, same group over time, etc.

Classic experimental design - before and after panel, quasi panel, retrospective panel, cross section.

Ch 4: developing indicators for concepts

Stages:

Concept - abstract summary of set of behaviours, attitudes, characteristics which are seen as having something in common. Definition of any concept is not true or false, but more useful or less useful. So obtain range of definitions, decide most useful, delineate dimensions of concept.

Then develop indicators for concept:

54 evaluating:

57 problem of meaning - same responses may have different meanings for different people.

Ch 5: finding a sample

Probability and non-probability - probability, each person has equal chance of being selected.

Types of probability sample:

Sample size - larger sample smaller error - at 95% confidence rate 10,000 gives 1% error; 100 gives 10% error.

Non-responses may cause bias.

Secondary analysis - analysing someone else's work.

Non-probability, when probability too expensive or too dispersed e.g. homosexuals, cannabis users.

Purposive sampling - choose respondents typical of some category. Quota sampling - interviewers are given quotas of people with certain characteristics. Availability sample - sampling whoever is willing - to be used cautiously.

Ch 6: Constructing questionnaires

Aspects:

Content (Dillman 1978):

82 question type:

92 layout:

questionnaire development:

Ch 7: Conducting interviews

Basic methods face to face, mail, telephone, and now online.

Issues:

Ch 8: Overview of analysis

Features which affect how data is analysed:

134 list of methods:

Univariate methodsBivariate methodsMultivariate methods
Frequency distributionsCross tabulationsConditional tables
ScattergramsPartial rank order correlation
RegressionMultiple and partial correlation
Rank order correlationMultiple and partial regression
Comparison of meansPath analysis

Ch 9: Univariate analysis

Central tendency and dispersion. Dispersion - ratio of number of people not in modal category.

For inferential standard error gives us real population estimate. Probability theory tells us with 95% confidence population mean will fall within 2 standard errors of sample mean. Standard error is standard deviation divided by square root of number.

Selecting statistics for univariate analysis:

DescriptiveDescriptiveDescriptiveInferential
Level of measurementCentral tendencyDispersion
NominalModeVariation ratioInterval estimate using standard error of the binomial
OrdinalMedianDecile rangeAs above
IntervalMeanStandard deviation or varianceInterval estimate using standard error of the mean

Ch 10: Bivariate analysis: crosstabulations

155 Crosstabulation consists of:

<>158 when reading a crosstabulation look for:

164 correlation coefficient - description of character of relation between variables.

Ch 11: Bivariate analysis - alternative methods

173 scattergrams - relation between two interval level variables with large numbers of categories:

Pearson's correlation provides single figure index of strength and direction of any linear relationship. When squared it becomes a proportional reduction of error (PRE) measure

Regression analysis - correlation tells us how likely one variable is to affect another (e.g. more education with more income). Regression tells us how much difference it is likely to make. Regression line has function of predicting Y score given knowledge of X score, and to estimate strength of association.

Inferential statistics. Tests of significance - p = <.05 means fewer than five out of hundred samples will produce a false positive. As a rule of thumb, use .05 for smaller samples and .01 for larger.

Links between correlations and tests of significance:

CorrelationSignificanceNInterpretation
0.350.27100Moderate association in sample but too likely to be due to sampling error. Continue to assume correlation of 0 in the population.
0.150.0011500Weak association but is very likely to hold in the population.
0.640.01450Strong relationship that is likely to hold in the population.
0.040.77600Negligible association. Highly probable that the correlation differs from zero due only to sampling error. Continue to assume correlation of 0 in the population.

Ch 12: Elaborating bivariate relationships

201 multiple statistical controls - if, when we control for one or more variables, the effect disappears, then we assume those variables caused the effect. I.e. they explain the original relationship.

211 elaborative analysis:

Ch 13: Multivariate analysis

Multivariate techniques allow answers to several different questions:

Partial correlation - r indicates how strongly two variables are related. Distorting effect of other variables is removed (partialled out).

215 partial regression - "b" - how much will x increase when y increases by one unit - independent of y's links with other variables. 216 standardised - turns "b" into a standard deviation unit e.g. 0.45 means that when y increases by one standard deviation, x will increase by 0.45 standard deviation.

225 path analysis. These two links explain use of path diagrams and give example diagrams:

Walker's whole set of lectures looks good.

Ch 14: Coding

Steps:

Multiple responses - multiple dichotomy method or mutliple response method. Multiple responses to open questions - matrix as wide as biggest number of factors given by any respondent; where people give less, put "missing" code. SPSS deals well with this but to use more sophisticated stats e.g.g correlations, you need to use multiple dichotomy coding.

For missing data you might want different codes:

Checks:

Ch 15: Building scales

More formal and systematic version of getting to know people:

Summated scaling - adding scores on series of questions.

Likert scale:

Select best items:

Factor analysis reduces large number of variables to smaller set of underlying variables by testing for intercorrelations. Steps:

Be aware of GIGO problem - factor analysis finds any correlation whether causal or not. Important to exclude possible causal variables when selecting variables for analysis. Do variables covary *because* they have underlying factors in common?

Then decide how many factors to extract. Eigenvalues useful here - total the variance of all variables included in the factor. Eigenvalue of over 1 suggests inclusion.

Initial extraction does not clarify which variables belong most clearly to which factor. Use rotation. High loading variables belong to factor on which they load - unusual to use loading below .3.

Interpretation is then necessary - having found which variables link empirically, must find conceptual commonality.

Then construct scales:

Complicating issues:

Ch 16: Initial analysis

Data must be prepared for analysis.

Recoding:

Missing data - need to check for bias e.g. are those who refuse to answer ethnic origin from minorities; need to minimise effect - minimise loss; avoid distorting variance and correlation; strive for simplicity.

Options for dealing with missing cases outlined by Hertel 1976:

Ch 17: Moving beyond initial analysis

Checking relationships:

301 deviant case analysis - paying particular attention to cases which buck the trend within the sample.

302 the role of ex post facto theorising - can be as useful as hypothesis testing - both have limitations but there is a role for both.

Ch 19: Ethics

Two basic approaches - set of rules, or use of judgement with regard to consequences.

Obligation to:

Participants - should be voluntary. de Vaus notes problem of "official" surveys and attempts to solve the problem by distinguishing research from bureaucratic form filling. Doesn't convince me. I think that there are some circumstances where the value of the results outweighs the ethical issue of coercion. Having said that, I can't think of any non-government survey where that would apply. And in any case coercion has to be explicit - deceit is never permissible.

333 power relationships cause difficulties - any captive audience - universities surveying students, or anybody surveying prisoners. Can't always ensure voluntary participation e.g. when asking questions about someone else, such as household members.

333 informed consent - close cousin of voluntary participation.

334 questions e.g. how much information makes a participant "informed"? When should people be informed - it can be acceptable to inform people afterwards when prior information would affect the answers.

335 research has shown little difference in response rates or content when consent is informed, but drop by 7% if written consent is sought (whether before or after).

337ff issues about anonymity and confidentiality - (two different things).

340 analysis and reporting - safe guard against temptation to concentrate on results that confirm the hypothesis. Good safeguard is to publish dataset for others to work on. General rule of thumb is to aim at "learning from" the research rather than "using" it.