SOC 206 W 2011
First assignment:
Take three variables: one dependent with at
least 5 categories and at least ordinal level of measurement (Y) and two other
variables (X,Z) with at least ordinal level of
measurement (dichotomies are OK). Formulate a hypothesis that link Y and X and
present a short explanation (theory) why they are linked.
1. Present a second hypothesis that explains
the relationship between Y and X in terms of Z.
a. Write out these two equations in
mathematical form.
b. Present them in the form of causal
diagrams.
2. Create a scatter plot of Y and X.
First, run frequencies on all the variables and make sure that all missing values are properly coded
a. Run the regression of Y on X. Interpret
the results. (Slope, intercept, R-square, F and t statistics.)
b. Create a 3-D scatter plot of Y and X and
Z.
c. Run the regression of Y on X and Z. Save
the residuals (RES) and the predicted value (PRED). Did the effect of X on Y
change? Does Z have an effect? Interpret the results. Looking at Beta, which
has a stronger impact X or Z?
d. Run a correlation between Y, X, Z and the
RES and PRED. What do you see?
Take your dependent variable and try to build a multiple regression explaining it with additional independent variables. You will need at least four independent variables. Make sure that there is at least one dummy variable among your independent variables.
Draw a causal diagram.
Give me one argument for non-linearity involving your dependent and one of your independent variables (of course, neither can be dummy). Test it. According to your model, what is the inflection point?
Give me one argument for an interaction involving your dependent and two of your independent variables (if possible, one of the two independent variables should be a dummy). Test it. Write out two equations, one for the group 0 of the dummy and another for group 1. If you like, you can check your two equations by running the regression for the two subsets separately.
(In either case, don't worry if you don't find non-linearity/interaction. Interactions and non-linearities are difficult to include in causal diagrams, so don't bother.)
Identify a theoretically meaningful subset of variables and test them as a group in a nested model. Make sure that the two regression models you compare have the same case base! (Hint: use the formula on p. 270 of the book.)
Present your best model and interpret it (metric and standardized coefficients, their significance and R-squared).
What does your model say about the substance of your inquiry?
Present your best multiple regression model using as many independent variables as you need.
Interpret the results.
Hand in your output with your assignment.