[Get it solved] Interpret the statistical output of the GLM procedure (va...

Check Out Our Work & Get Yours Done

Submit Work

Download Sample

Enroll in the complete course for only $250 USD*

Order Now

Submit work Offers

Interpret the statistical output of the GLM procedure (variance derived from MSE, F

computer science

Description

60 scored multiple-choice and short-answer questions.

(Must achieve score of 68 percent correct to pass)
In addition to the 60 scored items, there may be up to five unscored items.
Two hours to complete exam.
Use exam ID A00-240;

Percentage of questions by topic:

ANOVA - 10%
Linear Regression - 20%
Logistic Regression - 25%
Prepare Inputs for Predictive Model Performance - 20%
Measure Model Performance - 25%

ANOVA - 10%

Verify the assumptions of ANOVA

 Explain the central limit theorem and when it must be applied

 Examine the distribution of continuous variables (histogram, box -whisker, Q-Q plots)

 Describe the effect of skewness on the normal distribution

 Define H0, H1, Type I/II error, statistical power, p-value

 Describe the effect of sample size on p-value and power

 Interpret the results of hypothesis testing

 Interpret histograms and normal probability charts

 Draw conclusions about your data from histogram, box-whisker, and Q-Q plots

 Identify the kinds of problems may be present in the data: (biased sample, outliers,

extreme values)

 For a given experiment, verify that the observations are independent

 For a given experiment, verify the errors are normally distributed

 Use the UNIVARIATE procedure to examine residuals

 For a given experiment, verify all groups have equal response variance

 Use the HOVTEST option of MEANS statement in PROC GLM to asses response

variance

Analyze differences between population means using the GLM and TTEST

procedures

 Use the GLM Procedure to perform ANOVA

o CLASS statement

o MODEL statement

o MEANS statement

o OUTPUT statement

 Evaluate the null hypothesis using the output of the GLM procedure

 Interpret the statistical output of the GLM procedure (variance derived from MSE, F

value, p-value R**2, Levene's test)

 Interpret the graphical output of the GLM procedure

 Use the TTEST Procedure to compare means

Perform ANOVA post hoc test to evaluate treatment effect

Exam Content Guide

 Use the LSMEANS statement in the GLM or PLM procedure to perform pairwise

comparisons

 Use PDIFF option of LSMEANS statement

 Use ADJUST option of the LSMEANS statement (TUKEY and DUNNETT)

 Interpret diffograms to evaluate pairwise comparisons

 Interpret control plots to evaluate pairwise comparisons

 Compare/Contrast use of pairwise T-Tests, Tukey and Dunnett comparison methods

Detect and analyze interactions between factors

 Use the GLM procedure to produce reports that will help determine the significance

of the interaction between factors. MODEL statement

 LSMEANS with SLICE=option (Also using PROC PLM)

 ODS SELECT

 Interpret the output of the GLM procedure to identify interaction between factors:

p-value

 F Value

 R Squared

 TYPE I SS

 TYPE III SS

Linear Regression - 20%

Fit a multiple linear regression model using the REG and GLM procedures

 Use the REG procedure to fit a multiple linear regression model

 Use the GLM procedure to fit a multiple linear regression model

Analyze the output of the REG, PLM, and GLM procedures for multiple linear

regression models

 Interpret REG or GLM procedure output for a multiple linear regression model:

convert models to algebraic expressions

 Convert models to algebraic expressions

 Identify missing degrees of freedom

 Identify variance due to model/error, and total variance

 Calculate a missing F value

 Identify variable with largest impact to model

 For output from two models, identify which model is better

 Identify how much of the variation in the dependent variable is explained by the

model

 Conclusions that can be drawn from REG, GLM, or PLM output: (about H0, model

quality, graphics)

Use the REG or GLMSELECT procedure to perform model selection

Exam Content Guide

 Use the SELECTION option of the model statement in the GLMSELECT procedure

 Compare the differentmodel selection methods (STEPWISE, FORWARD, BACKWARD)

 Enable ODS graphics to display graphs from the REG or GLMSELECT procedure

 Identify best models by examining the graphical output (fit criterion from the REG or

GLMSELECT procedure)

 Assign names to models in the REG procedure (multiple model statements)

Assess the validity of a given regression model through the use of diagnostic and

residual analysis

 Explain the assumptions for linear regression

 From a set of residuals plots, asses which assumption about the error terms has

been violated

 Use REG procedure MODEL statement options to identify influential observations

(Student Residuals, Cook's D, DFFITS, DFBETAS)

 Explain options for handling influential observations

 Identify collinearity problems by examining REG procedure output

 Use MODEL statement options to diagnose collinearity problems (VIF, COLLIN,

COLLINOINT)

Logistic Regression - 25%

Perform logistic regression with the LOGISTIC procedure

 Identify experiments that require analysis via logistic regression

 Identify logistic regression assumptions

 logistic regression concepts (log odds, logit transformation, sigmoidal relationship

between p and X)

 Use the LOGISTIC procedure to fit a binary logistic regression model (MODEL and

CLASS statements)

Optimize model performance through input selection

 Use the LOGISTIC procedure to fit a multiple logistic regression model

 LOGISTIC procedure SELECTION=SCORE option

 Perform Model Selection (STEPWISE, FORWARD, BACKWARD) within the LOGISTIC

procedure

Interpret the output of the LOGISTIC procedure

 Interpret the output from the LOGISTIC procedure for binary logistic regression

models: Model Convergence section

 Testing Global Null Hypothesis table

 Type 3 Analysis of Effects table

 Analysis of Maximum Likelihood Estimates table

Exam Content Guide

 Association of Predicted Probabilities and Observed Responses

Score new data sets using the LOGISTIC and PLM procedures

 Use the SCORE statement in the PLM procedure to score new cases

 Use the CODE statement in PROC LOGISTIC to score new data

 Describe when you would use the SCORE statement vs the CODE statement in PROC

LOGISTIC

 Use the INMODEL/OUTMODEL options in PROC LOGISTIC

 Explain how to score new data when you have developed a model from a biased

sample

Prepare Inputs for Predictive Model

Performance - 20%

Identify the potential challenges when preparing input data for a model

 Identify problems that missing values can cause in creating predictive models and

scoring new data sets

 Identify limitations of Complete Case Analysis

 Explain problems caused by categorical variables with numerous levels

 Discuss the problem of redundant variables

 Discuss the problem of irrelevant and redundant variables

 Discuss the non-linearities and the problems they create in predictive models

 Discuss outliers and the problems they create in predictive models

 Describe quasi-complete separation

 Discuss the effect of interactions

 Determine when it is necessary to oversample data

Use the DATA step to manipulate data with loops, arrays, conditional

statements and functions

 Use ARRAYs to create missing indicators

 Use ARRAYS, LOOP, IF, and explicit OUTPUT statements

Improve the predictive power of categorical inputs

 Reduce the number of levels of a categorical variable

 Explain thresholding

 Explain Greenacre's method

 Cluster the levels of a categorical variable via Greenacre's method using the CLUSTER

procedure

o METHOD=WARD option

o FREQ, VAR, ID statement

Exam Content Guide

o Use of ODS output to create an output data set

 Convert categorical variables to continuous using smooth weight of evidence

Screen variables for irrelevance and non-linear association using the CORR

procedure

 Explain how Hoeffding's D and Spearman statistics can be used to find irrelevant

variables and non-linear associations

 Produce Spearman and Hoeffding's D statistic using the CORR procedure (VAR, WITH

statement)

 Interpret a scatter plot of Hoeffding's D and Spearman statistic to identify irrelevant

variables and non-linear associations

Screen variables for non-linearity using empirical logit plots

 Use the RANK procedure to bin continuous input variables (GROUPS=, OUT= option;

VAR, RANK statements)

 Interpret RANK procedure output

 Use the MEANS procedure to calculate the sum and means for the target cases and

total events (NWAY option; CLASS, VAR, OUTPUT statements)

 Create empirical logit plots with the SGPLOT procedure

 Interpret empirical logit plots

Measure Model Performance - 25%

Apply the principles of honest ass

Enroll in the complete course for only $250 USD*

Interpret the statistical output of the GLM procedure (variance derived from MSE, F

computer science

Description

Get instant assignment help service

Related Questions in computer science category

Policy

Exploring

Other

Connect With Us

We Provide Services Across The Globe