class: center, middle, inverse, title-slide # Before we fit a model ### --- layout: true <div class="my-footer"> <span> Dr. Lucy D'Agostino McGowan </span> </div> --- ## Before you fit a model ๐ **Understand the content matter** -- As a statistician I collaborate frequently with subject matter experts to ensure that I understand the the context of the problem at hand. -- โ **Understand the objective** -- It is crucial to understand the what the objectives are. Ideally, these are set a priori, or if exploratory analyses are being done that is very explicit from beginning to end -- ๐ **Understand where the data came from** -- Was this observational or experimental data? Is any data missing? What are the units? Are there data entry issues? -- ๐งน **Get the data into a tidy, analyzable form** -- Often we get data in a form that is not easily analyzable. In this class, we will be focusing _mostly_ on statistical methodology once the data _is_ in an analyzable format, but just because it is analyzable doesn't mean the analysis choice is obvious. --- ## Before you fit a model ๐ **Determine the appropriate model** -- In this class we are focusing on _Linear Models_. Linear models are not always appropriate. You must examine your data to determine whether a linear model is a good choice. --- ## Before you fit a model ### ๐ Understand the content matter ### โ Understand the objective ### ๐ Understand where the data came from ### ๐งน Get the data into a tidy, analyzable form ### ๐ Determine the appropriate model --- ## Is a Linear Model appropriate? * Outcome variable, `\(y\)` is **continuous** -- * Explanatory variable(s), `\(\mathbf{X} = \{X_1, \dots, X_p\}\)` can take any form -- * Observations are **independent** -- * The residuals are **homoscedastic** (Equal variance) -- * The residuals are **normally distributed** -- * The relationship between the residuals & `\(y\)` is linear --- ## What are Linear Models used for? โ๏ธ Prediction of future outcomes using specific predictors -- โ๏ธ Assessing the relationship between explanatory variables and the response