R Commands for Econometrics#
Basic Operations#
Data Management#
Load a dataset (CSV):
data <- read.csv("filename.csv")
Save a dataset:
write.csv(data, "filename.csv", row.names = FALSE)
View data:
View(data)
Summarize variables:
summary(data$varname)
Display specific variables:
data[c("varname1", "varname2")]
Data Import/Export#
Import CSV:
data <- read.csv("filename.csv")
Export to CSV:
write.csv(data, "filename.csv", row.names = FALSE)
Variable Management#
Create, Drop, and Rename Variables#
Create new variable:
data$newvar <- expression
Replace variable values:
data$varname <- ifelse(condition, new_value, data$varname)
Drop a variable:
data$varname <- NULL
Rename a variable:
names(data)[names(data) == "oldvar"] <- "newvar"
Label Variables and Values#
Rename factor levels:
data$varname <- factor(data$varname, levels = c(1, 2), labels = c("Label1", "Label2"))
Missing Values#
Count missing values:
sum(is.na(data$varname))
Descriptive Statistics and Data Exploration#
Basic Statistics#
Mean, standard deviation, etc.:
summary(data$varname)
Frequency of categorical variables:
table(data$varname)
Cross-tabulation#
Two-way table:
table(data$var1, data$var2)
Add row and column proportions:
prop.table(table(data$var1, data$var2), margin = 1) # row proportions prop.table(table(data$var1, data$var2), margin = 2) # column proportions
Correlation Matrix#
Pairwise correlations:
cor(data[c("var1", "var2", "var3")], use = "complete.obs")
Data Transformations#
Recoding Variables#
Recode values of a variable:
data$varname <- ifelse(data$varname == oldval, newval, data$varname)
Generating Categorical Variables#
Creating dummies (binary variables):
data$newvar <- as.numeric(data$varname == value)
Logarithmic and Other Transformations#
Log of a variable:
data$logvar <- log(data$varname)
Time Series and Panel Data#
Create a time series object:
ts_data <- ts(data$varname, start = c(year, month), frequency = 12)
Set up panel data (using plm package):
library(plm) pdata <- pdata.frame(data, index = c("id", "time"))
Regressions and Statistical Models#
Basic Regression#
Linear regression:
model <- lm(depvar ~ indepvar1 + indepvar2, data = data) summary(model)
Robust standard errors (using sandwich package):
library(sandwich) library(lmtest) coeftest(model, vcov = vcovHC(model, type = "HC1"))
Instrumental Variables (IV) Regression#
Two-stage least squares (using ivreg package):
library(AER) model_iv <- ivreg(depvar ~ endogvar + indepvars | instrumentvar + indepvars, data = data) summary(model_iv)
Probit and Logit#
Probit model:
model_probit <- glm(depvar ~ indepvars, family = binomial(link = "probit"), data = data) summary(model_probit)
Logit model:
model_logit <- glm(depvar ~ indepvars, family = binomial(link = "logit"), data = data) summary(model_logit)
Panel Data Models#
Random effects model (using plm package):
library(plm) model_re <- plm(depvar ~ indepvars, data = pdata, model = "random") summary(model_re)
Fixed effects model:
model_fe <- plm(depvar ~ indepvars, data = pdata, model = "within") summary(model_fe)
Time Series Models#
Autoregressive model (AR):
arima(data$varname, order = c(1, 0, 0))
Vector autoregression (VAR) (using vars package):
library(vars) model_var <- VAR(data[c("var1", "var2")], p = 2) summary(model_var)
Post-estimation and Diagnostics#
Predictions#
Generate fitted values:
data$yhat <- predict(model)
Generate residuals:
data$residuals <- residuals(model)
Model Fit Statistics#
Display model summary (AIC, BIC, etc.):
AIC(model) BIC(model)
Heteroskedasticity Tests#
Breusch-Pagan test (using lmtest package):
library(lmtest) bptest(model)
Multicollinearity Diagnostics#
Variance inflation factor (VIF) (using car package):
library(car) vif(model)
Marginal Effects#
Calculate marginal effects:
library(margins) margins(model)
Summary of marginal effects:
summary(margins(model))
Marginal effects at specific values:
margins(model, at = list(varname = c(value1, value2)))
Hypothesis Testing#
F-test for linear restrictions:
library(car) linearHypothesis(model, "var1 = var2")
Test joint significance:
linearHypothesis(model, c("var1 = 0", "var2 = 0"))
Wald test:
library(aod) wald.test(Sigma = vcov(model), b = coef(model), Terms = c(2, 3))
Linear Combinations and Contrasts#
General linear hypothesis testing:
library(multcomp) glht(model, linfct = "var1 + var2 = 0")
Custom contrasts:
library(multcomp) K <- matrix(c(1, 1, 0), 1) # var1 + var2 glht(model, linfct = K)
Model Comparison#
Compare nested models:
anova(model1, model2)
Likelihood ratio test:
library(lmtest) lrtest(model1, model2)
Graphs and Visualizations#
Basic Graphs#
Histogram:
hist(data$varname, main = "Histogram", xlab = "Variable")
Scatter plot:
plot(data$var1, data$var2)
Regression Plot#
Add a regression line to scatter plot:
plot(data$var1, data$var2) abline(model, col = "blue")
Box Plots#
Boxplot:
boxplot(varname ~ groupvar, data = data)
Packages#
Installing Packages#
install.packages("packagename")
Loading Packages#
library(packagename)
Recommended Packages#
stargazer: Create publication-quality tables.install.packages("stargazer") library(stargazer)
texreg: Export regression tables in LaTeX, HTML, Word.install.packages("texreg") library(texreg)
modelsummary: Modern table output with many formats.install.packages("modelsummary") library(modelsummary)
AER: Containsivregfor instrumental variables regressions.install.packages("AER") library(AER)
plm: Panel data regressions.install.packages("plm") library(plm)
fixest: Fast fixed effects estimation.install.packages("fixest") library(fixest)
sandwich: Robust standard errors.install.packages("sandwich") library(sandwich)
car: Regression diagnostics and multicollinearity tests.install.packages("car") library(car)
lmtest: Tests for linear models.install.packages("lmtest") library(lmtest)
vars: Vector autoregressive models.install.packages("vars") library(vars)
margins: Calculate and visualize marginal effects.install.packages("margins") library(margins)
broom: Convert models into tidy data frames for easy visualization.install.packages("broom") library(broom)
multcomp: Multiple comparisons and general linear hypotheses.install.packages("multcomp") library(multcomp)
aod: Analysis of overdispersed data and additional model tests.install.packages("aod") library(aod)
ggplot2: Advanced data visualization.install.packages("ggplot2") library(ggplot2)