R Commands for Econometrics#

Basic Operations#

Data Management#

  • Load a dataset (CSV):

    data <- read.csv("filename.csv")
    
  • Save a dataset:

    write.csv(data, "filename.csv", row.names = FALSE)
    
  • View data:

    View(data)
    
  • Summarize variables:

    summary(data$varname)
    
  • Display specific variables:

    data[c("varname1", "varname2")]
    

Data Import/Export#

  • Import CSV:

    data <- read.csv("filename.csv")
    
  • Export to CSV:

    write.csv(data, "filename.csv", row.names = FALSE)
    

Variable Management#

Create, Drop, and Rename Variables#

  • Create new variable:

    data$newvar <- expression
    
  • Replace variable values:

    data$varname <- ifelse(condition, new_value, data$varname)
    
  • Drop a variable:

    data$varname <- NULL
    
  • Rename a variable:

    names(data)[names(data) == "oldvar"] <- "newvar"
    

Label Variables and Values#

  • Rename factor levels:

    data$varname <- factor(data$varname, levels = c(1, 2), labels = c("Label1", "Label2"))
    

Missing Values#

  • Count missing values:

    sum(is.na(data$varname))
    

Descriptive Statistics and Data Exploration#

Basic Statistics#

  • Mean, standard deviation, etc.:

    summary(data$varname)
    
  • Frequency of categorical variables:

    table(data$varname)
    

Cross-tabulation#

  • Two-way table:

    table(data$var1, data$var2)
    
  • Add row and column proportions:

    prop.table(table(data$var1, data$var2), margin = 1)  # row proportions
    prop.table(table(data$var1, data$var2), margin = 2)  # column proportions
    

Correlation Matrix#

  • Pairwise correlations:

    cor(data[c("var1", "var2", "var3")], use = "complete.obs")
    

Data Transformations#

Recoding Variables#

  • Recode values of a variable:

    data$varname <- ifelse(data$varname == oldval, newval, data$varname)
    

Generating Categorical Variables#

  • Creating dummies (binary variables):

    data$newvar <- as.numeric(data$varname == value)
    

Logarithmic and Other Transformations#

  • Log of a variable:

    data$logvar <- log(data$varname)
    

Time Series and Panel Data#

  • Create a time series object:

    ts_data <- ts(data$varname, start = c(year, month), frequency = 12)
    
  • Set up panel data (using plm package):

    library(plm)
    pdata <- pdata.frame(data, index = c("id", "time"))
    

Regressions and Statistical Models#

Basic Regression#

  • Linear regression:

    model <- lm(depvar ~ indepvar1 + indepvar2, data = data)
    summary(model)
    
  • Robust standard errors (using sandwich package):

    library(sandwich)
    library(lmtest)
    coeftest(model, vcov = vcovHC(model, type = "HC1"))
    

Instrumental Variables (IV) Regression#

  • Two-stage least squares (using ivreg package):

    library(AER)
    model_iv <- ivreg(depvar ~ endogvar + indepvars | instrumentvar + indepvars, data = data)
    summary(model_iv)
    

Probit and Logit#

  • Probit model:

    model_probit <- glm(depvar ~ indepvars, family = binomial(link = "probit"), data = data)
    summary(model_probit)
    
  • Logit model:

    model_logit <- glm(depvar ~ indepvars, family = binomial(link = "logit"), data = data)
    summary(model_logit)
    

Panel Data Models#

  • Random effects model (using plm package):

    library(plm)
    model_re <- plm(depvar ~ indepvars, data = pdata, model = "random")
    summary(model_re)
    
  • Fixed effects model:

    model_fe <- plm(depvar ~ indepvars, data = pdata, model = "within")
    summary(model_fe)
    

Time Series Models#

  • Autoregressive model (AR):

    arima(data$varname, order = c(1, 0, 0))
    
  • Vector autoregression (VAR) (using vars package):

    library(vars)
    model_var <- VAR(data[c("var1", "var2")], p = 2)
    summary(model_var)
    

Post-estimation and Diagnostics#

Predictions#

  • Generate fitted values:

    data$yhat <- predict(model)
    
  • Generate residuals:

    data$residuals <- residuals(model)
    

Model Fit Statistics#

  • Display model summary (AIC, BIC, etc.):

    AIC(model)
    BIC(model)
    

Heteroskedasticity Tests#

  • Breusch-Pagan test (using lmtest package):

    library(lmtest)
    bptest(model)
    

Multicollinearity Diagnostics#

  • Variance inflation factor (VIF) (using car package):

    library(car)
    vif(model)
    

Marginal Effects#

  • Calculate marginal effects:

    library(margins)
    margins(model)
    
  • Summary of marginal effects:

    summary(margins(model))
    
  • Marginal effects at specific values:

    margins(model, at = list(varname = c(value1, value2)))
    

Hypothesis Testing#

  • F-test for linear restrictions:

    library(car)
    linearHypothesis(model, "var1 = var2")
    
  • Test joint significance:

    linearHypothesis(model, c("var1 = 0", "var2 = 0"))
    
  • Wald test:

    library(aod)
    wald.test(Sigma = vcov(model), b = coef(model), Terms = c(2, 3))
    

Linear Combinations and Contrasts#

  • General linear hypothesis testing:

    library(multcomp)
    glht(model, linfct = "var1 + var2 = 0")
    
  • Custom contrasts:

    library(multcomp)
    K <- matrix(c(1, 1, 0), 1)  # var1 + var2
    glht(model, linfct = K)
    

Model Comparison#

  • Compare nested models:

    anova(model1, model2)
    
  • Likelihood ratio test:

    library(lmtest)
    lrtest(model1, model2)
    

Graphs and Visualizations#

Basic Graphs#

  • Histogram:

    hist(data$varname, main = "Histogram", xlab = "Variable")
    
  • Scatter plot:

    plot(data$var1, data$var2)
    

Regression Plot#

  • Add a regression line to scatter plot:

    plot(data$var1, data$var2)
    abline(model, col = "blue")
    

Box Plots#

  • Boxplot:

    boxplot(varname ~ groupvar, data = data)
    

Packages#

Installing Packages#

install.packages("packagename")

Loading Packages#

library(packagename)