AGE EMPLOY ADDRESS DEBTINC CREDDEBT OTHDEBT DEFAULTER
1 3 17 12 9.3 11.36 5.01 1
2 1 10 6 17.3 1.36 4.00 0
3 2 15 14 5.5 0.86 2.17 0
4 3 15 14 2.9 2.66 0.82 0
5 1 2 0 17.3 1.79 3.06 1
Quick and Easy Model Tables with tab_model() in R
Introduction:
When working with statistical models, one often needs to present results in a clear and visually appealing manner. The tab_model()
function from sjPlot
library offers a powerful tool for creating publication-ready tables of various types of regression models, including linear, generalized linear, mixed-effects, and Bayesian models. This blog post will walk you through the function, its capabilities, and how to use it effectively.
A glance at the dataset:
The dataset includes information about customers who have applied for a bank loan, with the following attributes: Age group, Years at current employer, Years at current address, Debt to income ratio, Credit card debts, Other debts and Loan defaulter status.
Getting started:
We will begin with installing and calling the library required, which is sjPlot
.
install.packages("sjPlot")
library(sjPlot)
For this example, we will fit a Binary Logistic Regression (BLR) model to the dataset. The dependent variable is DEFAULTER, where 1 indicates that the customer is a defaulter and 0 indicates otherwise. The rest of the variables are the independent variables. Since the column AGE denotes three different age groups, we will convert than column into a factor.
$AGE <- factor(df$AGE)
df<- glm(DEFAULTER ~ AGE + EMPLOY + ADDRESS + DEBTINC + CREDDEBT + OTHDEBT, family = binomial, data = df) model
Using tab_model():
Using summary(model)
, we can easily display the regression model output. Here, in order to create well-formatted tables, we will use tab_model(model)
. The tables produced are suitable for inclusion in documents, presentations, and academic papers, thanks to their clean and customizable format.
tab_model(model)
DEFAULTER | |||
Predictors | Odds Ratios | CI | p |
(Intercept) | 0.45 | 0.27 – 0.76 | 0.003 |
AGE [2] | 1.29 | 0.76 – 2.18 | 0.344 |
AGE [3] | 1.87 | 0.92 – 3.80 | 0.082 |
EMPLOY | 0.77 | 0.72 – 0.82 | <0.001 |
ADDRESS | 0.91 | 0.87 – 0.94 | <0.001 |
DEBTINC | 1.09 | 1.04 – 1.14 | <0.001 |
CREDDEBT | 1.76 | 1.49 – 2.11 | <0.001 |
OTHDEBT | 1.02 | 0.91 – 1.14 | 0.685 |
Observations | 700 | ||
R2 Tjur | 0.339 |
This code will generate a table displaying the odds ratio, confidence intervals, and p-values for the linear model. Adding transform = NULL
to tab_models()
will display the coefficients (estimates) directly, rather than transforming them to odds ratios.
tab_model(model, transform = NULL)
DEFAULTER | |||
Predictors | Log-Odds | CI | p |
(Intercept) | -0.79 | -1.31 – -0.27 | 0.003 |
AGE [2] | 0.25 | -0.27 – 0.78 | 0.344 |
AGE [3] | 0.63 | -0.08 – 1.33 | 0.082 |
EMPLOY | -0.26 | -0.33 – -0.20 | <0.001 |
ADDRESS | -0.10 | -0.14 – -0.06 | <0.001 |
DEBTINC | 0.09 | 0.04 – 0.13 | <0.001 |
CREDDEBT | 0.56 | 0.40 – 0.75 | <0.001 |
OTHDEBT | 0.02 | -0.09 – 0.13 | 0.685 |
Observations | 700 | ||
R2 Tjur | 0.339 |
Customizing the output:
We can further customize the appearance and content of the tables. Here are some of the arguments we can use:
show.ci: BOOLEAN, whether to show confidence intervals
show se: BOOLEAN, whether to show standard error
show.stat: BOOLEAN, whether to show the test statistic
show.p BOOLEAN, whether to show p-value
show.est: BOOLEAN, whether to show estimate
show.r2 BOOLEAN, whether to show R-squared value
digits: amount of decimals to display
p.style: style for displaying p-values, either “numeric”, “scientific” or “stars” (can also be “numeric_stars” or “scientific_stars”)
string.est: assign name to the column for estimates
We will now see an example using the arguments mentioned above.
tab_model(model, transform = NULL, show.ci = FALSE, show.se = TRUE, show.stat = TRUE, show.p = TRUE, show.est = TRUE, show.r2 = FALSE, digits = 3, p.style = "scientific_stars", string.est = "Estimate")
DEFAULTER | ||||
Predictors | Estimate | std. Error | Statistic | p |
(Intercept) | -0.788 ** | 0.264 | -2.985 | 2.837e-03 |
AGE [2] | 0.252 | 0.267 | 0.946 | 3.443e-01 |
AGE [3] | 0.627 | 0.361 | 1.739 | 8.201e-02 |
EMPLOY | -0.262 *** | 0.032 | -8.211 | 2.194e-16 |
ADDRESS | -0.100 *** | 0.022 | -4.459 | 8.222e-06 |
DEBTINC | 0.085 *** | 0.022 | 3.845 | 1.205e-04 |
CREDDEBT | 0.563 *** | 0.089 | 6.347 | 2.201e-10 |
OTHDEBT | 0.023 | 0.057 | 0.405 | 6.852e-01 |
Observations | 700 | |||
* p<0.05 ** p<0.01 *** p<0.001 |
Conclusion:
The tab_model()
function from the sjPlot
library is a versatile tool for creating high-quality tables of regression model results in R. Whether you’re working with linear, mixed-effects, or Bayesian models, tab_model()
offers a range of customization options to suit your needs. Explore the various arguments for further customization. Happy plotting!