Type: | Package |
Encoding: | UTF-8 |
Title: | Data Visualization for Statistics in Social Science |
Version: | 2.9.0 |
Maintainer: | Daniel Lüdecke <d.luedecke@uke.de> |
Description: | Collection of plotting and table output functions for data visualization. Results of various statistical analyses (that are commonly used in social sciences) can be visualized using this package, including simple and cross tabulated frequencies, histograms, box plots, (generalized) linear models, mixed effects models, principal component analysis and correlation matrices, cluster analyses, scatter plots, stacked scales, effects plots of regression models (including interaction terms) and much more. This package supports labelled data. |
License: | GPL-3 |
Depends: | R (≥ 4.1) |
Imports: | graphics, grDevices, stats, utils, bayestestR (≥ 0.16.1), datawizard (≥ 1.1.0), dplyr, ggeffects, ggplot2 (≥ 3.2.0), knitr, insight (≥ 1.3.1), parameters (≥ 0.27.0), performance (≥ 0.15.0), purrr, rlang, scales, sjlabelled (≥ 1.1.2), sjmisc (≥ 2.8.2), sjstats (≥ 0.17.8), tidyr (≥ 1.0.0) |
Suggests: | brms, car, clubSandwich, cluster, cowplot, effects, haven, GPArotation, ggrepel, glmmTMB, gridExtra, ggridges, httr, lme4, MASS, nFactors, pscl, psych, rmarkdown, rstanarm, sandwich, splines, survey, TMB, testthat |
URL: | https://strengejacke.github.io/sjPlot/ |
BugReports: | https://github.com/strengejacke/sjPlot/issues |
RoxygenNote: | 7.3.2 |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2025-07-10 18:10:34 UTC; DL |
Author: | Daniel Lüdecke |
Repository: | CRAN |
Date/Publication: | 2025-07-10 19:00:05 UTC |
Data Visualization for Statistics in Social Science
Description
Collection of plotting and table output functions for data visualization. Results of various statistical analyses (that are commonly used in social sciences) can be visualized using this package, including simple and cross tabulated frequencies, histograms, box plots, (generalized) linear models, mixed effects models, PCA and correlation matrices, cluster analyses, scatter plots, Likert scales, effects plots of interaction terms in regression models, constructing index or score variables and much more.
The package supports labelled data, i.e. value and variable labels from labelled data (like vectors or data frames) are automatically used to label the output. Own labels can be specified as well.
What does this package do?
In short, the functions in this package mostly do two things:
compute basic or advanced statistical analyses
either plot the results as ggplot-figure or print them as html-table
How does this package help me?
One of the more challenging tasks when working with R is to get nicely formatted output of statistical analyses, either in graphical or table format. The sjPlot-package takes over these tasks and makes it easy to create beautiful figures or tables.
There are many examples for each function in the related help files and a comprehensive online documentation at https://strengejacke.github.io/sjPlot/.
A note on the package functions
The main functions follow specific naming conventions, hence starting with a specific prefix, which indicates what kind of task these functions perform.
-
sjc
- cluster analysis functions -
sjp
- plotting functions -
sjt
- (HTML) table output functions
Author(s)
Daniel Lüdecke d.luedecke@uke.de
Plot chi-squared distributions
Description
This function plots a simple chi-squared distribution or a chi-squared distribution with shaded areas that indicate at which chi-squared value a significant p-level is reached.
Usage
dist_chisq(
chi2 = NULL,
deg.f = NULL,
p = NULL,
xmax = NULL,
geom.colors = NULL,
geom.alpha = 0.7
)
Arguments
chi2 |
Numeric, optional. If specified, a chi-squared distribution with |
deg.f |
Numeric. The degrees of freedom for the chi-squared distribution. Needs to be specified. |
p |
Numeric, optional. If specified, a chi-squared distribution with |
xmax |
Numeric, optional. Specifies the maximum x-axis-value. If not specified, the x-axis ranges to a value where a p-level of 0.00001 is reached. |
geom.colors |
user defined color for geoms. See 'Details' in |
geom.alpha |
Specifies the alpha-level of the shaded area. Default is 0.7, range between 0 to 1. |
Examples
# a simple chi-squared distribution
# for 6 degrees of freedom
dist_chisq(deg.f = 6)
# a chi-squared distribution for 6 degrees of freedom,
# and a shaded area starting at chi-squared value of ten.
# With a df of 6, a chi-squared value of 12.59 would be "significant",
# thus the shaded area from 10 to 12.58 is filled as "non-significant",
# while the area starting from chi-squared value 12.59 is filled as
# "significant"
dist_chisq(chi2 = 10, deg.f = 6)
# a chi-squared distribution for 6 degrees of freedom,
# and a shaded area starting at that chi-squared value, which has
# a p-level of about 0.125 (which equals a chi-squared value of about 10).
# With a df of 6, a chi-squared value of 12.59 would be "significant",
# thus the shaded area from 10 to 12.58 (p-level 0.125 to p-level 0.05)
# is filled as "non-significant", while the area starting from chi-squared
# value 12.59 (p-level < 0.05) is filled as "significant".
dist_chisq(p = 0.125, deg.f = 6)
Plot F distributions
Description
This function plots a simple F distribution or an F distribution with shaded areas that indicate at which F value a significant p-level is reached.
Usage
dist_f(
f = NULL,
deg.f1 = NULL,
deg.f2 = NULL,
p = NULL,
xmax = NULL,
geom.colors = NULL,
geom.alpha = 0.7
)
Arguments
f |
Numeric, optional. If specified, an F distribution with |
deg.f1 |
Numeric. The first degrees of freedom for the F distribution. Needs to be specified. |
deg.f2 |
Numeric. The second degrees of freedom for the F distribution. Needs to be specified. |
p |
Numeric, optional. If specified, a F distribution with |
xmax |
Numeric, optional. Specifies the maximum x-axis-value. If not specified, the x-axis ranges to a value where a p-level of 0.00001 is reached. |
geom.colors |
user defined color for geoms. See 'Details' in |
geom.alpha |
Specifies the alpha-level of the shaded area. Default is 0.7, range between 0 to 1. |
Examples
# a simple F distribution for 6 and 45 degrees of freedom
dist_f(deg.f1 = 6, deg.f2 = 45)
# F distribution for 6 and 45 degrees of freedom,
# and a shaded area starting at F value of two.
# F-values equal or greater than 2.31 are "significant"
dist_f(f = 2, deg.f1 = 6, deg.f2 = 45)
# F distribution for 6 and 45 degrees of freedom,
# and a shaded area starting at a p-level of 0.2
# (F-Value about 1.5).
dist_f(p = 0.2, deg.f1 = 6, deg.f2 = 45)
Plot normal distributions
Description
This function plots a simple normal distribution or a normal distribution with shaded areas that indicate at which value a significant p-level is reached.
Usage
dist_norm(
norm = NULL,
mean = 0,
sd = 1,
p = NULL,
xmax = NULL,
geom.colors = NULL,
geom.alpha = 0.7
)
Arguments
norm |
Numeric, optional. If specified, a normal distribution with |
mean |
Numeric. Mean value for normal distribution. By default 0. |
sd |
Numeric. Standard deviation for normal distribution. By default 1. |
p |
Numeric, optional. If specified, a normal distribution with |
xmax |
Numeric, optional. Specifies the maximum x-axis-value. If not specified, the x-axis ranges to a value where a p-level of 0.00001 is reached. |
geom.colors |
user defined color for geoms. See 'Details' in |
geom.alpha |
Specifies the alpha-level of the shaded area. Default is 0.7, range between 0 to 1. |
Examples
# a simple normal distribution
dist_norm()
# a simple normal distribution with different mean and sd.
# note that curve looks similar to above plot, but axis range
# has changed.
dist_norm(mean = 2, sd = 4)
# a simple normal distribution
dist_norm(norm = 1)
# a simple normal distribution
dist_norm(p = 0.2)
Plot t-distributions
Description
This function plots a simple t-distribution or a t-distribution with shaded areas that indicate at which t-value a significant p-level is reached.
Usage
dist_t(
t = NULL,
deg.f = NULL,
p = NULL,
xmax = NULL,
geom.colors = NULL,
geom.alpha = 0.7
)
Arguments
t |
Numeric, optional. If specified, a t-distribution with |
deg.f |
Numeric. The degrees of freedom for the t-distribution. Needs to be specified. |
p |
Numeric, optional. If specified, a t-distribution with |
xmax |
Numeric, optional. Specifies the maximum x-axis-value. If not specified, the x-axis ranges to a value where a p-level of 0.00001 is reached. |
geom.colors |
user defined color for geoms. See 'Details' in |
geom.alpha |
Specifies the alpha-level of the shaded area. Default is 0.7, range between 0 to 1. |
Examples
# a simple t-distribution
# for 6 degrees of freedom
dist_t(deg.f = 6)
# a t-distribution for 6 degrees of freedom,
# and a shaded area starting at t-value of one.
# With a df of 6, a t-value of 1.94 would be "significant".
dist_t(t = 1, deg.f = 6)
# a t-distribution for 6 degrees of freedom,
# and a shaded area starting at p-level of 0.4
# (t-value of about 0.26).
dist_t(p = 0.4, deg.f = 6)
Sample dataset from the EUROFAMCARE project
Description
A SPSS sample data set, imported with the read_spss
function.
Plot frequencies of variables
Description
Plot frequencies of a variable as bar graph, histogram, box plot etc.
Usage
plot_frq(
data,
...,
title = "",
weight.by = NULL,
title.wtd.suffix = NULL,
sort.frq = c("none", "asc", "desc"),
type = c("bar", "dot", "histogram", "line", "density", "boxplot", "violin"),
geom.size = NULL,
geom.colors = "#336699",
errorbar.color = "darkred",
axis.title = NULL,
axis.labels = NULL,
xlim = NULL,
ylim = NULL,
wrap.title = 50,
wrap.labels = 20,
grid.breaks = NULL,
expand.grid = FALSE,
show.values = TRUE,
show.n = TRUE,
show.prc = TRUE,
show.axis.values = TRUE,
show.ci = FALSE,
show.na = FALSE,
show.mean = FALSE,
show.mean.val = TRUE,
show.sd = TRUE,
drop.empty = TRUE,
mean.line.type = 2,
mean.line.size = 0.5,
inner.box.width = 0.15,
inner.box.dotsize = 3,
normal.curve = FALSE,
normal.curve.color = "red",
normal.curve.size = 0.8,
normal.curve.alpha = 0.4,
auto.group = NULL,
coord.flip = FALSE,
vjust = "bottom",
hjust = "center",
y.offset = NULL
)
Arguments
data |
A data frame, or a grouped data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
title |
Character vector, used as plot title. By default,
|
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
title.wtd.suffix |
Suffix (as string) for the title, if |
sort.frq |
Determines whether categories should be sorted
according to their frequencies or not. Default is |
type |
Specifies the plot type. May be abbreviated.
|
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
geom.colors |
User defined color for geoms, e.g. |
errorbar.color |
Color of confidence interval bars (error bars).
Only applies to |
axis.title |
Character vector of length one or two (depending on
the plot function and type), used as title(s) for the x and y axis.
If not specified, a default labelling is chosen.
Note: Some plot types do not support this argument. In such
cases, use the return value and add axis titles manually with
|
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
xlim |
Numeric vector of length two, defining lower and upper axis limits
of the x scale. By default, this argument is set to |
ylim |
numeric vector of length two, defining lower and upper axis limits
of the y scale. By default, this argument is set to |
wrap.title |
Numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
grid.breaks |
numeric; sets the distance between breaks for the axis,
i.e. at every |
expand.grid |
logical, if |
show.values |
Logical, whether values should be plotted or not. |
show.n |
logical, if |
show.prc |
logical, if |
show.axis.values |
logical, whether category, count or percentage values for the axis should be printed or not. |
show.ci |
Logical, if |
show.na |
logical, if |
show.mean |
Logical, if |
show.mean.val |
Logical, if |
show.sd |
Logical, if |
drop.empty |
Logical, if |
mean.line.type |
Numeric value, indicating the linetype of the mean
intercept line. Only applies to histogram-charts and
when |
mean.line.size |
Numeric, size of the mean intercept line. Only
applies to histogram-charts and when |
inner.box.width |
width of the inner box plot that is plotted inside of violin plots. Only applies
if |
inner.box.dotsize |
size of mean dot insie a violin or box plot. Applies only
when |
normal.curve |
Logical, if |
normal.curve.color |
Color of the normal curve line. Only
applies if |
normal.curve.size |
Numeric, size of the normal curve line. Only
applies if |
normal.curve.alpha |
Transparancy level (alpha value) of the normal curve. Only
applies if |
auto.group |
numeric value, indicating the minimum amount of unique values
in the count variable, at which automatic grouping into smaller units
is done (see |
coord.flip |
logical, if |
vjust |
character vector, indicating the vertical position of value
labels. Allowed are same values as for |
hjust |
character vector, indicating the horizontal position of value
labels. Allowed are same values as for |
y.offset |
numeric, offset for text labels when their alignment is adjusted
to the top/bottom of the geom (see |
Value
A ggplot-object.
Note
This function only works with variables with integer values (or numeric factor levels), i.e. scales / centered variables with fractional part may result in unexpected behaviour.
Examples
library(sjlabelled)
data(efc)
data(iris)
# simple plots, two different notations
plot_frq(iris, Species)
plot_frq(efc$tot_sc_e)
# boxplot
plot_frq(efc$e17age, type = "box")
if (require("dplyr")) {
# histogram, pipe-workflow
efc |>
dplyr::select(e17age, c160age) |>
plot_frq(type = "hist", show.mean = TRUE)
# bar plot(s)
plot_frq(efc, e42dep, c172code)
}
if (require("dplyr") && require("gridExtra")) {
# grouped data frame, all panels in one plot
efc |>
group_by(e42dep) |>
plot_frq(c161sex) |>
plot_grid()
}
library(sjmisc)
# grouped variable
ageGrp <- group_var(efc$e17age)
ageGrpLab <- group_labels(efc$e17age)
plot_frq(ageGrp, title = get_label(efc$e17age), axis.labels = ageGrpLab)
# plotting confidence intervals. expand grid and v/hjust for text labels
plot_frq(
efc$e15relat, type = "dot", show.ci = TRUE, sort.frq = "desc",
coord.flip = TRUE, expand.grid = TRUE, vjust = "bottom", hjust = "left"
)
# histogram with overlayed normal curve
plot_frq(efc$c160age, type = "h", show.mean = TRUE, show.mean.val = TRUE,
normal.curve = TRUE, show.sd = TRUE, normal.curve.color = "blue",
normal.curve.size = 3, ylim = c(0,50))
Plot grouped proportional tables
Description
Plot grouped proportional crosstables, where the proportion of
each level of x
for the highest category in y
is plotted, for each subgroup of grp
.
Usage
plot_gpt(
data,
x,
y,
grp,
colors = "metro",
geom.size = 2.5,
shape.fill.color = "#f0f0f0",
shapes = c(15, 16, 17, 18, 21, 22, 23, 24, 25, 7, 8, 9, 10, 12),
title = NULL,
axis.labels = NULL,
axis.titles = NULL,
legend.title = NULL,
legend.labels = NULL,
wrap.title = 50,
wrap.labels = 15,
wrap.legend.title = 20,
wrap.legend.labels = 20,
axis.lim = NULL,
grid.breaks = NULL,
show.total = TRUE,
annotate.total = TRUE,
show.p = TRUE,
show.n = TRUE
)
Arguments
data |
A data frame, or a grouped data frame. |
x |
Categorical variable, where the proportion of each category in
|
y |
Categorical or numeric variable. If not a binary variable, |
grp |
Grouping variable, which will define the y-axis |
colors |
May be a character vector of color values in hex-format, valid
color value names (see
|
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
shape.fill.color |
Optional color vector, fill-color for non-filled shapes |
shapes |
Numeric vector with shape styles, used to map the different
categories of |
title |
Character vector, used as plot title. By default,
|
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
axis.titles |
character vector of length one or two, defining the title(s) for the x-axis and y-axis. |
legend.title |
Character vector, used as legend title for plots that have a legend. |
legend.labels |
character vector with labels for the guide/legend. |
wrap.title |
Numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
wrap.legend.labels |
numeric, determines how many chars of the legend labels are displayed in one line and when a line break is inserted. |
axis.lim |
Numeric vector of length 2, defining the range of the plot axis.
Depending on plot type, may effect either x- or y-axis, or both.
For multiple plot outputs (e.g., from |
grid.breaks |
numeric; sets the distance between breaks for the axis,
i.e. at every |
show.total |
Logical, if |
annotate.total |
Logical, if |
show.p |
Logical, adds significance levels to values, or value and variable labels. |
show.n |
logical, if |
Details
The p-values are based on chisq.test
of x
and y
for each grp
.
Value
A ggplot-object.
Examples
if (requireNamespace("haven")) {
data(efc)
# the proportion of dependency levels in female
# elderly, for each family carer's relationship
# to elderly
plot_gpt(efc, e42dep, e16sex, e15relat)
# proportion of educational levels in highest
# dependency category of elderly, for different
# care levels
plot_gpt(efc, c172code, e42dep, n4pstu)
}
Arrange list of plots as grid
Description
Plot multiple ggplot-objects as a grid-arranged single plot.
Usage
plot_grid(x, margin = c(1, 1, 1, 1), tags = NULL)
Arguments
x |
A list of ggplot-objects. See 'Details'. |
margin |
A numeric vector of length 4, indicating the top, right, bottom and left margin for each plot, in centimetres. |
tags |
Add tags to your subfigures. Can be |
Details
This function takes a list
of ggplot-objects as argument.
Plotting functions of this package that produce multiple plot
objects (e.g., when there is an argument facet.grid
) usually
return multiple plots as list (the return value is named plot.list
).
To arrange these plots as grid as a single plot, use plot_grid
.
Value
An object of class gtable
.
Examples
if (require("dplyr") && require("gridExtra")) {
library(ggeffects)
data(efc)
# fit model
fit <- glm(
tot_sc_e ~ c12hour + e17age + e42dep + neg_c_7,
data = efc,
family = poisson
)
# plot marginal effects for each predictor, each as single plot
p1 <- ggpredict(fit, "c12hour") |>
plot(show_y_title = FALSE, show_title = FALSE)
p2 <- ggpredict(fit, "e17age") |>
plot(show_y_title = FALSE, show_title = FALSE)
p3 <- ggpredict(fit, "e42dep") |>
plot(show_y_title = FALSE, show_title = FALSE)
p4 <- ggpredict(fit, "neg_c_7") |>
plot(show_y_title = FALSE, show_title = FALSE)
# plot grid
plot_grid(list(p1, p2, p3, p4))
# plot grid
plot_grid(list(p1, p2, p3, p4), tags = TRUE)
}
Plot grouped or stacked frequencies
Description
Plot grouped or stacked frequencies of variables as bar/dot, box or violin plots, or line plot.
Usage
plot_grpfrq(
var.cnt,
var.grp,
type = c("bar", "dot", "line", "boxplot", "violin"),
bar.pos = c("dodge", "stack"),
weight.by = NULL,
intr.var = NULL,
title = "",
title.wtd.suffix = NULL,
legend.title = NULL,
axis.titles = NULL,
axis.labels = NULL,
legend.labels = NULL,
intr.var.labels = NULL,
wrap.title = 50,
wrap.labels = 15,
wrap.legend.title = 20,
wrap.legend.labels = 20,
geom.size = NULL,
geom.spacing = 0.15,
geom.colors = "Paired",
show.values = TRUE,
show.n = TRUE,
show.prc = TRUE,
show.axis.values = TRUE,
show.ci = FALSE,
show.grpcnt = FALSE,
show.legend = TRUE,
show.na = FALSE,
show.summary = FALSE,
drop.empty = TRUE,
auto.group = NULL,
ylim = NULL,
grid.breaks = NULL,
expand.grid = FALSE,
inner.box.width = 0.15,
inner.box.dotsize = 3,
smooth.lines = FALSE,
emph.dots = TRUE,
summary.pos = "r",
facet.grid = FALSE,
coord.flip = FALSE,
y.offset = NULL,
vjust = "bottom",
hjust = "center"
)
Arguments
var.cnt |
Vector of counts, for which frequencies or means will be plotted or printed. |
var.grp |
Factor with the cross-classifying variable, where |
type |
Specifies the plot type. May be abbreviated.
|
bar.pos |
Indicates whether bars should be positioned side-by-side (default),
or stacked ( |
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
intr.var |
An interaction variable which can be used for box plots. Divides each category indicated
by |
title |
character vector, used as plot title. Depending on plot type and function,
will be set automatically. If |
title.wtd.suffix |
Suffix (as string) for the title, if |
legend.title |
character vector, used as title for the plot legend. |
axis.titles |
character vector of length one or two, defining the title(s) for the x-axis and y-axis. |
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
legend.labels |
character vector with labels for the guide/legend. |
intr.var.labels |
a character vector with labels for the x-axis breaks
when having interaction variables included.
These labels replace the |
wrap.title |
numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
wrap.legend.labels |
numeric, determines how many chars of the legend labels are displayed in one line and when a line break is inserted. |
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
geom.spacing |
the spacing between geoms (i.e. bar spacing) |
geom.colors |
user defined color for geoms. See 'Details' in |
show.values |
Logical, whether values should be plotted or not. |
show.n |
logical, if |
show.prc |
logical, if |
show.axis.values |
logical, whether category, count or percentage values for the axis should be printed or not. |
show.ci |
Logical, if |
show.grpcnt |
logical, if |
show.legend |
logical, if |
show.na |
logical, if |
show.summary |
logical, if |
drop.empty |
Logical, if |
auto.group |
numeric value, indicating the minimum amount of unique values
in the count variable, at which automatic grouping into smaller units
is done (see |
ylim |
numeric vector of length two, defining lower and upper axis limits
of the y scale. By default, this argument is set to |
grid.breaks |
numeric; sets the distance between breaks for the axis,
i.e. at every |
expand.grid |
logical, if |
inner.box.width |
width of the inner box plot that is plotted inside of violin plots. Only applies
if |
inner.box.dotsize |
size of mean dot insie a violin or box plot. Applies only
when |
smooth.lines |
prints a smooth line curve. Only applies, when argument |
emph.dots |
logical, if |
summary.pos |
position of the model summary which is printed when |
facet.grid |
|
coord.flip |
logical, if |
y.offset |
numeric, offset for text labels when their alignment is adjusted
to the top/bottom of the geom (see |
vjust |
character vector, indicating the vertical position of value
labels. Allowed are same values as for |
hjust |
character vector, indicating the horizontal position of value
labels. Allowed are same values as for |
Details
geom.colors
may be a character vector of color values
in hex-format, valid color value names (see demo("colors")
or
a name of a color brewer palette.
Following options are valid for the geom.colors
argument:
If not specified, a default color brewer palette will be used, which is suitable for the plot style (i.e. diverging for likert scales, qualitative for grouped bars etc.).
If
"gs"
, a greyscale will be used.If
"bw"
, and plot-type is a line-plot, the plot is black/white and uses different line types to distinguish groups (see this package-vignette).If
geom.colors
is any valid color brewer palette name, the related palette will be used. UseRColorBrewer::display.brewer.all()
to view all available palette names.Else specify own color values or names as vector (e.g.
geom.colors = c("#f00000", "#00ff00")
).
Value
A ggplot-object.
Examples
data(efc)
plot_grpfrq(efc$e17age, efc$e16sex, show.values = FALSE)
# boxplot
plot_grpfrq(efc$e17age, efc$e42dep, type = "box")
# grouped bars
plot_grpfrq(efc$e42dep, efc$e16sex, title = NULL)
# box plots with interaction variable
plot_grpfrq(efc$e17age, efc$e42dep, intr.var = efc$e16sex, type = "box")
# Grouped bar plot
plot_grpfrq(efc$neg_c_7, efc$e42dep, show.values = FALSE)
# same data as line plot
plot_grpfrq(efc$neg_c_7, efc$e42dep, type = "line")
# show ony categories where we have data (i.e. drop zero-counts)
library(dplyr)
efc <- dplyr::filter(efc, e42dep %in% c(3,4))
plot_grpfrq(efc$c161sex, efc$e42dep, drop.empty = TRUE)
# show all categories, even if not in data
plot_grpfrq(efc$c161sex, efc$e42dep, drop.empty = FALSE)
Plot likert scales as centered stacked bars
Description
Plot likert scales as centered stacked bars.
Usage
plot_likert(
items,
groups = NULL,
groups.titles = "auto",
title = NULL,
legend.title = NULL,
legend.labels = NULL,
axis.titles = NULL,
axis.labels = NULL,
catcount = NULL,
cat.neutral = NULL,
sort.frq = NULL,
weight.by = NULL,
title.wtd.suffix = NULL,
wrap.title = 50,
wrap.labels = 30,
wrap.legend.title = 30,
wrap.legend.labels = 28,
geom.size = 0.6,
geom.colors = "BrBG",
cat.neutral.color = "grey70",
intercept.line.color = "grey50",
reverse.colors = FALSE,
values = "show",
show.n = TRUE,
show.legend = TRUE,
show.prc.sign = FALSE,
grid.range = 1,
grid.breaks = 0.2,
expand.grid = TRUE,
digits = 1,
reverse.scale = FALSE,
coord.flip = TRUE,
sort.groups = TRUE,
legend.pos = "bottom",
rel_heights = 1,
group.legend.options = list(nrow = NULL, byrow = TRUE),
cowplot.options = list(label_x = 0.01, hjust = 0, align = "v")
)
Arguments
items |
Data frame, or a grouped data frame, with each column representing one item. |
groups |
(optional) Must be a vector of same length as |
groups.titles |
(optional, only used if groups are supplied) Titles for each factor group that will be used as table caption for each
component-table. Must be a character vector of same length as |
title |
character vector, used as plot title. Depending on plot type and function,
will be set automatically. If |
legend.title |
character vector, used as title for the plot legend. |
legend.labels |
character vector with labels for the guide/legend. |
axis.titles |
character vector of length one or two, defining the title(s) for the x-axis and y-axis. |
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
catcount |
optional, amount of categories of |
cat.neutral |
If there's a neutral category (like "don't know" etc.), specify
the index number (value) for this category. Else, set |
sort.frq |
Indicates whether the items of
|
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
title.wtd.suffix |
Suffix (as string) for the title, if |
wrap.title |
numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
wrap.legend.labels |
numeric, determines how many chars of the legend labels are displayed in one line and when a line break is inserted. |
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
geom.colors |
user defined color for geoms. See 'Details' in |
cat.neutral.color |
Color of the neutral category, if plotted (see |
intercept.line.color |
Color of the vertical intercept line that divides positive and negative values. |
reverse.colors |
logical, if |
values |
Determines style and position of percentage value labels on the bars:
|
show.n |
logical, if |
show.legend |
logical, if |
show.prc.sign |
logical, if |
grid.range |
Numeric, limits of the x-axis-range, as proportion of 100.
Default is 1, so the x-scale ranges from zero to 100% on both sides from the center.
Can alternatively be supplied as a vector of 2 positive numbers (e.g. |
grid.breaks |
numeric; sets the distance between breaks for the axis,
i.e. at every |
expand.grid |
logical, if |
digits |
Numeric, amount of digits after decimal point when rounding estimates or values. |
reverse.scale |
logical, if |
coord.flip |
logical, if |
sort.groups |
(optional, only used if groups are supplied) logical, if groups should be sorted according to the values supplied to |
legend.pos |
(optional, only used if groups are supplied) Defines the legend position. Possible values are |
rel_heights |
(optional, only used if groups are supplied) This option can be used to adjust the height of the subplots. The bars in subplots can have different heights due to a differing number of items or due to legend placement. This can be adjusted here. Takes a vector of numbers, one for each plot. Values are evaluated relative to each other. |
group.legend.options |
(optional, only used if groups are supplied) List of options to be passed to |
cowplot.options |
(optional, only used if groups are supplied) List of label options to be passed to |
Value
A ggplot-object.
Note
Note that only even numbers of categories are possible to plot, so the "positive"
and "negative" values can be splitted into two halfs. A neutral category (like "don't know")
can be used, but must be indicated by cat.neutral
.
The catcount
-argument indicates how many item categories are in the
Likert scale. Normally, this argument can be ignored because the amount of
valid categories is retrieved automatically. However, sometimes (for instance,
if a certain category is missing in all items), auto-detection of the amount
of categories fails. In such cases, specify the amount of categories
with the catcount
-argument.
Examples
if (requireNamespace("ggrepel") && requireNamespace("sjmisc")) {
library(sjmisc)
data(efc)
# find all variables from COPE-Index, which all have a "cop" in their
# variable name, and then plot that subset as likert-plot
mydf <- find_var(efc, pattern = "cop", out = "df")
plot_likert(mydf)
plot_likert(
mydf,
grid.range = c(1.2, 1.4),
expand.grid = FALSE,
values = "sum.outside",
show.prc.sign = TRUE
)
# Plot in groups
plot_likert(mydf, c(2,1,1,1,1,2,2,2,1))
if (require("parameters") && require("nFactors")) {
groups <- parameters::principal_components(mydf)
plot_likert(mydf, groups = parameters::closest_component(groups))
}
plot_likert(mydf,
c(rep("B", 4), rep("A", 5)),
sort.groups = FALSE,
grid.range = c(0.9, 1.1),
geom.colors = "RdBu",
rel_heights = c(6, 8),
wrap.labels = 40,
reverse.scale = TRUE)
# control legend items
six_cat_example = data.frame(
matrix(sample(1:6, 600, replace = TRUE),
ncol = 6)
)
## Not run:
six_cat_example <-
six_cat_example |>
dplyr::mutate_all(~ordered(.,labels = c("+++","++","+","-","--","---")))
# Old default
plot_likert(
six_cat_example,
groups = c(1, 1, 1, 2, 2, 2),
group.legend.options = list(nrow = 2, byrow = FALSE)
)
# New default
plot_likert(six_cat_example, groups = c(1, 1, 1, 2, 2, 2))
# Single row
plot_likert(
six_cat_example,
groups = c(1, 1, 1, 2, 2, 2),
group.legend.options = list(nrow = 1)
)
## End(Not run)
}
Plot regression models
Description
plot_model()
creates plots from regression models, either
estimates (as so-called forest or dot whisker plots) or marginal effects.
Usage
plot_model(
model,
type = c("est", "re", "eff", "emm", "pred", "int", "std", "std2", "slope", "resid",
"diag"),
transform,
terms = NULL,
sort.est = NULL,
rm.terms = NULL,
group.terms = NULL,
order.terms = NULL,
pred.type = c("fe", "re"),
mdrt.values = c("minmax", "meansd", "zeromax", "quart", "all"),
ri.nr = NULL,
title = NULL,
axis.title = NULL,
axis.labels = NULL,
legend.title = NULL,
wrap.title = 50,
wrap.labels = 25,
axis.lim = NULL,
grid.breaks = NULL,
ci.lvl = NULL,
se = NULL,
vcov.fun = NULL,
vcov.args = NULL,
colors = "Set1",
show.intercept = FALSE,
show.values = FALSE,
show.p = TRUE,
show.data = FALSE,
show.legend = TRUE,
show.zeroinf = TRUE,
value.offset = NULL,
value.size,
jitter = NULL,
digits = 2,
dot.size = NULL,
line.size = NULL,
vline.color = NULL,
p.threshold = c(0.05, 0.01, 0.001),
p.val = NULL,
p.adjust = NULL,
grid,
case,
auto.label = TRUE,
prefix.labels = c("none", "varname", "label"),
bpe = "median",
bpe.style = "line",
bpe.color = "white",
ci.style = c("whisker", "bar"),
std.response = TRUE,
...
)
get_model_data(
model,
type = c("est", "re", "eff", "pred", "int", "std", "std2", "slope", "resid", "diag"),
transform,
terms = NULL,
sort.est = NULL,
rm.terms = NULL,
group.terms = NULL,
order.terms = NULL,
pred.type = c("fe", "re"),
ri.nr = NULL,
ci.lvl = NULL,
colors = "Set1",
grid,
case = "parsed",
digits = 2,
...
)
Arguments
model |
A regression model object. Depending on the |
type |
Type of plot. There are three groups of plot-types:
Marginal Effects (related vignette)
Model diagnostics
Note: For mixed models, the diagnostic plots like linear relationship or check for Homoscedasticity, do not take the uncertainty of random effects into account, but is only based on the fixed effects part of the model. |
transform |
A character vector, naming a function that will be applied
on estimates and confidence intervals. By default, |
terms |
Character vector with the names of those terms from
|
sort.est |
Determines in which way estimates are sorted in the plot:
|
rm.terms |
Character vector with names that indicate which terms should
be removed from the plot. Counterpart to |
group.terms |
Numeric vector with group indices, to group coefficients. Each group of coefficients gets its own color (see 'Examples'). |
order.terms |
Numeric vector, indicating in which order the coefficients should be plotted. See examples in this package-vignette. |
pred.type |
Character, only applies for Marginal Effects plots
with mixed effects models. Indicates whether predicted values should be
conditioned on random effects ( |
mdrt.values |
Indicates which values of the moderator variable should be
used when plotting interaction terms (i.e.
|
ri.nr |
Numeric vector. If |
title |
Character vector, used as plot title. By default,
|
axis.title |
Character vector of length one or two (depending on the
plot function and type), used as title(s) for the x and y axis. If not
specified, a default labelling is chosen. Note: Some plot types
may not support this argument sufficiently. In such cases, use the returned
ggplot-object and add axis titles manually with
|
axis.labels |
Character vector with labels for the model terms, used as
axis labels. By default, |
legend.title |
Character vector, used as legend title for plots that have a legend. |
wrap.title |
Numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
axis.lim |
Numeric vector of length 2, defining the range of the plot
axis. Depending on plot-type, may effect either x- or y-axis. For
Marginal Effects plots, |
grid.breaks |
Numeric value or vector; if |
ci.lvl |
Numeric, the level of the confidence intervals (error bars).
Use |
se |
Logical, if |
vcov.fun |
Variance-covariance matrix used to compute uncertainty
estimates (e.g., for robust standard errors). This argument accepts a
covariance matrix, a function which returns a covariance matrix, or a
string which identifies the function to be used to compute the covariance
matrix. See |
vcov.args |
List of arguments to be passed to the function identified by
the |
colors |
May be a character vector of color values in hex-format, valid
color value names (see
|
show.intercept |
Logical, if |
show.values |
Logical, whether values should be plotted or not. |
show.p |
Logical, adds asterisks that indicate the significance level of estimates to the value labels. |
show.data |
Logical, for Marginal Effects plots, also plots the raw data points. |
show.legend |
For Marginal Effects plots, shows or hides the legend. |
show.zeroinf |
Logical, if |
value.offset |
Numeric, offset for text labels to adjust their position relative to the dots or lines. |
value.size |
Numeric, indicates the size of value labels. Can be used
for all plot types where the argument |
jitter |
Numeric, between 0 and 1. If |
digits |
Numeric, amount of digits after decimal point when rounding estimates or values. |
dot.size |
Numeric, size of the dots that indicate the point estimates. |
line.size |
Numeric, size of the lines that indicate the error bars. |
vline.color |
Color of the vertical "zero effect" line. Default color is inherited from the current theme. |
p.threshold |
Numeric vector of length 3, indicating the treshold for
annotating p-values with asterisks. Only applies if
|
p.val |
Character specifying method to be used to calculate p-values. Defaults to "profile" for glm/polr models, otherwise "wald". |
p.adjust |
String value, if not |
grid |
Logical, if |
case |
Desired target case. Labels will automatically converted into the
specified character case. See |
auto.label |
Logical, if |
prefix.labels |
Indicates whether the value labels of categorical variables
should be prefixed, e.g. with the variable name or variable label. See
argument |
bpe |
For Stan-models (fitted with the rstanarm- or
brms-package), the Bayesian point estimate is, by default, the median
of the posterior distribution. Use |
bpe.style |
For Stan-models (fitted with the rstanarm- or
brms-package), the Bayesian point estimate is indicated as a small,
vertical line by default. Use |
bpe.color |
Character vector, indicating the color of the Bayesian
point estimate. Setting |
ci.style |
Character vector, defining whether inner and outer intervals
for Bayesion models are shown in boxplot-style ( |
std.response |
Logical, whether the response variable will also be
standardized if standardized coefficients are requested. Setting both
|
... |
Other arguments, passed down to various functions. Here is a list of supported arguments and their description in detail.
|
Details
Different Plot Types
type = "std"
Plots standardized estimates. See details below.
type = "std2"
Plots standardized estimates, however, standardization follows Gelman's (2008) suggestion, rescaling the estimates by dividing them by two standard deviations instead of just one. Resulting coefficients are then directly comparable for untransformed binary predictors.
type = "pred"
Plots estimated marginal means (or marginal effects). Simply wraps
ggpredict
. See also this package-vignette.type = "eff"
Plots estimated marginal means (or marginal effects). Simply wraps
ggeffect
. See also this package-vignette.type = "int"
A shortcut for marginal effects plots, where interaction terms are automatically detected and used as
terms
-argument. Furthermore, if the moderator variable (the second - and third - term in an interaction) is continuous,type = "int"
automatically chooses useful values based on themdrt.values
-argument, which are passed toterms
. Then,ggpredict
is called.type = "int"
plots the interaction term that appears first in the formula along the x-axis, while the second (and possibly third) variable in an interaction is used as grouping factor(s) (moderating variable). Usetype = "pred"
ortype = "eff"
and specify a certain order in theterms
-argument to indicate which variable(s) should be used as moderator. See also this package-vignette.type = "slope"
andtype = "resid"
Simple diagnostic-plots, where a linear model for each single predictor is plotted against the response variable, or the model's residuals. Additionally, a loess-smoothed line is added to the plot. The main purpose of these plots is to check whether the relationship between outcome (or residuals) and a predictor is roughly linear or not. Since the plots are based on a simple linear regression with only one model predictor at the moment, the slopes (i.e. coefficients) may differ from the coefficients of the complete model.
type = "diag"
For Stan-models, plots the prior versus posterior samples. For linear (mixed) models, plots for multicollinearity-check (Variance Inflation Factors), QQ-plots, checks for normal distribution of residuals and homoscedasticity (constant variance of residuals) are shown. For generalized linear mixed models, returns the QQ-plot for random effects.
Standardized Estimates
Default standardization is done by completely refitting the model on the
standardized data. Hence, this approach is equal to standardizing the
variables before fitting the model, which is particularly recommended for
complex models that include interactions or transformations (e.g.,
polynomial or spline terms). When type = "std2"
, standardization of
estimates follows Gelman's (2008) suggestion, rescaling the estimates by
dividing them by two standard deviations instead of just one. Resulting
coefficients are then directly comparable for untransformed binary
predictors.
Value
Depending on the plot-type, plot_model()
returns a
ggplot
-object or a list of such objects. get_model_data
returns the associated data with the plot-object as tidy data frame, or
(depending on the plot-type) a list of such data frames.
References
Gelman A (2008) "Scaling regression inputs by dividing by two
standard deviations." Statistics in Medicine 27: 2865-2873.
Aiken and West (1991). Multiple Regression: Testing and Interpreting Interactions.
Examples
# prepare data
if (requireNamespace("haven")) {
library(sjmisc)
data(efc)
efc <- to_factor(efc, c161sex, e42dep, c172code)
m <- lm(neg_c_7 ~ pos_v_4 + c12hour + e42dep + c172code, data = efc)
# simple forest plot
plot_model(m)
# grouped coefficients
plot_model(m, group.terms = c(1, 2, 3, 3, 3, 4, 4))
# keep only selected terms in the model: pos_v_4, the
# levels 3 and 4 of factor e42dep and levels 2 and 3 for c172code
plot_model(m, terms = c("pos_v_4", "e42dep [3,4]", "c172code [2,3]"))
}
# multiple plots, as returned from "diagnostic"-plot type,
# can be arranged with 'plot_grid()'
## Not run:
p <- plot_model(m, type = "diag")
plot_grid(p)
## End(Not run)
# plot random effects
if (require("lme4") && require("glmmTMB")) {
m <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
plot_model(m, type = "re")
# plot marginal effects
plot_model(m, type = "pred", terms = "Days")
}
# plot interactions
## Not run:
m <- glm(
tot_sc_e ~ c161sex + c172code * neg_c_7,
data = efc,
family = poisson()
)
# type = "int" automatically selects groups for continuous moderator
# variables - see argument 'mdrt.values'. The following function call is
# identical to:
# plot_model(m, type = "pred", terms = c("c172code", "neg_c_7 [7,28]"))
plot_model(m, type = "int")
# switch moderator
plot_model(m, type = "pred", terms = c("neg_c_7", "c172code"))
# same as
# ggeffects::ggpredict(m, terms = c("neg_c_7", "c172code"))
## End(Not run)
# plot Stan-model
## Not run:
if (require("rstanarm")) {
data(mtcars)
m <- stan_glm(mpg ~ wt + am + cyl + gear, data = mtcars, chains = 1)
plot_model(m, bpe.style = "dot")
}
## End(Not run)
Forest plot of multiple regression models
Description
Plot and compare regression coefficients with confidence intervals of multiple regression models in one plot.
Usage
plot_models(
...,
transform = NULL,
std.est = NULL,
std.response = TRUE,
rm.terms = NULL,
title = NULL,
m.labels = NULL,
legend.title = "Dependent Variables",
legend.pval.title = "p-level",
axis.labels = NULL,
axis.title = NULL,
axis.lim = NULL,
wrap.title = 50,
wrap.labels = 25,
wrap.legend.title = 20,
grid.breaks = NULL,
dot.size = 3,
line.size = NULL,
value.size = NULL,
spacing = 0.4,
colors = "Set1",
show.values = FALSE,
show.legend = TRUE,
show.intercept = FALSE,
show.p = TRUE,
p.shape = FALSE,
p.threshold = c(0.05, 0.01, 0.001),
p.adjust = NULL,
ci.lvl = 0.95,
vcov.fun = NULL,
vcov.args = NULL,
vline.color = NULL,
digits = 2,
grid = FALSE,
auto.label = TRUE,
prefix.labels = c("none", "varname", "label")
)
Arguments
... |
One or more regression models, including glm's or mixed models.
May also be a |
transform |
A character vector, naming a function that will be applied
on estimates and confidence intervals. By default, |
std.est |
Choose whether standardized coefficients should be used
for plotting. Default is no standardization ( |
std.response |
Logical, whether the response variable will also be
standardized if standardized coefficients are requested. Setting both
|
rm.terms |
Character vector with names that indicate which terms should
be removed from the plot. Counterpart to |
title |
Character vector, used as plot title. By default,
|
m.labels |
Character vector, used to indicate the different models in the plot's legend. If not specified, the labels of the dependent variables for each model are used. |
legend.title |
Character vector, used as legend title for plots that have a legend. |
legend.pval.title |
Character vector, used as title of the plot legend that
indicates the p-values. Default is |
axis.labels |
Character vector with labels for the model terms, used as
axis labels. By default, |
axis.title |
Character vector of length one or two (depending on the
plot function and type), used as title(s) for the x and y axis. If not
specified, a default labelling is chosen. Note: Some plot types
may not support this argument sufficiently. In such cases, use the returned
ggplot-object and add axis titles manually with
|
axis.lim |
Numeric vector of length 2, defining the range of the plot
axis. Depending on plot-type, may effect either x- or y-axis. For
Marginal Effects plots, |
wrap.title |
Numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
grid.breaks |
Numeric value or vector; if |
dot.size |
Numeric, size of the dots that indicate the point estimates. |
line.size |
Numeric, size of the lines that indicate the error bars. |
value.size |
Numeric, indicates the size of value labels. Can be used
for all plot types where the argument |
spacing |
Numeric, spacing between the dots and error bars of the plotted fitted models. Default is 0.3. |
colors |
May be a character vector of color values in hex-format, valid
color value names (see
|
show.values |
Logical, whether values should be plotted or not. |
show.legend |
For Marginal Effects plots, shows or hides the legend. |
show.intercept |
Logical, if |
show.p |
Logical, adds asterisks that indicate the significance level of estimates to the value labels. |
p.shape |
Logical, if |
p.threshold |
Numeric vector of length 3, indicating the treshold for
annotating p-values with asterisks. Only applies if
|
p.adjust |
String value, if not |
ci.lvl |
Numeric, the level of the confidence intervals (error bars).
Use |
vcov.fun |
Variance-covariance matrix used to compute uncertainty
estimates (e.g., for robust standard errors). This argument accepts a
covariance matrix, a function which returns a covariance matrix, or a
string which identifies the function to be used to compute the covariance
matrix. See |
vcov.args |
List of arguments to be passed to the function identified by
the |
vline.color |
Color of the vertical "zero effect" line. Default color is inherited from the current theme. |
digits |
Numeric, amount of digits after decimal point when rounding estimates or values. |
grid |
Logical, if |
auto.label |
Logical, if |
prefix.labels |
Indicates whether the value labels of categorical variables
should be prefixed, e.g. with the variable name or variable label. See
argument |
Value
A ggplot-object.
Examples
data(efc)
# fit three models
fit1 <- lm(barthtot ~ c160age + c12hour + c161sex + c172code, data = efc)
fit2 <- lm(neg_c_7 ~ c160age + c12hour + c161sex + c172code, data = efc)
fit3 <- lm(tot_sc_e ~ c160age + c12hour + c161sex + c172code, data = efc)
# plot multiple models
plot_models(fit1, fit2, fit3, grid = TRUE)
# plot multiple models with legend labels and
# point shapes instead of value labels
plot_models(
fit1, fit2, fit3,
axis.labels = c(
"Carer's Age", "Hours of Care", "Carer's Sex", "Educational Status"
),
m.labels = c("Barthel Index", "Negative Impact", "Services used"),
show.values = FALSE, show.p = FALSE, p.shape = TRUE
)
## Not run:
# plot multiple models from nested lists argument
all.models <- list()
all.models[[1]] <- fit1
all.models[[2]] <- fit2
all.models[[3]] <- fit3
plot_models(all.models)
# plot multiple models with different predictors (stepwise inclusion),
# standardized estimates
fit1 <- lm(mpg ~ wt + cyl + disp + gear, data = mtcars)
fit2 <- update(fit1, . ~ . + hp)
fit3 <- update(fit2, . ~ . + am)
plot_models(fit1, fit2, fit3, std.est = "std2")
## End(Not run)
Plot predicted values and their residuals
Description
This function plots observed and predicted values of the response of linear (mixed) models for each coefficient and highlights the observed values according to their distance (residuals) to the predicted values. This allows to investigate how well actual and predicted values of the outcome fit across the predictor variables.
Usage
plot_residuals(
fit,
geom.size = 2,
remove.estimates = NULL,
show.lines = TRUE,
show.resid = TRUE,
show.pred = TRUE,
show.ci = FALSE
)
Arguments
fit |
Fitted linear (mixed) regression model (including objects of class
|
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
remove.estimates |
Numeric vector with indices (order equals to row index of |
show.lines |
Logical, if |
show.resid |
Logical, if |
show.pred |
Logical, if |
show.ci |
Logical, if |
Value
A ggplot-object.
Note
The actual (observed) values have a coloured fill, while the predicted values have a solid outline without filling.
Examples
data(efc)
# fit model
fit <- lm(neg_c_7 ~ c12hour + e17age + e42dep, data = efc)
# plot residuals for all independent variables
plot_residuals(fit)
# remove some independent variables from output
plot_residuals(fit, remove.estimates = c("e17age", "e42dep"))
Plot (grouped) scatter plots
Description
Display scatter plot of two variables. Adding a grouping variable to the scatter plot is possible. Furthermore, fitted lines can be added for each group as well as for the overall plot.
Usage
plot_scatter(
data,
x,
y,
grp,
title = "",
legend.title = NULL,
legend.labels = NULL,
dot.labels = NULL,
axis.titles = NULL,
dot.size = 1.5,
label.size = 3,
colors = "metro",
fit.line = NULL,
fit.grps = NULL,
show.rug = FALSE,
show.legend = TRUE,
show.ci = FALSE,
wrap.title = 50,
wrap.legend.title = 20,
wrap.legend.labels = 20,
jitter = 0.05,
emph.dots = FALSE,
grid = FALSE
)
Arguments
data |
A data frame, or a grouped data frame. |
x |
Name of the variable for the x-axis. |
y |
Name of the variable for the y-axis. |
grp |
Optional, name of the grouping-variable. If not missing, the scatter plot will be grouped. See 'Examples'. |
title |
Character vector, used as plot title. By default,
|
legend.title |
Character vector, used as legend title for plots that have a legend. |
legend.labels |
character vector with labels for the guide/legend. |
dot.labels |
Character vector with names for each coordinate pair given
by |
axis.titles |
character vector of length one or two, defining the title(s) for the x-axis and y-axis. |
dot.size |
Numeric, size of the dots that indicate the point estimates. |
label.size |
Size of text labels if argument |
colors |
May be a character vector of color values in hex-format, valid
color value names (see
|
fit.line , fit.grps |
Specifies the method to add a fitted line accross
the data points. Possible values are for instance |
show.rug |
Logical, if |
show.legend |
For Marginal Effects plots, shows or hides the legend. |
show.ci |
Logical, if |
wrap.title |
Numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
wrap.legend.labels |
numeric, determines how many chars of the legend labels are displayed in one line and when a line break is inserted. |
jitter |
Numeric, between 0 and 1. If |
emph.dots |
Logical, if |
grid |
Logical, if |
Value
A ggplot-object. For grouped data frames, a list of ggplot-objects for each group in the data.
Examples
# load sample date
library(sjmisc)
library(sjlabelled)
data(efc)
# simple scatter plot
plot_scatter(efc, e16sex, neg_c_7)
# simple scatter plot, increased jittering
plot_scatter(efc, e16sex, neg_c_7, jitter = .4)
# grouped scatter plot
plot_scatter(efc, c160age, e17age, e42dep)
# grouped scatter plot with marginal rug plot
# and add fitted line for complete data
plot_scatter(
efc, c12hour, c160age, c172code,
show.rug = TRUE, fit.line = "lm"
)
# grouped scatter plot with marginal rug plot
# and add fitted line for each group
plot_scatter(
efc, c12hour, c160age, c172code,
show.rug = TRUE, fit.grps = "loess",
grid = TRUE
)
Plot stacked proportional bars
Description
Plot items (variables) of a scale as stacked proportional bars. This function is useful when several items with identical scale/categoroies should be plotted to compare the distribution of answers.
Usage
plot_stackfrq(
items,
title = NULL,
legend.title = NULL,
legend.labels = NULL,
axis.titles = NULL,
axis.labels = NULL,
weight.by = NULL,
sort.frq = NULL,
wrap.title = 50,
wrap.labels = 30,
wrap.legend.title = 30,
wrap.legend.labels = 28,
geom.size = 0.5,
geom.colors = "Blues",
show.prc = TRUE,
show.n = FALSE,
show.total = TRUE,
show.axis.prc = TRUE,
show.legend = TRUE,
grid.breaks = 0.2,
expand.grid = FALSE,
digits = 1,
vjust = "center",
coord.flip = TRUE
)
Arguments
items |
Data frame, or a grouped data frame, with each column representing one item. |
title |
character vector, used as plot title. Depending on plot type and function,
will be set automatically. If |
legend.title |
character vector, used as title for the plot legend. |
legend.labels |
character vector with labels for the guide/legend. |
axis.titles |
character vector of length one or two, defining the title(s) for the x-axis and y-axis. |
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
sort.frq |
Indicates whether the
|
wrap.title |
numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
wrap.legend.labels |
numeric, determines how many chars of the legend labels are displayed in one line and when a line break is inserted. |
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
geom.colors |
user defined color for geoms. See 'Details' in |
show.prc |
Logical, whether percentage values should be plotted or not. |
show.n |
Logical, whether count values hould be plotted or not. |
show.total |
logical, if |
show.axis.prc |
Logical, if |
show.legend |
logical, if |
grid.breaks |
numeric; sets the distance between breaks for the axis,
i.e. at every |
expand.grid |
logical, if |
digits |
Numeric, amount of digits after decimal point when rounding estimates or values. |
vjust |
character vector, indicating the vertical position of value
labels. Allowed are same values as for |
coord.flip |
logical, if |
Value
A ggplot-object.
Examples
# Data from the EUROFAMCARE sample dataset
library(sjmisc)
data(efc)
# recveive first item of COPE-index scale
start <- which(colnames(efc) == "c82cop1")
# recveive first item of COPE-index scale
end <- which(colnames(efc) == "c90cop9")
# auto-detection of labels
plot_stackfrq(efc[, start:end])
# works on grouped data frames as well
library(dplyr)
efc |>
group_by(c161sex) |>
select(start:end) |>
plot_stackfrq()
Plot contingency tables
Description
Plot proportional crosstables (contingency tables) of two variables as ggplot diagram.
Usage
plot_xtab(
x,
grp,
type = c("bar", "line"),
margin = c("col", "cell", "row"),
bar.pos = c("dodge", "stack"),
title = "",
title.wtd.suffix = NULL,
axis.titles = NULL,
axis.labels = NULL,
legend.title = NULL,
legend.labels = NULL,
weight.by = NULL,
rev.order = FALSE,
show.values = TRUE,
show.n = TRUE,
show.prc = TRUE,
show.total = TRUE,
show.legend = TRUE,
show.summary = FALSE,
summary.pos = "r",
drop.empty = TRUE,
string.total = "Total",
wrap.title = 50,
wrap.labels = 15,
wrap.legend.title = 20,
wrap.legend.labels = 20,
geom.size = 0.7,
geom.spacing = 0.1,
geom.colors = "Paired",
dot.size = 3,
smooth.lines = FALSE,
grid.breaks = 0.2,
expand.grid = FALSE,
ylim = NULL,
vjust = "bottom",
hjust = "center",
y.offset = NULL,
coord.flip = FALSE
)
Arguments
x |
A vector of values (variable) describing the bars which make up the plot. |
grp |
Grouping variable of same length as |
type |
Plot type. may be either |
margin |
Indicates which data of the proportional table should be plotted. Use |
bar.pos |
Indicates whether bars should be positioned side-by-side (default),
or stacked ( |
title |
character vector, used as plot title. Depending on plot type and function,
will be set automatically. If |
title.wtd.suffix |
Suffix (as string) for the title, if |
axis.titles |
character vector of length one or two, defining the title(s) for the x-axis and y-axis. |
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
legend.title |
character vector, used as title for the plot legend. |
legend.labels |
character vector with labels for the guide/legend. |
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
rev.order |
Logical, if |
show.values |
Logical, whether values should be plotted or not. |
show.n |
logical, if |
show.prc |
logical, if |
show.total |
When |
show.legend |
logical, if |
show.summary |
logical, if |
summary.pos |
position of the model summary which is printed when |
drop.empty |
Logical, if |
string.total |
String for the legend label when a total-column is added. Only applies
if |
wrap.title |
numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
wrap.legend.title |
numeric, determines how many chars of the legend's title are displayed in one line and when a line break is inserted. |
wrap.legend.labels |
numeric, determines how many chars of the legend labels are displayed in one line and when a line break is inserted. |
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
geom.spacing |
the spacing between geoms (i.e. bar spacing) |
geom.colors |
user defined color for geoms. See 'Details' in |
dot.size |
Dot size, only applies, when argument |
smooth.lines |
prints a smooth line curve. Only applies, when argument |
grid.breaks |
numeric; sets the distance between breaks for the axis,
i.e. at every |
expand.grid |
logical, if |
ylim |
numeric vector of length two, defining lower and upper axis limits
of the y scale. By default, this argument is set to |
vjust |
character vector, indicating the vertical position of value
labels. Allowed are same values as for |
hjust |
character vector, indicating the horizontal position of value
labels. Allowed are same values as for |
y.offset |
numeric, offset for text labels when their alignment is adjusted
to the top/bottom of the geom (see |
coord.flip |
logical, if |
Value
A ggplot-object.
Examples
# create 4-category-items
grp <- sample(1:4, 100, replace = TRUE)
# create 3-category-items
x <- sample(1:3, 100, replace = TRUE)
# plot "cross tablulation" of x and grp
plot_xtab(x, grp)
# plot "cross tablulation" of x and y, including labels
plot_xtab(x, grp, axis.labels = c("low", "mid", "high"),
legend.labels = c("Grp 1", "Grp 2", "Grp 3", "Grp 4"))
# plot "cross tablulation" of x and grp
# as stacked proportional bars
plot_xtab(x, grp, margin = "row", bar.pos = "stack",
show.summary = TRUE, coord.flip = TRUE)
# example with vertical labels
library(sjmisc)
library(sjlabelled)
data(efc)
sjPlot::set_theme(geom.label.angle = 90)
plot_xtab(efc$e42dep, efc$e16sex, vjust = "center", hjust = "bottom")
# grouped bars with EUROFAMCARE sample dataset
# dataset was importet from an SPSS-file,
# see ?sjmisc::read_spss
data(efc)
efc.val <- get_labels(efc)
efc.var <- get_label(efc)
plot_xtab(efc$e42dep, efc$e16sex, title = efc.var['e42dep'],
axis.labels = efc.val[['e42dep']], legend.title = efc.var['e16sex'],
legend.labels = efc.val[['e16sex']])
plot_xtab(efc$e16sex, efc$e42dep, title = efc.var['e16sex'],
axis.labels = efc.val[['e16sex']], legend.title = efc.var['e42dep'],
legend.labels = efc.val[['e42dep']])
# -------------------------------
# auto-detection of labels works here
# so no need to specify labels. For
# title-auto-detection, use NULL
# -------------------------------
plot_xtab(efc$e16sex, efc$e42dep, title = NULL)
plot_xtab(efc$e16sex, efc$e42dep, margin = "row",
bar.pos = "stack", coord.flip = TRUE)
Save ggplot-figure for print publication
Description
Convenient function to save the last ggplot-figure in high quality for publication.
Usage
save_plot(
filename,
fig = ggplot2::last_plot(),
width = 12,
height = 9,
dpi = 300,
theme = ggplot2::theme_get(),
label.color = "black",
label.size = 2.4,
axis.textsize = 0.8,
axis.titlesize = 0.75,
legend.textsize = 0.6,
legend.titlesize = 0.65,
legend.itemsize = 0.5
)
Arguments
filename |
Name of the output file; filename must end with one of the following accepted file types: ".png", ".jpg", ".svg" or ".tif". |
fig |
The plot that should be saved. By default, the last plot is saved. |
width |
Width of the figure, in centimetres. |
height |
Height of the figure, in centimetres. |
dpi |
Resolution in dpi (dots per inch). Ignored for vector formats, such as ".svg". |
theme |
The default theme to use when saving the plot. |
label.color |
Color value for labels (axis, plot, etc.). |
label.size |
Fontsize of value labels inside plot area. |
axis.textsize |
Fontsize of axis labels. |
axis.titlesize |
Fontsize of axis titles. |
legend.textsize |
Fontsize of legend labels. |
legend.titlesize |
Fontsize of legend title. |
legend.itemsize |
Size of legend's item (legend key), in centimetres. |
Note
This is a convenient function with some default settings that should
come close to most of the needs for fontsize and scaling in figures
when saving them for printing or publishing. It uses cairographics
anti-aliasing (see png
).
For adjusting plot appearance, see also sjPlot-themes
.
Set global theme options for sjp-functions
Description
Set global theme options for sjp-functions.
Usage
set_theme(
base = ggplot2::theme_grey(),
theme.font = NULL,
title.color = "black",
title.size = 1.2,
title.align = "left",
title.vjust = NULL,
geom.outline.color = NULL,
geom.outline.size = 0,
geom.boxoutline.size = 0.5,
geom.boxoutline.color = "black",
geom.alpha = 1,
geom.linetype = 1,
geom.errorbar.size = 0.7,
geom.errorbar.linetype = 1,
geom.label.color = NULL,
geom.label.size = 4,
geom.label.alpha = 1,
geom.label.angle = 0,
axis.title.color = "grey30",
axis.title.size = 1.1,
axis.title.x.vjust = NULL,
axis.title.y.vjust = NULL,
axis.angle.x = 0,
axis.angle.y = 0,
axis.angle = NULL,
axis.textcolor.x = "grey30",
axis.textcolor.y = "grey30",
axis.textcolor = NULL,
axis.linecolor.x = NULL,
axis.linecolor.y = NULL,
axis.linecolor = NULL,
axis.line.size = 0.5,
axis.textsize.x = 1,
axis.textsize.y = 1,
axis.textsize = NULL,
axis.tickslen = NULL,
axis.tickscol = NULL,
axis.ticksmar = NULL,
axis.ticksize.x = NULL,
axis.ticksize.y = NULL,
panel.backcol = NULL,
panel.bordercol = NULL,
panel.col = NULL,
panel.major.gridcol = NULL,
panel.minor.gridcol = NULL,
panel.gridcol = NULL,
panel.gridcol.x = NULL,
panel.gridcol.y = NULL,
panel.major.linetype = 1,
panel.minor.linetype = 1,
plot.backcol = NULL,
plot.bordercol = NULL,
plot.col = NULL,
plot.margins = NULL,
legend.pos = "right",
legend.just = NULL,
legend.inside = FALSE,
legend.size = 1,
legend.color = "black",
legend.title.size = 1,
legend.title.color = "black",
legend.title.face = "bold",
legend.backgroundcol = "white",
legend.bordercol = "white",
legend.item.size = NULL,
legend.item.backcol = "grey90",
legend.item.bordercol = "white"
)
Arguments
base |
base theme where theme is built on. By default, all
metrics from |
theme.font |
base font family for the plot. |
title.color |
Color of plot title. Default is |
title.size |
size of plot title. Default is 1.3. |
title.align |
alignment of plot title. Must be one of |
title.vjust |
numeric, vertical adjustment for plot title. |
geom.outline.color |
Color of geom outline. Only applies, if |
geom.outline.size |
size of bar outlines. Default is 0.1. Use
size of |
geom.boxoutline.size |
size of outlines and median bar especially for boxplots.
Default is 0.5. Use size of |
geom.boxoutline.color |
Color of outlines and median bar especially for boxplots.
Only applies, if |
geom.alpha |
specifies the transparancy (alpha value) of geoms |
geom.linetype |
linetype of line geoms. Default is |
geom.errorbar.size |
size (thickness) of error bars. Default is |
geom.errorbar.linetype |
linetype of error bars. Default is |
geom.label.color |
Color of geom's value and annotation labels |
geom.label.size |
size of geom's value and annotation labels |
geom.label.alpha |
alpha level of geom's value and annotation labels |
geom.label.angle |
angle of geom's value and annotation labels |
axis.title.color |
Color of x- and y-axis title labels |
axis.title.size |
size of x- and y-axis title labels |
axis.title.x.vjust |
numeric, vertical adjustment of x-axis-title. |
axis.title.y.vjust |
numeric, vertical adjustment of y-axis-title. |
axis.angle.x |
angle for x-axis labels |
axis.angle.y |
angle for y-axis labels |
axis.angle |
angle for x- and y-axis labels. If set, overrides both |
axis.textcolor.x |
Color for x-axis labels. If not specified, a default dark gray color palette will be used for the labels. |
axis.textcolor.y |
Color for y-axis labels. If not specified, a default dark gray color palette will be used for the labels. |
axis.textcolor |
Color for both x- and y-axis labels.
If set, overrides both |
axis.linecolor.x |
Color of x-axis border |
axis.linecolor.y |
Color of y-axis border |
axis.linecolor |
Color for both x- and y-axis borders.
If set, overrides both |
axis.line.size |
size (thickness) of axis lines. Only affected, if |
axis.textsize.x |
size of x-axis labels |
axis.textsize.y |
size of y-axis labels |
axis.textsize |
size for both x- and y-axis labels.
If set, overrides both |
axis.tickslen |
length of axis tick marks |
axis.tickscol |
Color of axis tick marks |
axis.ticksmar |
margin between axis labels and tick marks |
axis.ticksize.x |
size of tick marks at x-axis. |
axis.ticksize.y |
size of tick marks at y-axis. |
panel.backcol |
Color of the diagram's background |
panel.bordercol |
Color of whole diagram border (panel border) |
panel.col |
Color of both diagram's border and background.
If set, overrides both |
panel.major.gridcol |
Color of the major grid lines of the diagram background |
panel.minor.gridcol |
Color of the minor grid lines of the diagram background |
panel.gridcol |
Color for both minor and major grid lines of the diagram background.
If set, overrides both |
panel.gridcol.x |
See |
panel.gridcol.y |
See |
panel.major.linetype |
line type for major grid lines |
panel.minor.linetype |
line type for minor grid lines |
plot.backcol |
Color of the plot's background |
plot.bordercol |
Color of whole plot's border (panel border) |
plot.col |
Color of both plot's region border and background.
If set, overrides both |
plot.margins |
numeric vector of length 4, indicating the top, right, bottom and left margin of the plot region. |
legend.pos |
position of the legend, if a legend is drawn.
|
legend.just |
justification of legend, relative to its position ( |
legend.inside |
logical, use |
legend.size |
text size of the legend. Default is 1. Relative size, so recommended values are from 0.3 to 2.5 |
legend.color |
Color of the legend labels |
legend.title.size |
text size of the legend title |
legend.title.color |
Color of the legend title |
legend.title.face |
font face of the legend title. By default, |
legend.backgroundcol |
fill color of the legend's background. Default is |
legend.bordercol |
Color of the legend's border. Default is |
legend.item.size |
size of legend's item (legend key), in centimetres. |
legend.item.backcol |
fill color of the legend's item-background. Default is |
legend.item.bordercol |
Color of the legend's item-border. Default is |
Value
The customized theme object, or NULL
, if a ggplot-theme was used.
See Also
Examples
## Not run:
library(sjmisc)
data(efc)
# set sjPlot-defaults, a slightly modification
# of the ggplot base theme
set_theme()
# legends of all plots inside
set_theme(legend.pos = "top left", legend.inside = TRUE)
plot_xtab(efc$e42dep, efc$e16sex)
# Use classic-theme. you may need to
# load the ggplot2-library.
library(ggplot2)
set_theme(base = theme_classic())
plot_frq(efc$e42dep)
# adjust value labels
set_theme(
geom.label.size = 3.5,
geom.label.color = "#3366cc",
geom.label.angle = 90
)
# hjust-aes needs adjustment for this
update_geom_defaults('text', list(hjust = -0.1))
plot_xtab(efc$e42dep, efc$e16sex, vjust = "center", hjust = "center")
# Create own theme based on classic-theme
set_theme(
base = theme_classic(), axis.linecolor = "grey50",
axis.textcolor = "#6699cc"
)
plot_frq(efc$e42dep)
## End(Not run)
Modify plot appearance
Description
Set default plot themes, use pre-defined color scales or modify plot or table appearance.
Usage
theme_sjplot(base_size = 12, base_family = "")
theme_sjplot2(base_size = 12, base_family = "")
theme_blank(base_size = 12, base_family = "")
theme_538(base_size = 12, base_family = "")
font_size(
title,
axis_title.x,
axis_title.y,
labels.x,
labels.y,
offset.x,
offset.y,
base.theme
)
label_angle(angle.x, angle.y, base.theme)
legend_style(inside, pos, justify, base.theme)
scale_color_sjplot(palette = "metro", discrete = TRUE, reverse = FALSE, ...)
scale_fill_sjplot(palette = "metro", discrete = TRUE, reverse = FALSE, ...)
sjplot_pal(palette = "metro", n = NULL)
show_sjplot_pals()
css_theme(css.theme = "regression")
Arguments
base_size |
Base font size. |
base_family |
Base font family. |
title |
Font size for plot titles. |
axis_title.x |
Font size for x-axis titles. |
axis_title.y |
Font size for y-axis titles. |
labels.x |
Font size for x-axis labels. |
labels.y |
Font size for y-axis labels. |
offset.x |
Offset for x-axis titles. |
offset.y |
Offset for y-axis titles. |
base.theme |
Optional ggplot-theme-object, which is needed in case multiple
functions should be combined, e.g. |
angle.x |
Angle for x-axis labels. |
angle.y |
Angle for y-axis labels. |
inside |
Logical, use |
pos |
Position of the legend, if a legend is drawn.
|
justify |
Justification of legend, relative to its position ( |
palette |
Character name of color palette. |
discrete |
Logical, if |
reverse |
Logical, if |
... |
Further arguments passed down to ggplot's |
n |
Numeric, number of colors to be returned. By default, the complete colour palette is returned. |
css.theme |
Name of the CSS pre-set theme-style. Can be used for table-functions. |
Details
When using the colors
argument in function calls (e.g.
plot_model()
) or when calling one of the predefined scale-functions
(e.g. scale_color_sjplot()
), there are pre-defined colour palettes
in this package. Use show_sjplot_pals()
to show all available
colour palettes.
Examples
# prepare data
if (requireNamespace("haven")) {
library(sjmisc)
data(efc)
efc <- to_factor(efc, c161sex, e42dep, c172code)
m <- lm(neg_c_7 ~ pos_v_4 + c12hour + e42dep + c172code, data = efc)
# create plot-object
p <- plot_model(m)
# change theme
p + theme_sjplot()
# change font-size
p + font_size(axis_title.x = 30)
# apply color theme
p + scale_color_sjplot()
# show all available colour palettes
show_sjplot_pals()
# get colour values from specific palette
sjplot_pal(pal = "breakfast club")
}
Plot Pearson's Chi2-Test of multiple contingency tables
Description
Plot p-values of Pearson's Chi2-tests for multiple contingency tables as ellipses or tiles. Requires a data frame with dichotomous (dummy) variables. Calculation of Chi2-matrix taken from Tales of R.
Usage
sjp.chi2(
df,
title = "Pearson's Chi2-Test of Independence",
axis.labels = NULL,
wrap.title = 50,
wrap.labels = 20,
show.legend = FALSE,
legend.title = NULL
)
Arguments
df |
A data frame with (dichotomous) factor variables. |
title |
character vector, used as plot title. Depending on plot type and function,
will be set automatically. If |
axis.labels |
character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically. |
wrap.title |
numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
show.legend |
logical, if |
legend.title |
character vector, used as title for the plot legend. |
Value
A ggplot-object.
Examples
# create data frame with 5 dichotomous (dummy) variables
mydf <- data.frame(as.factor(sample(1:2, 100, replace=TRUE)),
as.factor(sample(1:2, 100, replace=TRUE)),
as.factor(sample(1:2, 100, replace=TRUE)),
as.factor(sample(1:2, 100, replace=TRUE)),
as.factor(sample(1:2, 100, replace=TRUE)))
# create variable labels
items <- list(c("Item 1", "Item 2", "Item 3", "Item 4", "Item 5"))
# plot Chi2-contingency-table
sjp.chi2(mydf, axis.labels = items)
Plot polynomials for (generalized) linear regression
Description
This function plots a scatter plot of a term poly.term
against a response variable x
and adds - depending on
the amount of numeric values in poly.degree
- multiple
polynomial curves. A loess-smoothed line can be added to see
which of the polynomial curves fits best to the data.
Usage
sjp.poly(
x,
poly.term,
poly.degree,
poly.scale = FALSE,
fun = NULL,
axis.title = NULL,
geom.colors = NULL,
geom.size = 0.8,
show.loess = TRUE,
show.loess.ci = TRUE,
show.p = TRUE,
show.scatter = TRUE,
point.alpha = 0.2,
point.color = "#404040",
loess.color = "#808080"
)
Arguments
x |
A vector, representing the response variable of a linear (mixed) model; or
a linear (mixed) model as returned by |
poly.term |
If |
poly.degree |
Numeric, or numeric vector, indicating the degree of the polynomial.
If |
poly.scale |
Logical, if |
fun |
Linear function when modelling polynomial terms. Use |
axis.title |
Character vector of length one or two (depending on the
plot function and type), used as title(s) for the x and y axis. If not
specified, a default labelling is chosen. Note: Some plot types
may not support this argument sufficiently. In such cases, use the returned
ggplot-object and add axis titles manually with
|
geom.colors |
user defined color for geoms. See 'Details' in |
geom.size |
size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes. |
show.loess |
Logical, if |
show.loess.ci |
Logical, if |
show.p |
Logical, if |
show.scatter |
Logical, if TRUE (default), adds a scatter plot of data points to the plot. |
point.alpha |
Alpha value of point-geoms in the scatter plots. Only
applies, if |
point.color |
Color of of point-geoms in the scatter plots. Only applies,
if |
loess.color |
Color of the loess-smoothed line. Only applies, if |
Details
For each polynomial degree, a simple linear regression on x
(resp.
the extracted response, if x
is a fitted model) is performed,
where only the polynomial term poly.term
is included as independent variable.
Thus, lm(y ~ x + I(x^2) + ... + I(x^i))
is repeatedly computed
for all values in poly.degree
, and the predicted values of
the reponse are plotted against the raw values of poly.term
.
If x
is a fitted model, other covariates are ignored when
finding the best fitting polynomial.
This function evaluates raw polynomials, not orthogonal polynomials.
Polynomials are computed using the poly
function,
with argument raw = TRUE
.
To find out which polynomial degree fits best to the data, a loess-smoothed
line (in dark grey) can be added (with show.loess = TRUE
). The polynomial curves
that comes closest to the loess-smoothed line should be the best
fit to the data.
Value
A ggplot-object.
Examples
library(sjmisc)
data(efc)
# linear fit. loess-smoothed line indicates a more
# or less cubic curve
sjp.poly(efc$c160age, efc$quol_5, 1)
# quadratic fit
sjp.poly(efc$c160age, efc$quol_5, 2)
# linear to cubic fit
sjp.poly(efc$c160age, efc$quol_5, 1:4, show.scatter = FALSE)
# fit sample model
fit <- lm(tot_sc_e ~ c12hour + e17age + e42dep, data = efc)
# inspect relationship between predictors and response
plot_model(fit, type = "slope")
# "e17age" does not seem to be linear correlated to response
# try to find appropiate polynomial. Grey line (loess smoothed)
# indicates best fit. Looks like x^4 has the best fit,
# however, only x^3 has significant p-values.
sjp.poly(fit, "e17age", 2:4, show.scatter = FALSE)
## Not run:
# fit new model
fit <- lm(tot_sc_e ~ c12hour + e42dep + e17age + I(e17age^2) + I(e17age^3),
data = efc)
# plot marginal effects of polynomial term
plot_model(fit, type = "pred", terms = "e17age")
## End(Not run)
Wrapper to create plots and tables within a pipe-workflow
Description
This function has a pipe-friendly argument-structure, with the
first argument always being the data, followed by variables that
should be plotted or printed as table. The function then transforms
the input and calls the requested sjp.- resp. sjt.-function
to create a plot or table.
Both sjplot()
and sjtab()
support grouped data frames.
Usage
sjplot(data, ..., fun = c("grpfrq", "xtab", "aov1", "likert"))
sjtab(data, ..., fun = c("xtab", "stackfrq"))
Arguments
data |
A data frame. May also be a grouped data frame (see 'Note' and 'Examples'). |
... |
Names of variables that should be plotted, and also further arguments passed down to the sjPlot-functions. See 'Examples'. |
fun |
Plotting function. Refers to the function name of sjPlot-functions. See 'Details' and 'Examples'. |
Details
Following fun
-values are currently supported:
"grpfrq"
calls
plot_grpfrq
. The first two variables indata
are used (and required) to create the plot."likert"
calls
plot_likert
.data
must be a data frame with items to plot."stackfrq"
calls
tab_stackfrq
.data
must be a data frame with items to create the table."xtab"
calls
plot_xtab
ortab_xtab
. The first two variables indata
are used (and required) to create the plot or table.
Value
See related sjp. and sjt.-functions.
Note
The ...
-argument is used, first, to specify the variables from data
that should be plotted, and, second, to name further arguments that are
used in the subsequent plotting functions. Refer to the online-help of
supported plotting-functions to see valid arguments.
data
may also be a grouped data frame (see group_by
)
with up to two grouping variables. Plots are created for each subgroup then.
Examples
library(dplyr)
data(efc)
# Grouped frequencies
efc |> sjplot(e42dep, c172code, fun = "grpfrq")
# Grouped frequencies, as box plots
efc |> sjplot(e17age, c172code, fun = "grpfrq",
type = "box", geom.colors = "Set1")
## Not run:
# table output of grouped data frame
efc |>
group_by(e16sex, c172code) |>
select(e42dep, n4pstu, e16sex, c172code) |>
sjtab(fun = "xtab", use.viewer = FALSE) # open all tables in browser
## End(Not run)
Summary of correlations as HTML table
Description
Shows the results of a computed correlation as HTML table. Requires either
a data.frame
or a matrix with correlation coefficients
as returned by the cor
-function.
Usage
tab_corr(
data,
na.deletion = c("listwise", "pairwise"),
corr.method = c("pearson", "spearman", "kendall"),
title = NULL,
var.labels = NULL,
wrap.labels = 40,
show.p = TRUE,
p.numeric = FALSE,
fade.ns = TRUE,
val.rm = NULL,
digits = 3,
triangle = "both",
string.diag = NULL,
CSS = NULL,
encoding = NULL,
file = NULL,
use.viewer = TRUE,
remove.spaces = TRUE
)
Arguments
data |
Matrix with correlation coefficients as returned by the
|
na.deletion |
Indicates how missing values are treated. May be either
|
corr.method |
Indicates the correlation computation method. May be one of
|
title |
String, will be used as table caption. |
var.labels |
Character vector with variable names, which will be used to label variables in the output. |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
show.p |
Logical, if |
p.numeric |
Logical, if |
fade.ns |
Logical, if |
val.rm |
Specify a number between 0 and 1 to suppress the output of correlation values
that are smaller than |
digits |
Amount of decimals for estimates |
triangle |
Indicates whether only the upper right (use |
string.diag |
A vector with string values of the same length as |
CSS |
A |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
Value
Invisibly returns
the web page style sheet (
page.style
),the web page content (
page.content
),the complete html-output (
page.complete
) andthe html-table with inline-css for use with knitr (
knitr
)
for further use.
Note
If data
is a matrix with correlation coefficients as returned by
the cor
-function, p-values can't be computed.
Thus, show.p
, p.numeric
and fade.ns
only have an effect if data
is a data.frame
.
Examples
## Not run:
if (interactive()) {
# Data from the EUROFAMCARE sample dataset
library(sjmisc)
data(efc)
# retrieve variable and value labels
varlabs <- get_label(efc)
# recveive first item of COPE-index scale
start <- which(colnames(efc) == "c83cop2")
# recveive last item of COPE-index scale
end <- which(colnames(efc) == "c88cop7")
# create data frame with COPE-index scale
mydf <- data.frame(efc[, c(start:end)])
colnames(mydf) <- varlabs[c(start:end)]
# we have high correlations here, because all items
# belong to one factor.
tab_corr(mydf, p.numeric = TRUE)
# auto-detection of labels, only lower triangle
tab_corr(efc[, c(start:end)], triangle = "lower")
# auto-detection of labels, only lower triangle, all correlation
# values smaller than 0.3 are not shown in the table
tab_corr(efc[, c(start:end)], triangle = "lower", val.rm = 0.3)
# auto-detection of labels, only lower triangle, all correlation
# values smaller than 0.3 are printed in blue
tab_corr(efc[, c(start:end)], triangle = "lower",val.rm = 0.3,
CSS = list(css.valueremove = 'color:blue;'))
}
## End(Not run)
Print data frames as HTML table.
Description
These functions print data frames as HTML-table, showing the results in RStudio's viewer pane or in a web browser.
Usage
tab_df(
x,
title = NULL,
footnote = NULL,
col.header = NULL,
show.type = FALSE,
show.rownames = FALSE,
show.footnote = FALSE,
alternate.rows = FALSE,
sort.column = NULL,
digits = 2,
encoding = "UTF-8",
CSS = NULL,
file = NULL,
use.viewer = TRUE,
...
)
tab_dfs(
x,
titles = NULL,
footnotes = NULL,
col.header = NULL,
show.type = FALSE,
show.rownames = FALSE,
show.footnote = FALSE,
alternate.rows = FALSE,
sort.column = NULL,
digits = 2,
encoding = "UTF-8",
CSS = NULL,
file = NULL,
use.viewer = TRUE,
rnames = NULL,
...
)
Arguments
x |
For |
title , titles , footnote , footnotes |
Character vector with table
caption(s) resp. footnote(s). For |
col.header |
Character vector with elements used as column header for
the table. If |
show.type |
Logical, if |
show.rownames |
Logical, if |
show.footnote |
Logical, if |
alternate.rows |
Logical, if |
sort.column |
Numeric vector, indicating the index of the column
that should sorted. by default, the column is sorted in ascending order.
Use negative index for descending order, for instance,
|
digits |
Numeric, amount of digits after decimal point when rounding values. |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
CSS |
A |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
... |
Currently not used. |
rnames |
Character vector, can be used to set row names when |
Details
How do I use CSS
-argument?
With the CSS
-argument, the visual appearance of the tables
can be modified. To get an overview of all style-sheet-classnames
that are used in this function, see return value page.style
for
details. Arguments for this list have following syntax:
the class-name as argument name and
each style-definition must end with a semicolon
You can add style information to the default styles by using a + (plus-sign) as initial character for the argument attributes. Examples:
-
table = 'border:2px solid red;'
for a solid 2-pixel table border in red. -
summary = 'font-weight:bold;'
for a bold fontweight in the summary row. -
lasttablerow = 'border-bottom: 1px dotted blue;'
for a blue dotted border of the last table row. -
colnames = '+color:green'
to add green color formatting to column names. -
arc = 'color:blue;'
for a blue text color each 2nd row. -
caption = '+color:red;'
to add red font-color to the default table caption style.
See further examples in this package-vignette.
Value
A list with following items:
the web page style sheet (
page.style
),the HTML content of the data frame (
page.content
),the complete HTML page, including header, style sheet and body (
page.complete
)the HTML table with inline-css for use with knitr (
knitr
)the file path, if the HTML page should be saved to disk (
file
)
Note
The HTML tables can either be saved as file and manually opened
(use argument file
) or they can be saved as temporary files and
will be displayed in the RStudio Viewer pane (if working with RStudio)
or opened with the default web browser. Displaying resp. opening a
temporary file is the default behaviour.
Examples
## Not run:
data(iris)
data(mtcars)
tab_df(iris[1:5, ])
tab_dfs(list(iris[1:5, ], mtcars[1:5, 1:5]))
# sort 2nd column ascending
tab_df(iris[1:5, ], sort.column = 2)
# sort 2nd column descending
tab_df(iris[1:5, ], sort.column = -2)
## End(Not run)
Summary of factor analysis as HTML table
Description
Performs a factor analysis on a data frame or matrix
and displays the factors as HTML
table, or saves them as file.
In case a data frame is used as
parameter, the Cronbach's Alpha value for each factor scale will be calculated,
i.e. all variables with the highest loading for a factor are taken for the
reliability test. The result is an alpha value for each factor dimension.
Usage
tab_fa(
data,
rotation = "promax",
method = c("ml", "minres", "wls", "gls", "pa", "minchi", "minrank"),
nmbr.fctr = NULL,
fctr.load.tlrn = 0.1,
sort = FALSE,
title = "Factor Analysis",
var.labels = NULL,
wrap.labels = 40,
show.cronb = TRUE,
show.comm = FALSE,
alternate.rows = FALSE,
digits = 2,
CSS = NULL,
encoding = NULL,
file = NULL,
use.viewer = TRUE,
remove.spaces = TRUE
)
Arguments
data |
A data frame that should be used to compute a PCA, or a |
rotation |
Rotation of the factor loadings. May be one of
|
method |
the factoring method to be used. |
nmbr.fctr |
Number of factors used for calculating the rotation. By
default, this value is |
fctr.load.tlrn |
Specifies the minimum difference a variable needs to have between factor loadings (components) in order to indicate a clear loading on just one factor and not diffusing over all factors. For instance, a variable with 0.8, 0.82 and 0.84 factor loading on 3 possible factors can not be clearly assigned to just one factor and thus would be removed from the principal component analysis. By default, the minimum difference of loading values between the highest and 2nd highest factor should be 0.1 |
sort |
logical, if |
title |
String, will be used as table caption. |
var.labels |
Character vector with variable names, which will be used to label variables in the output. |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
show.cronb |
Logical, if |
show.comm |
Logical, if |
alternate.rows |
Logical, if |
digits |
Amount of decimals for estimates |
CSS |
A |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
Value
Invisibly returns
the web page style sheet (
page.style
),the web page content (
page.content
),the complete html-output (
page.complete
),the html-table with inline-css for use with knitr (
knitr
),the
factor.index
, i.e. the column index of each variable with the highest factor loading for each factor andthe
removed.items
, i.e. which variables have been removed because they were outside of thefctr.load.tlrn
's range.
for further use.
Note
This method for factor analysis relies on the functions
fa
and fa.parallel
from the psych package.
Examples
## Not run:
# Data from the EUROFAMCARE sample dataset
library(sjmisc)
library(GPArotation)
data(efc)
# recveive first item of COPE-index scale
start <- which(colnames(efc) == "c82cop1")
# recveive last item of COPE-index scale
end <- which(colnames(efc) == "c90cop9")
# auto-detection of labels
if (interactive()) {
tab_fa(efc[, start:end])
}
## End(Not run)
Summary of item analysis of an item scale as HTML table
Description
This function performs an item analysis with certain statistics that are useful for scale or index development. The resulting tables are shown in the viewer pane resp. webbrowser or can be saved as file. Following statistics are computed for each item of a data frame:
percentage of missing values
mean value
standard deviation
skew
item difficulty
item discrimination
Cronbach's Alpha if item was removed from scale
mean (or average) inter-item-correlation
Optional, following statistics can be computed as well:
kurstosis
Shapiro-Wilk Normality Test
If factor.groups
is not NULL
, the data frame df
will be
splitted into groups, assuming that factor.groups
indicate those columns
of the data frame that belong to a certain factor (see return value of function tab_pca
as example for retrieving factor groups for a scale and see examples for more details).
Usage
tab_itemscale(
df,
factor.groups = NULL,
factor.groups.titles = "auto",
scale = FALSE,
min.valid.rowmean = 2,
alternate.rows = TRUE,
sort.column = NULL,
show.shapiro = FALSE,
show.kurtosis = FALSE,
show.corr.matrix = TRUE,
CSS = NULL,
encoding = NULL,
file = NULL,
use.viewer = TRUE,
remove.spaces = TRUE
)
sjt.itemanalysis(
df,
factor.groups = NULL,
factor.groups.titles = "auto",
scale = FALSE,
min.valid.rowmean = 2,
alternate.rows = TRUE,
sort.column = NULL,
show.shapiro = FALSE,
show.kurtosis = FALSE,
show.corr.matrix = TRUE,
CSS = NULL,
encoding = NULL,
file = NULL,
use.viewer = TRUE,
remove.spaces = TRUE
)
Arguments
df |
A data frame with items. |
factor.groups |
If not |
factor.groups.titles |
Titles for each factor group that will be used as table caption for each
component-table. Must be a character vector of same length as |
scale |
Logical, if |
min.valid.rowmean |
Minimum amount of valid values to compute row means for index scores.
Default is 2, i.e. the return values |
alternate.rows |
Logical, if |
sort.column |
Numeric vector, indicating the index of the column
that should sorted. by default, the column is sorted in ascending order.
Use negative index for descending order, for instance,
|
show.shapiro |
Logical, if |
show.kurtosis |
Logical, if |
show.corr.matrix |
Logical, if |
CSS |
A |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
Value
Invisibly returns
-
df.list
: List of data frames with the item analysis for each sub.group (or complete, iffactor.groups
wasNULL
) -
index.scores
: A data frame with of standardized scale / index scores for each case (mean value of all scale items for each case) for each sub-group. -
ideal.item.diff
: List of vectors that indicate the ideal item difficulty for each item in each sub-group. Item difficulty only differs when items have different levels. -
cronbach.values
: List of Cronbach's Alpha values for the overall item scale for each sub-group. -
knitr.list
: List of html-tables with inline-css for use with knitr for each table (sub-group) -
knitr
: html-table of all complete output with inline-css for use with knitr -
complete.page
: Complete html-output.
If factor.groups = NULL
, each list contains only one elment, since just one
table is printed for the complete scale indicated by df
. If factor.groups
is a vector of group-index-values, the lists contain elements for each sub-group.
Note
The Shapiro-Wilk Normality Test (see column
W(p)
) tests if an item has a distribution that is significantly different from normal.-
Item difficulty should range between 0.2 and 0.8. Ideal value is
p+(1-p)/2
(which mostly is between 0.5 and 0.8). For item discrimination, acceptable values are 0.20 or higher; the closer to 1.00 the better. See
item_reliability
for more details.In case the total Cronbach's Alpha value is below the acceptable cut-off of 0.7 (mostly if an index has few items), the mean inter-item-correlation is an alternative measure to indicate acceptability. Satisfactory range lies between 0.2 and 0.4. See also
item_intercor
.
References
Jorion N, Self B, James K, Schroeder L, DiBello L, Pellegrino J (2013) Classical Test Theory Analysis of the Dynamics Concept Inventory. (web)
Briggs SR, Cheek JM (1986) The role of factor analysis in the development and evaluation of personality scales. Journal of Personality, 54(1), 106-148. doi: 10.1111/j.1467-6494.1986.tb00391.x
McLean S et al. (2013) Stigmatizing attitudes and beliefs about bulimia nervosa: Gender, age, education and income variability in a community sample. International Journal of Eating Disorders. doi: 10.1002/eat.22227
Trochim WMK (2008) Types of Reliability.
Examples
# Data from the EUROFAMCARE sample dataset
library(sjmisc)
library(sjlabelled)
data(efc)
# retrieve variable and value labels
varlabs <- get_label(efc)
# recveive first item of COPE-index scale
start <- which(colnames(efc) == "c82cop1")
# recveive last item of COPE-index scale
end <- which(colnames(efc) == "c90cop9")
# create data frame with COPE-index scale
mydf <- data.frame(efc[, start:end])
colnames(mydf) <- varlabs[start:end]
## Not run:
if (interactive()) {
tab_itemscale(mydf)
# auto-detection of labels
tab_itemscale(efc[, start:end])
# Compute PCA on Cope-Index, and perform a
# item analysis for each extracted factor.
indices <- tab_pca(mydf)$factor.index
tab_itemscale(mydf, factor.groups = indices)
# or, equivalent
tab_itemscale(mydf, factor.groups = "auto")
}
## End(Not run)
Print regression models as HTML table
Description
tab_model()
creates HTML tables from regression models.
Usage
tab_model(
...,
transform,
show.intercept = TRUE,
show.est = TRUE,
show.ci = 0.95,
show.ci50 = FALSE,
show.se = NULL,
show.std = NULL,
std.response = TRUE,
show.p = TRUE,
show.stat = FALSE,
show.df = FALSE,
show.zeroinf = TRUE,
show.r2 = TRUE,
show.icc = TRUE,
show.re.var = TRUE,
show.ngroups = TRUE,
show.fstat = FALSE,
show.aic = FALSE,
show.aicc = FALSE,
show.dev = FALSE,
show.loglik = FALSE,
show.obs = TRUE,
show.reflvl = FALSE,
terms = NULL,
rm.terms = NULL,
order.terms = NULL,
keep = NULL,
drop = NULL,
title = NULL,
pred.labels = NULL,
dv.labels = NULL,
wrap.labels = 25,
bootstrap = FALSE,
iterations = 1000,
seed = NULL,
vcov.fun = NULL,
vcov.args = NULL,
string.pred = "Predictors",
string.est = "Estimate",
string.std = "std. Beta",
string.ci = "CI",
string.se = "std. Error",
string.std_se = "standardized std. Error",
string.std_ci = "standardized CI",
string.p = "p",
string.std.p = "std. p",
string.df = "df",
string.stat = "Statistic",
string.std.stat = "std. Statistic",
string.resp = "Response",
string.intercept = "(Intercept)",
strings = NULL,
ci.hyphen = " – ",
minus.sign = "-",
collapse.ci = FALSE,
collapse.se = FALSE,
linebreak = TRUE,
col.order = c("est", "se", "std.est", "std.se", "ci", "std.ci", "ci.inner", "ci.outer",
"stat", "std.stat", "p", "std.p", "df.error", "response.level"),
digits = 2,
digits.p = 3,
digits.rsq = 3,
digits.re = 2,
emph.p = TRUE,
p.val = NULL,
df.method = NULL,
p.style = c("numeric", "stars", "numeric_stars", "scientific", "scientific_stars"),
p.threshold = c(0.05, 0.01, 0.001),
p.adjust = NULL,
case = "parsed",
auto.label = TRUE,
prefix.labels = c("none", "varname", "label"),
bpe = "median",
CSS = css_theme("regression"),
file = NULL,
use.viewer = TRUE,
encoding = "UTF-8"
)
Arguments
... |
One or more regression models, including glm's or mixed models.
May also be a |
transform |
A character vector, naming a function that will be applied
on estimates and confidence intervals. By default, |
show.intercept |
Logical, if |
show.est |
Logical, if |
show.ci |
Either logical, and if |
show.ci50 |
Logical, if |
show.se |
Logical, if |
show.std |
Indicates whether standardized beta-coefficients should also printed, and if yes, which type of standardization is done. See 'Details'. |
std.response |
Logical, whether the response variable will also be
standardized if standardized coefficients are requested. Setting both
|
show.p |
Logical, if |
show.stat |
Logical, if |
show.df |
Logical, if |
show.zeroinf |
Logical, if |
show.r2 |
Logical, if |
show.icc |
Logical, if |
show.re.var |
Logical, if |
show.ngroups |
Logical, if |
show.fstat |
Logical, if |
show.aic |
Logical, if |
show.aicc |
Logical, if |
show.dev |
Logical, if |
show.loglik |
Logical, if |
show.obs |
Logical, if |
show.reflvl |
Logical, if |
terms |
Character vector with names of those terms (variables) that should
be printed in the table. All other terms are removed from the output. If
|
rm.terms |
Character vector with names that indicate which terms should
be removed from the output Counterpart to |
order.terms |
Numeric vector, indicating in which order the coefficients should be plotted. See examples in this package-vignette. |
keep , drop |
Character containing a regular expression pattern that
describes the parameters that should be included (for |
title |
String, will be used as table caption. |
pred.labels |
Character vector with labels of predictor variables.
If not |
dv.labels |
Character vector with labels of dependent variables of all
fitted models. If |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
bootstrap |
Logical, if |
iterations |
Numeric, number of bootstrap iterations (default is 1000). |
seed |
Numeric, the number of the seed to replicate bootstrapped estimates. If |
vcov.fun |
Variance-covariance matrix used to compute uncertainty
estimates (e.g., for robust standard errors). This argument accepts a
covariance matrix, a function which returns a covariance matrix, or a
string which identifies the function to be used to compute the covariance
matrix. See |
vcov.args |
List of arguments to be passed to the function identified by
the |
string.pred |
Character vector,used as headline for the predictor column.
Default is |
string.est |
Character vector, used for the column heading of coefficients.
Default is based on the response scale, e.g. for logistic regression models,
|
string.std |
Character vector, used for the column heading of standardized beta coefficients. Default is |
string.ci |
Character vector, used for the column heading of confidence interval values. Default is |
string.se |
Character vector, used for the column heading of standard error values. Default is |
string.std_se |
Character vector, used for the column heading of standard error of standardized coefficients. Default is |
string.std_ci |
Character vector, used for the column heading of confidence intervals of standardized coefficients. Default is |
string.p |
String value, used for the column heading of p values. Default is |
string.std.p |
Character vector, used for the column heading of p values. Default is |
string.df |
Character vector, used for the column heading of degrees of freedom. Default is |
string.stat |
Character vector, used for the test statistic. Default is |
string.std.stat |
Character vector, used for the test statistic. Default is |
string.resp |
Character vector, used for the column heading of of the response level for multinominal or categorical models. Default is |
string.intercept |
Character vector, used as name for the intercept parameter. Default is |
strings |
Named character vector, as alternative to arguments like |
ci.hyphen |
Character vector, indicating the hyphen for confidence interval range. May be an HTML entity. See 'Examples'. |
minus.sign |
string, indicating the minus sign for negative numbers. May be an HTML entity. See 'Examples'. |
collapse.ci |
Logical, if |
collapse.se |
Logical, if |
linebreak |
Logical, if |
col.order |
Character vector, indicating which columns should be printed
and in which order. Column names that are excluded from |
digits |
Amount of decimals for estimates |
digits.p |
Amount of decimals for p-values |
digits.rsq |
Amount of decimals for r-squared values |
digits.re |
Amount of decimals for random effects part of the summary table. |
emph.p |
Logical, if |
df.method , p.val |
Method for computing degrees of freedom for p-values,
standard errors and confidence intervals (CI). Only applies to mixed models.
Use |
p.style |
Character, indicating if p-values should be printed as
numeric value ( |
p.threshold |
Numeric vector of length 3, indicating the treshold for
annotating p-values with asterisks. Only applies if
|
p.adjust |
String value, if not |
case |
Desired target case. Labels will automatically converted into the
specified character case. See |
auto.label |
Logical, if |
prefix.labels |
Indicates whether the value labels of categorical variables
should be prefixed, e.g. with the variable name or variable label. See
argument |
bpe |
For Stan-models (fitted with the rstanarm- or
brms-package), the Bayesian point estimate is, by default, the median
of the posterior distribution. Use |
CSS |
A |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
Details
Standardized Estimates
Default standardization is done by completely refitting the model on the
standardized data. Hence, this approach is equal to standardizing the
variables before fitting the model, which is particularly recommended for
complex models that include interactions or transformations (e.g.,
polynomial or spline terms). When show.std = "std2"
, standardization
of estimates follows Gelman's (2008) suggestion, rescaling the estimates by
dividing them by two standard deviations instead of just one. Resulting
coefficients are then directly comparable for untransformed binary
predictors. For backward compatibility reasons, show.std
also may be
a logical value; if TRUE
, normal standardized estimates are printed
(same effect as show.std = "std"
). Use show.std = NULL
(default) or show.std = FALSE
, if no standardization is required.
How do I use CSS
-argument?
With the CSS
-argument, the visual appearance of the tables
can be modified. To get an overview of all style-sheet-classnames
that are used in this function, see return value page.style
for details.
Arguments for this list have following syntax:
the class-names with
"css."
-prefix as argument name andeach style-definition must end with a semicolon
You can add style information to the default styles by using a + (plus-sign) as initial character for the argument attributes. Examples:
-
css.table = 'border:2px solid red;'
for a solid 2-pixel table border in red. -
css.summary = 'font-weight:bold;'
for a bold fontweight in the summary row. -
css.lasttablerow = 'border-bottom: 1px dotted blue;'
for a blue dotted border of the last table row. -
css.colnames = '+color:green'
to add green color formatting to column names. -
css.arc = 'color:blue;'
for a blue text color each 2nd row. -
css.caption = '+color:red;'
to add red font-color to the default table caption style.
Value
Invisibly returns
the web page style sheet (
page.style
),the web page content (
page.content
),the complete html-output (
page.complete
) andthe html-table with inline-css for use with knitr (
knitr
)
for further use.
Note
The HTML tables can either be saved as file and manually opened (use argument file
) or
they can be saved as temporary files and will be displayed in the RStudio Viewer pane (if working with RStudio)
or opened with the default web browser. Displaying resp. opening a temporary file is the
default behaviour (i.e. file = NULL
).
Examples are shown in these three vignettes:
Summary of Regression Models as HTML Table,
Summary of Mixed Models as HTML Table and
Summary of Bayesian Models as HTML Table.
Summary of principal component analysis as HTML table
Description
Performes a principle component analysis on a data frame or matrix
(with varimax or oblimin rotation) and displays the factor solution as HTML
table, or saves them as file.
In case a data frame is used as
parameter, the Cronbach's Alpha value for each factor scale will be calculated,
i.e. all variables with the highest loading for a factor are taken for the
reliability test. The result is an alpha value for each factor dimension.
Usage
tab_pca(
data,
rotation = c("varimax", "quartimax", "promax", "oblimin", "simplimax", "cluster",
"none"),
nmbr.fctr = NULL,
fctr.load.tlrn = 0.1,
title = "Principal Component Analysis",
var.labels = NULL,
wrap.labels = 40,
show.cronb = TRUE,
show.msa = FALSE,
show.var = FALSE,
alternate.rows = FALSE,
digits = 2,
string.pov = "Proportion of Variance",
string.cpov = "Cumulative Proportion",
CSS = NULL,
encoding = NULL,
file = NULL,
use.viewer = TRUE,
remove.spaces = TRUE
)
Arguments
data |
A data frame that should be used to compute a PCA, or a |
rotation |
Rotation of the factor loadings. May be one of
|
nmbr.fctr |
Number of factors used for calculating the rotation. By
default, this value is |
fctr.load.tlrn |
Specifies the minimum difference a variable needs to have between factor loadings (components) in order to indicate a clear loading on just one factor and not diffusing over all factors. For instance, a variable with 0.8, 0.82 and 0.84 factor loading on 3 possible factors can not be clearly assigned to just one factor and thus would be removed from the principal component analysis. By default, the minimum difference of loading values between the highest and 2nd highest factor should be 0.1 |
title |
String, will be used as table caption. |
var.labels |
Character vector with variable names, which will be used to label variables in the output. |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
show.cronb |
Logical, if |
show.msa |
Logical, if |
show.var |
Logical, if |
alternate.rows |
Logical, if |
digits |
Amount of decimals for estimates |
string.pov |
String for the table row that contains the proportions of variances. By default, "Proportion of Variance" will be used. |
string.cpov |
String for the table row that contains the cumulative variances. By default, "Cumulative Proportion" will be used. |
CSS |
A |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
Value
Invisibly returns
the web page style sheet (
page.style
),the web page content (
page.content
),the complete html-output (
page.complete
),the html-table with inline-css for use with knitr (
knitr
),the
factor.index
, i.e. the column index of each variable with the highest factor loading for each factor andthe
removed.items
, i.e. which variables have been removed because they were outside of thefctr.load.tlrn
's range.
for further use.
Examples
## Not run:
# Data from the EUROFAMCARE sample dataset
library(sjmisc)
data(efc)
# recveive first item of COPE-index scale
start <- which(colnames(efc) == "c82cop1")
# recveive last item of COPE-index scale
end <- which(colnames(efc) == "c90cop9")
# auto-detection of labels
if (interactive()) {
tab_pca(efc[, start:end])
}
## End(Not run)
Summary of stacked frequencies as HTML table
Description
Shows the results of stacked frequencies (such as likert scales) as HTML table. This function is useful when several items with identical scale/categories should be printed as table to compare their distributions (e.g. when plotting scales like SF, Barthel-Index, Quality-of-Life-scales etc.).
Usage
tab_stackfrq(
items,
weight.by = NULL,
title = NULL,
var.labels = NULL,
value.labels = NULL,
wrap.labels = 20,
sort.frq = NULL,
alternate.rows = FALSE,
digits = 2,
string.total = "N",
string.na = "NA",
show.n = FALSE,
show.total = FALSE,
show.na = FALSE,
show.skew = FALSE,
show.kurtosis = FALSE,
digits.stats = 2,
file = NULL,
encoding = NULL,
CSS = NULL,
use.viewer = TRUE,
remove.spaces = TRUE
)
Arguments
items |
Data frame, or a grouped data frame, with each column representing one item. |
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
title |
String, will be used as table caption. |
var.labels |
Character vector with variable names, which will be used to label variables in the output. |
value.labels |
Character vector (or |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
sort.frq |
logical, indicates whether the
|
alternate.rows |
Logical, if |
digits |
Numeric, amount of digits after decimal point when rounding values. |
string.total |
label for the total N column. |
string.na |
label for the missing column/row. |
show.n |
logical, if |
show.total |
logical, if |
show.na |
logical, if |
show.skew |
logical, if |
show.kurtosis |
Logical, if |
digits.stats |
amount of digits for rounding the skewness and kurtosis valuess. Default is 2, i.e. skewness and kurtosis values have 2 digits after decimal point. |
file |
Destination file, if the output should be saved as file.
If |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
CSS |
A |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
Value
Invisibly returns
the web page style sheet (
page.style
),the web page content (
page.content
),the complete html-output (
page.complete
) andthe html-table with inline-css for use with knitr (
knitr
)
for further use.
Examples
# -------------------------------
# random sample
# -------------------------------
# prepare data for 4-category likert scale, 5 items
likert_4 <- data.frame(
as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.2, 0.3, 0.1, 0.4))),
as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.5, 0.25, 0.15, 0.1))),
as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.25, 0.1, 0.4, 0.25))),
as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.1, 0.4, 0.4, 0.1))),
as.factor(sample(1:4, 500, replace = TRUE, prob = c(0.35, 0.25, 0.15, 0.25)))
)
# create labels
levels_4 <- c("Independent", "Slightly dependent",
"Dependent", "Severely dependent")
# create item labels
items <- c("Q1", "Q2", "Q3", "Q4", "Q5")
# plot stacked frequencies of 5 (ordered) item-scales
## Not run:
if (interactive()) {
tab_stackfrq(likert_4, value.labels = levels_4, var.labels = items)
# -------------------------------
# Data from the EUROFAMCARE sample dataset
# Auto-detection of labels
# -------------------------------
data(efc)
# recveive first item of COPE-index scale
start <- which(colnames(efc) == "c82cop1")
# recveive first item of COPE-index scale
end <- which(colnames(efc) == "c90cop9")
tab_stackfrq(efc[, c(start:end)], alternate.rows = TRUE)
tab_stackfrq(efc[, c(start:end)], alternate.rows = TRUE,
show.n = TRUE, show.na = TRUE)
# --------------------------------
# User defined style sheet
# --------------------------------
tab_stackfrq(efc[, c(start:end)], alternate.rows = TRUE,
show.total = TRUE, show.skew = TRUE, show.kurtosis = TRUE,
CSS = list(css.ncol = "border-left:1px dotted black;",
css.summary = "font-style:italic;"))
}
## End(Not run)
Summary of contingency tables as HTML table
Description
Shows contingency tables as HTML file in browser or viewer pane, or saves them as file.
Usage
tab_xtab(
var.row,
var.col,
weight.by = NULL,
title = NULL,
var.labels = NULL,
value.labels = NULL,
wrap.labels = 20,
show.obs = TRUE,
show.cell.prc = FALSE,
show.row.prc = FALSE,
show.col.prc = FALSE,
show.exp = FALSE,
show.legend = FALSE,
show.na = FALSE,
show.summary = TRUE,
drop.empty = TRUE,
statistics = c("auto", "cramer", "phi", "spearman", "kendall", "pearson", "fisher"),
string.total = "Total",
digits = 1,
tdcol.n = "black",
tdcol.expected = "#339999",
tdcol.cell = "#993333",
tdcol.row = "#333399",
tdcol.col = "#339933",
emph.total = FALSE,
emph.color = "#f8f8f8",
prc.sign = " %",
hundret = "100.0",
CSS = NULL,
encoding = NULL,
file = NULL,
use.viewer = TRUE,
remove.spaces = TRUE,
...
)
sjt.xtab(
var.row,
var.col,
weight.by = NULL,
title = NULL,
var.labels = NULL,
value.labels = NULL,
wrap.labels = 20,
show.obs = TRUE,
show.cell.prc = FALSE,
show.row.prc = FALSE,
show.col.prc = FALSE,
show.exp = FALSE,
show.legend = FALSE,
show.na = FALSE,
show.summary = TRUE,
drop.empty = TRUE,
statistics = c("auto", "cramer", "phi", "spearman", "kendall", "pearson", "fisher"),
string.total = "Total",
digits = 1,
tdcol.n = "black",
tdcol.expected = "#339999",
tdcol.cell = "#993333",
tdcol.row = "#333399",
tdcol.col = "#339933",
emph.total = FALSE,
emph.color = "#f8f8f8",
prc.sign = " %",
hundret = "100.0",
CSS = NULL,
encoding = NULL,
file = NULL,
use.viewer = TRUE,
remove.spaces = TRUE,
...
)
Arguments
var.row |
Variable that should be displayed in the table rows. |
var.col |
Cariable that should be displayed in the table columns. |
weight.by |
Vector of weights that will be applied to weight all cases.
Must be a vector of same length as the input vector. Default is
|
title |
String, will be used as table caption. |
var.labels |
Character vector with variable names, which will be used to label variables in the output. |
value.labels |
Character vector (or |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
show.obs |
Logical, if |
show.cell.prc |
Logical, if |
show.row.prc |
Logical, if |
show.col.prc |
Logical, if |
show.exp |
Logical, if |
show.legend |
logical, if |
show.na |
logical, if |
show.summary |
Logical, if |
drop.empty |
Logical, if |
statistics |
Name of measure of association that should be computed. May
be one of |
string.total |
Character label for the total column / row header |
digits |
Amount of decimals for estimates |
tdcol.n |
Color for highlighting count (observed) values in table cells. Default is black. |
tdcol.expected |
Color for highlighting expected values in table cells. Default is cyan. |
tdcol.cell |
Color for highlighting cell percentage values in table cells. Default is red. |
tdcol.row |
Color for highlighting row percentage values in table cells. Default is blue. |
tdcol.col |
Color for highlighting column percentage values in table cells. Default is green. |
emph.total |
Logical, if |
emph.color |
Logical, if |
prc.sign |
The percentage sign that is printed in the table cells, in HTML-format.
Default is |
hundret |
Default value that indicates the 100-percent column-sums (since rounding values
may lead to non-exact results). Default is |
CSS |
A |
encoding |
String, indicating the charset encoding used for variable and
value labels. Default is |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
... |
Other arguments, currently passed down to the test statistics functions
|
Value
Invisibly returns
the web page style sheet (
page.style
),the web page content (
page.content
),the complete html-output (
page.complete
) andthe html-table with inline-css for use with knitr (
knitr
)
for further use.
Examples
# prepare sample data set
data(efc)
# print simple cross table with labels
## Not run:
if (interactive()) {
tab_xtab(efc$e16sex, efc$e42dep)
# print cross table with manually set
# labels and expected values
tab_xtab(
efc$e16sex,
efc$e42dep,
var.labels = c("Elder's gender", "Elder's dependency"),
show.exp = TRUE
)
# print minimal cross table with labels, total col/row highlighted
tab_xtab(efc$e16sex, efc$e42dep, show.cell.prc = FALSE, emph.total = TRUE)
# User defined style sheet
tab_xtab(efc$e16sex, efc$e42dep,
CSS = list(css.table = "border: 2px solid;",
css.tdata = "border: 1px solid;",
css.horline = "border-bottom: double blue;"))
# ordinal data, use Kendall's tau
tab_xtab(efc$e42dep, efc$quol_5, statistics = "kendall")
# calculate Spearman's rho, with continuity correction
tab_xtab(
efc$e42dep,
efc$quol_5,
statistics = "spearman",
exact = FALSE,
continuity = TRUE
)
}
## End(Not run)
View structure of labelled data frames
Description
Save (or show) content of an imported SPSS, SAS or Stata data file,
or any similar labelled data.frame
, as HTML table.
This quick overview shows variable ID number, name, label,
type and associated value labels. The result can be
considered as "codeplan" of the data frame.
Usage
view_df(
x,
weight.by = NULL,
alternate.rows = TRUE,
show.id = TRUE,
show.type = FALSE,
show.values = TRUE,
show.string.values = FALSE,
show.labels = TRUE,
show.frq = FALSE,
show.prc = FALSE,
show.wtd.frq = FALSE,
show.wtd.prc = FALSE,
show.na = FALSE,
max.len = 15,
sort.by.name = FALSE,
wrap.labels = 50,
verbose = FALSE,
CSS = NULL,
encoding = NULL,
file = NULL,
use.viewer = TRUE,
remove.spaces = TRUE
)
Arguments
x |
A (labelled) data frame, imported by |
weight.by |
Name of variable in |
alternate.rows |
Logical, if |
show.id |
Logical, if |
show.type |
Logical, if |
show.values |
Logical, if |
show.string.values |
Logical, if |
show.labels |
Logical, if |
show.frq |
Logical, if |
show.prc |
Logical, if |
show.wtd.frq |
Logical, if |
show.wtd.prc |
Logical, if |
show.na |
logical, if |
max.len |
Numeric, indicates how many values and value labels per variable are shown. Useful for variables with many different values, where the output can be truncated. |
sort.by.name |
Logical, if |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
verbose |
Logical, if |
CSS |
A |
encoding |
Character vector, indicating the charset encoding used
for variable and value labels. Default is |
file |
Destination file, if the output should be saved as file.
If |
use.viewer |
Logical, if |
remove.spaces |
Logical, if |
Value
Invisibly returns
the web page style sheet (
page.style
),the web page content (
page.content
),the complete html-output (
page.complete
) andthe html-table with inline-css for use with knitr (
knitr
)
for further use.
Examples
## Not run:
# init dataset
data(efc)
# view variables
view_df(efc)
# view variables w/o values and value labels
view_df(efc, show.values = FALSE, show.labels = FALSE)
# view variables including variable typed, orderd by name
view_df(efc, sort.by.name = TRUE, show.type = TRUE)
# User defined style sheet
view_df(efc,
CSS = list(css.table = "border: 2px solid;",
css.tdata = "border: 1px solid;",
css.arc = "color:blue;"))
## End(Not run)