1 Introduction

This document accompanies the book Partial Least Squares Structural Equation Modeling (PLS-SEM) Using R by Joseph Hair, Tomas Hult, Christian M. Ringle, Marko Sarstedt, Nicholas Danks, and Soumya Ray.

It introduces a concise version of the book’s R Code and outputs for the example corporate reputation model.

2 Introduction to SEMinR (Chapter 3)

2.1 Installing and loading the package

To download and install the SEMinR package call install.packages("seminr"). (You only need to do this once to equip RStudio on your computer with SEMinR.)

To load the SEMinR library use library(seminr). (You must do this everytime you restart RStudio and wish to use SEMinR.)

library(seminr)

2.2 Load and inspect the data

The data set accompanying the book (“Corporate Reputation Data.csv”) is integrated in the SEMinR package.

corp_rep_data <- seminr::corp_rep_data


Alternatively, we can load the data by importing it from another file such as "Corporate Reputation Data.csv".

corp_rep_data <- read.csv(file = "Corporate Reputation Data.csv", 
  header = TRUE, sep = ";")


Take a quick look at the data with head().

head(corp_rep_data)

2.3 Model set up

2.3.1 Model and measurement details

Here, we work with a simple corporate reputation model as displayed below in Fig. 1. Tab. 1 shows the model’s measurement details, i.e. the constructs, variable names and items.

**Fig. 1** Simple corporate reputation model.

Fig. 1 Simple corporate reputation model.


Tab. 1 Measurement details for the simple corporate reputation model.
construct variable name item
Competence (COMP) comp_1 [The company] is a top competitor in its market.
Competence (COMP) comp_2 As far as I know, [the company] is recognized worldwide.
Competence (COMP) comp_3 I believe that [the company] performs at a premium level.
Likeability (LIKE) like_1 [The company] is a company that I can better identify with than other companies.
Likeability (LIKE) like_2 [The company] is a company that I would regret more not having if it no longer existed than I would other companies.
Likeability (LIKE) like_3 I regard [the company] as a likeable company.
Customer Satisfaction (CUSA) cusa I am satisfied with [the company].
Customer Loyalty (CUSL) cusl_1 I would recommend [company] to friends and relatives.
Customer Loyalty (CUSL) cusl_2 If I had to choose again, I would choose [company] as my mobile phone services provider.
Customer Loyalty (CUSL) cusl_3 I will remain a customer of [company] in the future.

2.3.2 Create a measurement model

The constructs() function specifies the list of all construct measurement models. Within this list you can define various constructs:

  • composite() specifies the measurement of individual constructs.
  • interaction_term() specifies interactions terms.
  • higher_composite() specifies hierarchical component models, i.e. higher-order constructs (Sarstedt et al., 2019).

Thereby, composite() specifies the measurement of individual constructs and

  • multi_items() creates a vector of multiple measurement items with similar names.
  • single_item() describe a single measurement item.

For example, the composite COMP incorporates the items comp_1 to comp_3.

corp_rep_mm <- constructs(
  composite("COMP", multi_items("comp_", 1:3)),
  composite("LIKE", multi_items("like_", 1:3)),
  composite("CUSA", single_item("cusa")),
  composite("CUSL", multi_items("cusl_", 1:3)))

2.3.3 Create a structural model

The structural model indicates the sequence of the constructs and the relationships between them.

  • relationships() specifies all the structural relationships between all constructs.
  • paths() specifies relationships between a specific set of antecedents and outcomes.

For example, to specify the relationships from COMP and LIKE to CUSA and CUSL, we use the from = and to = arguments in the paths() function: paths(from = c("COMP", "LIKE"), to = c("CUSA", "CUSL")).

corp_rep_sm <- relationships(
  paths(from = c("COMP", "LIKE"), to = c("CUSA", "CUSL")),
  paths(from = c("CUSA"), to = c("CUSL")))

2.3.4 Estimating the model

To estimate a PLS path model, algorithmic options and arguments settings must be selected. These can be reviewed by calling the function’s documentation with ?estimate_pls.

Here, we specify the data (data = corp_rep_data), the measurement model (measurement_model = corp_rep_mm) and structural model (structural_model = corp_rep_sm) as well as the weighting scheme (inner_weights = path_weighting) and missing data handling with missing values being indicated by “-99” (missing_value = "-99") and replaced by the mean (missing = mean_replacement).

corp_rep_pls_model <- estimate_pls(data = corp_rep_data,
  measurement_model = corp_rep_mm,
  structural_model  = corp_rep_sm,
  inner_weights = path_weighting,
  missing = mean_replacement,
  missing_value = "-99")
## Generating the seminr model
## All 344 observations are valid.

2.3.5 Summarizing the model

Once the model has been estimated, a summarized report of the results can be generated by using the summary() function.

summary_corp_rep <- summary(corp_rep_pls_model)


The summary() function applied to a SEMinR model object produces a summary.seminr_model class object. Its sub-objects (see Tab. 2) serve as basis for the assessment of the measurement and structural model (Hair et al., 2019).

Tab. 2 Elements of the summary.seminr_model object.
Sub-object Content
meta The estimation function and version information.
iterations The number of iterations for the PLS algorithm to converge.
paths The model’s path coefficients and (adjusted) R2 values.
total_effects The model’s total effects.
total_indirect_effects The model’s total indirect effects.
loadings The outer loadings for all constructs.
weights The outer weights for all constructs.
validity The metrics necessary to evaluate the construct measures’ validity.
reliability The metrics necessary to evaluate the construct measures’ reliability.
composite_scores The estimated scores for constructs.
vif_antecedents The metrics used to evaluate structural model collinearity.
fSquare The f2 metric for all structural model relationships.
descriptives The descriptive statistics of the indicator data.
it_criteria The Information Theoretic model selection criteria for the estimated model.


For example, by calling summary_corp_rep$paths we inspect the model’s path coefficients and the (adjusted) R2 values and by calling summary_corp_rep$reliability, we inspect the construct reliability metrics which we can plot with plot(summary_corp_rep$reliability).

summary_corp_rep$paths
##         CUSA  CUSL
## R^2    0.295 0.562
## AdjR^2 0.290 0.558
## COMP   0.162 0.009
## LIKE   0.424 0.342
## CUSA       . 0.504
summary_corp_rep$reliability
##      alpha  rhoC   AVE  rhoA
## COMP 0.776 0.865 0.681 0.832
## LIKE 0.831 0.899 0.747 0.836
## CUSA 1.000 1.000 1.000 1.000
## CUSL 0.831 0.899 0.748 0.839
## 
## Alpha, rhoC, and rhoA should exceed 0.7 while AVE should exceed 0.5
plot(summary_corp_rep$reliability)


To check if and when the algorithm converged, we can inspect the number of iterations in summary_corp_rep$iterations.

summary_corp_rep$iterations
## [1] 4


We can access summary statistics such as mean, standard deviation and number of missing values for the model’s items and constructs by inspecting the summary_corp_rep$descriptives$statistics object.

Precisely, we call summary_corp_rep$descriptives$statistics$items to get the item statistics and summary_corp_rep$descriptives$statistics$constructs for the construct statistics.

summary_corp_rep$descriptives$statistics$items
##                    No. Missing  Mean Median   Min   Max Std.Dev. Kurtosis Skewness
## serviceprovider  1.000   0.000 2.000  2.000 1.000 4.000    1.004    2.477    0.744
## servicetype      2.000   0.000 1.637  2.000 1.000 2.000    0.482    1.323   -0.568
## comp_1           3.000   0.000 4.648  5.000 1.000 7.000    1.435    2.664   -0.263
## comp_2           4.000   0.000 5.424  6.000 1.000 7.000    1.377    2.375   -0.564
## comp_3           5.000   0.000 5.221  5.500 1.000 7.000    1.460    2.797   -0.674
## like_1           6.000   0.000 4.584  5.000 1.000 7.000    1.550    2.589   -0.403
## like_2           7.000   0.000 4.250  4.000 1.000 7.000    1.850    2.095   -0.311
## like_3           8.000   0.000 4.480  5.000 1.000 7.000    1.873    2.055   -0.324
## cusl_1           9.000   3.000 5.129  5.000 1.000 7.000    1.515    3.246   -0.789
## cusl_2          10.000   4.000 5.276  6.000 1.000 7.000    1.746    3.022   -0.947
## cusl_3          11.000   3.000 5.651  6.000 1.000 7.000    1.657    3.899   -1.296
## cusa            12.000   1.000 5.440  6.000 1.000 7.000    1.175    3.748   -0.765
## csor_1          13.000   0.000 4.235  4.000 1.000 7.000    1.471    2.605   -0.042
## csor_2          14.000   0.000 3.076  3.000 1.000 7.000    1.654    2.427    0.495
## csor_3          15.000   0.000 3.988  4.000 1.000 7.000    1.481    2.521   -0.061
## csor_4          16.000   0.000 3.125  3.000 1.000 7.000    1.464    2.527    0.196
## csor_5          17.000   0.000 3.983  4.000 1.000 7.000    1.585    2.297   -0.042
## csor_global     18.000   0.000 4.988  5.000 1.000 7.000    1.291    2.363   -0.141
## attr_1          19.000   0.000 4.991  5.000 1.000 7.000    1.460    2.906   -0.565
## attr_2          20.000   0.000 2.945  2.000 1.000 7.000    2.101    1.868    0.572
## attr_3          21.000   0.000 4.811  5.000 1.000 7.000    1.454    2.423   -0.274
## attr_global     22.000   0.000 5.587  6.000 2.000 7.000    1.216    2.884   -0.652
## perf_1          23.000   0.000 4.619  5.000 1.000 7.000    1.393    2.633   -0.201
## perf_2          24.000   0.000 5.070  5.000 1.000 7.000    1.334    2.790   -0.438
## perf_3          25.000   0.000 4.721  5.000 1.000 7.000    1.507    2.644   -0.420
## perf_4          26.000   0.000 4.919  5.000 1.000 7.000    1.436    2.803   -0.460
## perf_5          27.000   0.000 4.971  5.000 1.000 7.000    1.442    2.688   -0.457
## perf_global     28.000   0.000 5.977  6.000 3.000 7.000    0.981    2.904   -0.752
## qual_1          29.000   0.000 5.052  5.000 1.000 7.000    1.399    3.223   -0.644
## qual_2          30.000   0.000 4.372  5.000 1.000 7.000    1.497    2.451   -0.290
## qual_3          31.000   0.000 5.081  5.000 1.000 7.000    1.473    2.969   -0.678
## qual_4          32.000   0.000 4.413  4.000 1.000 7.000    1.490    2.600   -0.215
## qual_5          33.000   0.000 5.012  5.000 1.000 7.000    1.424    2.642   -0.530
## qual_6          34.000   0.000 4.924  5.000 1.000 7.000    1.537    2.445   -0.438
## qual_7          35.000   0.000 4.398  4.000 1.000 7.000    1.556    2.511   -0.224
## qual_8          36.000   0.000 4.837  5.000 1.000 7.000    1.417    2.415   -0.197
## qual_global     37.000   0.000 6.026  6.000 2.000 7.000    1.020    3.222   -0.862
## switch_1        38.000   0.000 3.765  4.000 1.000 5.000    1.287    2.351   -0.723
## switch_2        39.000   0.000 3.352  3.000 1.000 5.000    1.314    1.947   -0.282
## switch_3        40.000   0.000 3.881  4.000 1.000 5.000    1.227    2.549   -0.807
## switch_4        41.000   0.000 2.837  3.000 1.000 4.000    1.149    1.745   -0.453
summary_corp_rep$descriptives$statistics$constructs
##        No. Missing   Mean Median    Min   Max Std.Dev. Kurtosis Skewness
## COMP 1.000   0.000 -0.000  0.075 -2.911 1.668    1.000    2.746   -0.449
## LIKE 2.000   0.000 -0.000  0.032 -2.300 1.698    1.000    2.351   -0.265
## CUSA 3.000   0.000  0.000  0.477 -3.783 1.329    1.000    3.759   -0.766
## CUSL 4.000   0.000  0.000  0.221 -3.073 1.173    1.000    3.570   -1.000

2.3.6 Bootstrapping the model

In PLS-SEM, we need to perform bootstrapping to estimate standard errors and compute confidence intervals.

We run the bootstrapping with the bootstrap_model() function with 1,000 subsamples (nboot = 1000) and set a seed (seed = 123) to obtain reproducible results.

Next, we summarize the bootstrap model with sum_boot_corp_rep <- summary(boot_corp_rep) and obtain results on model estimates such as the path coefficients with sum_boot_corp_rep$bootstrapped_paths.

boot_corp_rep <- bootstrap_model(seminr_model = corp_rep_pls_model,
  nboot = 1000,
  cores = NULL,
  seed = 123)
## Bootstrapping model using seminr...
## SEMinR Model successfully bootstrapped

sum_boot_corp_rep <- summary(boot_corp_rep)

sum_boot_corp_rep$bootstrapped_paths
##                Original Est. Bootstrap Mean Bootstrap SD T Stat. 2.5% CI 97.5% CI
## COMP  ->  CUSA         0.162          0.166        0.068   2.374   0.038    0.298
## COMP  ->  CUSL         0.009          0.011        0.056   0.165  -0.098    0.126
## LIKE  ->  CUSA         0.424          0.422        0.062   6.858   0.299    0.542
## LIKE  ->  CUSL         0.342          0.340        0.056   6.059   0.227    0.450
## CUSA  ->  CUSL         0.504          0.504        0.042  11.978   0.419    0.585


The summary.boot_seminr_model object, i.e. sum_boot_corp_rep, contains the following sub-objects (Tab. 3):

Tab. 3 Elements of the summary.boot_seminr_model object.
Sub-object Content
nboot The number of bootstrap subsamples generated during bootstrapping.
bootstrapped_paths The bootstrap estimated standard error, T statistic, and confidence intervals for the path coefficients.
bootstrapped_weights The bootstrap estimated standard error, T statistic, and confidence intervals for the indicator weights.
bootstrapped_loadings The bootstrap estimated standard error, T statistic, and confidence intervals for the indicator loadings.
bootstrapped_HTMT The bootstrap estimated standard error, T statistic, and confidence intervals for the HTMT values.
bootstrapped_total_paths The bootstrap estimated standard error, T statistic, and confidence intervals for the model’s total effects.

3 Evaluation of reflective measurement models (Chapter 4)

3.1 Indicator reliability

For the reflective measurement model, we need to estimate the relationships between the reflectively measured constructs and their indicators (i.e., loadings). Indicator reliability can be calculated by squaring the loadings.

Low indicator reliability may result in biased construct results. Therefore, we evaluate indicator loadings as follows:

  • Indicator loadings above 0.708 are recommended, since they correspond to an explained variance (indicator reliabilty) of at least 50%.
  • Indicators with loadings between 0.40 and 0.70 should be considered for removal.
  • Indicators with very low loadings (below 0.40) should be removed.

We can get the loadings by inspecting the summary.seminr_model object’s loadings element (summary_corp_rep$loadings).

summary_corp_rep$loadings
##         COMP  LIKE  CUSA  CUSL
## comp_1 0.858 0.000 0.000 0.000
## comp_2 0.798 0.000 0.000 0.000
## comp_3 0.818 0.000 0.000 0.000
## like_1 0.000 0.879 0.000 0.000
## like_2 0.000 0.870 0.000 0.000
## like_3 0.000 0.843 0.000 0.000
## cusa   0.000 0.000 1.000 0.000
## cusl_1 0.000 0.000 0.000 0.833
## cusl_2 0.000 0.000 0.000 0.917
## cusl_3 0.000 0.000 0.000 0.843


We can get the indicator reliability by squaring the loadings.

summary_corp_rep$loadings^2
##         COMP  LIKE  CUSA  CUSL
## comp_1 0.736 0.000 0.000 0.000
## comp_2 0.638 0.000 0.000 0.000
## comp_3 0.669 0.000 0.000 0.000
## like_1 0.000 0.773 0.000 0.000
## like_2 0.000 0.757 0.000 0.000
## like_3 0.000 0.711 0.000 0.000
## cusa   0.000 0.000 1.000 0.000
## cusl_1 0.000 0.000 0.000 0.694
## cusl_2 0.000 0.000 0.000 0.841
## cusl_3 0.000 0.000 0.000 0.710

3.2 Internal consistency reliability

Internal consistency reliability is the extent to which indicators measuring the same construct are associated with each other.

Of the various indicators for internal consistency reliability, Cronbach’s alpha is the lower bound (Trizano-Hermosilla & Alvarado, 2016), the composite reliability \(\rho\)c (Jöreskog, 1971) is the upper bound for internal consistency reliability. The exact (or consistent) reliability coefficient \(\rho\)A usually lies between these bounds and may serve as a good representation of a construct’s internal consistency reliability (Dijkstra, 2010, 2014; Dijkstra & Henseler, 2015).

An item is acceptable for inclusion in the model if its internal consistency reliability takes specific values:

  • Recommended value of 0.80 to 0.90.
  • Minimum value of 0.70 (or 0.60 in exploratory research).
  • Maximum value of 0.95 to avoid indicator redundancy, which would compromise content validity (Diamantopoulos et al., 2012).

The reliability indicators can be found in summary_corp_rep$reliability and plotted with plot(summary_corp_rep$reliability).

summary_corp_rep$reliability
##      alpha  rhoC   AVE  rhoA
## COMP 0.776 0.865 0.681 0.832
## LIKE 0.831 0.899 0.747 0.836
## CUSA 1.000 1.000 1.000 1.000
## CUSL 0.831 0.899 0.748 0.839
## 
## Alpha, rhoC, and rhoA should exceed 0.7 while AVE should exceed 0.5
plot(summary_corp_rep$reliability)

3.3 Convergent validity

Convergent validity is the extent to which the construct converges in order to explain the variance of its indicators. The average variance extracted (AVE) is the mean of a construct indicator’s squared loadings. The minimum acceptable AVE is 0.50 or higher (Hair et al., 2021).

AVE values can also be accessed at summary_corp_rep$reliability.

summary_corp_rep$reliability
##      alpha  rhoC   AVE  rhoA
## COMP 0.776 0.865 0.681 0.832
## LIKE 0.831 0.899 0.747 0.836
## CUSA 1.000 1.000 1.000 1.000
## CUSL 0.831 0.899 0.748 0.839
## 
## Alpha, rhoC, and rhoA should exceed 0.7 while AVE should exceed 0.5

3.4 Discriminant validty

According to the older Fornell-Larcker criterion (Fornell & Larcker, 1981), the square root of the AVE of each construct should be higher than the construct’s highest correlation with any other construct in the model. These results can be outputted by summary_corp_rep$validity$fl_criteria.

However, this metric is not suitable for discriminant validity assessment due to its poor performance in detecting discriminant validity problems (Henseler et al., 2015; Radomir & Moisescu, 2019).

summary_corp_rep$validity$fl_criteria
##       COMP  LIKE  CUSA  CUSL
## COMP 0.825     .     .     .
## LIKE 0.645 0.864     .     .
## CUSA 0.436 0.528 1.000     .
## CUSL 0.450 0.615 0.689 0.865
## 
## FL Criteria table reports square root of AVE on the diagonal and construct correlations on the lower triangle.


We recommend the heterotrait-monotrait ratio (HTMT) of the correlations to assess discriminant validity (Henseler et al., 2015).

The HTMT is the mean value of the indicator correlations across constructs (i.e., the heterotrait-heteromethod correlations) relative to the (geometric) mean of the average correlations for the indicators measuring the same construct (i.e., the monotrait-heteromethod correlations).

Discriminant validity problems are present when HTMT values

  • exceed 0.90 for constructs that are conceptually very similar.
  • exceed 0.85 for constructs that are conceptually more distinct.

We can get the HTMT matrix by calling summary_corp_rep$validity$htmt.

summary_corp_rep$validity$htmt
##       COMP  LIKE  CUSA CUSL
## COMP     .     .     .    .
## LIKE 0.780     .     .    .
## CUSA 0.465 0.577     .    .
## CUSL 0.532 0.737 0.755    .


We can use bootstrap confidence intervals to test if the HTMT is significantly different from 1.00 (Henseler et al., 2015) or a lower threshold value such as 0.90 or 0.85, which should be defined based on the study context (Franke & Sarstedt, 2019).

We obtain the HTMT 90% bootstrap CI by calling sum_boot_corp_rep <- summary(boot_corp_rep, alpha = 0.10) and inspect the object sum_boot_corp_rep$bootstrapped_HTMT.

sum_boot_corp_rep <- summary(boot_corp_rep, alpha = 0.10)

sum_boot_corp_rep$bootstrapped_HTMT
##                Original Est. Bootstrap Mean Bootstrap SD T Stat. 5% CI 95% CI
## COMP  ->  LIKE         0.780          0.782        0.041  19.009 0.716  0.849
## COMP  ->  CUSA         0.465          0.467        0.060   7.806 0.368  0.563
## COMP  ->  CUSL         0.532          0.534        0.059   8.961 0.438  0.631
## LIKE  ->  CUSA         0.577          0.577        0.044  13.153 0.502  0.647
## LIKE  ->  CUSL         0.737          0.736        0.041  17.872 0.669  0.802
## CUSA  ->  CUSL         0.755          0.755        0.034  22.232 0.699  0.809

4 Evaluation of formative measurement models (Chapter 5)

Relevant criteria for evaluating formative measurement models include the assessment of:

  1. Convergent validity.
  2. Indicator collinearity.
  3. Statistical significance and relevance of the indicator weights.

4.1 Model set up

4.1.1 Model and measurement details

Here, we work with an extended corporate reputation model as displayed below in Fig. 2.

Tab. 4 shows the model’s measurement details, i.e. the constructs, variable names and items for the formative constructs:

  • QUAL: The quality of a company’s products and services as well as its quality of customer orientation.
  • PERF: The company’s economic and managerial performance.
  • CSOR: The company’s corporate social responsibility.
  • ATTR: The company’s attractiveness.