How do I create and run an XLSTAT-PLSPM project?

Partial Least Squares Path Modeling (PLS-PM) is a statistical approach for modeling complex multivariable relationships (structural equation models) among observed and latent variables. Since a few years, this approach has been enjoying increasing popularity in several sciences (Esposito Vinzi et al., 2007). Structural Equation Models include a number of statistical methodologies allowing the estimation of a causal theoretical network of relationships linking latent complex concepts, each measured by means of a number of observable indicators.

The first presentation of the finalized PLS approach to path models with latent variables has been published by Wold in 1979 and then the main references on the PLS algorithm are Wold (1982 and 1985).
Herman Wold opposed LISREL (Jöreskog, 1970) "hard modeling" (heavy distribution assumptions, several hundreds of cases necessary) to PLS "soft modeling" (very few distribution assumptions, few cases can suffice). These two approaches to Structural Equation Modeling have been compared in Jöreskog and Wold (1982).

From the standpoint of structural equation modeling, PLS-PM is a component-based approach where the concept of causality is formulated in terms of linear conditional expectation. PLS-PM seeks for optimal linear predictive relationships rather than for causal mechanisms thus privileging a prediction-relevance oriented discovery process to the statistical testing of causal hypotheses. Two very important review papers on PLS approach to Structural Equation Modeling are Chin (1998, more application oriented) and Tenenhaus et al. (2005, more theory oriented).

Furthermore, PLS Path Modeling can be used for analyzing multiple tables and it is directly related to more classical data analysis methods used in this field. In fact, PLS-PM may be also viewed as a very flexible approach to multi-block (or multiple table) analysis by means of both the hierarchical PLS path model and the confirmatory PLS path model (Tenenhaus and Hanafi, 2007). This approach clearly shows how the "data-driven" tradition of multiple table analysis can be somehow merged in the "theory-driven" tradition of structural equation modeling so as to allow running the analysis of multi-block data in light of current knowledge on conceptual relationships between tables.

In this tutorial we guide you step by step to show you how to create a project, define a model, estimate the parameters and analyze the results. This tutorial is based on the following paper: [Tenenhaus M., Esposito Vinzi V., Chatelin Y.-M. and Lauro C. (2005). PLS Path Modeling. Computational Statistics & Data Analysis, 48(1), 159-205].

The application is based on real life data, where 250 customers of mobile phone operators have been asked several questions in order be able to model their loyalty. The PLSPM model is based on the The European Consumer Satisfaction Index (ECSI). In the ECSI model, the latent variables (concepts that cannot be directly measured) are interrelated as displayed below.

plspm1.gif

Each latent variable is related to one or more manifest variables that are measured. In this application case, the manifest variables are questions on a 1-10 scale. For example, for the Image latent variable the five manifest variables are :
- It can be trusted in what it says and does
- It is stable and firmly established
- It has a social contribution for the society
- It is concerned with customers
- It is innovative and forward looking

An XLSTAT-PLSPM project sheet containing both the data and the results for use in this tutorial can be downloaded by clicking here. XLSTAT-PLSPM projects are special Excel workbook templates. When you create a new project, its default name starts with PLSPMBook. You can then save it to the name you want, but make sure you use the "Save" or "Save as" command of the XLSTAT-PLSPM toolbar to save it in the folder dedicated to the PLSPM projects using the *.ppm extension.

Note: when you open the PLSPathModeling_ECSI.ppm file, the graphical representation might look bad. This is due to the fact that the representation depends on your screen settings. To improve the display, click the "Optimize the display" button of the "PLS Path modeling" toolbar (see below).

A raw XLSTAT-PLSPM project contains two sheets that cannot be removed:
- D1: This sheet is empty and you need to add all the input data that you want to use into that worksheet.
- PLSPMGraph: This sheet is blank and is used to design the model. When you select this sheet, the "Path modeling" toolbar is displayed. It is made invisible when you leave that sheet.

To create the project used in this tutorial, we first generated a new project using the XLSTAT-PLSPM toolbar:

plspm0.gif

plspm2.gif

We then saved it as PLSPathModeling_ECSI.ppm using the "Save as" command of the same toolbar.

Then, we copied the data that were available in an Excel file, and pasted them into the D1 sheet of the Project. Once this is done, you are ready to start creating the model. Move to the PLSPMGraph sheet. The "PLS Path modeling" toolbar is displayed only on that sheet. You can find details on the function of each button in the help.

plspm3.gif

To create several latent variables in a row, double click on the circle button so that it stays pressed while you add variables:

plspm4.gif

You can then add the arrows that indicate how the latent variables are related. To create several arrows in a row, double click on the arrow button so that it stays pressed while you add the arrows.

plspm5.gif

To add an arrow, click on the latent variable from which it should start, then hold the left button of the mouse, then drag until the mouse cursor is over the latent variable where the arrow should end. Once an arrow is displayed you can still invert the direction or set it to double direction by using the contextual menu that you display by clicking the right button of the mouse:

plspm6.gif

Once all the arrows have been added, you can define the manifest variables that relate to each latent variable (this can also be done after adding the latent variables). To add manifest variables to a latent variable, the fastest way is to double-click the latent variable. This activates the D1 sheet and displays a dialog box where you give a proper name to the latent variable, select the manifest variables on D1 and define a few settings.

plspm7.gif

The several options available in that dialog box are described in the XLSTAT help. The most important one is the mode. In Mode A (reflective mode) the manifest variables construct the latent variable, and in Mode B (formative mode), the latent variable is responsible for what is measured for the manifest variables.

For example, this is how the dialog box looked liked once filled in for the latent variable Expectation:

plspm8.gif

Once the manifest variables have been defined for each latent variable, you can start computing the model. To run the model click the run button of the "PLS Path modeling" toolbar.

plspm9.gif
This displays the "Run" dialog box, where many options are available. For this tutorial the following options have been used:

plspm10.gif

plspm11.gif

plspm12.gif

plspm13.gif

Click here for other tutorials.

Copyright © 2008 Kovach Computing Services, Anglesey, Wales. All Rights Reserved. Portions copyright Addinsoft, Provalis Research, and Data Description Inc.

Last modified 25 January, 2008