________________________________________________________________________

Outline

0. Preparation

1. Enumerate time specific LCAs independently

2. Estimate LTA models

3. Evaluate the invariance assumption

4. Examine transition estimates

Data Source: The data used to illustrate these analyses include elementary school student Science Attitude survey items collected during 7th and 10th grades from the Longitudinal Study of American Youth (LSAY; Miller, 2015).

________________________________________________________________________

Preparation

Download the R-Project

Link to Github repository here: https://github.com/MM4DBER/Intro-to-LTA

Project folder organization

The following sub-folders will be used to contain files:

data; 2. enum_LCA_time1; 3. enum_LCA_time2; 4. LTA_models; 5. figures

Note regarding project location: If the main project folder is located within too many nested folders it may result in a file-path error when estimating models with MplusAutomation.

Notation guide

In the following script, three types of comments are included in code blocks in which models are estimated using MplusAutomation.

Annotate in R: The hash symbol # identifies comments written in R-language form.
Annotate in Mplus input: Within the mplusObject() function all text used to generate Mplus input files is enclosed within quotation marks (green text). To add comments within quotations the Mplus language convention is used (e.g., !!! annotate Mplus input !!!).
Annotate context-specific syntax: To signal to the user areas of the syntax which vary based on the particular modeling context the text, NOTE CHANGE: is used. This syntax will change based on your applied example (e.g., class #).

To install package {rhdf5}

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

BiocManager::install("rhdf5")

Load packages

library(MplusAutomation)
library(rhdf5)
library(tidyverse)       
library(here)            
library(glue)            
library(gt) 
library(reshape2)
library(cowplot)
library(patchwork)
library(PNWColors)
library(ggrepel)

Read in the LSAY data file named lsay_lta_faq_2021.csv.

lsay_data <- read_csv(here("data","lsay_lta_faq_2021.csv"),
                      na=c("9999","9999.00"))

________________________________________________________________________

Enumeration

Enumeration for each time point is done independently first, then the results are compared.

Enumerate Time Point 1 (7th grade)

# NOTE CHANGE: '1:6' indicates the number of k-class models to estimate
# User can change this number to fit research context
# In this example, the code loops or iterates over values 1 through 6 ( '{k}' )

t1_enum_k_16  <- lapply(1:6, function(k) { 
  enum_t1  <- mplusObject(                 
    
# The 'glue' function inserts R code within a string or "quoted green text" using the syntax {---}

  TITLE = glue("Class-{k}_Time1"), 
  
  VARIABLE = glue( 
     "!!! NOTE CHANGE: List of the five 7th grade science attitude indicators !!!
      categorical = ab39m ab39t ab39u ab39w ab39x;  ! 7th grade indicators
           usevar = ab39m ab39t ab39u ab39w ab39x;  
     
      classes = c({k});"),
  
  ANALYSIS = 
     "estimator = mlr; 
      type = mixture;
      !!! NOTE CHANGE: The intial and final start values. Reduce to speed up estimation time. !!!
      starts = 500 100;           
      processors = 10;",
  
  OUTPUT = "sampstat residual tech11 tech14;",
  
  PLOT = 
     "type = plot3; 
      series = ab39m-ab39x(*);",
  
  usevariables = colnames(lsay_data),
  rdata = lsay_data)

# NOTE CHANGE: Fix to match appropriate sub-folder name (e.g., "enum_LCA_time1")
enum_t1_fit <- mplusModeler(enum_t1,
                 dataout = here("enum_LCA_time1", "t1.dat"), 
                 modelout = glue(here("enum_LCA_time1", "c{k}_lca_enum_time1.inp")),
                 check = TRUE, run = TRUE, hashfilename = FALSE)
})

NOTE: It is highly recommended that you check the Mplus output files (.out) to check for convergence warnings or syntax errors. Mplus files may be viewed in the RStudio window (bottom right pane).

Enumerate Time Point 2 (10th grade)

t2_enum_k_16  <- lapply(1:6, function(k) { 
  enum_t2  <- mplusObject(                 
      
  TITLE = glue("Class-{k}_Time2"), 
  
  VARIABLE = glue( 
     "!!! NOTE CHANGE: List of the five 10th grade science attitude indicators !!!
      categorical = ga33a ga33h ga33i ga33k ga33l;
           usevar = ga33a ga33h ga33i ga33k ga33l;
    
      classes = c({k});"),
  
  ANALYSIS = 
     "estimator = mlr; 
      type = mixture;
      starts = 500 100;
      processors = 10;",
  
  OUTPUT = "sampstat residual tech11 tech14;",
  
  PLOT = 
     "type = plot3; 
      series = ga33a-ga33l(*);",
  
  usevariables = colnames(lsay_data),
  rdata = lsay_data)

enum_t2_fit <- mplusModeler(enum_t2, 
                 dataout = here("enum_LCA_time2", "t2.dat"),
                 modelout = glue(here("enum_LCA_time2", "c{k}_lca_enum_time2.inp")),
                 check = TRUE, run = TRUE, hashfilename = FALSE)
})

Create Model Fit Summary Table

Once you decice on the number of classes for each time point, presenting them in one table is useful for the publicatoin. The following syntax for producing publication ready fit tables can be cited using the following citation:

Garber, A. C. (2021). Creating Summary Fit Tables for LCA and LTA Analyses Using MplusAutomation. \(\color{blue}{\text{Retrieved from psyarxiv.com/uq2fh}}\)

Read all models for enumeration table

output_enum_t1 <- readModels(here("enum_LCA_time1"), quiet = TRUE)
output_enum_t2 <- readModels(here("enum_LCA_time2"), quiet = TRUE)

Extract model fit data

enum_extract1 <- LatexSummaryTable(output_enum_t1,                                 
                keepCols = c("Title", "Parameters", "LL", "BIC", "aBIC",
                           "BLRT_PValue", "T11_VLMR_PValue","Observations"))   

enum_extract2 <- LatexSummaryTable(output_enum_t2,                                 
                keepCols = c("Title", "Parameters", "LL", "BIC", "aBIC",
                           "BLRT_PValue", "T11_VLMR_PValue","Observations"))

Calculate Indices Derived from the Log Likelihood (LL)

allFit <- rbind(enum_extract1, enum_extract2) %>% 
  mutate(aBIC = -2*LL+Parameters*log((Observations+2)/24)) %>% 
  mutate(CAIC = -2*LL+Parameters*(log(Observations)+1)) %>% 
  mutate(AWE = -2*LL+2*Parameters*(log(Observations)+1.5)) %>%
  mutate(SIC = -.5*BIC) %>% 
  mutate(expSIC = exp(SIC - max(SIC))) %>% 
  mutate(BF = exp(SIC-lead(SIC))) %>% 
  mutate(cmPk = expSIC/sum(expSIC)) %>% 
  select(1:5,9:10,6:7,13,14)

Format fit table

allFit %>% 
  mutate(Title = str_remove(Title, "_Time.")) %>% 
  gt() %>%
  tab_header(
    title = md("**Model Fit Summary Table**"), subtitle = md("&nbsp;")) %>% 
  cols_label(
    Title = "Classes",
    Parameters = md("Par"),
    LL = md("*LL*"),
    T11_VLMR_PValue = "VLMR",
    BLRT_PValue = "BLRT",
    BF = md("BF"),
    cmPk = md("*cmPk*")) %>%
  tab_footnote(
    footnote = md(
    "*Note.* Par = Parameters; *LL* = model log likelihood;
      BIC = Bayesian information criterion;
      aBIC = sample size adjusted BIC; CAIC = consistent Akaike information criterion;
      AWE = approximate weight of evidence criterion;
      BLRT = bootstrapped likelihood ratio test p-value;
      VLMR = Vuong-Lo-Mendell-Rubin adjusted likelihood ratio test p-value;
      *cmPk* = approximate correct model probability."), 
    locations = cells_title()) %>% 
  tab_options(column_labels.font.weight = "bold",
              row_group.font.weight = "bold") %>% 
  fmt_number(10,decimals = 2, drop_trailing_zeros=TRUE, suffixing = TRUE) %>% 
  fmt_number(c(3:9,11), decimals = 2) %>% 
  fmt_missing(1:11, missing_text = "--") %>% 
  fmt(c(8:9,11), fns = function(x) 
    ifelse(x<0.001, "<.001", scales::number(x, accuracy = 0.01))) %>%
  fmt(10, fns = function(x) 
    ifelse(x>100, ">100", scales::number(x, accuracy = .1))) %>%
  tab_row_group(group = "Time-1",rows = 1:6) %>%
  tab_row_group(group = "Time-2",rows = 7:12) %>% 
  row_group_order(groups = c("Time-1","Time-2"))

Classes	Par	LL	BIC	aBIC	CAIC	AWE	BLRT	VLMR	BF	cmPk
Model Fit Summary Table¹
Time-1
Class-1	5	−10,250.60	20,541.34	20,525.45	20,546.34	20,596.47	–	–	0.0	<.001
Class-2	11	−8,785.32	17,658.92	17,623.97	17,669.93	17,780.22	<.001	<.001	0.0	<.001
Class-3	17	−8,693.57	17,523.59	17,469.57	17,540.59	17,711.04	<.001	<.001	0.0	<.001
Class-4	23	−8,664.09	17,512.79	17,439.71	17,535.79	17,766.40	<.001	<.001	>100	<.001
Class-5	29	−8,662.39	17,557.54	17,465.40	17,586.54	17,877.31	1.00	0.66	>100	<.001
Class-6	35	−8,661.54	17,604.01	17,492.80	17,639.01	17,989.94	0.67	0.94	0.0	<.001
Time-2
Class-1	5	−7,553.22	15,144.99	15,129.10	15,149.99	15,198.53	–	–	0.0	<.001
Class-2	11	−5,988.12	12,061.04	12,026.09	12,072.04	12,178.83	<.001	<.001	0.0	<.001
Class-3	17	−5,903.07	11,937.17	11,883.16	11,954.17	12,119.22	<.001	<.001	1.4	0.58
Class-4	23	−5,880.25	11,937.79	11,864.71	11,960.79	12,184.08	<.001	0.00	>100	0.42
Class-5	29	−5,877.60	11,978.74	11,886.60	12,007.74	12,289.28	0.67	0.70	>100	<.001
Class-6	35	−5,877.07	12,023.94	11,912.74	12,058.94	12,398.73	1.00	0.17	–	<.001
¹ Note. Par = Parameters; LL = model log likelihood; BIC = Bayesian information criterion; aBIC = sample size adjusted BIC; CAIC = consistent Akaike information criterion; AWE = approximate weight of evidence criterion; BLRT = bootstrapped likelihood ratio test p-value; VLMR = Vuong-Lo-Mendell-Rubin adjusted likelihood ratio test p-value; cmPk = approximate correct model probability.

Compare Time 1 & Time 2 Condition Item Probability Plots

For a publication, you will want to present the conditional item probability plot for each time point. The code below will create the plots for each time point.

Read models for plotting (4-class models)

model_t1_c4 <- readModels(here("enum_LCA_time1", "c4_lca_enum_time1.out"), quiet = TRUE)
model_t2_c4 <- readModels(here("enum_LCA_time2", "c4_lca_enum_time2.out"), quiet = TRUE)

Create a function called plot_lca_function that requires 7 arguments (inputs):

model_name: Name of Mplus model object (e.g., model_step1)
item_num: The number of items in LCA measurement model (e.g., 5)
class_num: The number of classes (K) in LCA model (e.g., 4)
item_labels: The item labels for x-axis (e.g., c("Enjoy","Useful","Logical","Job","Adult"))
class_labels: The class label names (e.g., c("Pro-Science","Amb. w/Minimal","Amb. w/Elevated","Anti-Science"))
class_legend_order = Change the order that class names are listed in the plot legend (e.g., c(1,3,2,4))
plot_title: Include the title of the plot here (e.g., "Conditional Item Probability Plot")

plot_lca_function <- function(model_name,item_num,class_num,item_labels,
                              class_labels,class_legend_order,plot_title){

mplus_model <- as.data.frame(model_name$gh5$means_and_variances_data$estimated_probs$values)
plot_data <- mplus_model[seq(2, 2*item_num, 2),]

c_size <- as.data.frame(model_name$class_counts$modelEstimated$proportion)
colnames(c_size) <- paste0("cs")
c_size <- c_size %>% mutate(cs = round(cs*100, 2))
colnames(plot_data) <- paste0(class_labels, glue(" ({c_size[1:class_num,]}%)"))
plot_data <- plot_data %>% relocate(class_legend_order)

plot_data <- cbind(Var = paste0("U", 1:item_num), plot_data)
plot_data$Var <- factor(plot_data$Var,
               labels = item_labels)
plot_data$Var <- fct_inorder(plot_data$Var)

pd_long_data <- melt(plot_data, id.vars = "Var") 

# This syntax uses the data.frame created above to produce the plot with `ggplot()`

p <- pd_long_data %>%
  ggplot(aes(x = as.integer(Var), y = value,
  shape = variable, colour = variable, lty = variable)) +
  geom_point(size = 4) + geom_line() + 
  scale_x_continuous("", breaks = 1:item_num, labels = plot_data$Var) + 
  scale_colour_grey() + 
  labs(title = plot_title, y = "Probability") +
  theme_cowplot() +
  theme(legend.title = element_blank(), 
        legend.position = "top") +
  coord_cartesian(xlim = c(.9, 5.4), ylim = c(-.05, 1.05), expand = FALSE) 
p
return(p)
}

Time 1 LCA - Conditional Item Probability Plot

plot_lca_function(
  model_name = model_t1_c4, 
  item_num = 5,
  class_num = 4,
  item_labels = c("Enjoy","Useful","Logical","Job","Adult"),
  class_labels = c("Pro-Science","Amb. w/ Minimal","Amb. w/ Elevated","Anti-Science"),
  class_legend_order = c(1,3,2,4),
  plot_title = "Time 1: Conditional Item Probability Plot"
  )

ggsave(here("figures", "T1_C4_LCA_plot.png"), dpi=300, height=5, width=7, units="in")

Time 2 LCA - Conditional Item Probability Plot

plot_lca_function(
  model_name = model_t2_c4,
  item_num = 5,         
  class_num = 4,
  item_labels = c("Enjoy","Useful","Logical","Job","Adult"),
  class_labels = c("Pro-Science","Anti-Science","Amb. w/ Elevated","Amb. w/ Minimal"),
  class_legend_order = c(1,3,4,2),
  plot_title = "Time 2: Conditional Item Probability Plot"
  )

ggsave(here("figures", "T2_C4_LCA_plot.png"), dpi=300, height=5, width=7, units="in")

________________________________________________________________________

Estimate Latent Transition Analysis Models (LTA)

When fitting the LTA model with two time points, it is possible to test if the latent classes at each time point are the same. If the same number and type of classes emerge at each time point, it may be meaningful to test if the measurement model can be the same at each time point. So this is asking the question, “Are the latent classes at time 1 the same at time 2?”. Note that the class sizes do not need to be equal.

Non-invariant LTA model: The classes are NOT held (or constrained) equal at each time point.

Invariant LA model: Classes ARE held equal at each time point. Note that this model is more parsimonious and thus would be preferred if there is statistical support for it.

These nested model tests are highly sensitive- so even if the test below indicates that the invariant model (holding classes the same across time points) significantly increases model misfit, it still may be reasonable to use the invariant model if the classes look the same.

Estimate Non-Invariant LTA Model

lta_non_inv <- mplusObject(
  
  TITLE = 
    "LTA (Non-Invariant)", 
  
  VARIABLE = 
     "usevar = ab39m ab39t ab39u ab39w ab39x  ! 7th grade indicators
             ga33a ga33h ga33i ga33k ga33l;   ! 10th grade indicators
      
      categorical = ab39m-ab39x ga33a-ga33l;

      classes = c1(4) c2(4);",
    
  ANALYSIS = 
     "estimator = mlr;
      type = mixture;
      starts = 500 100;
      processors=10;",

  MODEL = 
     "%overall%
      c2 on c1; !!! estimate all multinomial logistic regressions !!!

      MODEL c1: !!! the following syntax will allow item thresholds !!!
                !!! to be estimated for each class (e.g. noninvariance) !!!
      
      %c1#1%
      [AB39M$1-AB39X$1]; 
      %c1#2%
      [AB39M$1-AB39X$1];
      %c1#3%
      [AB39M$1-AB39X$1];
      %c1#4%
      [AB39M$1-AB39X$1];

      MODEL c2:
      %c2#1%
      [GA33A$1-GA33L$1];
      %c2#2%
      [GA33A$1-GA33L$1];
      %c2#3%
      [GA33A$1-GA33L$1];
      %c2#4%
      [GA33A$1-GA33L$1];",

  OUTPUT = "tech1 tech15 svalues;",
  
  usevariables = colnames(lsay_data),
  rdata = lsay_data)

lta_non_inv_fit <- mplusModeler(lta_non_inv,
                     dataout=here("LTA_models", "lta.dat"),
                     modelout=here("LTA_models", "lta-non-invariant.inp"),
                     check=TRUE, run = TRUE, hashfilename = FALSE)

Estimate Invariant LTA Model

lta_inv <- mplusObject(
  
  TITLE = 
     "LTA (Invariant)", 
  
  VARIABLE = 
     "usevar = ab39m ab39t ab39u ab39w ab39x  ! 7th grade indicators
             ga33a ga33h ga33i ga33k ga33l; ! 10th grade indicators
      
      categorical = ab39m-ab39x ga33a-ga33l;

      classes = c1(4) c2(4);",
    
  ANALYSIS = 
     "estimator = mlr;
      type = mixture;
      starts = 500 100;
      processors=10;",

  MODEL = 
     "%overall%
      c2 on c1;

      MODEL c1: 
      %c1#1%
      [AB39M$1-AB39X$1] (1-5);  !!! labels that are repeated will constrain parameters to equality !!!
      %c1#2%
      [AB39M$1-AB39X$1] (6-10);
      %c1#3%
      [AB39M$1-AB39X$1] (11-15);
      %c1#4%
      [AB39M$1-AB39X$1] (16-20);

      MODEL c2:
      %c2#1%
      [GA33A$1-GA33L$1] (1-5);
      %c2#2%
      [GA33A$1-GA33L$1] (6-10);
      %c2#3%
      [GA33A$1-GA33L$1] (11-15);
      %c2#4%
      [GA33A$1-GA33L$1] (16-20);",
   
  SAVEDATA = 
     "file = lta-inv-cprobs.dat;
      save = cprob;
      missflag = 9999;",

  OUTPUT = "tech1 tech15 svalues;",
  
  usevariables = colnames(lsay_data),
  rdata = lsay_data)

lta_inv_fit <- mplusModeler(lta_inv,
                 dataout=here("LTA_models", "lta.dat"),
                 modelout=here("LTA_models", "lta-invariant.inp"),
                 check=TRUE, run = TRUE, hashfilename = FALSE)

Test for significant difference between non-invariance & invariance LTA models

Conduct the Sattorra-Bentler adjusted Log Likelihood Ratio (LRT) difference test

Non-invariant (comparison): This model has more parameters (i.e., un-constrained model).
Invariant (nested): This model has less parameters (i.e., “constrained model”).

To test if holding the measurement of the classes the same across time points in the invariant model, we can do a likelihood ratio test (LRT). When we use ML with robust standard errors (MLR), we need to use the Sattorra-Bentler adjustment to the LRT.

lta_models <- readModels(here("LTA_models"), quiet = TRUE)

# *0 = null or nested model & *1 = comparison or parent model

# Log Likelihood Values
L0 <- lta_models[["lta.invariant.out"]][["summaries"]][["LL"]]
L1 <- lta_models[["lta.non.invariant.out"]][["summaries"]][["LL"]] 

# LRT equation
lr <- -2*(L0-L1) 

# Parameters
p0 <- lta_models[["lta.invariant.out"]][["summaries"]][["Parameters"]] 
p1 <- lta_models[["lta.non.invariant.out"]][["summaries"]][["Parameters"]]

# Scaling Correction Factors
c0 <- lta_models[["lta.invariant.out"]][["summaries"]][["LLCorrectionFactor"]]
c1 <- lta_models[["lta.non.invariant.out"]][["summaries"]][["LLCorrectionFactor"]]

# Difference Test Scaling correction (Sattorra-Bentler adjustment)
cd <- ((p0*c0)-(p1*c1))/(p0-p1)

# Chi-square difference test(TRd)
TRd <- (lr)/(cd)

# Degrees of freedom
df <- abs(p0 - p1)


# Significance test
(p_diff <- pchisq(TRd, df, lower.tail=FALSE))

RESULT: The Log Likelihood \(\chi^2\) difference test comparing the invariant and non-invariant LTA models was, \(\chi^2 (20) = 18.23, p = .572\).

Compare model fit summary statistics: Invariant & Non-Invariant LTA Models

Read & extract model fit data for comparison

enum_extract1 <- LatexSummaryTable(lta_models$lta.non.invariant.out,                                 
                 keepCols=c("Title", "Parameters", "LL", "BIC", "aBIC", "Observations")) 
enum_extract2 <- LatexSummaryTable(lta_models$lta.invariant.out,                                 
                 keepCols=c("Title", "Parameters", "LL", "BIC", "aBIC", "Observations"))

Calculate indices derived from the Log Likelihood (LL)

allFit <- rbind(enum_extract1, enum_extract2) %>% 
  mutate(aBIC = -2*LL+Parameters*log((Observations+2)/24)) %>% 
  select(1:5)

Format fit table

allFit %>% 
  gt() %>%
  tab_header(
    title = md("**Model Fit Comparision Table**"), subtitle = md("&nbsp;")) %>% 
  cols_label(
    Title = "Model",
    Parameters = md("Par"),
    LL = md("*LL*"),
    BIC = md("BIC"),
    aBIC = md("aBIC")) %>% 
  tab_footnote(
    footnote = md(
    "*Note.* Par = Parameters; *LL* = model log likelihood;
     BIC = Bayesian information criterion; aBIC = sample size adjusted BIC."), 
    locations = cells_title()) %>% 
  tab_options(column_labels.font.weight = "bold")

Model	Par	LL	BIC	aBIC
Model Fit Comparision Table¹
LTA (Non-Invariant)	55	-14447.71	29336.88	29162.13
LTA (Invariant)	35	-14458.24	29197.40	29086.19
¹ Note. Par = Parameters; LL = model log likelihood; BIC = Bayesian information criterion; aBIC = sample size adjusted BIC.

Plot Invariant LTA Conditional Item Probability Plot

Create a function for plotting the conditional item probabilities estimated from an LTA model.

The plot_lta_function requires one additional argument called timepoint used to specify the time point to extract probabilities (e.g., 1).

plot_lta_function <- function(model_name,item_num,class_num,timepoint,item_labels,plot_title){

# Extract Item Probabilities
mplus_model <- as_tibble(model_name$parameters$probability.scale) %>% 
  filter(category=="2", str_detect(LatentClass, glue("C{timepoint}"))) %>% 
  select(LatentClass,est, param) %>%
  pivot_wider(names_from = LatentClass, values_from = est) %>% 
  select(-param)

# Create class size in percentages (%)
c_size <- as.data.frame(model_name$class_counts$modelEstimated) %>%
  filter(str_detect(variable, glue("C{timepoint}"))) %>%
  select(proportion)
colnames(c_size) <- paste0("cs")
c_size <- c_size %>% mutate(cs = round(cs * 100, 2))
colnames(mplus_model) <- paste0("C", 1:class_num, glue(" ({c_size[1:class_num,]}%)"))

# Variable names
plot_t1 <- cbind(Var = paste0("U", 1:item_num), mplus_model)
plot_t1$Var <- factor(plot_t1$Var,
               labels = item_labels)
plot_t1$Var <- fct_inorder(plot_t1$Var)
pd_long_t1 <- melt(plot_t1, id.vars = "Var") 

p <- pd_long_t1 %>%
  ggplot(aes(x = as.integer(Var), y = value,
  shape = variable, colour = variable, lty = variable)) +
  geom_point(size = 4) + geom_line() + 
  scale_x_continuous("", breaks = 1:item_num, labels = plot_t1$Var) + 
  scale_colour_grey() + scale_y_continuous(limits = c(0,1)) + 
  labs(title = plot_title, y = "Probability") +
  theme_cowplot() +
  theme(legend.title = element_blank(), 
        legend.position = "top")

p
return(p)
}

Invariant LTA Model - Conditional Item Probability Plot

For the invariant LTA model, conditional item probabilities are the same across time-points.

plot_lta_function(
  model_name = lta_models$lta.invariant.out, 
  item_num = 5,
  class_num = 4,
  timepoint = 1, 
  item_labels = c("Enjoy","Useful","Logical","Job","Adult"),
  plot_title = ""
  )

ggsave(here("figures", "InvariantLTA_LCAplot.png"), dpi=300, height=5, width=7, units="in")

________________________________________________________________________

Examine transition estimates

Create Table: LTA Transition Estimates

Extract Transitions (Invariant LTA Model)

lta_out <- lta_models[["lta.invariant.out"]][["class_counts"]][["transitionProbs"]][["probability"]] %>% 
  as.data.frame(as.numeric())

t_matrix <- tibble(
  "Time1" = c("Anti-Science","Amb. w/ Elevated","Pro-Science","Amb. w/ Minimal"),
  "Anti-Science" = c(lta_out[1,1],lta_out[2,1],lta_out[3,1],lta_out[4,1]),
  "Amb. w/ Elevated" = c(lta_out[5,1],lta_out[6,1],lta_out[7,1],lta_out[8,1]),
  "Pro-Science" = c(lta_out[9,1],lta_out[10,1],lta_out[11,1],lta_out[12,1]),
  "Amb. w/ Minimal" = c(lta_out[13,1],lta_out[14,1],lta_out[15,1],lta_out[16,1]))

Format table

t_matrix %>% 
  gt(rowname_col = "Time1") %>%
  tab_stubhead(label = "7th grade") %>% 
  tab_header(
    title = md("**Student transitions from 7th grade (rows) to 10th grade (columns)**"),
    subtitle = md("&nbsp;")) %>% 
  fmt_number(2:5,decimals = 2) %>% 
  tab_spanner(label = "10th grade",columns = 2:5) %>% 
  tab_footnote(
    footnote = md("*Note.* Transition estimates for the invariant LTA model."), 
    locations = cells_title())

7th grade	10th grade
Student transitions from 7th grade (rows) to 10th grade (columns)¹
7th grade	Anti-Science	Amb. w/ Elevated	Pro-Science	Amb. w/ Minimal
Anti-Science	0.56	0.09	0.16	0.19
Amb. w/ Elevated	0.27	0.26	0.15	0.32
Pro-Science	0.35	0.08	0.30	0.26
Amb. w/ Minimal	0.21	0.14	0.12	0.52
¹ Note. Transition estimates for the invariant LTA model.

Plot LTA transitions

This code is adapted from the source code for the plotLTA function found in the

NOTE: The function found in plot_transitions_function.R is specific to a model with 2 time-points and 4-classes & must be updated to accommodate other models.

source("plot_transitions_function.R") # Script is located in the project repository

plot_transitions_function(
  model_name = lta_models$lta.invariant.out,
  color_pallete = pnw_palette("Bay", n=4, type = "discrete"),
  facet_labels =c(
    `1` = "Transitions to 10th Grade from the Pro-Science Class",
    `2` = "Transitions to 10th Grade from the Ambivalent w/ Elevated Utility Class",
    `3` = "Transitions to 10th Grade from the Ambivalent w/ Minimal Utility Class",
    `4` = "Transitions to 10th Grade from the Anti-Science Class"),
  timepoint_labels = c('1' = "7th Grade", '2' = "10th Grade"),
  class_labels = c(
    "Pro-Science",
    "Amb. / Elev. Utility",
    "Amb. / Min. Utility",
    "Anti-Science"))

ggsave(here("figures","LTA_transition_plot.png"), dpi=500, height=7, width=8, units="in")

________________________________________________________________________

References

Hallquist, Michael N., and Joshua F. Wiley. 2018. “MplusAutomation: An R Package for FacilitatingLarge-Scale Latent Variable Analyses in Mplus.” Structural Equation Modeling, 1–18. https://doi.org/10.1080/10705511.2017.1402334.

Miller, Jon D. Longitudinal Study of American Youth (LSAY), Seventh Grade Data, 1987-1988; 2015-2016. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2019-04-23. https://doi.org/10.3886/ICPSR37287.v1

Müller, Kirill. 2017.Here: A Simpler Way to Find Your Files. https://CRAN.R-project.org/package=here.

Muthén, B., & Asparouhov, T. (2020). Latent transition analysis with random intercepts (RI-LTA). Psychological Methods. Advance online publication. https://doi.org/10.1037/met0000370

Muthén L.K., & Muthen B.O. (1998-2017) Mplus User’s Guide. Eight Edition. Los Angelos, CA: Muthen & Muthen.

R Core Team. 2019.R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). “Welcome to the tidyverse.” Journal of Open Source Software, 4(43), 1686. doi: 10.21105/joss.01686.

Introduction to Latent Transition Analysis (LTA) with MplusAutomation

Adam Garber: MM4DBER Training Team

Updated: December 01, 2023

________________________________________________________________________

Outline

________________________________________________________________________

Preparation

________________________________________________________________________

Enumeration

Enumerate Time Point 1 (7th grade)

Enumerate Time Point 2 (10th grade)

Create Model Fit Summary Table

Compare Time 1 & Time 2 Condition Item Probability Plots

Time 1 LCA - Conditional Item Probability Plot

Time 2 LCA - Conditional Item Probability Plot

________________________________________________________________________

Estimate Latent Transition Analysis Models (LTA)

Estimate Non-Invariant LTA Model

Estimate Invariant LTA Model

Test for significant difference between non-invariance & invariance LTA models

Compare model fit summary statistics: Invariant & Non-Invariant LTA Models

Plot Invariant LTA Conditional Item Probability Plot

________________________________________________________________________

Examine transition estimates

Create Table: LTA Transition Estimates

Plot LTA transitions

________________________________________________________________________

References