Functional Approaches for Modeling Unfolding Data

Abstract

The purpose of this study is to introduce a functional approach for modeling unfolding response data. Functional data analysis (FDA) has been used for examining cumulative item response data, but a functional approach has not been systematically used with unfolding response processes. A brief overview of FDA is presented and illustrated within the context of unfolding data. Seven decision parameters are described that can provide a guide to conducting FDA in this context. These decision parameters are illustrated with real data using two scales that are designed to measure attitude toward capital punishment and attitude toward censorship. The analyses suggest that FDA offers a useful set of tools for examining unfolding response processes.

Keywords

functional clustering functional data analysis unfolding models

Functional data analysis (FDA) is an approach to statistics that focuses on families of functions or smooth curves (Kokoszka & Reimherr, 2017; Levitin et al., 2007; Ramsay & Silverman, 2005). Some recent applications of FDA include modeling COVID-19 prevalence (Oshinubi et al., 2022), forecasting demand for electricity (Shah et al., 2022), visualizing weather patterns (Suhaila, 2021), analysis of neural networks related to Alzheimer’s disease (S. Wang et al., 2021), and identification of hospitalization rates related to dialysis (Li et al., 2021). Applications to data in education and psychology are less well developed, although there are some exceptions including the modeling of longitudinal data (Marcoulides & Khojasteh, 2018), nonlinear growth curve modeling (Suk et al., 2019), and cumulative item response models (Woods, 2006).

In psychometric applications, the data are typically discrete and consist of zeros and ones indicating incorrect/correct or disagree/agree answers to test items. These discrete responses are assumed to arise from underlying smooth functions that are reflected in various item response models (Baker & Kim, 2004; De Ayala, 2013; Van der Linden, 2019). Examples of these underlying smooth functions include item and person response functions. FDA can be used to model these smooth functions from the discrete response patterns obtained from educational and psychological tests. In psychometrics, functional analyses have been used to analyze item response functions (Ramsay, 2016; Woods, 2006).

Unfolding models are also referred to as ideal point models (Engelhard & Yuan, in press). These models involve the estimation of smooth curves designed to represent item and person responses that are noncumulative in form. These curves were called maximum probability responses by Thurstone (Thurstone & Chave, 1929), and he distinguished them from increasing or cumulative probability responses. Figure 1 shows an increasing probability item in Panel A and two maximum probability items in Panel B. In dichotomous unfolding models, the probability of endorsing an item (x = 1) is a function of the distance between the person and item locations on the underlying latent variable (Coombs, 1964; Engelhard & Yuan, in press).

Figure 1

Increasing Probability (One Item) and Maximum Probability (Two Items) Scales

The purpose of this study is to describe and illustrate functional approaches for modeling unfolding data. The study is structured in the following way. First, FDA is briefly presented including functional clustering. Next, the unfolding response process is described with a short illustration of a functional approach. This is followed by two applications of FDA to attitude scales designed to measure attitudes toward capital punishment and censorship that reflect unfolding response processes. In essence, the goal is to provide evidence to support future research and applications of FDA with unfolding data.

Functional Data Analysis

Functional data occur when the observations are collected from a smooth curve or function (Kokoszka & Reimherr, 2017; Ramsay & Silverman, 2005). These observations can be discrete measurements that reflect smooth variation in the underlying variable. Functional data analysis extends the ideas and aims of traditional statistical analyses to samples of curves represented by functions. The smooth curves are assumed to have similar shapes, and, therefore, can be approximated as a linear combination of K-independent basis functions expressed as

\begin{matrix} x (t) = \sum_{k = 1}^{K} c_{k} ϕ_{k} (t) \end{matrix},

(1)

where $c_{k}$ represents the coefficients, $ϕ_{k}$ represents the basis functions, and $ϕ_{k} (t)$ represents the value of the kth basis function at argument value t. The basis functions can be splines, wavelets, or sine and cosine functions (Ramsay & Silverman, 2005). B-splines are frequently used as the basis functions for nonperiodic functional data (Abraham et al., 2003; Ramsay & Silverman, 2005).

In the case of B-splines, the functions are expressed as piecewise polynomial functions joined together at knots that separate the x-axis into subintervals. Figure 2 illustrates the structure of B-splines that are used in this study. B-splines are based on piecewise polynomials that can be combined to represent a flexible set of functions. The B-splines are shown in the last panel with equally spaced knots (vertical gray lines). The functions shown in Panels A to E are weighted by spline coefficients ( $c_{k}$ ) that are estimated to create a combined representation of the data. The coefficients can be estimated with a variety of methods including least squares (Kokoszka & Reimherr, 2017). The spline coefficients vary for each response pattern, and these coefficients can be used in the cluster analyses to group the response patterns.

Figure 2

Basis Function Based on B-Splines With Different Locations of the Knots

The polynomials can be of different degrees, and the researcher can control the shape and flexibility of the estimated function by changing the degree of the polynomials in the B-spline basis expansion. Figure 3 illustrates examples of several choices of degrees for the polynomials that can be used in the B-splines. Displays A, B, and C illustrate linear, quadratic, and cubic polynomials, respectively, that have been fit to eight random data points. It should be noted that increasing the degree of a polynomial leads to a better fit of the data points.

Figure 3

Illustration of Linear, Quadratic, and Cubic Polynomials

The locations of the knots along the x-axis also affect the shape and flexibility of the estimated functions. The knots mark the change points for the estimated functions because the piecewise polynomials are joined together at the knots. Figure 2 (Panel A) shows the B-splines with knots placed at −2, 0, and 2 (vertical gray lines). This can be contrasted with Figure 2 (Panel B) where the knots are at −1, 0, and 3.

Finally, a common approach to smoothing includes the use of penalty that can tune the roughness of displays that may exhibit too much variability. Roughness is particularly problematic when smoothing discrete data points and may result in curves that are too responsive to variation in the discrete observations The roughness penalty ( $L$ ) is defined as follows based on the derivative (d) as

\begin{matrix} L_{d} = \int {[D^{d} x (t)]}^{2} ds \end{matrix} .

(2)

This roughness penalty and a fit criterion, such as the sum of squares error (SSE), can be combined to yield a penalized sum of squares. The penalized sum of squares is defined as

\begin{matrix} PS S_{λ} = SSE + λ L_{d} (t) \end{matrix},

(3)

where $λ$ is a weight that can take real values from zero to infinity. If $λ = 0$ , then the penalty term does not affect the sum of squares. As $λ$ increases, the roughness penalty becomes more influential and forces the curves to become smoother in order to minimize $PS S_{λ}$ . It is important to note that a functional approach views the functions as continuous even though the original data are discrete. The roughness penalty is treated as being applied to a continuous variable.

Figure 4 shows the effects of two different penalty derivatives: third order (Panels A, B, and C) and fourth order (Panels D, E, and F). As might be expected, a larger penalty derivative results in a smoother curve. Figure 4 also illustrates the effects of varying the smoothing parameter with $λ$ = .5 (Panels A and D), $λ$ =1 (Panels B and E), and $λ$ = 5 (Panels C and F).

Figure 4

Examples of Smoothing With Different Penalized Derivatives and Smoothing Parameters

In summary, researchers can control the smoothness of curves used in FDA by choosing the degree of the polynomial, the location of the knots, the degree of the derivative to penalize the fit criterion, and the value of the smoothing weight (Kokoszka & Reimherr, 2017; Ramsay & Silverman, 2005). Overall, this process is exploratory and judgmental in nature. Cuevas (2014), Kokoszka and Reimherr (2017), and Levitin et al. (2007) are useful resources for a more detailed discussion of these topics.

Functional Clustering

Functional clustering is a process in which functional curves are grouped based on having similar attributes and form. The goal of functional clustering is to produce clusters of curves that maximize within-cluster homogeneity and between-cluster heterogeneity. Functional clustering is based on the spline coefficients ( $c_{k}$ ) shown in Equation 1. For example, k-means clustering (Steinley, 2006) can be applied to these spline coefficients.

To determine the number of clusters, there are several methods that can be used (Charrad et al., 2014). A simple approach is to plot the within-cluster sum of squares on the number of clusters. This plot can be interpreted in a way analogous to scree plots that are used in exploratory factor analysis. The number of clusters can be identified by the location of the elbow in this plot. Identification of an elbow is somewhat ambiguous, but it can be useful for guiding exploratory analyses.

Once a preliminary number of clusters are identified, the functional Fisherian Estimation Maximization (funFEM) approach is used. This approach is based on Gaussian mixture distributions using a combination of Fisher’s method and an Estimation Maximization (EM) algorithm to identify clusters (funFEM). It should be noted that k-means clustering is used as an initial step of the funFEM algorithm. It is beyond the scope of this study to provide more details about the funFEM algorithm, and Bouveyron et al. (2015) should be consulted for further details regarding the funFEM clustering algorithm.

One of the key features of the funFEM algorithm is that it produces a visual representation of the clustered curves. In this study, the smooth person response functions (PRFs) are placed into clusters based on their shape using the funFEM algorithm. There have been many recent developments in the area of functional clustering (Abraham et al., 2003; Bouveyron et al., 2015; Jacques & Preda, 2013; James & Sugar, 2003; Tarpey & Kinateder, 2003). Jacques and Preda (2013) provide a comprehensive overview of common functional clustering techniques.

In addition to the clustered functional curves, it is helpful to visualize the data as hierarchical clusters using a dendrogram or tree diagram. Technically, the clusters are not conceptualized as having a hierarchical structure; however, it may be useful to examine hierarchical clusters of persons in order to consider hypotheses about underlying explanatory structure that can be explored in future research.

In summary, there are seven key decision parameters that underlie FDA and functional clustering as used in this study. These are as follows:

Basis Function: Set of known functions that can be combined to approximate any function (e.g., B-splines)

Degree of polynomial: Linear, quadratic, cubic, etc.

Knots: Sequence of values that separate the function into polynomial segments on the x-axis (e.g., item locations)

Penalized Derivative: The derivative of the function used for smoothing.

Smoothing Parameter: A weight applied to the penalized derivative.

Number of Clusters: Number of subgroups identified in data (k).

Functional Clustering: Clustering method applied to matrix of spline coefficients (c) obtained from FDA.

These seven key decision parameters guide the judgments made in analyzing unfolding data for the purposes of this study.

Unfolding Models

The basic structure of an unfolding process can be illustrated through displays created by Coombs (1964). Figure 5 shows his conception of unfolding. Figure 5 (Panel A) shows the rank order locations of Items A to G, while Figure 5 (Panel B) shows the unfolding order of items for Person 1 (Ideal point). Person 1 has the item ordering of C, D, B, E, A, and F on the underlying unfolding scale. It is important to note that the underlying joint scale (J scale) is common for all persons.

Figure 5

Views of Unfolding (Coombs, 1964)

There are a variety of unfolding models that can be categorized in different ways. In this study, the focus is on modeling dichotomous judgments (e.g., disagree and agree) with FDA. Unfolding models can be grouped in terms of deterministic and probabilistic characteristics (Engelhard & Yuan, in press). Important deterministic unfolding models were proposed by Coombs (1964) that include parallelogram analysis (Chapter 4) and the unfolding technique (Chapter 5). Thurstone & Chave (1927), Andrich (1988, 2016), Hoijink (1993), and Roberts (1995) proposed unfolding models that can be classified as probabilistic and parametric in structure. The Mudfold Model proposed by van Schuur (1984) is an example of a probabilistic unfolding model that is nonparametric. It was extended by Post (1992) and Post and Snijders (1993). Mudfold is an R package for that can be used to estimate the dichotomous Mudfold model (Balafas et al., 2020). Liu and Chalmers (2018) and Luo (2001) provide discussions of other general Item Response Theory (IRT) unfolding models including methods of parameter estimation.

A probabilistic view of the unfolding process is presented in Figure 6. These three response functions fit an unfolding process that can be described as monotonically increasing Figure 6 (Panel A), monotonically decreasing Figure 6 (Panel B), and monotonically symmetric Figure 6 (Panel C). Misfitting response patterns will not have a good match to any of these three general unfolding patterns. These patterns play an important role in the guiding the exploratory analyses described in this study.

Figure 6

Probabilistic View of the Unfolding Process

Illustration

Table 1 is an example of a data set with unfolding response patterns. Seven persons responded dichotomously to Items A, B, C, D, E, F, and G (disagree = 0 or agree = 1). The items are ordered from low to high (−3.00 to 3.00) on the underlying scale. The distinctive parallelogram pattern described by Coombs (1964) for unfolding data is evident in Table 1.

Table 1

Illustrative Data With An Unfolding Pattern (Parallelogram)

	Items
Persons	A	B	C	D	E	F	G
1	1	1	0	0	0	0	0
2	1	1	1	0	0	0	0
3	0	1	1	1	0	0	0
4	0	0	1	1	1	0	0
5	0	0	0	1	1	1	0
6	0	0	0	0	1	1	1
7	0	0	0	0	0	1	1
FDA items	−3.00	−2.00	−1.00	0.00	1.00	2.00	3.00

Note. Items A to G are ordered on the unfolding scale. FDA = functional data analysis.

FDA was applied to the illustrative data in Table 1. The following decision parameters were used for the FDA:

Basis Function: B-splines

Degree of polynomial: fourth-degree polynomial

Knots: Item locations (−3 to 3)

Penalized Derivative: third derivative

Smoothing Parameter: λ = 5 (weight)

Number of Clusters: k = 3

Functional Clustering: k-means and FunFEM.

Figure 7 shows the plots for the illustrative data that emerge from a functional approach to these data. Panel A illustrated the relationship between the observed p values based on the order of the items. Next, the seven PRFs are displayed for these data. This plot with all of the PRFs can be busy and hard to interpret when the number of persons is large. Finally, Panel A also shows the scree-like plot for the within-cluster sum of square. The elbow appears at k = 3, and this suggests three clusters for these data. Panel B displays the three mean PRFs that reflect the three clusters (monotonically decreasing, monotonically symmetric, and monotonically increasing). The histogram for the three clusters is also shown in Panel B.

Figure 7

Illustrative Data With Three Clusters

The shape of these curves is reminiscent of Figure 2 (Panel A) from Coombs (1964). Panel C in Figure 7 (Panel C) shows a dendrogram for these data. The three clusters identify persons who tended to disagree from below (Persons 1 and 2), agree (Persons 3, 4, and 5), and disagree from above (Persons 6 and 7).

This illustration gives an analysis of a simple unfolding data set. It shows how clusters can be created with FDA to identify groups of persons for further interpretation and analysis. The analysis of these data with a known unfolding structure can be useful in interpreting the empirical data analyzed in the next two sections.

In the next two sections, these ideas are applied to real data using scales designed to measure attitude toward capital punishment (Andrich, 1988; Peterson, 1931) and attitude toward censorship (Roberts, 1995; Rosander & Thurstone, 1931).

Study 1: Attitude Toward Capital Punishment

In the first study, attitudes toward capital punishment (Andrich, 1988) are analyzed. There are eight items in this scale. These data were collected from 54 graduate students taking an introductory course in educational measurement and statistics (Andrich, 1988). The responses were scored dichotomously: 0 = disagree and 1 = agree.

Analyses

Table 2 lists the items included in the attitude toward capital punishment scale. The Thurstone values for the items are also shown in Table 2. These Thurstone values were obtained from Peterson (1931), and they were also reported by Andrich (1988) and Wohlwill (1963). The x-axis for the FDA was defined by standardized Thurstone values (FDA-T) with M = 0 and SD = 1. The researcher can select different ways to define the x-axis, and the FDA-T values are used in Study 1. The substantive results are expected to vary depending on how the x-axis and scale are define. The data set is the same one used in Andrich (1988).

Table 2

Attitude Toward Capital Punishment

		Item locations
Items	Statements	Thurstone values	FDA-T
1	Capital punishment is one of the most hideous practices of our time.	0.60	−1.47
2	The state cannot teach the sacredness of human life by destroying it.	2.00	−1.03
3	Capital punishment is not an effective deterrent to crime.	2.80	−0.77
4	I do not believe in capital punishment but I am not sure it is not necessary.	5.40	0.05
5	I think capital punishment is necessary but I wish it were not.	6.20	0.30
6	Until we find a more civilized way to prevent crime, we must have capital punishment.	7.20	0.62
7	Capital punishment is justified because it does act as a deterrent to crime.	8.40	1.00
8	Capital punishment gives the criminal what he deserves.	9.40	1.31

Note. Thurstone values are from Peterson (1931). The FDA-T values are standardized Thurstone values (M = 0, SD = 1). FDA = functional data analysis.

The following decision parameters were used for the FDA:

Basis Function: B-splines

Degree of polynomial: fourth-degree polynomial

Knots: Item locations based on standardized Thurstone Scale Values (FDA-T)

Penalized Derivative: third derivative

Smoothing Parameter: λ = 5 (weight)

Number of Clusters: k = 3

Functional Clustering: k-means and FunFEM

These decision parameters are the same as those used in the illustration.

Results

Figure 8 shows the results of the FDA. Figure 8 (Panel A) shows the plot of the item averages (p values) on the items. Overall, the suitability of the data for unfolding analyses looks reasonable. Next, the 54 PRFs (N = 54) are shown. As noted earlier, this plot is quite busy, and it is difficult to interpret. Finally, the scree-like plot of the within-clusters sum of squares is displayed. As was found with the illustrative data, an elbow appears at k = 3 suggesting three clusters. Panel B in Figure 8 illustrates the average FDA curves for each cluster based on k = 3 appears for these data. Next, a histogram shows the grouping of the 54 persons into three clusters. Panel C presents a dendrogram that displays clustering for the 54 persons. As was found with the illustrative data, there seem to be three clusters reflecting disagree from below, agree, and disagree from above.

Figure 8

Attitudes Toward Capital Punishment (Andrich, 1988): Three Clusters

In summary, the analyses of these data reflect unfolding patterns that approximate the three expected patterns as shown in the illustrative data. The PRFs are clustered into three clusters that appear to match the expectations for an unfolding response process.

Study 2: Attitude Toward Censorship

In Study 2, a scale designed to measure attitude toward censorship is analyzed. These items were first published by Rosander and Thurstone (1931), and reprinted in Shaw and Wright (1967). Table 3 lists the items included in the attitude toward censorship scale.

Table 3

Attitude Toward Censorship

		Item locations
Items	Statements	Ranks	Mudfold quantiles	FDA-M
1	I doubt if censorship is wise.	2	0.10	−1.45
2	A truly free people must be allowed to choose their own reading and entertainment.	5	0.25	−0.55
3	We must have censorship to protect the morals of young people.	8	0.40	−0.08
4	The theory of censorship is sound, but censors make a mess of it.	7	0.35	−0.21
5	Only narrow-minded Puritans want censorship.	6	0.30	−0.36
6	The whole theory of censorship is utterly unreasonable.	9	0.45	0.04
7	Until public taste has been educated, we must continue to have censorship.	1	0.05	−2.12
8	Many of our greatest literary classics would be suppressed if the censors thought they could get away with it.	3	0.15	−1.05
9	Everything that is printed for publication should first be examined by government censors.	10	0.50	0.14
10	Plays and movies should be censored but the press should be free.	4	0.20	−0.77
11	Censorship has practically no effect on people morals.	11	0.55	0.24
12	Censorship is a gross violation of our constitutional right.	20	1.00	1.52
13	Censorship protects those who lack judgment or experience to choose for themselves.	13	0.65	0.41
14	Censorship is a very difficult problem, and I am not sure how far I think it should go.	12	0.60	0.33
15	Censorship is a good thing on the whole although it is often abused.	15	0.75	0.55
16	Education of the public taste is preferable to censorship.	17	0.85	0.68
17	Human progress demands free speech and a free press.	14	0.70	0.48
18	Censorship is effective in raising moral and aesthetic standards.	18	0.90	0.75
19	Censorship might be warranted if we could get reasonable censors.	16	0.80	0.62
20	Morality is produced by self-control not by censorship.	19	0.95	0.82

Note. Item rank and quantile values are obtained from Mudfold R Package (Balafas et al., 2020). The FDA-M values are based on a logistic transformation of quantiles described in the text. FDA = functional data analysis.

There were 223 students who indicated the extent that they agreed with each of the 20 statements (Roberts, 1995). Responses were on a 6-point rating scale where 1 = Strongly Disagree, 2 = Disagree, 3 = Slightly Disagree, 4 = Slightly Agree, 5 = Agree, and 6 = Strongly Agree. As this study focuses on dichotomous data, these ratings were dichotomized (0 = 1, 2, or 3, and 1 = 4, 5, or 6).

Analyses

To define items values for the x-axis, the data were analyzed with the Mudfold R Package (Balafas et al., 2020). The ranks and quantiles obtained from fitting a Mudfold model are shown in Table 3, as well as the transformed values used in the FDA. Thurstone Mudfold values (FDA-M) were based on a transformation of item ranks to quantiles (Johnson, 2006). These quantiles (κ) are transformed into FDA-Mudfold (FDA-M) values as follows:

FDA - M = \ln [κ + 1 / (2 \times N) / (1 - κ) + 1 / (2 \times N)],

(4)

where N is the number of persons and κ is the item quantile. Next, these FDA-M values are standardized (M = 0, SD = 1) for the purposes of this study. This transformation is designed to reduce bias in the case of small N and also provides a reasonable adjustment for perfect scores (Anscombe, 1956). It should be stressed that the definition of the x-axis influences the substantive results because the interpretation is expected to vary depending on how the x-axis and scale are defined. This provides another example of how to define the x-axis for a functional approach to unfolding data. The data set is the same one used in Roberts (1995).

The following decision parameters were used for the FDA:

Basis Function: B-splines

Degree of polynomial: fourth-degree polynomial

Knots: Item locations based on transformed Mudfold values (FDA-M)

Penalized Derivative: third derivative

Smoothing Parameter: λ = 5 (weight)

Number of Clusters: k = 3, 4

Functional Clustering: k-means and FunFEM.

These decision parameters are the same as those used in the illustration with the exception of the number of clusters (three and four clusters were examined).

Results

Figure 9 shows the results of the FDA. Figure 9 (Panel A) shows the plot of the item averages (p values) on the items. Two items appear to be outliers on this curve: Item 7 (Until public taste has been educated, we must continue to have censorship) and Item 12 (Censorship is a gross violation of our constitutional right). These items may be outliers because they may be confusing to respondents. An alternative explanation is that the Mudfold analyses used to define the x-axis are “content” free, and so, these items are not constrained to match the substantive ordering that may be expected. Next, the individual PRFs (N = 223) are plotted. As expected, this plot is busy and difficult to interpret. Finally, the scree-like plot of the within clusters sum of squares is shown. In this case, there are two potential elbows (k = 3 and k = 4). The functional cluster model was fit to three and four clusters. The BIC was 1741.289 for three clusters, and the Bayesian Information Criterion (BIC) was 355.631 for four clusters. This suggests that using four clusters is a better fit for these data.

Figure 9

Attitude Toward Censorship (Roberts, 1995): Four Clusters

Panel B illustrates average FDA curves for each of the four clusters. It is interesting to note that the use of four clusters yields a curve that does not fit the expected pattern for unfolding data. Three clusters appear to have expected shapes for unfolding data, while the fourth curve (concave up—red) suggests persons in this cluster were not responding as expected. Additional analyses are needed to explore why the fourth cluster appears to not reflect an unfolding process for these respondents. The histogram for the four clusters is shown next. Panel C presents the dendrogram for these data. The person numbers are not shown in this display because of the large sample size.

In summary, four clusters offer a potentially useful interpretation of person response functions with the censorship data. The fourth cluster may indicate aberrant PRFs that require additional analyses.

Discussion

The purpose of this study was to introduce FDA as a promising approach for modeling unfolding response data. Functional data analysis was discussed with seven decision parameters identified that can be used to guide FDA within the context of unfolding data. The use of functional clustering as a strategy for summarizing the smoothed curves obtained from FDA was also described. The use of k-means clustering in combination with a funFEM algorithm (Bouveyron, 2015) shows promise as a strategy for grouping person response functions obtained from the functional analysis.

It is important to recognize that FDA serves the same purposes as other statistical methods. The goals of FDA Ramsay and Silverman (2005) include the following:

• Representation of the data in ways to suggest further analyses,

• Visualization of data to highlight various characteristics,

• Identification of patterns and variation within the data,

• Exploration of relationships between dependent and independent variables, and

• Comparison of two or more data sets.

This study primarily focused on illustrating the representation of unfolding response data with visual displays of the data.

There are several areas for future research on the use of FDA in the context of unfolding response processes. It is important to consider the application of FDA to polytomous data within the context of unfolding models. Another area for future research focuses on the interpretation of clusters. This study started with the strong idea of response patterns that exhibit good fit, and this may guide interpretation of clusters. Another set of topics for future research is to examine item and person fit from the perspective of FDA with unfolding items.

Future research should explore the consequences of specifying the x-axis in different ways. The substantive interpretations of the analyses are expected to vary depending on how the x-axis is defined. The x-axis can be defined based on a specific unfolding model, such as hyperbolic cosine model (HCM; Andrich, 2016), generalized graded unfolding model (GGUM; Roberts, 2016), or multiple unidimensional unfolding (MUDFOLD; van Schuur, 1984). Study 1 uses previously defined scale values to define the x-axis, while Study 2 uses modified item values from MUDFOLD.

It is also important to recognize that FDA and FDA clustering in general are rapidly growing areas of research. This study illustrated several methods from FDA, and future research should examine other tools from FDA to examine response data. Unfolding models offer another general area that is open for additional conceptual and statistical work. Unfolding models are underutilized for exploring item responses. The theoretical underpinnings of unfolding models may be less familiar for some researchers. The development of new software (Balafas et al., 2020; J. Wang et al., in press) should provide additional opportunities for exploring new application areas. A combination of new ideas regarding unfolding in conjunction with new methods within FDA offer an exciting space for future research.

Seven key decision parameters were identified in this study with various options. It is important to conduct future research using simulated data to examine the effects of varying these parameters. As these choices are based on judgment, it is inevitable that as Tukey (1977) said “different users are always likely to prefer different choices” (p. 215). Choice can be viewed as a double-edged sword because it can be viewed as an advantage of a functional approach to unfolding data because of the flexibility. However, choice can also be viewed as a disadvantage because judgments are difficult to make when there are no definitive or correct answers. The researcher is primarily guided by the usefulness and meaning that may accrue from using FDA to model the data. FDA as used in this study is essentially an exploratory data analysis. Exploratory analyses depend in a fundamental way on choices and judgments by researchers who have substantive knowledge regarding the data and research problem. Of course, these characteristics play an important in all statistical analyses of data.

In summary, the use of FDA with unfolding data offers a promising set of tools for examining unfolding response processes. The major goal of this study was to offer proof-of concept support for a combination of FDA and unfolding analyses. This study is intended as an invitation to other researchers to investigate the utility of FDA for unfolding data.

Footnotes

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

George Engelhard

References

Abraham

Cornillon

P. A.

Matzner-Lober

Molinari

(2003). Unsupervised curve clustering using B-splines. Scandinavian Journal of Statistics, 30(3), 581–595. https://doi.org/10.1111/1467-9469.00350

Andrich

(1988). The application of an unfolding model of the PIRT type to the measurement of attitude. Applied Psychological Measurement, 12(1), 33–51.

Andrich

(2016). Hyperbolic cosine model for unfolding responses. In van der Linden

W. J.

(Ed.), Handbook of item response theory. Volume one: Models (pp. 353–367). CRC Press.

Anscombe

F. J.

(1956). On estimating binomial response relations. Biometrika, 42, 461–464.

Baker

F. B.

Kim

(2004). Item response theory: Parameter estimation techniques. Marcel Dekker.

Balafas

S. E.

Krijnen

W. P.

Post

W. J.

Wit

E. C.

(2020). Mudfold: An R package for nonparametric IRT modelling of unfolding processes. The R Journal, 12(1), 49–75.

Bouveyron

(2015). funFEM: Clustering in the discriminative functional subspace (R package version 1.1). https://CRAN.R-project.org/package=FunFEM

Bouveyron

Côme

Jacques

(2015). The discriminative functional mixture model for a comparative analysis of bike sharing systems. The Annals of Applied Statistics, 9(4), 1726–1760.

Charrad

Ghazzali

Boiteau

Niknafs

(2014). NbClust: An R package for determining the relevant number of clusters in a data set. Journal of Statistical Software, 61(6), 1–36.

10.

Coombs

C. H.

(1964). A theory of data. Wiley.

11.

Cuevas

(2014). A partial overview of the theory of statistics with functional data. Journal of Statistical Planning and Inference, 147, 1–23. https://doi.org/10.1016/j.jspi.2013.04.002

12.

De Ayala

R. J

. (2013). The theory and practice of item response theory. Guilford Publications.

13.

Engelhard

Yuan

(in press). A history of unfolding models. In Engelhard

Yuan

Wang

(Eds.), Special issue on unfolding models. Journal of Applied Measurement.

14.

Hoijtink

(1993). PARELLA: A parametric approach to parallelogram analysis. Kwantitatieve Methoden, 42, 93–107.

15.

Jacques

Preda

(2013). Functional data clustering: A survey. Advances in Data Analysis and Classification, 8(3), 231–255. https://doi.org/10.1007/s11634-013-0158-y

16.

James

G. M.

Sugar

C. A.

(2003). Clustering for sparsely sampled functional data. Journal of the American Statistical Association, 98(462), 397–408. https://doi.org/10.1198/016214503000189

17.

Johnson

M. S.

(2006). Nonparametric estimation of item and respondent locations from unfolding-type items. Psychometrika, 71(2), 257–279.

18.

Kokoszka

Reimherr

(2017). Introduction to functional data analysis. CRC Press.

19.

Levitin

D. J.

Nuzzo

R. L.

Vines

B. W.

Ramsay

J. O.

(2007). Introduction to functional data analysis. Canadian Psychology, 48(3), 135–155. https://doi.org/10.1037/cp2007014

20.

Nguyen

D. V.

Banerjee

Rhee

C. M.

Kalantar-Zadeh

Kürüm

Şentürk

(2021). Multilevel modeling of spatially nested functional data: Spatiotemporal patterns of hospitalization rates in the US dialysis population. Statistics in Medicine, 40(17), 3937–3952.

21.

Liu

C.-W.

Chalmers

R. P.

(2018). Fitting item response unfolding models to Likert-scale data using mirt in R. PLOS ONE, 13(5), Article e0196292.

22.

Luo

(2001). A class of probabilistic unfolding models for polytomous responses. Journal of Mathematical Psychology, 45(2), 224–248.

23.

Marcoulides

K. M.

Khojasteh

(2018). Analyzing longitudinal data using natural cubic smoothing splines. Structural Equation Modeling: A Multidisciplinary Journal, 25(6), 965–971.

24.

Oshinubi

Ibrahim

Rachdi

Demongeot

(2022). Functional data analysis: Application to daily observation of COVID-19 prevalence in France. AIMS Mathematics, 7(4), 5347–5385.

25.

Peterson

R. C.

(1931). A scale for measuring attitude towards capital punishment. University of Chicago Press.

26.

Post

W. J.

(1992). Nonparametric unfolding models: A latent structure approach (M&T Series). DSWO Press.

27.

Post

W. J.

Snijders

T. A.

(1993). Nonparametric unfolding models for dichotomous data. Methodika, VII, 130–156.

28.

Ramsay

J. O.

(2016). Functional approaches to modeling response data. In van der Linden

W. J.

(Ed.), Handbook of item response theory. Volume one, models (pp. 337–350). CRC Press.

29.

Ramsay

J. O.

Silverman

B. W.

(2005). Functional data analysis. Springer.

30.

Roberts

J. A.

(2016). Generalized graded unfolding model. In van der Linden

W. J.

(Ed.), Handbook of Item Response Theory, Volume one: Models (pp. 369–390). CRC Press.

31.

Roberts

J. S.

(1995). Item response theory approaches to attitude measurement (UMI No. 9611247) [Doctoral dissertation Abstracts International, 56, 7089B, University of South Carolina]. ProQuest Dissertations and Theses Database.

32.

Rosander

A. C.

Thurstone

L. L

. (1931). Scale of attitude toward censorship: Scale No. 28. In Thurstone

L. L.

(Ed.), The measurement of social attitudes. University of Chicago Press.

33.

Rupp

(2013). A systematic review of the methodology for person fit research in item response theory: Lessons about generalizability of inferences from the design of simulation studies. Psychological Test and Assessment Modeling, 55(1), 3–38.

34.

Shah

Jan

Ali

(2022). Functional data approach for short-term electricity demand forecasting. Mathematical Problems in Engineering, 2022, Article 6709779.

35.

Shaw

M. E.

Wright

J. M.

(1967). Scales for the measurement of attitudes. McGraw-Hill.

36.

Steinley

(2006). K-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology, 59, 1–34.

37.

Suhaila

(2021). Functional data visualization and outlier detection on the anomaly of El Niño southern oscillation. Climate, 9(7), 118.

38.

Suk

H. W.

West

S. G.

Fine

K. L.

Grimm

K. J.

(2019). Nonlinear growth curve modeling using penalized spline models: A gentle introduction. Psychological Methods, 24, 269–290.

39.

Tarpey

Kinateder

K. K.

(2003). Clustering functional data. Journal of Classification, 20(1), 93–114.

40.

Thurstone

L. L.

Chave

E. J.

(1929). The measurement of attitude: A psychophysical method and some experiments with a scale for measuring attitude toward the church. The University of Chicago Press.

41.

Tukey

J.W.

(1977). Exploratory Data Analysis. Addison-Wesley Publishing Company.

42.

Van der Linden

W. J

. (Ed.). (2019). Handbook of item response theory: Volume 1: Models. CRC Press.

43.

van Schuur

W. H

. (1984). Structure of political beliefs. CT press.

44.

Wang

Combs

Zopluoglu

(in press). Fitting hyperbolic cosine unfolding models using the Stan Language. Journal of Applied Measurement.

45.

Wang

Cao

Shang

& Alzheimer’s Disease Neuroimaging Initiative. (2021). Estimation of the mean function of functional data via deep neural networks. Stat, 10(1), e393.

46.

Wohlwill

J. F.

(1963). The measurement of scalability for non-cumulative items. Educational and Psychological Measurement, 23(3), 543–555.

47.

Woods

C. M.

(2006). Ramsay-curve item response theory (RC-IRT) to detect and correct for nonnormal latent variables. Psychological Methods, 11(3), 253–270.