The FuzzySpec package implements the FVIBES (Fuzzy Variable-Importance Based Eigenspace Separation) algorithm, a fuzzy spectral clustering procedure that incorporates variable-weighted distance metrics and adaptive adjacency matrix constructions. This package accompanies the paper Variable-Weighted Adjacency Constructions for Fuzzy Spectral Clustering by Ghashti, Hare, and Thompson (2025).
The key features of this package include:
a variable-weighted distance metric that automatically determines variable importance using nonparametric kernel density estimation,
an adaptive adjacency construction framework with multiple options for building similarity graphs including locally-adaptive scaling (Zelnik-Manor and Perona, 2004),
clustering outputs that return fuzzy membership matrices rather than just hard cluster assignments, and
a synthetic dataset generation containing built-in generators to benchmark fuzzy clustering algorithms.
There are three primary functions needed to conduct FVIBES clustering:
Build an adjacency matrix from the data using
make.adjacency()
Perform fuzzy spectral clustering using
fuzzy.spectral.clustering()
Optionally, examine results results with 2D visualization
function plot.fuzzy()
or compare to true class labels using
clustering.accuracy()
.
Install the latest release version of FuzzySpec from GitHub or with the following:
The basic steps using built-in function are provided below.
spirals
, see the
help file for gen.fuzzy()
for more options and
information.set.seed(1)
data <- FuzzySpec::gen.fuzzy(n = 300, dataset = "spirals", noise = 0.15) # data generation
FuzzySpec::plot.fuzzy(data, plotFuzzy = TRUE, colorCluster = TRUE) # plot data generating process
W <- FuzzySpec::make.adjacency(
data = data$X,
method = "vw", # variable-weighted distances
isLocWeighted = TRUE, # Locally-adaptive scaling
scale = FALSE # scaling not required for kernel methods
)
#> Multistart 1 of 3 |Multistart 1 of 3 |Multistart 1 of 3 |Multistart 1 of 3 /Multistart 1 of 3 -Multistart 1 of 3 |Multistart 1 of 3 |Multistart 2 of 3 |Multistart 2 of 3 |Multistart 2 of 3 /Multistart 2 of 3 -Multistart 2 of 3 |Multistart 2 of 3 |Multistart 2 of 3 /Multistart 3 of 3 |Multistart 3 of 3 |Multistart 3 of 3 /Multistart 3 of 3 -Multistart 3 of 3 |Multistart 3 of 3 |
k = 3
and the commonly chosen fuzzy parameter
m = 1.5
. We display the first 5 rows of the membership
matrix \(\mathbf{U}\):res <- FuzzySpec::fuzzy.spectral.clustering(
W = W, k = 3, m = 1.5, method = "CM"
)
res$u[1:5,]
#> Clus 1 Clus 2 Clus 3
#> Obj 1 0.9048457 0.05660900 0.03854527
#> Obj 2 0.9549308 0.02402001 0.02104920
#> Obj 3 0.9092418 0.05372531 0.03703293
#> Obj 4 0.9764813 0.01192352 0.01159519
#> Obj 5 0.9590105 0.02179147 0.01919800
acc <- FuzzySpec::clustering.accuracy(data$y, res$cluster)
cat("Clustering accuracy:", round(acc, 3), "\n")
#> Clustering accuracy: 0.99
fari
,
which computes fuzzy generalizations of the Adjusted Rand Index (FARI)
based on Frobenius inner products of membership matrices (Andrews, Brown
and Hvingelby, 2022).resDF <- list(
X = data$X, U = res$u, y = factor(res$cluster), k = 3
)
FuzzySpec::plot.fuzzy(resDF, plotFuzzy = TRUE, colorCluster = TRUE)
See respective help files for each function when needed; here we
provide a basic overview of function arguments for
make.adjacency()
. This function allows for flexible
adjacency matrix constructions based on Ghashti et al. (2025). The
parameters are as follows:
method
: distance metric
"eu"
: squared Euclidean distance"vw"
: variable-weighted distance using kernel density
bandwidth estimationisLocWeighted
: scaling approach
TRUE
: locally-adaptive scaling (Zelnik-Manor &
Perona, 2004)FALSE
: global scaling with parameter
sig
isModWeighted
: apply similarity weightings
ModMethod = "snn"
: shared nearest neighbors (Jarvis
& Patrick, 1973)ModMethod = "sim"
: similarity-based weightingModMethod = "both"
: combined SNN and SIMisSparse
: returns a sparse matrix when using
weightingsReferences
Andrews, J.L., Browne, R. and C.D. Hvingelby (2022). On Assessments of Agreement Between Fuzzy Partitions. Journal of Classification, 39, 326–342.
J.C. Bezdek (1981). Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York.
K. R. Coombes (2025). Thresher: Threshing and Reaping for Principal Components. R package version 1.1.5.
Ferraro, M.B., Giordani, P., and A. Serafini (2019). fclust: An R Package for Fuzzy Clustering. The R Journal, 11.
Jarvis, R. A., and A. E. Patrick (1973). Clustering using a similarity measure based on shared near neighbors. IEEE Transactions on Computers, 22(11), 1025-1034.
Ghashti, J. S., Hare, W., and J. R. J. Thompson (2025). Variable-weighted adjacency constructions for fuzzy spectral clustering. Submitted.
Hayfield, T., and J. S. Racine (2008). Nonparametric Econometrics: The np Package. Journal of Statistical Software 27(5).
McLachlan, G. and T. Krishnan (2008). The EM algorithm and extensions, Second Edition. John Wiley & Sons.
Ng, A., Jordan, M., and Y. Weiss (2001). On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems, 14.
Scrucca, L., Fraley, C., Murphy, T.B., and A. E. Raftery (2023). Model-Based Clustering, Classification, and Density Estimation Using mclust in R. Chapman & Hall.
H. Wickham (2016). ggplot2: Elegant Graphics for Data Analysis. Springer–Verlag New York.
Zelnik-Manor, L., and P. Perona (2004). Self-tuning spectral clustering. Advances in Neural Information Processing Systems, 17.
Zhu, Q., Feng, J., and J. Huang (2016). Natural neighbor: A self-adaptive neighborhood method without parameter K. Pattern Recognition Letters, 80, 30-36.