Note: This is a working paper which will be expanded/updated frequently. The directory has a pdf copy of this article, the complete Rmd file with all code chunks, the bib file, the LaTeX style file, and the R source code.

1 Problem

We use the abbreviation MDS for the minimization of Kruskal’s stress \[\begin{equation} \sigma(X):=\mathop{\sum\sum}_{1\leq i<j\leq n} w_{ij}(\delta_{ij}-d_{ij}(X))^2\label{E:stress} \end{equation}\]

over the \(n\times p\) configurations \(X\) (Kruskal (1964a), Kruskal (1964b)). Symmetric matrix \(W=\{w_{ij}\}\) has known non-negative weights, symmetric matrix \(\Delta=\{\delta_{ij}\}\) has non-negative dissimilarities, and the symmetric matrix-valued function \(D(X)=\{d_{ij}(X)\}\) has Euclidean distances between the points, i.e. the rows of \(X\).

In the usual applications of MDS the user chooses dimensionality \(p\) to be much smaller than \(n\), because the name of the game is dimensionality reduction. In this paper we give a detailed analysis of full dimensional scaling or FDS, which is MDS with \(p=n\).

2 Notation

We introduce some notation which is useful for all Euclidean MDS problems, not just FDS problems. This notation has been more or less standard since De Leeuw (1977). Suppose the \(e_i\) are unit vectors of length \(n\), i.e. \(e_i\) has all its elements equal to zero except element \(i\), which is equal to one. Also define the matrices \(A_{ij}:=(e_i-e_j)(e_i-e_j)'\). Thus \(A_{ij}\) has four nonzero elements: elements \((i,i)\) and \((j,j)\) are \(+1\), elements \((i,j)\) and \((j,i)\) are \(-1\). With these definitions \[\begin{equation} d_{ij}^2(X)=(e_i-e_j)'XX'(e_i-e_j)=\mathbf{tr}\ X'A_{ij}X=\mathbf{tr}\ A_{ij}XX'.\label{E:matrixd} \end{equation}\]

Now expand stress. If we assume without loss of generality that \[ \mathop{\sum\sum}_{1\leq i<j\leq n} w_{ij}\delta_{ij}^2=1, \] and we define

\[\begin{align} \rho(X)&:=\mathop{\sum\sum}_{1\leq i<j\leq n} w_{ij}\delta_{ij}d_{ij}(X),\label{E:rho}\\ \eta(X)&:=\mathop{\sum\sum}_{1\leq i<j\leq n} w_{ij}d_{ij}^2.\label{E:eta} \end{align}\] Now \[\begin{equation} \sigma(X)=1-\rho(X)+\frac12\eta(X). \end{equation}\] To develop some convenient matrix notation we define \[\begin{align} V&:=\mathop{\sum\sum}_{1\leq i<j\leq n} w_{ij}A_{ij},\\ B(X)&:=\mathop{\sum\sum}_{1\leq i<j\leq n} \begin{cases}w_{ij}\frac{\delta_{ij}}{d_{ij}(X)}A_{ij}&\text{ if }d_{ij}(X)>0,\\ 0&\text{ if }d_{ij}(X)=0,\end{cases} \end{align}\] so that \[\begin{align} \rho(X)&=\mathbf{tr}\ X'B(X)X,\\ \eta(X)&=\mathbf{tr}\ X'VX. \end{align}\]

Note that \(V\) has off-diagonal elements equal to \(-w_{ij}\) and \(B(X)\) has off-diagonal elements \[b_{ij}(X)=-w_{ij}\frac{\delta_{ij}}{d_{ij}(X)}\] if \(d_{ij}(X)>0\) and zero otherwise. Diagonal elements are filled to make rows and columns sum to zero. Both \(V\) and \(B(X)\) are doubly-centered and positive semi-definite. If we assume without loss of generality that the MDS problem is irredicible (De Leeuw (1977)) then \(\mathbf{rank}(V)=n-1\) and the only vectors in the null space are multiples of \(e:=(1,1,\cdots,1)\). If all weights are eual to one then \(V=nI-ee'\).

3 Local minima of stress

De Leeuw (1984) proves a key necessary condition for a local minimum in MDS. Also see De Leeuw, Groenen, and Mair (2016) for a more recent proof. We also review some additional results on maxima and minima of stress.

Theorem 1: [De Leeuw 84, De Leeuw 93, De Leeuw and Groenen 97]

  1. If \(\sigma\) has a local minimum at \(X\) then \(d_{ij}(X)>0\) for all \(i,j\) such that \(w_{ij}\delta_{ij}>0\).
  2. If \(\sigma\) has a local minimum at \(X\) then \(\sigma\) is differentiable in a neighborhood of \(X\). Thus \(B(X)X=VX\).
  3. If \(\sigma\) has a local minimum at a column-centered \(X\) and \(\mathbf{rank}(X)=n-1\) then \(\sigma(X)=0\).
  4. \(\sigma\) has a local maximum at \(X\) if and only if \(X=0\).

Let’s look at a small example with four points, all dissimilarities equal, and all weights equal to one. There is a local minimum with four points in the corners of a square, with stress equal to 0.0285955. And there is another local minimum with three points forming an equilateral triangle, and the fourth point in the center. This has stress 0.0669873. We can compute stress for all points of the form \(\alpha X+\beta Y\), where \(X\) and \(Y\) are the two local minima. Figure 1 has a contour plot of \(\sigma(\alpha,\beta)\), showing the local minima at \((1,0)\) and \((0,1)\), and the local maximum at \((0,0)\).