Exact sparse \(\ell_0\) nonnegative least squares#

Reference

[Nadisic et al., 2020] N. Nadisic, J. E. Cohen, A. Vandaele, N. Gillis, “Exact Sparse Nonnegative Least Squares”, ICASSP2020, pdf

Throughout this manuscript, the reader encounters many instances of NNLS. Therefore, efficient solvers for NNLS, in particular with matrix unknowns and warm-start, are instrumental to designing efficient algorithms for nonnegative LRA. Recall a generic formulation of (matrix) NNLS as follows,

\[ \underset{V\in\mathbb{R}_+^{r\times n}}{\min} \|Y - UV^T\|_F^2 \]

with \(Y\in\mathbb{R}^{m\times n}\) the input data matrix, and \(U\in\mathbb{R}^{m\times r}\) the mixing matrix, which may both in principle have positive and negative entries. The reader may refer to the dedicated section regarding NNLS in the first part of this manuscript.

An important variant of NNLS is the sparse NNLS problem (sNNLS), in which the coefficients are constrained to have only a few nonzero entries. In the following, we assume this sparsity is enforced columnwise on matrix \(V^T\), although we have also worked on a sparsity constraint applied on all entries of \(V^T\) [Nadisic et al., 2022], not discussed in this manuscript. Columnwise sparsity on the coefficient matrix \(V^T\) is frequently encountered in applications of rLRA related to source separation. For instance, the matrix \(V^T\) may denote source activations in a mixture; it is often reasonable to assume that not all sources are activated at all times or positions. An example is provided below in spectral unmixing. The (matrix) sNNLS problem can be formulated as

\[ \underset{V\in\mathbb{R}_+^{n\times r}}{\min} \|Y - UV^T\|_2^2 \;\; \text{such that } \;\; \forall q\leq n,\; \|V[q,:]\|_0\leq k \]

for a known fixed sparsity level \(k<r\). Geometrically, sNNLS is the projection of a data vector \(y\) on the facets of the cone spanned by columns of \(U\). Data points can lie on exterior facets of the cone \(\cp(U)\), or on interior facets.

Note that, unlike matrix NNLS, for which we can design algorithms such as HALS using matrix-matrix products which leverage the native parallelism of these operations, we have not found a better solution for sNNLS than applying a vector sNNLS solver looping over all columns. Therefore, we may simply study the following (vector) sNNLS problem,

(10)#\[ \underset{v\in\mathbb{R}_+^{r}}{\min} \|y - Uv\|_2^2 \;\; \text{such that } \;\; \|v\|_0\leq k \]

with \(y\) a column of input matrix \(Y\) and \(v\) a column of matrix \(V^T\).

sNNLS is NP-hard, but the rank is typically small#

Looking at sNNLS (10), the reader may recognize a particular case of the sparse coding problem with nonnegativity constraints. In fact if we could solve sNNLS in polynomial time of \(r\), we could solve sparse coding with the algorithm since a sparse matrix-vector product \(Ax\) can be rewritten as a sparse nonnegative matrix-vector product \([A,-A][x_{+}, x_{-}]\) splitting the positive and negative parts of sparse vector \(x\). This proves that sNNLS is NP-hard. Computing global solutions to sparse coding is therefore bound to be expensive when the dimension \(r\) grows.

Nevertheless, in rLRA problems, dimension \(r\) refers to the rank of the rLRA model, which is typically reasonably small. This opens the door to considering algorithms that explore the entire set of solutions. Indeed, the sNNLS problem reduces to finding an optimal support for the solution \(v^\ast\), since

(11)#\[ \underset{v\in\mathbb{R}_+^{r}}{\min} \|y - Uv\|_2^2 \;\; \text{such that} \;\; \|v\|_0\leq k \; = \underset{\#S\leq k,\; S\subset[1,r]}{\min} ~\underset{v_S\in\mathbb{R}_+^{k}}{\min} \|y - U[:,S]v_S\|_2^2\]

where the inner problem wrt. \(v_S\) is a simple NNLS of reduced dimension.

Branch and Bound strategy#

Considering equation (11), we could solve the (vector) sNNLS problem by computing \(r \choose k\) NNLS problems of size \(k\). This brute force approach is tractable for small \(k\) and \(r\) and has the advantage of computing the global solution to sNNLS. Nevertheless, let us consider a more subtle approach that still solves sNNLS globally.

The key observation is that the cost function \(\|y - Uv\|_2^2\) at optimality decreases with the sparisty constraint \(k\). Formally, if \(S_1 \subset S_2 \subset [1,r]\) are two supports with \(S_1\) strictly included in \(S_2\) (corresponding to a sparser solution \(v^\ast_1\)), then

\[ \underset{v\geq 0}{\min} \|y - U[:, S_1]v\|_2^2 \geq \underset{v\geq 0}{\min} \|y - U[:, S_2]v\|_2^2. \]

In other words, the optimal cost necessarily increases as the constraint set grows. This observation suggests a Branch-and-Bound (BaB) algorithm for searching the combinatorial space of supports. We may first compute a guess for the optimal sNNLS solution using a heuristic algorithm, such as nonnegative OMP [Nguyen et al., 2017, Pati et al., 1993]. The cost function value at this solution \(v_{\text{UB}}\) is an upper-bound UB of the actual minimal cost of \(k\)-sparse sNNLS. Second, we may compute all NNLS problems with one zero element. For illustration purposes, consider the problem where the first element in \(v\) is set to zero, and denote the optimal cost for this problem \(s^\ast\). Because of the previous observation regarding monotonicity of the costs with increasing constraints, cost \(s^\ast\) is a lower bound LB on the optimal costs of all subsequent sNNLS problems where the first entry in the unknown \(v\) is zero. This fact applies to any set of constraints (any node in the tree of possible supports, see illustration below). The BaB algorithm allows us to save computation time when we encounter the case \(UB<LB\), in which case we know that the current proposed solution \(v_{\text{UB}}\) will always be better than the solutions at the current node and its children. This allows for safely pruning the set of supports to explore. Importantly, when a new solution for an \(r\)-sparse NNLS problem is computed, if its loss is lower than the current UB, we have obtained a new tighter UB, and update \(v_{\text{UB}}\) and UB accordingly. This procedure is perhaps better understood using a tree visualisation.

../../_images/BaB.png

Fig. 29 A tree representation of the Branch and Bound algorithm for solving sNNLS with \(r=2\) and \(k=4\). The deeper the node the sparser the solution, and the higher the cost. The leafs are the \(k\) choose \(n\) solutions. OMP is a heuristic that selects one particular leaf, which loss is larger than the loss at the optimal sNNLS solution, while any node in the tree has a loss lower than all its children.#

We named the proposed BaB algorithm for solving sNNLS arborescent.

Note

Despite the BaB algorithm pruning nodes, thus avoiding some computations, the total number of nodes in the graph – and therefore the total number of NNLS problems to solve if no pruning is achieved – is \(\sum_{p=k}^{r} {r\choose p}\). Therefore in some problem instances, the brute force algorithm can be faster if well implemented.

Note

The proposed arborescent algorithm, in fact, solves all \(p\)-sparse NNLS problems for \(p\geq k\). Indeed, one can show that the tree obtained after running arborescent can be cut at any floor optimally. This is useful if the sparsity level is not known in advance, in which case arborescent returns all \(p\)-sparse solutions.

A few details on the actual computations#

Upper-bound. We either use OMP, or compute a full NNLS at the root node, sort the coefficients, prune the smallest ones, and then recompute the resulting \(r\)-sparse sNNLS. This is equivalent to a depth‑first, left‑first exploration of the search tree. The nodes are then explored left to right (with the largest initial nonnegative coefficients first).

NNLS solver. To avoid spending too much time in the NNLS solver at each node, we use a warm start with the AS algorithm run to global convergence. In this setting, warm starts are extremely effective, drastically reducing computation time while keeping an exact solver and avoiding errors from early stopping.

Implementation. The method is implemented in Julia because loops and recursion are efficiently compiled.

Example of matrix sparse Nonnegative Least Squares#

import tensorly as tl
from tensorly.datasets import data_imports
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from tensorly_hdr.sep_nmf import snpa
from tensorly_hdr.giantly.arborescent import ksparse_nnls
from tensorly.solvers.nnls import hals_nnls

# Import an hyperspectral image
image_bunch = data_imports.load_indian_pines()
image_t = image_bunch.tensor # x by y by wavelength
image_t = image_t/tl.max(image_t) # maximum value set to 1
# Vectorize the tensor into a n_1\times n_2 matrix (wavelength \times pixels, y vary first)
image = tl.unfold(image_t, 2)

# Indian Pines is 145 by 145 (pixels) by 200 (bands)
n1, n2, n3 = tl.shape(image_t)
rank = 5

# select pure pixels and compute dense abundances
_, U, VTdense = snpa(image, rank)

# compute k sparse coefficients with sNNLS
VTsparse = tl.copy(VTdense)
imagetimage = tl.norm(image, axis=0)**2
k = 2

VTsparse = ksparse_nnls(U.T@image, U.T@U, imagetimage, VTsparse, k)

Hide code cell output

[juliapkg] Found dependencies: /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/juliapkg/juliapkg.json
[juliapkg] Found dependencies: /opt/hostedtoolcache/Python/3.11.15/x64/lib/python3.11/site-packages/juliacall/juliapkg.json
[juliapkg] Locating Julia 1.10.3 - 1.11
[juliapkg] WARNING: You have Julia 1.12.6 installed but 1.10.3 - 1.11 is required.
[juliapkg]   It is recommended that you upgrade Julia or install JuliaUp.
[juliapkg] Querying Julia versions from https://julialang-s3.julialang.org/bin/versions.json
[juliapkg] WARNING: About to install Julia 1.11.9 to /home/runner/.julia/environments/pyjuliapkg/pyjuliapkg/install.
[juliapkg]   If you use juliapkg in more than one environment, you are likely to
[juliapkg]   have Julia installed in multiple locations. It is recommended to
[juliapkg]   install JuliaUp (https://github.com/JuliaLang/juliaup) or Julia
[juliapkg]   (https://julialang.org/downloads) yourself.
[juliapkg] Downloading Julia from https://julialang-s3.julialang.org/bin/linux/x64/1.11/julia-1.11.9-linux-x86_64.tar.gz
             download complete
[juliapkg] Verifying download
[juliapkg] Installing Julia 1.11.9 to /home/runner/.julia/environments/pyjuliapkg/pyjuliapkg/install
[juliapkg] Using Julia 1.11.9 at /home/runner/.julia/environments/pyjuliapkg/pyjuliapkg/install/bin/julia
[juliapkg] Using Julia project at /home/runner/.julia/environments/pyjuliapkg
[juliapkg] Writing Project.toml:
           | [deps]
           | PythonCall = "6099a3de-0909-46bc-b1f4-468b9a2dfc0d"
           | OpenSSL_jll = "458c3c95-2e84-50aa-8efc-19380b2a3a95"
           | 
           | [compat]
           | PythonCall = "=0.9.31"
           | OpenSSL_jll = "~3.0"
[juliapkg] Installing packages:
           | import Pkg
           | Pkg.Registry.update()
           | Pkg.add([
           |   Pkg.PackageSpec(name="PythonCall", uuid="6099a3de-0909-46bc-b1f4-468b9a2dfc0d"),
           |   Pkg.PackageSpec(name="OpenSSL_jll", uuid="458c3c95-2e84-50aa-8efc-19380b2a3a95"),
           | ])
           | Pkg.resolve()
           | Pkg.precompile()
  Installing known registries into `~/.julia`
       Added `General` registry to ~/.julia/registries
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
   Installed TableTraits ───────────────── v1.0.1
   Installed Parsers ───────────────────── v2.8.4
   Installed OpenSSL_jll ───────────────── v3.0.20+0
   Installed Scratch ───────────────────── v1.3.0
   Installed MicroMamba ────────────────── v0.1.15
   Installed JSON ──────────────────────── v1.5.0
   Installed Preferences ───────────────── v1.5.2
   Installed PythonCall ────────────────── v0.9.31
   Installed IteratorInterfaceExtensions ─ v1.0.0
   Installed JLLWrappers ───────────────── v1.7.1
   Installed Pidfile ───────────────────── v1.3.0
   Installed micromamba_jll ────────────── v2.3.1+0
   Installed DataValueInterfaces ───────── v1.0.0
   Installed Tables ────────────────────── v1.12.1
   Installed PrecompileTools ───────────── v1.2.1
   Installed DataAPI ───────────────────── v1.16.0
   Installed UnsafePointers ────────────── v1.0.0
   Installed OrderedCollections ────────── v1.8.1
   Installed pixi_jll ──────────────────── v0.41.3+0
   Installed MacroTools ────────────────── v0.5.16
   Installed StructUtils ───────────────── v2.8.0
   Installed CondaPkg ──────────────────── v0.2.34
    Updating `~/.julia/environments/pyjuliapkg/Project.toml`
  [6099a3de] + PythonCall v0.9.31
⌅ [458c3c95] + OpenSSL_jll v3.0.20+0
    Updating `~/.julia/environments/pyjuliapkg/Manifest.toml`
  [992eb4ea] + CondaPkg v0.2.34
  [9a962f9c] + DataAPI v1.16.0
  [e2d170a0] + DataValueInterfaces v1.0.0
  [82899510] + IteratorInterfaceExtensions v1.0.0
  [692b3bcd] + JLLWrappers v1.7.1
  [682c06a0] + JSON v1.5.0
  [1914dd2f] + MacroTools v0.5.16
  [0b3b1443] + MicroMamba v0.1.15
  [bac558e1] + OrderedCollections v1.8.1
  [69de0a69] + Parsers v2.8.4
  [fa939f87] + Pidfile v1.3.0
⌅ [aea7be01] + PrecompileTools v1.2.1
  [21216c6a] + Preferences v1.5.2
  [6099a3de] + PythonCall v0.9.31
  [6c6a2e73] + Scratch v1.3.0
  [ec057cc2] + StructUtils v2.8.0
  [3783bdb8] + TableTraits v1.0.1
  [bd369af6] + Tables v1.12.1
  [e17b2a0c] + UnsafePointers v1.0.0
⌅ [458c3c95] + OpenSSL_jll v3.0.20+0
  [f8abcde7] + micromamba_jll v2.3.1+0
  [4d7b5844] + pixi_jll v0.41.3+0
  [0dad84c5] + ArgTools v1.1.2
  [56f22d72] + Artifacts v1.11.0
  [2a0f44e3] + Base64 v1.11.0
  [ade2ca70] + Dates v1.11.0
  [f43a241f] + Downloads v1.6.0
  [7b1f6079] + FileWatching v1.11.0
  [b77e0a4c] + InteractiveUtils v1.11.0
  [4af54fe1] + LazyArtifacts v1.11.0
  [b27032c2] + LibCURL v0.6.4
  [76f85450] + LibGit2 v1.11.0
  [8f399da3] + Libdl v1.11.0
  [56ddb016] + Logging v1.11.0
  [d6f4376e] + Markdown v1.11.0
  [ca575930] + NetworkOptions v1.2.0
  [44cfe95a] + Pkg v1.11.0
  [de0858da] + Printf v1.11.0
  [9a3f8284] + Random v1.11.0
  [ea8e919c] + SHA v0.7.0
  [9e88b42a] + Serialization v1.11.0
  [fa267f1f] + TOML v1.0.3
  [a4e569a6] + Tar v1.10.0
  [8dfed614] + Test v1.11.0
  [cf7118a7] + UUIDs v1.11.0
  [4ec0a83e] + Unicode v1.11.0
  [deac9b47] + LibCURL_jll v8.6.0+0
  [e37daf67] + LibGit2_jll v1.7.2+0
  [29816b5a] + LibSSH2_jll v1.11.0+1
  [c8ffd9c3] + MbedTLS_jll v2.28.6+0
  [14a3606d] + MozillaCACerts_jll v2023.12.12
  [83775a58] + Zlib_jll v1.2.13+1
  [8e850ede] + nghttp2_jll v1.59.0+0
  [3f19e933] + p7zip_jll v17.4.0+2
        Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading. To see why use `status --outdated -m`
Precompiling project...
    528.7 ms  ✓ IteratorInterfaceExtensions
    534.9 ms  ✓ DataValueInterfaces
    574.6 ms  ✓ DataAPI
    610.2 ms  ✓ UnsafePointers
    901.4 ms  ✓ OrderedCollections
    597.6 ms  ✓ Scratch
    796.5 ms  ✓ Pidfile
    457.8 ms  ✓ TableTraits
   1266.3 ms  ✓ StructUtils
   1068.9 ms  ✓ Preferences
    484.1 ms  ✓ PrecompileTools
    553.0 ms  ✓ JLLWrappers
   1144.2 ms  ✓ Tables
   1125.6 ms  ✓ OpenSSL_jll
    965.2 ms  ✓ StructUtils → StructUtilsTablesExt
   4045.2 ms  ✓ MacroTools
   2113.1 ms  ✓ micromamba_jll
   1978.0 ms  ✓ pixi_jll
   2270.5 ms  ✓ MicroMamba
  13464.3 ms  ✓ Parsers
   3434.0 ms  ✓ JSON
   4288.8 ms  ✓ CondaPkg
  13885.9 ms  ✓ PythonCall
  23 dependencies successfully precompiled in 38 seconds. 27 already precompiled.
  2 dependencies had output during precompilation:
┌ CondaPkg
│  Downloading artifact: pixi
└  
┌ MicroMamba
│  Downloading artifact: micromamba
└  
  No Changes to `~/.julia/environments/pyjuliapkg/Project.toml`
  No Changes to `~/.julia/environments/pyjuliapkg/Manifest.toml`
Detected IPython. Loading juliacall extension. See https://juliapy.github.io/PythonCall.jl/stable/compat/#IPython
/home/runner/work/HDR/HDR/tensorly_hdr/tensorly_hdr/giantly

We can also implement a bare-bones alternating algorithm for NMF with k-sparse coefficients as follows. This algorithm is guaranteed to converge since it is an exact alternating algorithm with unique and attained subproblem solutions (supposing full-rank factors at each iteration), see Alternating optimization.

def barebone_aNNLS_sNMF(Y, U, VT, rank, k, itermax=10, tol=1e-8):
    # inplace modification of initial U and V
    YtY = tl.norm(image, axis=0)**2
    for i in range(itermax):
        print(f"Running iteration {i}")
        VT = ksparse_nnls(U.T@image, U.T@U, YtY, VT, k) # r x n1n2
        U = hals_nnls(VT@image.T, VT@VT.T, U.T).T
        rmse = tl.norm(Y - U@VT)/tl.sqrt(n1*n2*n3)
        print(f"RMSE is {rmse}")
        if rmse<tol:
            break
    return U, VT

Usnmf, VTsnmf = barebone_aNNLS_sNMF(image, tl.copy(U), tl.copy(VTdense), rank, k)
# Clustering with k=1
Usnmf1, VTsnmf1 = barebone_aNNLS_sNMF(image, tl.copy(U), tl.copy(VTdense), rank, 1)

Hide code cell output

Running iteration 0
RMSE is 0.01359225820636328
Running iteration 1
RMSE is 0.012696440458471936
Running iteration 2
RMSE is 0.012601905548613422
Running iteration 3
RMSE is 0.012567772923926697
Running iteration 4
RMSE is 0.012540759561513085
Running iteration 5
RMSE is 0.012514251533131548
Running iteration 6
RMSE is 0.01248105020001841
Running iteration 7
RMSE is 0.01243316542574228
Running iteration 8
RMSE is 0.01238388079875542
Running iteration 9
RMSE is 0.01236016569736541
Running iteration 0
RMSE is 0.023704230429043564
Running iteration 1
RMSE is 0.02192505747529786
Running iteration 2
RMSE is 0.021437248302857356
Running iteration 3
RMSE is 0.021338320710414364
Running iteration 4
RMSE is 0.021307224402818143
Running iteration 5
RMSE is 0.021277418334322656
Running iteration 6
RMSE is 0.021226576969234023
Running iteration 7
RMSE is 0.02111729906887571
Running iteration 8
RMSE is 0.020658511637742103
Running iteration 9
RMSE is 0.019657966506385138

Hide code cell source

fig, axs = plt.subplots(4, rank, figsize=(16, 8.5))
row_names = ["Dense", "SNPA + sNNLS", "sNMF (k=2)", "sNMF (k=1)"]

for i in range(rank):
    # Column title for each latent component
    axs[0, i].set_title(f"Component {i+1}", fontsize=12)

    axs[0, i].imshow(tl.reshape(VTdense[i, :], [n1, n2]))
    axs[0, i].set_axis_off()

    axs[1, i].imshow(tl.reshape(VTsparse[i, :], [n1, n2]))
    axs[1, i].set_axis_off()

    axs[2, i].imshow(tl.reshape(VTsnmf[i, :], [n1, n2]))
    axs[2, i].set_axis_off()

    axs[3, i].imshow(tl.reshape(VTsnmf1[i, :], [n1, n2]))
    axs[3, i].set_axis_off()

fig.suptitle("Abundance maps for each algorithm", fontsize=17)
fig.subplots_adjust(left=0.105, right=0.995, top=0.90, bottom=0.03, wspace=0.03, hspace=0.06)

# Row labels aligned to each row center and placed close to first-column images
for r, name in enumerate(row_names):
    bbox = axs[r, 0].get_position()
    x = bbox.x0 - 0.004
    y = bbox.y0 + bbox.height / 2
    fig.text(x, y, name, va="center", ha="right", fontsize=12)

fig2, ax2 = plt.subplots(3, 1, figsize=(10, 9), sharex=True)
component_labels = [f"Component {j+1}" for j in range(rank)]

# Plot dense spectra first to define the component colors
lines0 = ax2[0].plot(U)
colors = [line.get_color() for line in lines0]

# Reuse the same colors in all coefficient plots so each component is traceable
for j in range(rank):
    ax2[1].plot(Usnmf[:, j], color=colors[j])
    ax2[2].plot(Usnmf1[:, j], color=colors[j])

ax2[0].set_title("SNPA")
ax2[1].set_title("sNMF k=2")
ax2[2].set_title("sNMF k=1")

fig2.legend(
    lines0,
    component_labels,
    title="Estimated spectra",
    loc="upper center",
    ncol=min(rank, 5),
    frameon=False,
    bbox_to_anchor=(0.5, 1.01),
)
fig2.tight_layout(rect=[0, 0, 1, 0.95])

plt.show()
../../_images/74e297e9762d4bc89441a1d578137811d47c3bd788e45c3eda77dbdf02a7c561.png ../../_images/d9a40c3b0aae100834d00d0fcebc12ab9e1380580d61ffdc7b1ee4625a5004ad.png