Claim 4.3 For any j ∈ J we have j ∈ hσ,b(S). Therefore, |J| = |P| ≤ |S|.
Proof Consider any j ∉ hσ,b(S). From Equation (A.1) in the proof of Lemma 4.2 it follows that |ûj| ≤ δnL < 1/2.
Lemma 4.3 Consider an execution of NOISELESSSPARSEFFTINNER, and let S = supp(x̂ − ẑ).If |S| ≤ k′, then
Proof Let e denote the number of coordinates i ∈ S for which either Ecoll(i) or Eoff (i) holds. Each such coordinate might not appear in P with the correct value, leading to an incorrect value of ŵi. In fact, it might result in an arbitrary pair (i′, v′) being added to P, which in turn could lead to an incorrect value of
Since E[e] ≤ (4|S|/B + α)|S| ≤ (4β + α)|S|, the lemma follows.
Analysis of NOISELESSSPARSEFFT
Consider the tth iteration of the procedure, and define St = supp(x̂ − ẑ) where ẑ denotes the value of the variable at the beginning of loop. Note that |S0| = |supp(x̂)| ≤ k.
We also define an indicator variable It which is equal to 0 iff |St|/|St−1| ≤ 1/8. If It = 1 we say the tth iteration was not successful. Let γ = 8 · 8(β + α). From Lemma 4.3 it follows that
For any t ≤ 1, define an event E(t) that occurs iff
Lemma 4.4 Let E = E(1) ∪ … ∪ E (λ) for λ = 1 + log k. Assume that (4γ)1\2 < 1/4. Then Pr[E] ≤ 1/3.
Proof Let t′ = [t/2]. We have
Therefore,
Theorem 4.1 Suppose x̂ is k-sparse with entries from {−L, …, L} for some known L = nO(1). Then the algorithm NOISELESSSPARSEFFT runs in expected O(k log n) time and returns the correct vector x̂ with probability at least 2/3.
Proof The correctness follows from Lemma 4.4. The running time is dominated by O(log k) executions of HASHTOBINS.
Assuming a correct run, in every round t we have
Therefore,
so the expected running time of each execution of HASHTOBINS is
4.3 Algorithm for the General Case
For the general case, we only achieve Equation (4.1) for
Pseudocode for the general algorithm SFT 4.0 is shown in Algorithms 4.2 and 4.3.
4.3.1 Intuition
Let S denote the “heavy” O(k/∊) coordinates of x̂. The overarching algorithm SPARSEFFT (SFT 4.0) works by first “locating” a set L containing most of S, then “estimating” x̂L to get ẑ. It then repeats on
Algorithm 4.2 SFT 4.0: General Sparse Fourier Transform for k = o(n), part 1 of 2
In the rest of this intuition, we will discuss the first iteration of SPARSEFFT with simplified constants. In this iteration, hashes are to B = O(k/∈) bins and, with 3/4 probability, we get ẑ so
Location
As in the noiseless case, to locate the “heavy” coordinates we consider the “bins” computed by HASHTOBINS with Pσ,a,b. This roughly corresponds to first permuting the coordinates according to the (almost) pairwise independent permutation Pσ,a,b, partitioning the coordinates into B = O(k/∊) “bins” of n/B consecutive indices, and observing the sum of values in each bin. We get that each heavy coordinate i has a large constant probability