Knapsack Cryptosystems (Part 1)

When people think of public key cryptosystems today, the classic algorithms of RSA and Diffie-Hellman jump to mind. However, an oft-forgotten public key cryptosystem developed around the same time as RSA and Diffie-Hellman was the knapsack-based cryptosystem by Ralph Merkel and Martin Hellman. While knapsack cryptosystems have been broken by lattice-based techniques (to be discussed in a future post), their construction is still quite elegant and noteworthy. In this two-part sequence, I will outline the construction of the knapsack cryptosystem.

Knapsack cryptosystems represent one of the first examples of using an NP-complete problem to develop a secure public key scheme. In this case, we use the subset-set problem. In the subset-sum problem, we are given a list of positive integers X=(x_1,x_2,\ldots,x_n) and a target value y. The problem is to find a subset X'=(x'_1,x'_2,\ldots,x'_k)\subseteq X such that \sum_{i=1}^k x'_i=y. As a small example, let X=(3,5,6,10). If y=9, then there exists a subset of X whose sum is y=9, namely the subset X'=(3,6). If instead however, y=12, then there is no subset of X that sums to y. Note that in this case, we are not allowed to use the 6 twice. (As a side note, the subset-sum problem can be viewed as a special case of the 0-1 knapsack problem where the weights and values of each item are equal. This is where the cryptosystem derives its name.)

While the subset-sum problem is NP-complete in general, for certain sets X, the problem can be solved efficiently. One particular case is when the set of numbers X is superincreasing.

Definition. Let X=(x_1,x_2,\ldots,x_n) be a sequence of positive numbers where x_1 \le x_2 \le \cdots \le x_n. Then, we say that X is superincreasing if for all 1\le i\le n-1, we have 2\cdot x_i< x_{i+1}.

For instance, the set (1,3,7,20,50,120) is superincreasing (each number is more than twice the previous number).

A useful property of superincreasing sets is that \sum_{i=1}^k x_i< x_{k+1} for all k=1,\ldots, n-1.

Lemma. Let X=(x_1,x_2,\ldots,x_n) be a superincreasing sequence of positive numbers where x_1 \le x_2 \le \cdots \le x_n. Then, for all 1\le k<n, \sum_{i=1}^k x_i< x_{k+1}.

Proof. This is easily seen via induction on k. For k=1, we have that x_1 < 2\cdot x_1 < \cdot x_2 using the fact that x_1 is positive and the sequence is superincreasing. Now, for the inductive step, we have

\sum_{i=1}^{k+1}x_i=\sum_{i=1}^k x_i+x_{k+1}<x_{k+1}+x_{k+1}=2\cdot x_{k+1}<x_{k+2}

using the inductive hypothesis and the fact that the sequence of x_i is superincreasing.

It is not difficult to see that if X is a superincreasing set, the subset-sum problem becomes easy. Specifically, given a superincreasing set X and a target value y, we proceed as follows:

  1. Without loss of generality, let X=(x_1,x_2,\ldots,x_n) where x_1 < x_2 < \cdots < x_n. Initialize X'=().
  2. If x_n \le y, then add x_n to the set X' and subtract x_n from y. If y=0, then we have found our subset X'. Repeat this process for each of x_{n-1},x_{n-2},\ldots, x_1.

At the conclusion of the algorithm, either we will have a subset X', or we will not. If we do not, then we conclude that no such subset exists. Observe that the above algorithm is linear in the cardinality of X, so this is quite efficient.

In words, at each step, the algorithm selects the largest element in the set X that does not exceed the target y and adds it to the subset. The value of the target y is then updated. To see that this works, suppose that we have x_1<\cdots<x_k<y<x_{k+1}<\cdots < x_n. We claim that x_k must be in any subset that sums to y. First, since x_{k+1},\ldots,x_n are greater than y, they cannot be in the subset. If x_k is not in the subset, then the maximum sum of the subset is given by

\sum_{i=1}^{k-1} x_i<x_k<y

using the above lemma. Thus, any subset that does not contain x_k cannot sum to y. Thus, we see that if a subset summing to y exists, then it must contain the maximum element in the set that is less than y. To summarize, we have demonstrated that the subset problem is easy over superincreasing sets.

In the next post, I will show how we can exploit this property to construct a public key cryptosystem based on the hardness of the general subset-sum problem. The idea here is that we can choose the set X such that we can solve the subset-set problem over X by converting it to an equivalent subset-sum problem over a superincreasing set X' using some secret information (e.g., the secret key).


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s