1 Introduction
Boolean Satisfiability problem is a NPcomplete problem.[Karp1972] It implies, all other NPcomplete problems can be reduced to Boolean Satisfiability problem in polynomial time. So, if there exist an algorithm that can solve Boolean Satisfiability problem in polynomial time, then every other NPcomplete problem can be solved in polynomial time. However, still there does not exist such algorithm, that has been proved to solve the problem in polynomial time. It has lead to formulation of P versus NP problem defined by Stephen Cook in [Cook2006]. History and importance of P versus NP problem has been discussed in detail in [Cook2006].
Various attempts have been made to create efficient algorithms and systems to solve Boolean Satisfiability problem.
In 1971, in [Cook1971], it was shown that any boolean formula, in CNF, can be converted into a formula with at most three literals in polynomial time, based on the assumption, that number of clauses in the given formula are of the polynomial length.
In 1992, in [Selman1992], a greedy local search procedure called GSAT, was introduced. It was shown that, GSAT can solve structural satisfiability problems quickly. However, for testing of the algorithm, input contained a considerably less number of clauses for the given number of variables, for e.g. formulas with 50 variables having only 215 clauses were used. Whereas, it has been found that, a satisfiable boolean formula, with 50 variables can have more than clauses.
In 2001, in [Moskewicz2001], development of a new complete solver, Chaff, has been described. It has been shown that, Chaff has been able to obtain one to two orders of magnitude performance improvement on difficult SAT benchmarks in comparison with other solvers. In experiments, in [Moskewicz2001], the benchmark problems were used, but again, these problems contained considerably less number of clauses for the given variables.
In 2018, in [Yin2018], used problems with 50 variables and 212 clauses for testing, which is again, contained considerably less clauses for the given variables.
In 1999, in [Friedgut1999], sharp thresholds for graph properties and SAT problem were presented. SAT is a special case of SATISFIABILITY problem defined in [Karp1972], and any boolean formula in SAT is a special case of general boolean formula in CNF.
Most of the results presented in above works used much less number of clauses for a given number of variables. Further, no study has been able to establish a relationship between number of variables, number of clauses, and satisfiability of a general boolean formula in CNF.
In this paper, properties of clauses have been studied, novel relationships have been defined among clauses, and a necessary and sufficient condition has been established that determine satisfiability of any boolean formula in CNF. Further, it has been found that, any algorithm that solves Boolean Satisfiability problem, can be divided in two parts, one part generate possible solutions, which has exponential complexity and other part is similar to linear search. Thus combined complexity of any algorithm is of exponential order, which implies satisfiability cannot be solved in polynomial time, which implies . [Karp1972]
However, The necessary and suffient condition for satisfiability can be used to optimise existing algorithms, like DPLL[dpll], for improving complexity of bestcase scenarios.
2 Boolean Satisfiability Problem
As defined in [Karp1972], For the given clauses , we need to find whether conjuction of the given clauses is satisfiable or not.
3 Terminology used
The terms literal, boolean variable, clause are used with same meaning as defined in [Heule2015]. A boolean formula in CNF is a conjuction of clauses. It can be represented as a finite set of clauses. [Heule2015] A set of boolean variables is called a variable set.
4 Notations used
The notations, representing basic relations between sets have been used as defined in [jech2013set].
4.1 Variable cases
Let a variable, X, can be assigned values , and , independently, then, it is written as:
5 Tautology Clause
A clause, which evaluates to for every valuation, is called a tautology clause. If a clause contains a complemented pair of literals, it is a tautology.[Heule2015] In other words, If , then is a taulogy clause.
5.1 Significance of tautology clause in satisfiability problem
As a tautology clause always evaluates to , that is represented by 1 in boolean algebra. Let F is a boolean formula in CNF, which containins a tautology clause, we can write
(1) 
where is a tautology clause.
by using Identity property of Boolean algebra,
(2) 
Hence, The tautology clause has no effect on satisfiablity of a boolean formula, So, It can be ignored while solving satisfiability problem.
5.2 NonTautology Clause
A clause which is not a tautology is called a nontautology clause.
Lemma 1
If is a nontautology clause, then .
Proof
Given that, is not a tautology clause. Let, for the sake of contradiction,
N is a tautology clause(from definition), which is not true. So, our assumption is wrong. Hence,
Lemma 2
If is a clause, with n literals, such that, then
Proof
Given that, is a clause, by definition, is a disjunction of literals, so, we can write,
also, given that, , so we can write,
by using dominance law of boolean algebra,
Hence proved.
Lemma 3
If is a clause, with n literals, such that, then
Proof
Given that, is a clause, by definition, is a disjunction of literals, so, we can write,
also, given that,
Suppose, for the sake of contradiction, for some . Using Lemma 2, we get,
which is a contradiction, Therefore, the assumption, is not true, hence,
(3) 
Hence proved.
Theorem 1
If and are clauses, such that,
and
then
6 Clause over a variable set
A nontautology clauses, , is called a clause over a variable set, , if,
For e.g. clauses, and are clauses over variable set,
6.1 Fully Populated Clause over a variable set
A clause, , is called a fully populated clause over a variable set, , if
For e.g. clause is a fully populated clause over variable set,
Lemma 4
If is a clause over a variable set, , then, , such that, is a fully populated clause over .
Proof
Lemma 5
If is a fully populated clause over a variable set, , then , , such that, is a fully populated clause over variable set
Proof
Given that, is a fully populated clause over a variable set,
Suppose,
is a clause over .
From Lemma 4, , such that, is a fully populated clause over
Hence, , , such that, is a fully populated clause over
Lemma 6
For any given valuation to a variable set, , there exist a fully populated clause, say , over , such that,
Proof
Let, the variable set, , is given by,
where
Let, each has been assigned any of the values given above.
Now, we define a clause, , depending upon the valuation assigned above,
is a fully populated clause. and,
By putting values assigned for variables in , in , we get,
Hence, for any given valuation to the variable set, , there exists a fully populated clause, , such that,
Theorem 2
For a given set of variables, , with n variables, there exist fully populated clauses.
Proof
For a given set of variables, , with n variables, we can write a fully populated clause in the general form, given by,
where
i.e. each can be assigned a value in two ways, independently. As there are number of variables, in a clause. So, by using basic principle of counting, there are ways, in which, a clause can be selected. Hence, for a given variable set, , with variables, there exist number of fully populated clauses.
7 Sibling Clause
Two unequal fully populated clauses over a common variable set, , are called sibling clauses. In other words, If and are two nontautology clauses, such that:
then is a sibling clause of and viceversa. For e.g. and are sibling clauses over a variable set, .
Lemma 7
If A and B are two sibling clauses, and , then .
Proof
Given that and are sibling clauses.
(10) 
Lemma 8
If and are two sibling clauses, over a variable set, , and and are power sets of and , respectively, then
such that, and are sibling clauses.
Proof
Given that, and are sibling clauses over a variable set, . It implies, by definition of sibling clauses, and are fully populated clauses over .
From Lemma 5,
(11) 
such that, is a fully populated clause over . Or we can write,
(12) 
such that, is a fully populated clause over .
Now, We define a set, ,
(13) 
(14) 
as is a fully populated clause over , and ,
Thus, is a fully populated clause over .
(15) 
such that, and are fully populated clauses over a common variable set, .
(16) 
Now, there can be two cases, either or ,
Suppose,
from Eq. 14,
But, given that,
Thus, from Eq. 16, and are two unequal fully populated clauses over a common variable set,
and are sibling clauses.
Hence,
such that, and are sibling clauses.
8 Cardinality of a Boolean formula in CNF
As we know that, a clause is a set of literals. For a variable , there are two literals, i.e. and . Let us represent each literal as . So, for each variable , there are two literals and . For a variable set, , of variables, there are literals. So, a general clause in a boolean formula, can be written in the form, given by,
where,
As each can be selected in two ways, independently, and there are literals in . So, using fundamental counting principle, the total number of clauses possible are given by,
Hence, maximum possible cardinality of a boolean formula in CNF, is , including a null clause, .
9 Boolean formula in effective CNF
A boolean formula in CNF, given by, , is called a boolean formula in effective CNF, if it does not contain a tautology clause. We can write,
(17) 
Or, is nontautology clause. As discussed in Section 5.1 , a tautology clause has no effect on satisfiability of a CNF. So, for any given boolean formula, if we can identify tautology clauses, and ignore their existence, we can get an effective CNF.
10 A complete boolean formula
A boolean formula, , containing every possible nontautology clause, over a set of variables, , including clause, is called a complete boolean formula. For eg. for variable set, , the complete boolean formula is given by,
where is a null clause.
In sets notation, it can be written as:
Theorem 3
If is a complete boolean formula, over a variable set, , of variables, then contains clauses, including a clause.
Proof
Given, that is a complete boolean formula, over , and contains variables.
From the definition of a complete boolean formula, we know that,
We can write, a general clause in as,
(18) 
where,
is a variable, which can be assigned the values or independently. A value for means, the clause , neither contain nor .
Now, Each can be assigned a value in 3 different ways. By using fundamental counting principle, the total number of clauses possible is given by,
Also, there will be a clause in which, is assigned value . It will result in a clause, given by, . Hence, contains clauses, including a clause.
Corollary 1
If is a complete boolean formula, over a variable set, , with variables, then, for any given variable ,
where, is number of clauses containing ,
is number of clauses containing
is number of clauses containing neither nor
Proof
As explained in Theorem 3, for the given complete boolean formula , with n clauses, we can write a clause in the form, given by,
(19) 
where,
Now, suppose, we put for some in Eq. 19, we get,
We have assigned the value to one of the variables. There are variables remaining, to which, we can assign values independently. Each variable can be assigned 3 values independently. Thus, by using the fundamental counting principle, the total number of clauses with is given by,
Similarily, by assigning and we find
and
Hence, we get,
Corollary 2
A complete boolean formula, can be written as:
where is set of all poosible fully populated clauses over a set of variables, .
Proof
As explained in Theorem 3, for the given complete boolean formula , with n clauses, we can write a clause in the form, given by,
(20) 
where,
But, First, if we assign
We get a set of all fully populated clauses over , say , given by,
Then, we assign, for any
We get power set of clause . By assigning vaues, as above, , we get all possible clauses over . Thus, we can write:
(21) 
Hence proved.
Theorem 4
If is a powerset of C, where C is a fully populated clause, over a variable set, , with variables, then,
where,
is number of clauses containing , in and
is number of clauses not containing , in
Proof
Given that, is a fully populated clause, over a variable set, , with variables, and is a power set of . Let . We can write, , in general form,
where,
Now, if we put for some , we get,
We have assigned the value to one of the variables. There are variables re maining, to which, we can assign values independently. Each variable can be assigned 3 values independently. Thus, by using the fundamental counting principle, the total number of clauses with is given by,
Similarily, by assigning we find
Hence, we get,
Theorem 5
If there exists a fully populated clause, over , such that,
where, is a complete boolean formula over , then, is satisfiable.
Proof
Given that,
Suppose is any clause in , i.e.
Let is a set of all fully populated clauses over , then from Corollary 2
As, are unequal fully populated clauses over , which implies, from the definition of sibling clauses, are sibling clauses, including .
As is any clause in
such that, and are sibling clauses.
As, is a fully populated clause, so for a valuation, given by,
from Theorem 1
As and are sibling clauses, from Lemma 7
for valuation
is satisfiable.
Theorem 6
If is satisfiable, and , then is satisfiable.
Proof
Given that,
As is satisfiable. It implies, there exists a valuation, such that,
Hence, is satisfiable.
Theorem 7
If is satisfiable, then, there exists a fully populated clause, , such that,
Proof
Given that, is satisfiable.
Now, suppose, for the sake of contradiction, that, there does not exist a fully populated clause, , such that,
From Lemma 6, for any valuation, to the variable set, ,
for any valuation, to the variable set, , in which,
, from Theorem 1, for any valuation, to the variable set, , in which,
for any valuation, to the variable set, ,
is unsatisfiable. Which is a contradiction, so our assumption was wrong. Hence, there exist a fully populated clause, , such that,
Hence proved.
Theorem 8
is satisfiable, if and only if, there exists a fully populated clause, , such that,
11 Time complexity of an algorithm to solve boolean satisfiability problem
Theorem 9
The time complexity of an algorithm to solve boolean satisfiability problem is
Proof
The Theorem 8 provides us the necessary and sufficient condition for satisfiability of a boolean formula. Which states that, a boolean formula is satisfiable, iff there exist a fully populated clause, , such that, the clause, and all it’s subsets are absent in the given boolean formula.
And, for a valuation, in which, , makes that formula saisfiable. So, an algorithm is not required to check for absence of each subset of , individually, as all subsets are related to each other by the results of Theorem 1. However, if an algorithm process individual clauses in a given boolean formula, it would require to process clauses in the order of to in worst case scenarios. But, the existence of each fully populated clause is independent of existence of the other clauses. So, any algorithm shall require to search for a fully populated clause, , from the list of all possible fully populated clauses for the variable set, , over which the boolean formula has been defined. Which implies that, boolean satisfiability problem is basically a searching problem.
Further, the input of the satisfiability problem is not mentioned to be in sorted form.[Karp1972] However, the algorithm may attempt to sort the input, but it will lead to additional complexity of , where is the number of elements in the list. From Theorem 2, we know, that number of fully populated clauses, for a variable set, , of variables, is . Which implies, there will be additional complexity of or . Which implies, it can be solved using linear search only. And time complexity of linear search is , where is the number of items in the list. From Theorem 2, we know, that number of fully populated clauses, for a variable set, , of variables, is . Which implies, the time complexity of an algorithm to solve boolean satisfiability problem is . Hence proved.
12 Implications on P vs NP Problem
In [Karp1972] it has been established that SATISFIABILITY is a NPcomplete problem. In Theorem 9, it has been proved that, the time complexity of an algorithm to solve boolean satisfiability problem is , which is not polynomial. Which implies,
From Corollary 1. in [Karp1972]
13 Cardinality Function
Cardinality function can be used to determine cardinality of a given boolean formula. It can be used to optimise existing algorithms. For a given boolean formula in CNF, we define a function where, and , by replacing disjunction with multiplication and conjuction with addition. For e.g. Let the boolean formula, , is given by,
(22) 
We define the function, , given by:
(23) 
In general, can be defined as:
(24) 
where, p is number of clauses in F and
(25) 
where, is clause in the given boolean formula, . It is to be noted that is a function on integers, i.e.
The algorithm for for the boolean formula given in Eq. 22 is given below:
13.1 Total number of clauses
Following algorithm can be used to check total number of clauses in
13.2 Total number of clauses containing
Following algorithm can be used to check total number of clauses, containing in
13.3 Total number of clauses containing
Following algorithm can be used to check total number of clauses, containing in
13.4 Total number of clauses containing
Following algorithm can be used to check total number of clauses, containing or in
13.5 Checking tautology clauses in
The following algorithm can be used to check existence of tautology clauses, in a given formula, , in polynomial time.
14 Optimisations
The results produced from above theorems and algorithms provided above using novel cardinality function can be used to optimise any algorithm that solve satisfiability problem.
From Section 8, we know that maximum possible cardinality of a boolean formula in CNF, is , including a null clause, .
We know that cardinality of a power set is ,
where, , as is a fully populated clause.
So, we can use following constraint for optimisation:

If then is unsatisfiable.
If is in effective CNF, whcih can be checked using algorithm given in Algorithm 6, then we have:

If then is unsatisfiable.

If then is unsatisfiable.

If then , belong to the solution

If then , belong to the solution
15 Conclusion
A necessary and sufficient condition has been established to determine satisfiability of a boolean formula in CNF. It has been found that a satisfiable boolean formula with variables, can have clauses. This property can be used to improve encryption algorithms.[Cook2006] While, same condtion can be used to optimise existing algorithms to solve Boolean Satisfiabilty problem, which has applications in automatic theorem proving procedures.[dpll]
Comments
There are no comments yet.