1. State Key Laboratory of Complex and Critical Software Environment, Beihang University, Beijing 100191, China
2. Department of Mathematics and Statistics, Beijing Technology and Business University, Beijing 100048, China
kexu@buaa.edu.cn
Show less
History+
Received
Accepted
Published
2025-03-07
2025-05-18
Issue Date
Revised Date
2025-07-01
PDF
(483KB)
Abstract
In this paper, we identify the distinction between non-brute-force computation and brute-force computation as the most fundamental problem in computer science. Subsequently, we prove, by the diagonalization method, that constructed self-referential CSPs cannot be solved by non-brute-force computation, which is stronger than P NP. This constructive method for proving impossibility results is very different (and missing) from existing approaches in computational complexity theory, but aligns with Gödel’s technique for proving logical impossibility. Just as Gödel showed that proving formal unprovability is feasible in mathematics, our results show that proving computational hardness is not hard in mathematics. Specifically, proving lower bounds for many problems, such as 3-SAT, can be challenging because these problems have various effective strategies available to avoid exhaustive search. However, for self-referential examples that are extremely hard, exhaustive search becomes unavoidable, making its necessity easier to prove. Consequently, it renders the separation between non-brute-force computation and brute-force computation much simpler than that between P and NP. Finally, our results are akin to Gödel’s incompleteness theorem, as they reveal the limits of reasoning and highlight the intrinsic distinction between syntax and semantics.
Model RB is a random constraint satisfaction problem (CSP) model that was proposed by Xu and Li [1] in 2000, which could also be encoded to well-known NP-complete problems like SAT and CLIQUE. The purpose of this model was to address the issue of trivial unsatisfiability that was prevalent in previous random CSP models. One of the key features of Model RB is that its domain size grows with the number of variables . Additionally, Model RB has been proved to exhibit exact phase transitions from satisfiability to unsatisfiability, making it a useful tool for analyzing and evaluating the performance of algorithms. Over the last two decades, Model RB has been extensively researched from multiple perspectives, as evidenced by various studies (e.g., [2–21]). Moreover, this model has gained significant popularity and widespread use in renowned international algorithm competitions. A random instance of Model RB with a planted solution named frb100-40, where and , has remained elusive since it was made available online in 2005 as a 20-year challenge for algorithms. Despite numerous attempts, no one has been able to solve it thus far. In summary, the results suggest that Model RB possesses nice mathematical properties that can be easily derived. In contrast to its mathematical tractability, the random instances of this model, particularly those generated in the phase transition region, present a significant challenge for various algorithms, proving to be extremely difficult to solve.
As shown in the proof of Gödel’s incompleteness theorem [22], the constructive approach plays an indispensable role in revealing the fundamental limitations of finite formal systems. An algorithm is essentially a mechanized finite formal system. In this paper, we will study the limitations of algorithms based on the constructive approach. Specifically, we will explore whether non-exhaustive (non-brute-force) algorithms can always replace exhaustive (brute-force) ones, or if some computable problems inherently lack such alternatives. The Strong Exponential Time Hypothesis [23] conjectures that there is no non-brute-force algorithm for SAT. This problem is very similar to the foundational problem resolved by Gödel’s incompleteness theorem, originally proposed by David Hilbert nearly a century ago: whether a finite formal system can always replace a branch of mathematics (e.g., arithmetic) that contains infinitely many true mathematical statements. These two problems raise essentially the same deep philosophical question: can the part always replace the whole within the limits of reasoning? This inquiry concerns the limits of human knowledge—a subject extensively explored by many great philosophers (e.g., Laozi, Zeno, Socrates, Descartes, Kant, and Wittgenstein). Therefore, the most fundamental problem in computer science is non-brute-force computation vs. brute-force computation, rather than P vs. NP. From a mathematical perspective, the distinction between non-brute-force computation, which takes time (where is a constant), and brute-force computation, which takes time, seems more natural and intuitive compared to that between P and NP. From a practical standpoint, the framework of non-brute-force computation vs. brute-force computation is also broader and more universally applicable than that of P vs. NP. For instance, the P vs. NP paradigm does not apply to problems where brute-force algorithms run in polynomial time or those that lie outside NP. However, even in such cases, we can still explore the possibility of developing non-brute-force algorithms. Finally, from a historical perspective, non-brute-force computation vs. brute-force computation extends Gödel’s framework, which unveils the limits of mathematics. Similarly, P vs. NP builds on Turing’s framework, shedding light on the limits of machines. The limits of mathematics are more fundamental than those of machines, as anything mathematics cannot achieve is also beyond the reach of machines.
The advantages of Model RB enable us to choose specific threshold points at which instances with a symmetry requirement are on the edge of being satisfiable and unsatisfiable. In fact, we will show that there exist instances at exactly the same point which are either satisfiable with exactly one solution or unsatisfiable but only fail on one constraint. The satisfiability of such instances can be flipped under a special symmetry mapping. As a result, they form a fixed point set under this mapping, allowing us to create the most indistinguishable examples (self-referential examples) which are a source of computational hardness. Based on the symmetry mapping and driven by the famous method of diagonalization and self-reference, we show that unless exhaustive search is executed, the satisfiability of a certain constraint (thus the whole instance) is possible to be changed, while the subproblems of the whole instance remain unchanged. Therefore, whether the whole instance is satisfiable or unsatisfiable cannot be distinguished without exhaustive search. In summary, if we can construct the most indistinguishable examples with exactly the same method and the same parameter values (a task that is both rare and challenging), then it is not hard to understand and prove why they are extremely hard to solve.
2 Model RB
A random instance of Model RB consists of the following:
● A set of variables : Each variable takes values from its domain , and the domain size is , where for , and is a constant.
● A set of constraints (, where is a constant): for each , constraint . ( is a constant) is a sequence of distinct variables chosen uniformly at random without repetition from . is the permitted set of tuples of values which are selected uniformly without repetition from the subsets of , and where is a constant.
In this paper, we have a symmetry requirement of the permitted set of each constraint, and the permitted sets will be generated in the following way. Initially, we generate a symmetry set which contains tuples of values, then generate each permitted set of the constraint () by running random permutations of domains of variables in based on . For example, if and the domains are , then is a symmetry set. If we run a random permutation of , e.g., , then we get a permitted set . Through this method all are isomorphic and every domain value of the variables shares the same properties.
A constraint is said to be satisfied by an assignment if the values assigned to are in the set . An assignment is called a solution if it satisfies all the constraints. is called satisfiable if there exists a solution, and called unsatisfiable if there is no solution. It has been proved that Model RB not only avoids the trivial asymptotic behavior but also has exact phase transitions from satisfiability to unsatisfiability. Indeed, denote the probability that a random instance of Model RB is satisfiable, then
Theorem 2.1 ([1]). Let . If , are two constants and satisfy the inequality , then
In the following we will present some properties of Model RB which are important to prove our main theorems in the next section. From here on we tacitly take , where , and take
where . In fact note that and a simple calculation yields that for all under the condition that . Thus it is possible to take large enough such that .
First, we bound the probability that a random RB instance is satisfiable.
Lemma 2.2 Let be a random CSP instance of Model RB with variables and constraints. Then
where , and is the number of variables for which an assignment pair take the same values. Note that , using an argument similar to that in [1,7], we obtain that only the terms near and are not negligible, and
Indeed, asymptotic calculations show that
where is an integer. Note that and are constants, thus the upper bound of (2.4) comes from and .
Using the Cauchy inequality, we get . □
As an immediate consequence of Lemma 2.2 we obtain a lower bound of the probability that has exactly one solution.
Corollary 2.3 Let be a random CSP instance of Model RB with variables and constraints. Then the probability that has exactly one solution is at least .
Proof Let be the probability that has exactly one solution, and be the probability that has at least two solutions. Then from Lemma 2.2, we have
Therefore . □
Next we show that if a random instance is unsatisfiable, then w.h.p. it fails at only one constraint. We introduce the following definitions.
Definition 2.1 Let be a CSP instance. A constraint is called a self-unsatisfiable constraint if there exists an assignment under which is the only unsatisfied constraint in . If variable is contained in , then is called a self-unsatisfiable variable. If is unsatisfiable and every variable is a self-unsatisfiable variable, then is called a self-unsatisfiable formula.
Lemma 2.4 Let be a random CSP instance of Model RB with variables and constraints. If is unsatisfiable, then w.h.p. is a self-unsatisfiable formula.
Proof First we show that for any constraint of , with positive probability there exists an assignment which satisfies all constraints except . In fact let be the number of such assignments, then
Then . Therefore the probability that is a self-unsatisfiable constraint is at least .
Next, since the number of constraints is , the average degree of each variable is . By the Chernoff Bound,
where Deg denotes the degree of variable . From the requirement (2.1) we know that , thus
Therefore, almost surely all variables have degree at least .
Furthermore, note that each variable appears in at least constraints and the probability that each constraint appears to be a self-unsatisfiable constraint is at least , thus the probability that is not a self-unsatisfiable variable is at most .
Note that (2.1) entails that , therefore the probability that there exists a variable which is not self-unsatisfiable is at most
Thus w.h.p. all variables are self-unsatisfiable variables. □
Remark 2.1 We claim that Lemmas 2.2 and 2.4 hold for any domain size greater than polynomial. Take exponential domain size (where is a constant) for example, similarly, the dominant contributions of come from and , and asymptotic calculations show that and
are negligible for small integer , since . Moreover, probability analysis holds more easily in the proof of Lemma 2.4 if ((2.6) and (2.7)).
Next we define a symmetry mapping of a constraint which changes its permitted set slightly.
Definition 2.2 Consider a random instance of Model RB with . Assume that is a constraint of and , then a symmetry mapping of is to change by choosing (), (), where and , and then exchanging with (see Fig.1).
With the above properties of Model RB, we obtain the following interesting results.
Theorem 2.5 There exists an infinite set of satisfiable and unsatisfiable instances of Model RB such that this set is a fixed point under the symmetry mapping of changing satisfiability.
Proof Let be the set of RB instances with variables and constraints, where each instance either has a unique solution or has no solution.
Assume that has exactly one solution , which happens with probability at least from Corollary 2.3. For an arbitrary constraint , assume without loss that and are the domains of and , respectively. By the symmetry requirement, there exist such that , then we will exchange with . It is easy to see that this symmetry mapping will convert into an unpermitted tuple and convert into permitted tuples, thus is no longer a solution. However, it is possible that at most pairs can be expanded to new solutions (this is because by the symmetry requirement, the degree of each domain value of each variable is ). Specifically, the probability that can be expanded to a new solution is at most , thus a simple calculation yields that the probability that is still satisfiable after the symmetry mapping is at most .
Assume that is an unsatisfiable instance, then w.h.p. all variables in are self-unsatisfiable variables from Lemma 2.4. This implies that there exist a constraint and an assignment such that is the only unsatisfied one under . Assume without loss that and are the domains of and , respectively. It is apparent that . By our symmetry requirement, there exist , where and , such that a symmetry mapping of exchanging with will convert into a permitted tuple, thus becomes satisfiable under . Moreover, using a similar argument as above, we can see that the probability that the new pairs could expand to solutions is at most . Thus w.h.p. has only one solution after this symmetry mapping.
From the above two cases we can see that for any , the symmetry mapping changes its satisfiability, however, still belongs to after the mapping, thus can be considered as a fixed point under the symmetry mapping. □
In this section, it has been shown that we can construct satisfiable and unsatisfiable instances using exactly the same method and the same parameter values. Moreover, these satisfiable and unsatisfiable instances can be transformed into each other by performing the same mapping. This property is very similar to that of the self-referential proposition introduced by Gödel [22] in order to prove that such a proposition can be neither proved nor disproved (i.e., whether this proposition is true or false cannot be distinguished in finite formal systems). Gödel’s results reveal the fundamental difference between the syntax defined by rules and the semantics defined by models (or assignments). An algorithm is essentially a finite sequence of rules, which can also be viewed as a finite formal system. In contrast, the exhaustive search method determines, using the semantic definition of a property, whether this property can be satisfied through trying every possible model (or assignment) one by one. Inspired by Gödel’s idea, we refer to these satisfiable and unsatisfiable instances as the most indistinguishable examples or self-referential examples. Their self-referential property makes them extremely hard to differentiate syntactically. Simply put, in this context, syntax cannot replace semantics.
3 Main results
Proving complexity lower bounds (algorithmic impossibility results) for a given problem is essentially reasoning and making conclusions about an infinite set of algorithms. In mathematics, any such conclusion should be based on assumptions about the nature of the infinite set. These assumptions must be consistent with the reality and usually appear as axioms. Similar to many combinatorial problems, the general CSP has no global structure that can be exploited to design algorithms. The only exact algorithm currently available for solving CSPs is a divide-and-conquer method that systematically explores the solution space while employing various pruning strategies to enhance efficiency.
In this paper, we view an algorithm as a finite formal system which is defined by a finite set of symbols and rules. It is easy to see that finite formal systems possess greater expressive power than algorithms (Turing machines). This is because finite formal systems are essentially finite sets of rules while algorithms are essentially finite sequences of rules. Here we use finite formal systems to solve CSPs (i.e., to determine if a CSP instance is satisfiable). We only need to assume that this task is finished by dividing the original problem into subproblems. Under this assumption, we have the following lemma.
Lemma 3.1 If a CSP problem with variables and domain size can be solved in time ( is a constant), then at most subproblems with variables are needed to solve the original problem.
Proof It is easy to see that subproblems with variables are sufficient to solve the original problem. By condition, a CSP problem with variables can be solved in time. Note that
thus at most subproblems with variables are needed to solve the original problem. □
The above lemma is a key to proving the main results of this paper. For a better understanding, we take the classical sorting problem as an example. Assume that our goal is to prove that sorting elements cannot be done in time. By condition, we have . This means that we need a subproblem of elements with additional operations to solve the original problem. We can show by contradiction that this is impossible and so finish the proof.
Theorem 3.2 Model RB cannot be solved in time for any constant .
Proof For the sake of simplicity, we will prove that Theorem 3.2 holds for Model RB with by contradiction. Let be a RB instance with variables and constraints. Suppose there exists some constant such that can be solved in time, then Lemma 3.1 implies that can also be solved by assigning at most values to an arbitrary variable, say , and then solving the resulting subproblems (with variables) which require time. We will show that there exist instances where the subproblems produced by assigning values to are impossible to determine its satisfiability. For convenience, let be the domain of , be the set of values which have been assigned to (), and be the set of constraints containing .
Follow the strategy of the proof of Theorem 2.5, if is an instance having exactly one solution , then the constraint will be arbitrarily selected from . Note that is compared with the domain size , thus the probability that belongs to is . Therefore the symmetry mapping in Theorem 2.5 will be performed by exchanging with , where is chosen from , then becomes unsatisfiable. Similarly, if is an unsatisfiable instance, then the symmetry mapping of exchanging with , where is chosen from , will convert into a permitted tuple, thus becomes satisfiable.
Note that in either of the above cases, the subproblems with variables remain unchanged, while the satisfiability of the original problem with variables has been changed after performing the symmetry mapping. We can conclude that the subproblems are insufficient thus impossible to determine if is satisfiable or unsatisfiable. This completes the proof of Theorem 3.2.
□
As mentioned before, finite formal systems possess greater expressive power than algorithms (Turing machines). The above theorem indicates that no finite formal system for solving Model RB can significantly outperform or replace the exhaustive search method, which evaluates possibilities one by one. That is to say, non-brute-force computation cannot replace brute-force computation for Model RB. More interestingly, the above proof follows the same method (i.e., diagonalization and self-reference) used by Kurt Gödel [22] and Alan Turing [24], respectively, in their epoch-making papers of proving logical and computational impossibility results. This classical method, pioneered by Cantor, stands out as not only simple, elegant, and powerful, but also as the only approach that has successfully provided lower bounds on complexity classes. Moreover, the distinctive mathematical properties and the inherent extreme hardness of Model RB make it both feasible and effective in establishing algorithmic impossibility results. Importantly, applying this method to constructed instances, rather than directly to complexity classes, is crucial for achieving a successful proof. This study highlights the significance of creatively utilizing classical methods, particularly when addressing long-standing open problems. We believe that the underlying idea of this paper (i.e., Constructing the Most Indistinguishable Examples) will unlock the door to proving complexity lower bounds, which has always been a challenging task, even for many polynomial-time solvable problems.
It is worth mentioning that Xu and Li [3] established a direct link between proved phase transitions and the abundance with hard examples by proving that Model RB has both of these two properties. They also made a detailed comparison between Model RB and some other well-studied models with phase transitions, and stated that “such mathematical tractability should be another advantage of RB/RD, making it possible to obtain some interesting results which do not hold or cannot be easily obtained for random 3-SAT”. This paper confirms that remarkable results can indeed be achieved through a simple and elegant approach, as expected. More than two thousand years ago, Laozi, a great Chinese thinker and philosopher, once said: "The greatest truths are the simplest". We believe that the truth of computational hardness should also adhere to this fundamental and universal principle advocated by Laozi. In this sense, both the problem and the results of this paper are simple yet intuitively accessible. Additionally, if a problem is so difficult that there are no nontrivial results, then the most likely scenario is that the problem under study is not inherently foundational. In such a case, what should be done is to redefine the problem in alignment with Laozi’s principle. Foundational problems must be grounded in fundamental concepts. In this paper, we argue that non-brute-force computation vs. brute-force computation is more foundational than P vs. NP. This is because the concept of brute-force computation is inherently more fundamental than that of polynomial-time computation.
CSP can be encoded into SAT by use of the log-encoding [25] which introduces a new Boolean variable for each bit in the domain value of each CSP variable and thus uses a logarithmic number of Boolean variables to encode domains. It must be emphasized that each clause of these encoded SAT instances could be very long with Boolean variables if the domain size grows with the number of CSP variables . This is in sharp contrast to 3-SAT that is very short in clause length and has received the most attention in the past half-century. For encoded SAT instances of Model RB using the log-encoding, we have totally Boolean variables, clauses and literals. Note that , so it is easy to derive the following corollary from Theorem 3.2.
Corollary 3.3 SAT with Boolean variables cannot be solved in time for any constant .
The above corollary holds for SAT with no restriction on the clause length, and so is not directly applicable to -SAT with constant whose lower bounds can be obtained by reduction from encoded SAT instances of Model RB. Other CSP instances of domain size can also be encoded from Model RB using variables, thus cannot be solved in time for any constant .
It is well-known that SAT is NP-complete [26], and so it follows that P NP.
Xu K, Li W. Exact phase transitions in random constraint satisfaction problems. Journal of Artificial Intelligence Research, 2000, 12( 1): 93–103
[2]
Xu K, Boussemart F, Hemery F, Lecoutre C. A simple model to generate hard satisfiable instances. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence. 2005, 337−342
[3]
Xu K, Li W. Many hard examples in exact phase transitions. Theoretical Computer Science, 2006, 355( 3): 291–302
[4]
Xu K, Boussemart F, Hemery F, Lecoutre C. Random constraint satisfaction: easy generation of hard (satisfiable) instances. Artificial Intelligence, 2007, 171(8−9): 514−534
[5]
Liu T, Lin X, Wang C, Su K, Xu K. Large hinge width on sparse random hypergraphs. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 2011, 611−616
[6]
Cai S, Su K, Sattar A. Local search with edge weighting and configuration checking heuristics for minimum vertex cover. Artificial Intelligence, 2011, 175(9−10): 1672−1696
[7]
Zhao C, Zheng Z. Threshold behaviors of a random constraint satisfaction problem with exact phase transitions. Information Processing Letters, 2011, 111( 20): 985–988
[8]
Saitta L, Giordana A, Cornuejols A. Phase Transitions in Machine Learning. Cambridge: Cambridge University Press, 2011
[9]
Zhao C, Zhang P, Zheng Z, Xu K. Analytical and belief-propagation studies of random constraint satisfaction problems with growing domains. Physical Review E, 2012, 85( 1): 016106
[10]
Fan Y, Shen J, Xu K. A general model and thresholds for random constraint satisfaction problems. Artificial Intelligence, 2012, 193: 1–17
[11]
Lecoutre C. Constraint Networks: Targeting Simplicity for Techniques and Algorithms. John Wiley & Sons, 2013
[12]
Huang P, Yin M. An upper (lower) bound for Max (Min) CSP. Science China Information Sciences, 2014, 57( 7): 1–9
[13]
Xu W, Zhang P, Liu T, Gong F. The solution space structure of random constraint satisfaction problems with growing domains. Journal of Statistical Mechanics: Theory and Experiment, 2015, 2015: P12006
[14]
Liu T, Wang C, Xu K. Large hypertree width for sparse random hypergraphs. Journal of Combinatorial Optimization, 2015, 29( 3): 531–540
[15]
Knuth D E. The Art of Computer Programming. Addison-Wesley Professional, 2015
[16]
Xu W, Gong F. Performances of pure random walk algorithms on constraint satisfaction problems with growing domains. Journal of Combinatorial Optimization, 2016, 32( 1): 51–66
[17]
Fang Z, Li C M, Xu K. An exact algorithm based on MaxSAT reasoning for the maximum weight clique problem. Journal of Artificial Intelligence Research, 2016, 55( 1): 799–833
[18]
Wang X F, Xu D Y. Convergence of the belief propagation algorithm for RB model instances. Journal of Software, 2016, 27(11): 2712−2724
[19]
Li C M, Fang Z, Jiang H, Xu K. Incremental upper bound for the maximum clique problem. INFORMS Journal on Computing, 2018, 30( 1): 137–153
[20]
Karalias N, Loukas A. Erdős goes neural: an unsupervised learning framework for combinatorial optimization on graphs. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 6659−6672
[21]
Zhou G, Xu W. Super solutions of the model RB. Frontiers of Computer Science, 2022, 16( 6): 166406
[22]
Gödel K. Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I. Monatshefte für Mathematik und Physik, 1931, 38( 1): 173–198
[23]
Calabro C, Impagliazzo R, Paturi R. The complexity of satisfiability of small depth circuits. In: Proceedings of the 4th International Workshop on Parameterized and Exact Computation. 2009, 75−85
[24]
Turing A M. On computable numbers, with an application to the entscheidungsproblem. Proceedings of the London Mathematical Society, 1937, S2−42(1): 230−265
[25]
Walsh T. SAT v CSP. In: Proceedings of the 6th International Conference on Principles and Practice of Constraint Programming. 2000, 441−456
[26]
Cook S A. The complexity of theorem-proving procedures. In: Proceedings of the 3rd Annual ACM Symposium on Theory of Computing. 1971, 151−158
RIGHTS & PERMISSIONS
The Author(s) 2025. This article is published with open access at link.springer.com and journal.hep.com.cn
AI Summary 中Eng×
Note: Please be aware that the following content is generated by artificial intelligence. This website is not responsible for any consequences arising from the use of this content.