Improving local search algorithms for clique relaxation problems via group driven initialization

Rui SUN , Yiyuan WANG , Minghao YIN

Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (6) : 196403

PDF (5790KB)
Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (6) : 196403 DOI: 10.1007/s11704-024-40238-8
Theoretical Computer Science
RESEARCH ARTICLE

Improving local search algorithms for clique relaxation problems via group driven initialization

Author information +
History +
PDF (5790KB)

Abstract

Clique relaxation problems are important extension versions of the maximum clique problem with extensive real-world applications. Although lots of studies focus on designing local search algorithms for solving these problems, almost all state-of-the-art local search algorithms adopt a common general initialization method. This paper develops a general group driven initialization method for clique relaxation problems. The proposed method uses two kinds of ways to divide vertices into some subgroups by using the useful information of the search procedure and the structure information of a given instance and then constructs a good initial solution by considering the generated group information. We apply the proposed initialization method to two clique relaxation problems. Experimental results demonstrate that the proposed initialization method clearly improves the performance of state-of-the-art algorithms for the clique relaxation problems.

Graphical abstract

Keywords

combinatorial optimization / clique relaxation problems / group driven initialization / local search / group information

Cite this article

Download citation ▾
Rui SUN, Yiyuan WANG, Minghao YIN. Improving local search algorithms for clique relaxation problems via group driven initialization. Front. Comput. Sci., 2025, 19(6): 196403 DOI:10.1007/s11704-024-40238-8

登录浏览全文

4963

注册一个新账户 忘记密码

1 Introduction

Given a graph G=(V,E), a clique C is a subset of V such that every two distinct vertices in C are adjacent. As a basic concept of graph theory, one of the major concerns for the model of clique is the identification of cohesive subgroup, which has already lots of valuable applications in many fields, such as biological computing [1], chemoinformatics [2], and social network analysis [3,4]. In practice, many real-world applications cannot be directly modeled as the clique because the model of clique is too strict. To overcome the impracticalities stemming from the ideal model of the clique, a series of clique relaxation models based on different kinds of relaxation constraints are proposed to deal with real-world applications such as k-club [5], k-plex [6], k-quasi-clique [7], and k-defective [8].

Since the above clique relaxation problems involve relaxing certain constraints, each of them has distinct real-world applications. For example, the k-plex is a degree relaxation model of clique, and it aims to define a certain degree of local non-connectivity within a subgraph. Thus, it has been applied to the criminal network analysis field, whose major challenge lies in dealing with intelligence that often contains errors or incomplete information [9]. The k-club model restricts the shortest distance between vertices in the subgraph, which reflects the “spread” in real-world scenarios. As a result, the k-club model has been utilized in modeling epidemic control [10] and internet research [11]. The k-quasi-clique and k-defective models have similar types of constraints. The k-quasi-clique model directly employs a density constraint, while the k-defective-clique model can be regarded as an adapted version of the density constraint. Both of these models can be applied to protein detection [12,13].

As is known, the above mentioned clique relaxation problems are NP-hard problems [1418], which means that, unless P = NP, there is no polynomial-time algorithm for these problems. Recently, state-of-the-art exact algorithms for k-plex [19], k-defective [8], k-quasi-clique [20], and k-club [21] have been proposed. Because of the NP-hardness of these problems, many researchers turn to designing heuristic algorithms for solving the clique relaxation problems to obtain a good solution within an acceptable time limit. During the past decades, many representative heuristic algorithms have been designed for handling different clique relaxation problems, including k-club [5,2224], k-plex [2528], and k-quasi-clique [7,2931]. Among the heuristic algorithms for the maximum k-quasi-clique problem, NuQClq [7] has demonstrated state-of-the-art performance. It utilizes a novel variant of the configuration strategy called BoundedCC and employs a cumulative saturation-based scoring function. As for the maximum k-plex problem, DCCplex [26] and KLS [27] have shown to be the top-performing algorithms. Furthermore, for the k-club problem, NukCP [24] clearly outperforms other heuristic algorithms. It incorporates a dynamic reduction method and a stratified threshold configuration checking strategy.

The general local search framework for clique relaxation problems usually consists of initialization and search procedures. It works as follows: iteratively generate an initial solution and then explore a good candidate solution until the termination condition is reached. Nevertheless, the improvements of previous local search algorithms for clique relaxation problems mainly concentrate on designing some strategies for the search procedure, such as cycling strategy [24,26], scoring function [7], reduction strategy [24], and perturbation strategy [25,27]. Compared to lots of novel strategies in the search procedure, most of these algorithms adopt a frequency-based initialization method. Although a good initial solution has a great impact on the results of the subsequent search procedure, there are few works on studying a specialized initialization process for clique relaxation problems.

In this study, to improve the performance of the initialization procedure of local search algorithms for clique relaxation problems, we propose a group driven initialization method that consists of a group-based initialization method Initgdi, as well as two group generation and maintenance procedures GPsea and GPstr. Both GPsea and GPstr consider the useful information of the search procedure and the structure information of a given instance to a certain extent. In detail, vertices are divided into subgroups based on a novel local optimal matrix and the definition of modularity. Based on the group information of vertices, an initial solution is obtained based on a roulette wheel method. We remark that the proposed group driven initialization method does not impact the search procedure, and thus any local search algorithm based on the general local search framework for clique relaxation problems can directly apply this proposed method. In our work, we apply the proposed initialization method to three local search algorithms that achieve the state-of-the-art performance for k-quasi-clique and k-plex, respectively. Extensive experiments are carried out to evaluate the improved algorithms using our proposed initialization method on the benchmarks used in the literature. Results show the superiority of our proposed initialization method over a common general initialization method.

This paper is organized as follows. In Section 2, we introduce some preliminary definitions. In Section 3, we present our group driven initialization method. Finally, in Section 4, we present the experimental results.

2 Preliminary

An undirected graph G=(V,E) consists of a vertex set V={v1,v2,,vn} and an edge set E={e1,e2,,em}. The density of graph G is denoted as dens(G)=2×|E||V|×(|V|1). For an edge e={vi,vj}, vertices vi and vj are the endpoints of the edge. Two vertices are neighbors if and only if they belong to the same edge. Given a vertex v, its neighborhood is NG(v)={uV{v,u}E} and its close neighborhood is NG[v]=NG(v){v}. The degree of a vertex v is defined as the number of its neighbors, denoted as degG(v)=|NG(v)|. Given a vertex set SV, its neighborhood is NG(S)= vSNG(v)S and its close neighborhood is NG[S]= vSNG[v]. Given a vertex set SV, the induced subgraph G[S]=(VS,ES) is a subgraph of G if VS=S and the edge set ES includes all the edges in E that have both endpoints in S. Given a vertex set SV, a group set is defined as Vgro={g1,g2,,gd} where each vertex in S is partitioned into a specific subgroup (e.g., giVgro).

Given a graph G=(V,E) and a fixed constant k(0,1], a k-quasi-clique is a subset SV such that dens(G[S])k. The maximum quasi-clique problem (MQCP) aims to find a k-quasi-clique with the most vertices of a given graph.

Given a graph G=(V,E) and a positive integer k, a k-plex is a subset SV such that the degree of each vertex in G[S] is at least |S|k. The maximum k-plex problem (MKPP) aims to find a k-plex with the most vertices of a given graph.

2.1 A general local search framework for clique relaxation problems

A general local search framework for clique relaxation problems is shown in Algorithm 1, including two procedures: the initialization procedure (Line 2) and the search procedure (Line 3). In the beginning, an initial feasible solution Dini is generated by calling the initialization method, and then the algorithm uses the search procedure to improve the solution D until some stop criterion is reached. A local best solution obtained by the search procedure is stored in Dlbest. At the end of this search trajectory, if the local best solution Dlbest is better than the global best solution D, D is updated by Dlbest (Line 4). Finally, the algorithm returns D (Line 5).

In the following, we will introduce a general initialization method as below.

A frequency-based initialization method Initfreq(G) [7,2527]. Each vertex vV has a property: frequency, denoted as freq[v]. It works as follows: 1) at the beginning, freq[v]=0, for each vV; 2) whenever vertex v is moved (i.e., to be added or removed), the value of its freq will be increased by 1. The Initfreq(G) method iteratively picks a vertex with the smallest freq value, with ties broken randomly, until an initial solution cannot be extended. In each iteration, the selected vertex needs to satisfy the constraint of clique relaxation problems.

2.2 The roulette wheel selection method

In our work, we mainly use a roulette wheel selection method to decide which vertex is a candidate vertex [32]. The roulette wheel selection method is a stochastic selection method, where the probability for the selection of an individual is proportional to its fitness. Let us consider N individuals, each characterized by its fitness wi>0 where i[1,N]. The selection probability of the ith individual is defined as pi=wii=1Nwi.

3 A group driven initialization method

In this section, we design a group driven initialization method for clique relaxation problems, which includes a novel initialization procedure and two group generation and maintenance procedures. Various types of graphs have been employed to evaluate the performance of our proposed algorithms in our experimental section. We observe that most sparse graphs are with special structural features and thus we can easily take full advantage of the structural information to group all vertices and then construct a high-quality initial solution, which can be a good starting point for the subsequent search procedure. However, for the remaining graphs, especially for the dense graphs, we hardly learn some useful structural information from them and thus we turn to refer to the search information between vertices during the search procedure. After that, we mainly group all vertices based on the search information and then generate a good initial solution.

Based on the above considerations, Section 3.1 proposes a group-based initialization procedure Initgdi based on the group information. In Sections 3.2 and 3.3, to obtain the group information, we present two kinds of group generation and maintenance procedures, i.e., search-based group procedure GPsea and structure-based group procedure GPstr.

3.1 A novel group-based initialization procedure

For the proposed group-based initialization procedure Initgdi, we apply two group procedures, including GPsea and GPstr to generate and maintain the group information denoted as Vgro={g1,g2,,gq}, where an input parameter q denotes the maximum number of subgroups. In our work, we decide whether to use GPsea or GPstr based on the density value of a given graph, which will be further introduced in our experimental section.

At first, assuming that the initial solution in the Initgdi is denoted as Dini. We give a selection rule for the following Initgdi.

Roulette wheel selection rule Given a group set GoodGroupVgro that will be defined in Line 6 of Algorithm 2, we choose a subgroup gt from GoodGroup based on the roulette wheel selection where the fitness value of each subgroup gGoodGroup is defined as |Dinig|+1. Then, for the search-based method GPsea, a vertex v is randomly selected from a given subgroup gt, whereas for the structure-based method GPstr, the algorithm selects a vertex v with the biggest scorecoh(v,Dini) value that will be mentioned in Section 3.2.

Based on a given group set Vgro and a selection rule, we present a group-based initialization procedure Initgdi in Algorithm 2. We utilize CandSet to denote the set of all candidate vertices which satisfies the constraint of clique relaxation problems and can be added into Dini. In the initialization procedure, if vertices in a given graph have been already grouped and with probability 1-p, the algorithm calls a frequency-based initialization method (Line 11). Otherwise, the algorithm chooses a random subgroup gt from Vgro and then the first added vertex v1 is randomly selected from gt (Line 3). In the loop of adding vertices, if there exist some vertices in CandSet that belong to some subgroups in Vgro, then the algorithm chooses a candidate vertex v2 according to the roulette wheel selection rule (Lines 6–8). Otherwise, a random vertex v2 is picked (Line 9). Afterward, the selected vertex is added into Dini and the algorithm updates CandSet accordingly (Line 10). If CandSet becomes an empty set, this loop will be terminated. Finally, the algorithm returns an initial solution Dini (Line 12).

3.2 A search-based group procedure GPsea

In some dense graphs, every vertex is adjacent to most vertices, so it is hard to exploit the structure information to divide vertices into some subgroups. Thus, in this subsection, the search information is mainly exploited. We first introduce the intuition of GPsea and the definition and updating rules of a local optimal matrix that can reflect the search information to a certain extent. Then, the group generation and maintenance procedures of GPsea are displayed.

3.2.1 The intuition of GPsea

The intuition behind GPsea is inspired by the backbone structure observed in combinatorial optimization problems [33], which indicates that high-quality solutions often share common components. According to this observation, we consider a scenario where two vertices frequently appear together in some local best solutions. Interestingly, we also find that these pairs of vertices tend to appear in different high-quality local best solutions. As a result, our approach aims to classify these vertices into different subgroups, where each subgroup is associated with different high-quality solutions to some extent. By utilizing GPsea, the objective of Initgdi is to expedite the search process by initializing the initial solution from promising subgroups (i.e., search regions).

3.2.2 A local optimal matrix

In the general search framework for clique relaxation problems, at the end of each search procedure, a local best solution Dlbest is obtained. It indicates which vertices are included in the local best solution during the current search trajectory. If two vertices often appear in the local best solution, it not only indicates the importance of these two vertices but also implies that they may have high similarity. According to this search information, we first define a local optimal matrix M of size n×n where n=|V|. An element mij corresponds to the occurrence frequency of a pair of vertices vi and vj appearing in a local best solution Dlbest. It works as follows.

Matrix Rule 1 At the beginning, all the elements in the local optimal matrix are set to 0.

Matrix Rule 2 When Dlbest is obtained, for each pair of vertices vi and vj where vi,vj are two distinct vertices in Dlbest, mij and mji are increased by 1.

To make readers clearly understand our local optimal matrix, we present an example in Fig.1.

Given a graph G=(V,E), we use the information of matrix M to divide vertices into at most q subgroups, i.e., Vgro={g1,g2,,gq} where the size of each subgroup is |V|/q. During the search procedure, we think the two adjacent vertices often appearing in the same local best solution should be partitioned into the same subgroup. Thus, to greatly group all vertices in a given graph, we give an objective function to measure the quality of the grouping results.

objsea(Vgro)=t=1q(vi,vjgt{vi,vj}Emij).

For a group gt, its weight is defined as the sum of mv1,v2 for all vertex pairs (v1,v2) connected by an edge within the group. The objective function objsea is designed to maximize the total weight of all groups.

3.2.3 A Group Generation Procedure for GPsea

To make a good balance between the accuracy and complexity of calculating objsea(Vgro), we adopt a greedy group generation procedure to divide vertices into some subgroups. In this procedure, we use a two-level scoring function that combines two kinds of information (i.e., search information and structure information). The primary scoring function considers the search information, whereas the secondary scoring function refers to the structure information.

The primary scoring function based on the local optimal matrix is defined as follows.

scorem(vi,gt)=vjNG(vi)gtmij.

To maximize the value of objective function objsea(Vgro), we apply scorem to decide which vertex vi is selected and then added into a subgroup gt. To address the issue of tie-breaking in the primary scoring function, we employ the secondary scoring function to further select a vertex among these vertices with the same best scorem. The proposed secondary scoring function considers the cohesive relationship of a vertex vi and a subgroup gt, which is defined as follows.

scorecoh(vi,gt)=|NG(vi)gt|.

It is easy to observe that the more vertices in gt are adjacent to vi, the bigger scorecoh value. Based on the above scoring functions, we present a selection rule for the adding process of the proposed greedy group generation procedure.

Selection Rule select a vertex vi from a candidate set with the biggest scorem(vi,gt) value, breaking ties by preferring the one with the highest scorecoh(vi,gt) value, further ties are broken randomly.

The proposed group generation procedure is outlined in Algorithm 3. Vrem stores vertices that have not been grouped, which is initialized to V. The group set Vgro is initialized to an empty set (Line 1). There is an outer loop (Lines 2–9) and an inner loop (Lines 6−8), where the outer loop aims to generate a series of subgroups and the inner loop is responsible for formulating a certain subgroup. At the beginning of each outer loop, a vertex set gt is initialized to an empty set. The algorithm selects a vertex viVrem with the biggest freqbest(vi) value, where the number of times of vertex vi appearing in all obtained local best solutions is expressed by freqbest(vi) (Line 4). vi will be added into gt and then be also deleted from Vrem (Line 5). In each inner loop, the algorithm iteratively adds a vertex into gt, until |gt|=|V|/q or NG(gt)Vrem= (Lines 6–8). During the vertex selection process, the algorithm employs the BMS strategy [34], i.e., randomly sampling t1 vertices to compose a candidate set and then selecting a vertex from the candidate set according to the selection rule. After each inner loop, if the size of gt equals |V|/q, then gt will be stored into Vgro (Line 9). At last, the algorithm returns Vgro (Line 10).

3.2.4 A novel group-based search framework with Initgdi and GPsea

In this subsection, we will modify the general local search framework by using our proposed group-based initialization procedure and search-based group procedure, resulting in a novel search framework (Algorithm 4). In addition, Algorithm 4 will show how to maintain the group information for GPsea in the general local search framework. We remark that Algorithm 4 does not modify the search procedure of Algorithm 1 (i.e., the boxed parts of Algorithm 4 can be skipped). It means that if a local search algorithm for clique relaxation problems is based on the general local search framework, then our proposed group-based initialization procedure and the corresponding grouping generation and maintenance procedures can directly replace the original initialization method of this framework.

Algorithm 4 uses freqinc to record the sum of increments of freq values of all vertices since the last time the greedy group generation procedure GreedyGroupsea was called. To implement it, we define an auxiliary variable freqold. At first, the matrix M is initialized according to the matrix rule 1 (Line 1), as well as Vgro and freqold are initialized accordingly (Line 2). Whenever a local optimal solution Dlbest is obtained, freqinc and M need to be updated (Lines 6–7). If freqinc reaches a specific threshold α×|V|, where α is a parameter, GreedyGroupsea is called to group vertices and freqold needs to be recalculated (Lines 8–10). The matrix M contains little useful information at the beginning of the search procedure and needs to be constantly updated after each search procedure. Thus, GreedyGroupsea should be called within a certain period.

3.3 A structure-based group procedure GPstr

In this section, we present a structure-based group procedure GPstr for the graphs with obvious structural features. First, we give the intuition of a greedy group generation procedure for GPstr. Moreover, we introduce a group updating procedure based on the definition of modularity. Finally, a group-based search framework with Initgdi and GPstr is presented.

3.3.1 The intuition of GPstr

Massive sparse graphs often exhibit distinctive community structures, characterized by subgroups of vertices with dense connections internally and relatively fewer connections to vertices outside their respective subgroups. To harness these structural characteristics, we utilize GPstr, which involves two key steps: GreedyGroupstr and GroupUpdatestr. In detail, GreedyGroupstr aims to identify groups of vertices with strong internal connections, while GroupUpdatestr iteratively updates these groups to further differentiate the distinct community structures. During the initialization procedure, our focus lies in selecting a series of vertices within a specific subgroup, with the aim of leveraging the inherent community structure.

3.3.2 A group generation procedure for GPstr

To obtain several initial cohesive subgroups, we design a greedy group generation procedure in Algorithm 5. Variable try represents the consecutive times of obtaining an unsatisfied subgroup whose size is less than |V|/q. Initially, Vrem, try and Vgro are initialized accordingly (Line 1). In the process of generating subgroups, a new subgroup gt is initialized to a vertex in Vrem with the biggest degG value, and this selected vertex is deleted from Vrem (Lines 3–5). Afterward, the algorithm adds a vertex with the biggest scorecoh value into gt, which aims to enhance its cohesive structure (Lines 6–8). After the subgroup generating procedure, if |gt| equals |V|/q, then gt will be stored in Vgro and try is reset to 0 (Lines 9–10). Otherwise, try is increased by 1 (Line 11). At last, if try reaches a predefined threshold max_tries, Vgro will be returned (Line 12).

3.3.3 A group updating procedure for GPstr

To further optimize an initial group set Vgro={g1,,gq}, our goal is that the vertices belonging to a subgroup are densely connected, whereas vertices belonging to different subgroups are sparsely connected. The concept of modularity, as defined by Newman [35], is commonly utilized to generate community structures. Modularity values range between 0 and 1, with higher values indicating denser internal connections within the same groups and sparser connections between different groups. In our context, since our focus is solely on the grouped vertices, our objective is to maximize the modularity specifically for these grouped vertices. The objective function is defined as follows:

objstr(Vgro)=t=1q[|Egt|2|E|(vgtdegG(v)2|E|)2],

where an induced subgraph of a subgroup gt is G[gt]=(Vgt,Egt).

Note that there maybe exists lots of ungrouped vertices in V, denoted by Vungr. That is, V=VgroVungr. In the group updating procedure, we only care about verities in the group set Vgro and aim to maximize its objective value objstr.

In the following, we propose a novel way to measure the change of modularity during the process of adding or removing a vertex.

Adding process When adding a vertex v into a subgroup gt, we compute the value of modularity by considering a new induced graph G=(V,E) where V=Vgro{v}. In detail, before selecting a vertex vVungr to add into a subgroup gt in Vgro, the set {v} is considered as an extra subgroup (i.e., gq+1).

Removing process When deleting a vertex v from a subgroup gt, the value of modularity needs to be recalculated based on a new induced graph G=(V,E) where V=Vgro. Specifically, after deleting a vertex v from a subgroup gt in Vgro, the set {v} is seen as an extra subgroup (i.e., gq+1).

Based on the above assumption, the scoring function scoreadd(v,gt) of adding a vertex v is defined as follows [36].

scoreadd(v,gt)=[|Egt|+Lv,gt2|E|(vgt{v}degG(v)2|E|)2][|Egt|2|E|(vgtdegG(v)2|E|)2(degG(v)2|E|)2]=12|E|(Lv,gtvgtdegG(v)×degG(v)|E|),

where Lv,gt=|NG(v)gt|. Similarly, we propose a novel deleting scoring function scoredel(v,gt) as below.

scoredel(v,gt)=[|Egt|Lv,gt2|E|(vgt{v}degG(v)2|E|)2(degG(v)2|E|)2][|Egt|2|E|(vgtdegG(v)2|E|)2]=12|E|((vgtdegG(v)degG(v))×degG(v)|E|Lv,gt),

where Lv,gt=|NG(v)gt|.

Based on the definitions of scoreadd and scoredel, we propose the following vertex selection rules.

Add Rule Select a vertex v and a subgroup gt from a candidate set with the biggest scoreadd(v,gt) value, breaking ties by preferring the one with the highest freq value.

Delete Rule Select a vertex v and a subgroup gt from a candidate set with the biggest scoredel(v,gt) value, breaking ties by preferring the one with the lowest freq value.

In the adding operation, when multiple vertex-group pairs have the same maximum scoreadd value, our preference is to select a vertex with the highest frequency. This preference is based on the assumption that vertices with higher frequency are more significant in the search procedure, and thus they should be added to subgroups with higher priority. Conversely, when removing a vertex from its group, we prioritize vertices with lower frequency as they are considered less important. The GroupUpdatestr is presented in Algorithm 6 where parameter max_count is used to control the number of updating Vgro. The algorithm adds some vertices and deletes some other vertices to update Vgro. Specifically, in each iteration, one vertex within the groups and one vertex outside the groups are swapped. The selection of these two vertices is done using the Delete Rule and Add Rule, respectively. GroupUpdatestr aims to enhance the community structure within the groups.

3.3.4 A novel group-based search framework with Initgdi and GPstr

A novel search framework based on our proposed group-based initialization procedure and search-based group procedure GPstr is shown in Algorithm 7. We mention again that our proposed methods do not change the search procedure of the general search framework (i.e., the boxed parts) for clique relaxation problems. period is used to record the execution number of the search procedure and max_period is a parameter that represents the frequency of updating Vgro. Before the search procedure, Vgro is initialized through GreedyGroupstr. In each max_period rounds, the algorithm calls GroupUpdate to update Vgro (Lines 6–7).

4 Experimental results

We conducted experiments to evaluate the proposed initialization method on a broad range of classic and sparse benchmarks.

Three state-of-the-art local search algorithms, including NuQClq [7] for the MQCP as well as DCCplex [26] and FD-TS [25] for the MKPP, all adopt the general search framework (i.e., Algorithm 1). Thus, as described in Sections 3.2.3 and 3.3.3, our proposed group-based initialization procedure Initgdi and the corresponding group procedures (i.e., GPsea and GPstr) can be easily applied to the above algorithms, resulting in three novel algorithms including NuQClq-GDI, DCCplex-GDI, and FD-TS-GDI. For a given instance, if its density value is larger than 0.35, the corresponding algorithm uses GPsea. Otherwise, it applies GPstr. The source codes of the three algorithms were provided by the authors. For all competitors, we set the same parameters as described in the corresponding literature. All the algorithms are implemented in C++ and compiled by g++ with -O3 option. All experiments are run on Intel Xeon Gold 6238 CPU @ 2.10 GHz with 512 GB RAM under CentOS 7.9.The parameters of our proposed initialization method are tuned by irace [37]. Specifically, for the GPstr method, max_count, t1, max_period, max_tries, q and p are set to 100, 50, 20, 10, 50, and 60% while for the GPsea method, t1, p, q, and α are set to 50, 60%, 50, and 850, respectively.

For all instances, all the algorithms are performed 10 independent runs with different random seeds from 1 to 10 and the cutoff time is set to 1000 seconds. For each instance, max denotes the best size found, and avg denotes the average size obtained over 10 runs. When max = avg, we do not report avg. The bold values in the tables indicate the best solution among all the algorithms. We do not report the results when all the algorithms obtain the same solutions. Note that, for the MQCP and MKPP, the same graph with different k is considered as the different instances.

4.1 Results for the MQCP

We collect all used instances from [7], including 187 classic instances from DIMACS benchmark [38] and BHOSLIB benchmark [39] and 102 sparse instances from Florida Sparse Matrix Collection [40] and Stanford Large Network Dataset Collection. Besides, we consider 65 massive sparse graphs from the Network Data Repository [41] with more than 105 vertices and more than 106 edges, which has been used to test the performance of clique related problems [42,24]. For 65 massive sparse graphs, k is set to 0.9, 0.8, 0.7, and 0.6. The total number of newly selected instances is 260.

The results for NuQClq-GDI and NuQClq on classic instances and sparse instances are reported in Tab.1 and Tab.2, respectively. For classic instances, NuQClq-GDI obtains a better solution than NuQClq on 7 instances and only fails on one instance C2000.9 with k=0.999. As for instances where the algorithms obtain the same best solutions, compared with NuQClq, NuQClq-GDI obtains 10 better average solutions, except for 5 instances. For two sparse benchmarks, NuQClq-GDI achieves better solutions on 13 instances, whereas NuQClq finds better solutions on 3 instances. In terms of average solution, NuQClq-GDI outperforms NuQClq on 23 instances while it is defeated on only 5 instances.

4.2 Results for the MKPP

For the MKPP, we use the same k values as described in the corresponding literature (i.e., k=2,3,4). We adopt several popular benchmarks from [26], including 363 classic instances from DIMACS benchmark [38] and BHOSLIB benchmark [39] and 108 sparse instances from Network Data Repository [41]. According to our results, most of these sparse instances are so easy that all algorithms obtain the same quality values. Thus, we select 65 massive sparse instances as our additional benchmark, which has already been described in the previous subsection. In total, 195 (65×3) instances are picked.

Tab.3 summarizes the results of classic and sparse benchmarks for the MKPP. #inst denotes the number of instances in each benchmark. #better and #worse denote the number of instances where FD-TS-GDI and DCCplex-GDI achieve better and worse maximal (or average) results, respectively. For classic benchmarks, no matter the best or average solutions, DCCplex-GDI and FD-TS-GDI perform better than the corresponding original algorithm. As for sparse benchmarks, most of them for all the algorithms achieve the same solution. Nevertheless, for the instances where our proposed algorithm and the corresponding competitor perform differently, our proposed algorithm steadily outperforms its competitor.

We provide the detailed results for FD-TS-GDI and FD-TS on classic and sparse instances in Tab.4 and Tab.5, respectively. Additionally, the results of DCCplex-GDI and DCCplex on classic and sparse instances are shown in Tab.6 and Tab.7, respectively. In the case of classic instances, FD-TS-GDI and DCCplex-GDI outperform their counterparts in finding better maximal solutions in 34 and 20 instances, respectively. They are only defeated in 5 and 1 instances, respectively. When considering the average solutions for classic instances, FD-TS-GDI and DCCplex-GDI outperform FD-TS and DCCplex in 40 and 58 instances, respectively. Regarding the sparse instances, all four algorithms demonstrate similar performance by achieving the same maximal solutions in 300 out of 303 instances. However, FD-TS-GDI and DCCplex-GDI exhibit a slight advantage over their competitors in terms of both maximal solutions and average solutions.

Besides, we present time consumption of all algorithms on the instances where our algorithm and the competitors achieve the same maximal and average solution values. In detail, the average run time of NuQClq, FD-TS, and DCCplex is 26.06, 13.27, and 11.46 seconds, while the average run time of NuQClq-GDI, FD-TS-GDI, and DCCplex-GDI is 22.78, 11.69, and 8.98 seconds, respectively. Obviously, the proposed method clearly improves the time consumption of the local search algorithms for clique relaxation problems.

Moreover, we compare DCCplex-GDI to KLS [27] and CFS-RLS [28] on the respective selected benchmarks of the corresponding literature and using the same experiment settings (e.g., run times). The results is presented in Tab.8 and Tab.9. For Tab.8, both DCCplex-GDI and KLS are executed with 100 independent runs, whereas for Tab.9, both DCCplex-GDI and CFS-RLS are executed with 20 independent runs, following the same experimental settings (e.g., run times) as mentioned in the corresponding literature. Remark that, we don’t apply Initgdi to the KLS algorithm and compare it with the original KLS algorithm. This is because KLS does not follow the typical “initialize + search” framework. DCCplex-GDI finds better solutions than KLS and CFS-RLS for 18 and 19 instances, respectively, while DCCplex-GDI fails to match the solutions obtained by KLS and CFS-RLS on 14 and 0 instances. Note that DCCplex doesn’t use our initialization method and it only finds 17 better and 27 worse solutions than KLS. Once again, the excellent results of DCCplex-GDI can be attributed to the power of our proposed initialization method.

4.3 Effectiveness of search and structure information

To verify the necessity of the structural information and search information. We compare GPsea with two alternative versions: (1) GPsea1 employs the random selection strategy when the candidate vertices have the same best scorem value, instead of our selection rule; (2) GPsea2 selects a vertex vi with the biggest scorecoh, breaking ties randomly, instead of our selection rule. The former version only utilizes the search information while the latter one only applies the structure information. GPstr is compared with two versions: (1) GPstr1 selects a vertex and a subgroup with the biggest scoreadd or scoredel values, breaking ties randomly instead of our add and delete rules; (2) GPstr2 ignores the GroupUpdate procedure. GPstr1 only considers the structure information and ignores the search information whereas GPstr2 ignores the structure and search information obtained from GroupUpdate. #better and #worse denote the number of instances where our proposed methods find better and worse maximal results, respectively. As shown in Tab.10, the results demonstrate that both search and structure information plays an important role in GPsea and GPstr.

5 Conclusion

In this work, we propose a group driven initialization method for clique relaxation problems. The proposed methods divide the vertices into some subgroups using the search and structure information and then employ the obtained group information to help the algorithm generate an initial solution. We conduct extensive benchmarks to evaluate the performance of the proposed initialization methods and Results show that the proposed initialization method outperforms a common general initialization method for clique relaxation problems.

In the future, to enhance the effectiveness of the proposed method, we intend to incorporate new ideas [43,44]. One such idea is the utilization of Hash functions to record local optimal solutions to improve the exploitation of the search procedure. Additionally, we aim to identify the cohesive structure of the given graph through a simpler and more efficient approach. Furthermore, we intend to enhance the proposed method by incorporating a forgetting mechanism. This mechanism will enable the algorithm to concentrate more on recent search information.

References

[1]

Malod-Dognin N, Andonov R, Yanev N. Maximum cliques in protein structure comparison. In: Proceedings of the 9th International Symposium on Experimental Algorithms. 2010, 106−117

[2]

Hao J K. Memetic algorithms in discrete optimization. In: Neri F, Cotta C, Moscato P, eds. Handbook of Memetic Algorithms. Berlin: Springer, 2012, 73−94

[3]

Alduaiji N, Datta A, Li J . Influence propagation model for clique-based community detection in social networks. IEEE Transactions on Computational Social Systems, 2018, 5( 2): 563–575

[4]

Ouyang G, Dey D K, Zhang P . Clique-based method for social network clustering. Journal of Classification, 2020, 37( 1): 254–274

[5]

Shahinpour S, Butenko S . Algorithms for the maximum k-club problem in graphs. Journal of Combinatorial Optimization, 2013, 26( 3): 520–554

[6]

Jiang H, Xu F, Zheng Z, Wang B, Zhou W. A refined upper bound and inprocessing for the maximum k-plex problem. In: Proceedings of the 32nd International Joint Conference on Artificial Intelligence. 2023, 5613−5621

[7]

Chen J, Cai S, Pan S, Wang Y, Lin Q, Zhao M, Yin M. NuQClq: An effective local search algorithm for maximum quasi-clique problem. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. 2021, 12258−12266

[8]

Gao J, Xu Z, Li R, Yin M. An exact algorithm with new upper bounds for the maximum k-defective clique problem in massive sparse graphs. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence. 2022, 10174−10183

[9]

Balasundaram B. Graph theoretic generalizations of clique: optimization and extensions. Texas A&M University, Dissertation, 2007

[10]

Kempe D, Kleinberg J, Tardos É. Maximizing the spread of influence through a social network. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2003, 137−146

[11]

Terveen L, Hill W, Amento B . Constructing, organizing, and visualizing collections of topically related web resources. ACM Transactions on Computer-Human Interaction, 1999, 6( 1): 67–94

[12]

Yu H, Paccanaro A, Trifonov V, Gerstein M . Predicting interactions in protein networks by completing defective cliques. Bioinformatics, 2006, 22( 7): 823–829

[13]

Bandyopadhyay S, Bhattacharyya M . Mining the largest quasi-clique in human protein interactome. Nature Precedings, 2012,

[14]

Yannakakis M. Node-and edge-deletion NP-complete problems. In: Proceedings of the 10th Annual ACM Symposium on Theory of Computing. 1978, 253−264

[15]

Bourjolly J M, Laporte G, Pesant G . An exact algorithm for the maximum k-club problem in an undirected graph. European Journal of Operational Research, 2002, 138( 1): 21–28

[16]

Balasundaram B, Butenko S, Hicks I V . Clique relaxations in social network analysis: the maximum k-plex problem. Operations Research, 2011, 59( 1): 133–142

[17]

Pattillo J, Veremyev A, Butenko S, Boginski V . On the maximum quasi-clique problem. Discrete Applied Mathematics, 2013, 161( 1−2): 244–257

[18]

Peng Y, Xu Y, Zhao H, Zhou Z, Han H . Most similar maximal clique query on large graphs. Frontiers of Computer Science, 2020, 14( 3): 143601

[19]

Wang Z, Zhou Y, Luo C, Xiao M. A fast maximum k-plex algorithm parameterized by the degeneracy gap. In: Proceedings of the 32nd International Joint Conference on Artificial Intelligence. 2023, 5648−5656

[20]

Miao Z, Balasundaram B . An ellipsoidal bounding scheme for the quasi-clique number of a graph. INFORMS Journal on Computing, 2020, 32( 3): 763–778

[21]

Daemi N, Borrero J S, Balasundaram B . Interdicting low-diameter cohesive subgroups in large-scale social networks. INFORMS Journal on Optimization, 2022, 4( 3): 304–325

[22]

Almeida M T, Carvalho F D . Two-phase heuristics for the k-club problem. Computers & Operations Research, 2014, 52: 94–104

[23]

Moradi E, Balasundaram B . Finding a maximum k-club using the k-clique formulation and canonical hypercube cuts. Optimization Letters, 2018, 12( 8): 1947–1957

[24]

Chen J, Wang Y, Cai S, Yin M, Zhou Y, Wu J. NukCP: an improved local search algorithm for maximum k-club problem. In: Proceedings of the 36th AAAI Conference on Artificial Intelligence. 2022, 10146−10155

[25]

Zhou Y, Hao J K . Frequency-driven tabu search for the maximum s-plex problem. Computers & Operations Research, 2017, 86: 65–78

[26]

Chen P, Wan H, Cai S, Li J, Chen H. Local search with dynamic-threshold configuration checking and incremental neighborhood updating for maximum k-plex problem. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 2343−2350

[27]

Pullan W . Local search for the maximum k-plex problem. Journal of Heuristics, 2021, 27( 3): 303–324

[28]

Jin Y, Drake J H, He K, Benlic U . Reinforcement learning based coarse-to-fine search for the maximum k-plex problem. Applied Soft Computing, 2022, 131: 109758

[29]

Djeddi Y, Haddadene H A, Belacel N . An extension of adaptive multi-start tabu search for the maximum quasi-clique problem. Computers & Industrial Engineering, 2019, 132: 280–292

[30]

Zhou Q, Benlic U, Wu Q . An opposition-based memetic algorithm for the maximum quasi-clique problem. European Journal of Operational Research, 2020, 286( 1): 63–83

[31]

Pinto B Q, Ribeiro C C, Riveaux J A, Rosseti I . A BRKGA-based matheuristic for the maximum quasi-clique problem with an exact local search strategy. RAIRO-Operations Research, 2021, 55: S741–S763

[32]

Lipowski A, Lipowska D . Roulette-wheel selection via stochastic acceptance. Physica A: Statistical Mechanics and Its Applications, 2012, 391( 6): 2193–2196

[33]

Wu Q, Hao J K . A review on algorithms for maximum clique problems. European Journal of Operational Research, 2015, 242( 3): 693–709

[34]

Cai S, Lin J, Luo C . Finding a small vertex cover in massive sparse graphs: construct, local search, and preprocess. Journal of Artificial Intelligence Research, 2017, 59: 463–494

[35]

Newman M E J, Girvan M . Finding and evaluating community structure in networks. Physical Review E, 2004, 69( 2): 026113

[36]

Blondel V D, Guillaume J L, Lambiotte R, Lefebvre E . Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008, 2008( 10): P10008

[37]

López-Ibáñez M, Dubois-Lacoste J, Cáceres L P, Birattari M, Stützle T . The irace package: iterated racing for automatic algorithm configuration. Operations Research Perspectives, 2016, 3: 43–58

[38]

Johnson D S, Trick M A. Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge, Workshop, October 11−13, 1993. Boston: American Mathematical Society, 1996

[39]

Xu K, Boussemart F, Hemery F, Lecoutre C . Random constraint satisfaction: Easy generation of hard (satisfiable) instances. Artificial Intelligence, 2007, 171( 8-9): 514–534

[40]

Davis T A, Hu Y . The university of Florida sparse matrix collection. ACM Transactions on Mathematical Software, 2011, 38( 1): 1

[41]

Rossi R A, Ahmed N K. The network data repository with interactive graph analytics and visualization. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015, 4292−4293

[42]

Wang Y, Cai S, Chen J, Yin M . SCCWalk: an efficient local search algorithm and its improvements for maximum weight clique problem. Artificial Intelligence, 2020, 280: 103230

[43]

Jiang L, Ouyang D, Zhang Q, Zhang L . DeciLS-PBO: an effective local search method for pseudo-Boolean optimization. Frontiers of Computer Science, 2024, 18( 2): 182326

[44]

Pan S, Ma Y, Wang Y, Zhou Z, Ji J, Yin M, Hu S . An improved master-apprentice evolutionary algorithm for minimum independent dominating set problem. Frontiers of Computer Science, 2023, 17( 4): 174326

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (5790KB)

899

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/