2025-06-20 2025, Volume 6 Issue 2

  • Select all
  • research-article
    Shusen Yang , Fangyuan Zhao , Zihao Zhou , Liang Shi , Xuebin Ren , Zongben Xu

    Federated learning (FL) has been becoming a popular interdisciplinary research area in both applied mathematics and information sciences. Mathematically, FL aims to collaboratively optimize aggregate objective functions over distributed datasets while satisfying a variety of privacy and system constraints. Different from conventional distributed optimization methods, FL needs to address several specific issues (e.g. non-i.i.d. data and differential private noises), which pose a set of new challenges in the problem formulation, algorithm design, and convergence analysis. In this paper, we will systematically review existing FL optimization research including their assumptions, formulations, methods, and theoretical results. Potential future directions are also discussed.

  • research-article
    Yumeng Ren , Yiming Gao , Xue-Cheng Tai , Chunlin Wu

    ${\mathcal{l}}_{1}$ based sparse regularization plays a central role in compressive sensing and image processing. In this paper, we propose ${\mathcal{l}}_{1}$ DecNet, as an unfolded network derived from a variational decomposition model, which incorporates ${\mathcal{l}}_{1}$ related sparse regularizations and is solved by a non-standard scaled alternating direction method of multipliers. ${\mathcal{l}}_{1}$ DecNet effectively separates a spatially sparse feature and a learned spatially dense feature from an input image, and thus helps the subsequent spatially sparse feature related operations. Based on this, we develop ${\mathcal{l}}_{1}$ DecNet+, a learnable architecture framework consisting of our ${\mathcal{l}}_{1}$ DecNet and a segmentation module which operates over extracted sparse features instead of original images. This architecture combines well the benefits of mathematical modeling and data-driven approaches. To our best knowledge, this is the first study to incorporate mathematical image prior into feature extraction in segmentation network structures. Moreover, our ${\mathcal{l}}_{1}$ DecNet + framework can be easily extended to 3D case. We evaluate the effectiveness of ${\mathcal{l}}_{1}$ DecNet+ on two commonly encountered sparse segmentation tasks: retinal vessel segmentation in medical image processing and pavement crack detection in industrial abnormality identification. Experimental results on different datasets demonstrate that, our ${\mathcal{l}}_{1}$ DecNet+ architecture with various lightweight segmentation modules can achieve equal or better performance than their enlarged versions respectively. This leads to especially practical advantages on resource-limited devices.

  • research-article
    Zhangchen Zhou , Hanxu Zhou , Yuqing Li , Zhi-Qin John Xu

    Previous research has shown that fully-connected neural networks with small initialization and gradient-based training methods exhibit a phenomenon known as condensation [T. Luo et al., J. Mach. Learn. Res., 22(1), 2021]. Condensation is a phenomenon wherein the weight vectors of neural networks concentrate on isolated orientations during the training process, and it is a feature in the non-linear learning process that enables neural networks to possess better generalization abilities. However, the impact of neural network architecture on this phenomenon remains a topic of inquiry. In this study, we turn our focus towards convolutional neural networks (CNNs) to investigate how their structural characteristics, in contrast to fully-connected networks, exert influence on the condensation phenomenon. We first demonstrate in theory that under gradient descent and the small initialization scheme, the convolutional kernels of a two-layer CNN condense towards a specific direction determined by the training samples within a given time period. Subsequently, we conduct systematic empirical investigations to substantiate our theory. Moreover, our empirical study showcases the persistence of condensation under broader conditions than those imposed in our theory. These insights collectively contribute to advancing our comprehension of the non-linear training dynamics inherent in CNNs.

  • research-article
    Ruxu Lian , Jieqiong Ma , Jiangbo Jin , Qingcun Zeng

    In this work, we consider the dynamic framework of the ocean-atmosphere (O-A) coupled model with physical boundary conditions at the ocean-atmosphere interface, and this coupled model can be viewed as atmosphere general circulation model coupled with ocean general circulation model. As the initial data and boundary conditions are assumed to meet certain assumptions, by taking advantage of energy estimates method and compactness arguments, we addressed the existence and stability of global weak solutions, the existence and uniqueness of global strong solution to the O-A coupled model.

  • research-article
    Lingyi Chen , Shitong Wu , Wenhao Ye , Huihui Wu , Wenyi Zhang , Hao Wu , Bo Bai

    The Blahut-Arimoto (BA) algorithm has played a fundamental role in the numerical computation of rate-distortion (RD) functions. This algorithm possesses a desirable monotonic convergence property by alternatively minimizing its Lagrangian with a fixed multiplier. In this paper, we propose a novel modification of the BA algorithm, wherein the multiplier is updated through a one-dimensional rootfinding step using a monotonic univariate function, efficiently implemented by Newton's method in each iteration. Consequently, the modified algorithm directly computes the RD function for a given target distortion, without exploring the entire RD curve as in the original BA algorithm. Moreover, this modification presents a versatile framework, applicable to a wide range of problems, including the computation of distortion-rate (DR) functions. Theoretical analysis shows that the outputs of the modified algorithms still converge to the solutions of the RD and DR functions with rate $\mathcal{O}(1/n)$, where n is the number of iterations. Additionally, these algorithms provide $\epsilon $-approximation solutions with $\mathcal{O}\left(\right(MN\mathrm{l}\mathrm{o}\mathrm{g}N/\epsilon \left)\right(1+\mathrm{l}\mathrm{o}\mathrm{g}|\mathrm{l}\mathrm{o}\mathrm{g}\epsilon |\left)\right)$ arithmetic operations, where M,N are the sizes of source and reproduced alphabets respectively. Numerical experiments demonstrate that the modified algorithms exhibit significant acceleration compared with the original BA algorithms and showcase commendable performance across classical source distributions such as discretized Gaussian, Laplacian and uniform sources.

  • research-article
    Wenli Yang , Zhongyi Huang , Wei Zhu

    We propose a novel two-stage model for image denoising. With the group sparse representations over local singular value decomposition stage (locally), one can remove the noise effectively and keep the texture well. The final denoising by a firstorder variational model stage (globally) can help us to remove artifacts, maintain the image contrast, suppress the staircase effect, while preserving sharp edges. The existence and uniqueness of global minimizers of the low-rank problem based on group sparse representations are analyzed and proved. Alternating direction method of multipliers is utilized to minimize the associated functional, and the convergence analysis of the proposed optimization algorithm are established. Numerical experiments are conducted to showcase the distinctive features of our method and to provide a comparison with other image denoising techniques.

  • research-article
    Ge Xu , Huajie Chen , Xingyu Gao

    In this paper, we study numerical approximations of the ground states in finite temperature density functional theory. We formulate the problem with respect to the density matrices and justify the convergence of the finite dimensional approximations. Moreover, we provide an optimal a priori error estimate under some mild assumptions and present some numerical experiments to support the theory.