Oct 2024, Volume 6 Issue 2
    

  • Select all
  • Andrea Bertozzi, Ron Fedkiw, Frederic Gibou, Chiu-Yen Kao, Chi-Wang Shu, Richard Tsai, Wotao Yin, Hong-Kai Zhao
  • Yongqiang Cai, Qianxiao Li, Zuowei Shen

    We present the viewpoint that optimization problems encountered in machine learning can often be interpreted as minimizing a convex functional over a function space, but with a non-convex constraint set introduced by model parameterization. This observation allows us to repose such problems via a suitable relaxation as convex optimization problems in the space of distributions over the training parameters. We derive some simple relationships between the distribution-space problem and the original problem, e.g., a distribution-space solution is at least as good as a solution in the original space. Moreover, we develop a numerical algorithm based on mixture distributions to perform approximate optimization directly in the distribution space. Consistency of this approximation is established and the numerical efficacy of the proposed algorithm is illustrated in simple examples. In both theory and practice, this formulation provides an alternative approach to large-scale optimization in machine learning.

  • Björn Engquist, Kui Ren, Yunan Yang

    This paper develops and analyzes a stochastic derivative-free optimization strategy. A key feature is the state-dependent adaptive variance. We prove global convergence in probability with algebraic rate and give the quantitative results in numerical examples. A striking fact is that convergence is achieved without explicit information of the gradient and even without comparing different objective function values as in established methods such as the simplex method and simulated annealing. It can otherwise be compared to annealing with state-dependent temperature.

  • Kevin S. Miller, Andrea L. Bertozzi

    Active learning in semi-supervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier. A challenge is to identify which points to label to best improve performance while limiting the number of new labels. “Model Change” active learning quantifies the resulting change incurred in the classifier by introducing the additional label(s). We pair this idea with graph-based semi-supervised learning (SSL) methods, that use the spectrum of the graph Laplacian matrix, which can be truncated to avoid prohibitively large computational and storage costs. We consider a family of convex loss functions for which the acquisition function can be efficiently approximated using the Laplace approximation of the posterior distribution. We show a variety of multiclass examples that illustrate improved performance over prior state-of-art.

  • Antonio Baeza, Rosa Donat, Anna Martínez-Gavara

    Cost-effective multilevel techniques for homogeneous hyperbolic conservation laws are very successful in reducing the computational cost associated to high resolution shock capturing numerical schemes. Because they do not involve any special data structure, and do not induce savings in memory requirements, they are easily implemented on existing codes and are recommended for 1D and 2D simulations when intensive testing is required. The multilevel technique can also be applied to balance laws, but in this case, numerical errors may be induced by the technique. We present a series of numerical tests that point out that the use of monotonicity-preserving interpolatory techniques eliminates the numerical errors observed when using the usual 4-point centered Lagrange interpolation, and leads to a more robust multilevel code for balance laws, while maintaining the efficiency rates observed for hyperbolic conservation laws.

  • Samira Kabri, Alexander Auras, Danilo Riccio, Hartmut Bauermeister, Martin Benning, Michael Moeller, Martin Burger

    The reconstruction of images from their corresponding noisy Radon transform is a typical example of an ill-posed linear inverse problem as arising in the application of computerized tomography (CT). As the (naïve) solution does not depend on the measured data continuously, regularization is needed to reestablish a continuous dependence. In this work, we investigate simple, but yet still provably convergent approaches to learning linear regularization methods from data. More specifically, we analyze two approaches: one generic linear regularization that learns how to manipulate the singular values of the linear operator in an extension of our previous work, and one tailored approach in the Fourier domain that is specific to CT-reconstruction. We prove that such approaches become convergent regularization methods as well as the fact that the reconstructions they provide are typically much smoother than the training data they were trained on. Finally, we compare the spectral as well as the Fourier-based approaches for CT-reconstruction numerically, discuss their advantages and disadvantages and investigate the effect of discretization errors at different resolutions.

  • Kevin Bui, Yifei Lou, Fredrick Park, Jack Xin

    In this paper, we design an efficient, multi-stage image segmentation framework that incorporates a weighted difference of anisotropic and isotropic total variation (AITV). The segmentation framework generally consists of two stages: smoothing and thresholding, thus referred to as smoothing-and-thresholding (SaT). In the first stage, a smoothed image is obtained by an AITV-regularized Mumford-Shah (MS) model, which can be solved efficiently by the alternating direction method of multipliers (ADMMs) with a closed-form solution of a proximal operator of the

    1 - α 2
    regularizer. The convergence of the ADMM algorithm is analyzed. In the second stage, we threshold the smoothed image by
    K
    -means clustering to obtain the final segmentation result. Numerical experiments demonstrate that the proposed segmentation framework is versatile for both grayscale and color images, efficient in producing high-quality segmentation results within a few seconds, and robust to input images that are corrupted with noise, blur, or both. We compare the AITV method with its original convex TV and nonconvex TV
    p ( 0 < p < 1 )
    counterparts, showcasing the qualitative and quantitative advantages of our proposed method.

  • Wei Zhu

    In this work, we propose a second-order model for image denoising by employing a novel potential function recently developed in Zhu (J Sci Comput 88: 46, 2021) for the design of a regularization term. Due to this new second-order derivative based regularizer, the model is able to alleviate the staircase effect and preserve image contrast. The augmented Lagrangian method (ALM) is utilized to minimize the associated functional and convergence analysis is established for the proposed algorithm. Numerical experiments are presented to demonstrate the features of the proposed model.

  • Paula Chen, Jérôme Darbon, Tingwei Meng

    Two of the main challenges in optimal control are solving problems with state-dependent running costs and developing efficient numerical solvers that are computationally tractable in high dimensions. In this paper, we provide analytical solutions to certain optimal control problems whose running cost depends on the state variable and with constraints on the control. We also provide Lax-Oleinik-type representation formulas for the corresponding Hamilton-Jacobi partial differential equations with state-dependent Hamiltonians. Additionally, we present an efficient, grid-free numerical solver based on our representation formulas, which is shown to scale linearly with the state dimension, and thus, to overcome the curse of dimensionality. Using existing optimization methods and the min-plus technique, we extend our numerical solvers to address more general classes of convex and nonconvex initial costs. We demonstrate the capabilities of our numerical solvers using implementations on a central processing unit (CPU) and a field-programmable gate array (FPGA). In several cases, our FPGA implementation obtains over a 10 times speedup compared to the CPU, which demonstrates the promising performance boosts FPGAs can achieve. Our numerical results show that our solvers have the potential to serve as a building block for solving broader classes of high-dimensional optimal control problems in real-time.

  • Daniil Bochkov, Frederic Gibou

    We consider the inverse problem of finding guiding pattern shapes that result in desired self-assembly morphologies of block copolymer melts. Specifically, we model polymer self-assembly using the self-consistent field theory and derive, in a non-parametric setting, the sensitivity of the dissimilarity between the desired and the actual morphologies to arbitrary perturbations in the guiding pattern shape. The sensitivity is then used for the optimization of the confining pattern shapes such that the dissimilarity between the desired and the actual morphologies is minimized. The efficiency and robustness of the proposed gradient-based algorithm are demonstrated in a number of examples related to templating vertical interconnect accesses (VIA).

  • Jingrun Chen, Weinan E, Yifei Sun

    Machine learning has been widely used for solving partial differential equations (PDEs) in recent years, among which the random feature method (RFM) exhibits spectral accuracy and can compete with traditional solvers in terms of both accuracy and efficiency. Potentially, the optimization problem in the RFM is more difficult to solve than those that arise in traditional methods. Unlike the broader machine-learning research, which frequently targets tasks within the low-precision regime, our study focuses on the high-precision regime crucial for solving PDEs. In this work, we study this problem from the following aspects: (i) we analyze the coefficient matrix that arises in the RFM by studying the distribution of singular values; (ii) we investigate whether the continuous training causes the overfitting issue; (iii) we test direct and iterative methods as well as randomized methods for solving the optimization problem. Based on these results, we find that direct methods are superior to other methods if memory is not an issue, while iterative methods typically have low accuracy and can be improved by preconditioning to some extent.