Compressed sensing (CS) enables people to acquire the compressed measurements directly and recover sparse or compressible signals faithfully even when the sampling rate is much lower than the Nyquist rate. However, the pure random sensing matrices usually require huge memory for storage and high computational cost for signal reconstruction. Many structured sensing matrices have been proposed recently to simplify the sensing scheme and the hardware implementation in practice. Based on the restricted isometry property and coherence, couples of existing structured sensing matrices are reviewed in this paper, which have special structures, high recovery performance, and many advantages such as the simple construction, fast calculation and easy hardware implementation. The number of measurements and the universality of different structure matrices are compared.
In this paper, we study a method for isolated handwritten or hand-printed character recognition using dynamic programming for matching the non-linear multiprojection profiles that are produced from the Radon transform. The idea is to use dynamic time warping (DTW) algorithm to match corresponding pairs of the Radon features for all possible projections. By using DTW, we can avoid compressing feature matrix into a single vector which may miss information. It can handle character images in different shapes and sizes that are usually happened in natural handwriting in addition to difficulties such as multi-class similarities, deformations and possible defects. Besides, a comprehensive study is made by taking a major set of state-ofthe-art shape descriptors over several character and numeral datasets from different scripts such as Roman, Devanagari, Oriya, Bangla and Japanese-Katakana including symbol. For all scripts, the method shows a generic behaviour by providing optimal recognition rates but, with high computational cost.
Light field cameras are becoming popular in computer vision and graphics, with many research and commercial applications already having been proposed.Various types of cameras have been developed with the camera array being one of the ways of acquiring a 4D light field image usingmultiple cameras. Camera calibration is essential, since each application requires the correct projection and ray geometry of the light field. The calibrated parameters are used in the light field image rectified from the images captured by multiple cameras. Various camera calibration approaches have been proposed for a single camera, multiple cameras, and amoving camera. However, although these approaches can be applied to calibrating camera arrays, they are not effective in terms of accuracy and computational cost. Moreover, less attention has been paid to camera calibration of a light field camera. In this paper, we propose a calibration method for a camera array and a rectification method for generating a light field image from the captured images. We propose a two-step algorithm consisting of closed form initialization and nonlinear refinement, which extends Zhang’swell-known method to the camera array. More importantly, we introduce a rigid camera constraint whereby the array of cameras is rigidly aligned in the camera array and utilize this constraint in our calibration. Using this constraint, we obtained much faster and more accurate calibration results in the experiments.
A critical issue in image interpolation is preserving edge detail and texture information in images when zooming. In this paper, we propose a novel adaptive image zooming algorithm using weighted least-square estimation that can achieve arbitrary integer-ratio zoom (WLS-AIZ) For a given zooming ratio n, every pixel in a low-resolution (LR) image is associated with an n × n block of high-resolution (HR) pixels in the HR image. In WLS-AIZ, the LR image is interpolated using the bilinear method in advance. Model parameters of every n × n block are worked out throughweighted least-square estimation. Subsequently, each pixel in the n × n block is substituted by a combination of its eight neighboring HR pixels using estimated parameters. Finally, a refinement strategy is adopted to obtain the ultimate HR pixel values. The proposed algorithm has significant adaptability to local image structure. Extensive experiments comparingWLS-AIZ with other state of the art image zooming methods demonstrate the superiority of WLS-AIZ. In terms of peak signal to noise ratio (PSNR), structural similarity index (SSIM) and feature similarity index (FSIM), WLS-AIZ produces better results than all other image integer-ratio zoom algorithms.
The image denoising is a very basic but important issue in the field of image procession. Most of the existing methods addressing this issue only show desirable performance when the image complies with their underlying assumptions. Especially, when there is more than one kind of noises, most of the existing methods may fail to dispose the corresponding image. To address this problem, we propose a two-step image denoising method motivated by the statistical learning theory. Under the proposed framework, the type and variance of noise are estimated with support vector machine (SVM) first, and then this information is employed in the proposed denoising algorithm to further improve its denoising performance. Finally, comparative study is constructed to demonstrate the advantages and effectiveness of the proposed method.
Segmentation accuracy of dermoscopy images is important in the computer-aided diagnosis of skin cancer and a wide variety of segmentation methods for dermoscopy images have been developed. Considering that each method has its strengths and weaknesses, a novel adaptive segmentation framework based on multi-classification model is proposed for dermoscopy images. Firstly, five patterns of images are summarized according to the factors influencing segmentation. Then the matching relation is established between each image pattern and its optimal segmentationmethod. Next, the given image is classified into one of the five patterns by the multi-classification model based on BP neural network. Finally, the optimal segmentation method for this image is selected according to the matching relation, and then the image is effectively segmented. Experiments show that the proposed method delivers better accuracy and more robust segmentation results compared with the other seven state-of-the-art methods.
Scalable video quality enhancement refers to the process of enhancing low quality frames using high quality ones in scalable video bitstreams with time-varying qualities. A key problem in the enhancement is how to search for correspondence between high quality and low quality frames. Previous algorithms usually use block-based motion estimation to search for correspondences. Such an approach can hardly estimate scale and rotation transforms and always introduces outliers to the motion estimation results. In this paper, we propose a pixel-based outlier-free motion estimation algorithm to solve this problem. In our algorithm, the motion vector for each pixel is calculated with respect to estimate translation, scale, and rotation transforms. The motion relationships between neighboring pixels are considered via the Markov random field model to improve the motion estimation accuracy. Outliers are detected and avoided by taking both blocking effects and matching percentage in scaleinvariant feature transform field into consideration. Experiments are conducted in two scenarios that exhibit spatial scalability and quality scalability, respectively. Experimental results demonstrate that, in comparison with previous algorithms, the proposed algorithm achieves better correspondence and avoids the simultaneous introduction of outliers, especially for videos with scale and rotation transforms.
Although the distance between binary codes can be computed fast in Hamming space, linear search is not practical for large scale datasets. Therefore attention has been paid to the efficiency of performing approximate nearest neighbor search, in which hierarchical clustering trees (HCT) are widely used. However, HCT select cluster centers randomly and build indexes with the entire binary code, this degrades search performance. In this paper, we first propose a new clustering algorithm, which chooses cluster centers on the basis of relative distances and uses a more homogeneous partition of the dataset than HCT has to build the hierarchical clustering trees. Then, we present an algorithm to compress binary codes by extracting distinctive bits according to the standard deviation of each bit. Consequently, a new index is proposed using compressed binary codes based on hierarchical decomposition of binary spaces. Experiments conducted on reference datasets and a dataset of one billion binary codes demonstrate the effectiveness and efficiency of our method.
Smartphones are becoming increasingly popular, users are provided with various interface styles with different designed icons. Icon, as an important competent of user interface, is regarded to be more efficient and pleasurable. However, compared with desktop computers, fewer design principles on smartphone icon were proposed. This paper investigated the effects of icon background shape and the figure/background area ratio on visual search performance and user preference. Icon figures combined with six different geometric background shapes and five different figure/background area ratios were studied on three different screens in experiments with 40 subjects. The results of an analysis of variance (ANOVA) showed that these two independent variables (background shape and figure/background area ratio) significantly affected the visual search performance and user preference. On 3.5-in (1 in=0.025 4 m) and 4.0-in displays, unified backgroundwould be optimal, shapes such as square, circle and transitions between them (e.g., rounded square, squircle, etc.) are recommended because backgrounds in these shapes yield a better search time performance and subjective satisfaction for ease of use, search and visual preference. A 60% figure/background area ratio is the most appropriate for smartphone icon design on the 3.5-in screen, while a 50% area ratio could be a suggestion for both relatively optimized search performance and user preference on 4.0-in. In terms of the 4.7-in, icon figure is used directly for its better performance and preference compared with icons with background.
In this paper, we propose a novel framework to encrypt surveillance videos. Although a few encryption schemes have been proposed in the literature, they are not sufficiently efficient due to the lack of full consideration of the characteristics of surveillance videos, i.e., intensive global redundancy. By taking advantage of such redundancy, we design a novel method for encrypting such videos. We first train a background dictionary based on several frame observations. Then every single frame is parsed into the background and foreground components. Separation is the key to improve the efficiency of the proposed technique, since encryption is only carried out in the foreground,while the background is skillfully recorded by corresponding background recovery coefficients. Experimental results demonstrate that, compared to the state of the art, the proposed method is robust to known cryptanalytic attacks, and enhances the overall security due to the foreground and background separation. Additionally, our encryption method is faster than competing methods, which do not conduct foreground extraction.
The information rate is an important metric of the performance of a secret-sharing scheme. In this paper we consider 272 non-isomorphic connected graph access structures with nine vertices and eight or nine edges, and either determine or bound the optimal information rate in each case. We obtain exact values for the optimal information rate for 231 cases and present a method that is able to derive information-theoretical upper bounds on the optimal information rate. Moreover, we apply some of the constructions to determine lower bounds on the information rate. Regarding information rate, we conclude with a full listing of the known optimal information rate (or bounds on the optimal information rate) for all 272 graphs access structures of nine participants.
This paper proposes a new method to analyze Ethernet performance. Currently, most studies on Ethernet performance assume that the channel is divided into time slots, or the network load is saturated with little attention to a non-slotted channel and the non-saturation status. However, this situation is more consistent with the practical application of Ethernet. This paper first calculates the original collision probability and the retransmission collision probability in the original load, then obtains the retransmission load of the network based on those two collision probabilities, and finally acquires the actual load of the network by an iterative method. In addition, the accuracy of the analysis is checked against simulation results.