In this study, we detect several types of common faults in power transmission lines using an object detection algorithm. However, there are two problems associated with this algorithm that must be solved. The first problem is that a single object has multiple labels, and the second problem is that the detection capability of small objects is low. To solve the first problem, the traditional non-maximum suppression (NMS) algorithm is used to handle universal objects [
10], the polygonal non-maximum suppression algorithm is used to perform curve text detection [
16], and the mask non-maximum suppression algorithm is used to perform oriented scene text detection based on the segmentation method [
17] and other methods. To improve the ability to detect small objects, the length and width of which are less than 5% of the original scale, feature pyramid networks predict objects by fusing different feature layers [
9]. In addition, single shot detector (SSD) generates anchors on multiple feature maps [
13], whereas Cascade regional CNN (R-CNN) provides a multi-regression architecture to train high-quality detectors [
14]. The structure of the detection network is presented in Fig. 1.