Dual channel and multi-scale adaptive morphological methods for infrared small targets

Infrared small target detection is a challenging task. Morphological operators with a single structural element size are easily affected by complex background noise, and the detection performance is easily affected by multi‑scale background noise environments. In order to enhance the detection performance of infrared small targets, we propose a dual channel and multi‑scale adaptive morphological method (DMAM), which consists of three stages. Stages 1 and 2 are mainly used to suppress background noise, while stage 3 is mainly used to enhance the small target area. The multi‑scale adaptive morphological operator is used to enhance the algorithm’s adaptability to complex background environments, and in order to further eliminate background noise, we have set up a dual channel module. The experimental results indicate that this method has shown superiority in both quantitative and qualitative aspects in comparison methods, and the effectiveness of each stage and module has been demonstrated in ablation experiments. The code and data of the paper are placed in https:// pan. baidu. com/s/ 19psd wJoh‑ 0MpPD 41g6N_ rw


Introduction
With the development of computer vision and infrared imaging technology, infrared search and tracking (IRST) systems have a wide range of applications in the fields of guidance, early warning and traffic safety, and infrared small target detection plays a vital role in the performance of IRST systems, so infrared small target detection algorithms have been widely concerned by researchers [1][2][3].However, in practical applications, there are usually the following problems: first, due to the long shooting distance, weak infrared targets show the characteristics of small size, weak signal, and few texture features, which are difficult to detect directly [4,5]; Secondly, the near-Earth background is complex, and the target is often submerged in debris and noise, so there are many false alarms and false detections [6,7]; Finally, the module data volume and transmission speed are slow, making it difficult to apply in practical applications, the real-time performance of the object detection algorithm in practical applications requires high [8][9][10].
Infrared weak and small target detection algorithm is mainly divided into sequencebased and single-frame-based methods, based on sequence detection using the continuity and correlation of moving targets in multi-frame images to achieve infrared small target detection, while single-frame detection mainly uses single-frame images, extracts the gradient, grayscale and contrast of small targets in infrared images and other characteristics, through target enhancement or background suppression and other ways to achieve weak and small target detection, compared with multi-frame detection, it has the advantages of low complexity, high execution efficiency, and easy hardware implementation.The single-frame infrared single-frame weak target detection algorithm is mainly divided into three methods: Filter-based [11][12][13], human visual system [14][15][16][17][18][19][20][21][22][23] and low-rank sparse matrix recovery [24][25][26][27].
In addition, morphological operations show good performance in small target detection, but the results of the top-hat algorithm are greatly affected by structural elements, and traditional structural elements are fixed in size and shape, making it difficult to adapt to targets of different shapes and sizes.To solve this problem, Wang et al. reconstructed the structural elements in four directions based on the ring top hat transformation and the relationship between target and circularity, and effectively improved the application effect of the top-hat algorithm in small target detection [28]; Li et al. proposed a multidirectional improved top hat filter (MITHF) to highlight potential small targets and suppress strong structural edge clutter [29]; Bai et al. enhanced the infrared small target by using swelling and erosion to enhance the target area and suppress the surrounding background [30].
However, according to the above research, the single-frame small target detection algorithm still has the problem that it is difficult to balance real-time and detection rate at the same time, such as NTHF [12] and ADMD [19] have fast detection speed, but poor suppression effect on the background and low detection rate, while WSLCM [23] and NRAM [26] have strong robustness to background noise, high detection rate, but too low real-time performance, so the actual scene application effect of the above algorithm is weak.Inspired by the above algorithms, we fully utilize the information differences between images to establish a multi-scale adaptive weighted morphological operator, which effectively solves the problem of selecting the size of structural elements in morphological operations in small object detection.Subsequently, we designed three detection stages and integrated a dual channel module, effectively improving the detection effect and speed of infrared small object detection.
Our contributions are as follows: (1) We propose an efficient and robust infrared small target detection algorithm, which is reasonably designed to include three stages and fully utilizes image operations to improve the detection effect of infrared small targets.The effectiveness of each stage has been demonstrated through ablation experiments.(2) In previous studies, morphological operators have been proven to be effective in detecting small infrared targets, but the selection of the size of structural elements is often a challenge.Therefore, we set two structural elements by combining the information difference weights between images, including adaptive multi-scale weighted square structural elements and adaptive multi-scale weighted ring struc-tural elements, and draw conclusions through practice, The former is more suitable for suppressing background noise, while the latter is more suitable for accurately detecting small targets.(3) Due to the fact that single channel infrared small target detection is prone to residual background noise, we have designed a dual channel module that effectively eliminates background noise and enhances the area of infrared small targets by combining this module.
This article is organized as follows.In "Related work" section, the related work was discussed.In "Method" section, some previous methods were summarized and the algorithm proposed in this paper was proposed.In "Experimental preparation" section, the experimental setup and evaluation indicators were discussed.In "Experimental results and analysis" section, a large number of experiments were conducted to verify the effectiveness of the proposed method.In "Conclusion" section, a conclusion of the paper is provided.

Related work
We divide the related work into three methods: Filter-based, human visual system, and low rank sparse matrix recovery.
Filter-based algorithms use pixel grayscale differences to highlight small targets and remove surrounding background noise interference.Deshpande et al. used a maximum median filter for detecting small targets [11], and subsequently, Bai et al. improved the top-hat algorithm for small target detection (NTHF) using a ring structure [12]; Bae et al. proposed an infrared weak target detection method based on bilateral filtering (BF) based on edge direction analysis [13].Although these methods can suppress simple backgrounds, small targets are often in complex environments, background grayscale changes are large, and there is a lot of noisy noise.
The algorithm based on the Human Visual System (HVS) refers to the principle that the human eye distinguishes between the target and the background according to contrast to obtain the visual significance area, and highlights small infrared targets by constructing a significant difference map between the target and the background.Chen et al. proposed local contrast measurement (LCM) using a nested window with eight directions to capture the maximum gray value of the target block and the average value of the surrounding background block [14]; Han et al. proposed an improved LCM using the average grayscale suppression background of the target block [15]; Qin et al.only considered the largest pixels when calculating the average grayscale, and proposed NLCM [16]; Wei et al. combine the local gray differences of the two corresponding directions together to improve the performance of the algorithm, and propose MPCM [17]; Han et al. proposed RLCM [18] by adopting a relative local contrast measurement method.Moradi et al. used absolute directional mean difference for small target detection, and proposed ADMD [19] subsequently; Han et al. proposed TLLCM using a three-layer filtering window to compute window center pixels between the enhancement core and the surrounding local background [20].In addition, weighting functions have been used to improve the performance of algorithms, such as WLCM [21], RIL [22], and WSLCM [23], which use weighting functions combined with local contrast to achieve more accurate detection results.
The algorithm based on low-rank sparse matrix recovery divides infrared images into three components: target composed of sparse matrix f T , background composed of low- rank matrix f B , and noise matrix f N [24].Qin et al. separated the original image matrix into a sparse matrix and a low-rank matrix, and converted the small target detection into a low-rank sparse matrix decomposition for processing, but the processed image is easy to leave background residuals [25]; Zhang et al. adopted the structural norm to eliminate strong residuals and proposed NRAM [26]; Zhang et al. proposed PSTNN by introducing non-convex low-rank constraints in detection [27].

Morphological operators
Mathematical morphology is widely used in the field of computer vision.It is based on two basic operations: expansion and erosion.If f (x, y) represents grayscale image, and b(i, j) represents structure operator, they can be represented by f ⊕ b and f ⊖ b as follows: Where x, y represents the coordinates of pixels in the image, and (m, n) represents the offset of pixel coordinates in the structural element relative to x and y.
If the structural unit is a planar structural unit, i.e. b(m, n) = 0 , then expansion and erosion are simplified as follows.
In this paper, a new morphological operator is proposed, and expansion and corrosion are used many times in this algorithm to weaken background noise while enhancing infrared small targets.

Multiscale adaptive morphological operators
Morphological operators show strong performance in infrared small target detection, but at present, most of them need to set a single structural element size according to the actual situation when performing morphological processing, so as the background and target area size change, the performance of morphological operators will be affected.We make full use of the difference in background information between images to derive weights and set multiscale adaptive weighted morphological operators.
Among them, the adaptive square structure morphology operator is used in stage 2 to suppress multi-scale background noise, and the adaptive ring structure morphology operator is used in stage 3 to further enhance the small target area. (1)

Adaptive square structure morphological operator
Due to the presence of multi-scale background noise in infrared small target images, using a single structural element often results in poor processing performance.Therefore, in an adaptive square structure, we use four square row structural operators with sizes 2-5 to perform morphological operations on the images, and then subtract the processed image from the pre processed image to obtain the difference between the images.The larger the difference, the more significant the processing effect, finally, we weighted the difference values of the processed image to obtain the output image.
We use four square structure operators of size 2-5 to expand the images b 0 separately, and then subtract the image after each expansion treatment with the image before processing, find the degree of difference between the images, sum the difference image matrix, and finally obtain the output image by weighting the sum.
where I 1 represents the output image of the process, b n represents the expanded image of a square structure operator with size n , w n represents the weight value corresponding to the expanded image, which is calculated as follows:

Adaptive ring structure morphological operator
The ring structure proved its effectiveness for small target detection in NTHF [12], but due to the fixed size, it is easy to weaken the detection effect of small targets, so we used an adaptively weighted ring structure in the third stage, because the size of the target area in small target detection is usually 1 × 1, 3 × 3, 5 × 5, 7 × 7 and 9 × 9 pixels, so we use five sizes of ring structure, so Bi and Bo set five combinations, which are(1, 2), (1,4), (2,7), (3,10) and (4, 13)(hereinafter referred to as combination 1-5 in sequence), among them, Bi and Bo represent the radii of inner and outer structural elements, respectively., as the purpose of using an adaptive ring structure is to accurately identify small targets while eliminating the background residuals present in the images I 2 x, y obtained in the stage 2. Therefore, it is necessary to set the size of structural elements that are more suitable for infrared small targets.
The use of adaptive ring structures increases the selection process of the combination range of structural components on top of adaptive square structures, including combination range 1-3 and combination range 3-5, The former is more suitable for targets with smaller scales, while the latter is only suitable for targets with larger scales.We set up a method to select the combination range, as shown in Fig. 1.
Firstly, we use three ring structural elements combined with 1,3 and 5 to handle I 2 x, y , subtract the processed images from I 2 x, y separately and get the difference values c1, c3, and c5. (5) Where, the values of i are 1, 3, 5, and I 3 x, y represents the image processed with ring structural elements.
Then we calculate the absolute values of the differences between c1 and c3, as well as c5 and c3.Since most of the background noise is suppressed in the second stage, the larger the absolute value, the greater the change in the enhancement effect of small infrared targets in that range, and the more likely there is a structural element size suitable for the small infrared targets in the image.After selecting the appropriate combination range, adaptive weighting is performed, and the weighting method is consistent with the adaptive square structure.

Optimized Fourier bilateral filtering
The bilateral filter is a typical nonlinear filtering algorithm, which preserves the edge information of the image through the characterization of the change of pixel intensity, which can effectively solve the edge blurring problem caused by filtering [31].The bilateral filtering of the image f can be expressed as: where w is a spatial domain kernel used to reduce the interference of distant pixels on the pixels to be updated; φ is a pixel domain kernel used to reduce the interference of distant pixels on the pixels to be updated.θ and σ They are Gaussian distance standard deviation and Gaussian grayscale standard deviation, respectively.It can be seen from Eq. ( 7) that Bilateral filter not only considers the Euclidean distance between the target and surrounding pixels, but also takes into account the gray distance between the target and surrounding pixels, so it can well represent and retain image edge information.
Sanjay et al. improved on the traditional bilateral filter, using the Fourier function to approximate the truncated Gaussian kernel, and applying the Fourier series approximation to the range kernel to represent it using fast convolution, and proposed a model (7) Fig. 1 Adaptive multi-scale ring structure size interval selection using the least squares fitting optimization coefficient [32].where the Fourier series approximation can be expressed as: where the period is [−T , T ] , and K is the number of terms.We compared a variety of improved Bilateral filter, such as Fast Adaptive Bilateral Filtering [33] and Gaussian Adaptive Bilateral Filtering [34].However, the optimized Fourier Bilateral Filtering [32] uses least square optimization to perform accurate calculation in the processing process, and controls the filtering quality by adjusting the kernel error and approximates the Gaussian kernel by using Fourier series.Therefore, among these algorithms, the image processing effect for infrared small target detection is the best.In the experiment, we used this model [32] as the preprocessing stage of our algorithm, which can significantly suppress the background area while preserving the target area and image edges.

DMAM
DMAM consists of three stages: stage 1 is through optimized Fourier bilateral filtering, which is used to initially remove noise and preserve edges.In stage 2, we design an adaptively weighted square structure morphological operator to suppress the background and preliminarily identify small targets.In stage 3, we design an adaptively weighted ring structure morphological operator to further accurately identify small targets and obtain the final target image, as shown in Fig. 2, Stage 1 Fourier bilateral filtering optimized at this stage is used to smooth the image background, preserve the edge of the target area, filter with two standard deviation parameters and set to 5 and 30, respectively.
Stage 2 Our main goal in this phase is to suppress background noise and enhance small targets in preparation for final recognition.For the purpose of this stage, we use a square structure operator in this stage, because the square structure operator can better suppress background noise in our practice.
Firstly, since the fixed-size morphological operator is not adaptable enough to the image, and the processing effect is easily affected by small target size and complex background, the adaptive weighted square structure expansion operation is used to process (9)  the image after the first stage of processing, which can effectively enhance the image contrast and suppress the background noise.
Then we process the image through a mean filter with a kernel of 9 * 9 to further remove noise and edges.
where O 1 represents the output image, h(x, y) represents the neighborhood operator, and f (x, y) represents the original image.
In order to obtain the difference between the images obtained by these two processes, we use subtraction, and finally enhance the image contrast by multiplication.
Stage 3 We first perform adaptive loop structure morphological dilation on the image.Then, the expanded image is subjected to an adaptive weighted ring structure corrosion operation, which can eliminate or weaken bright details.Through these two steps, background residuals smaller than structural elements can be removed.In the end, we set up two subtraction operations, taking advantage of the differences between the images to enhance the area where the small target is located.
Dual channel Due to the small size of the image, some background noise cannot be effectively processed during the processing.Therefore, we have set up a dual channel module.This module first enlarges the image to twice its original size, and then processes it through stages 2 and 3 to obtain the background noise processing results of the original image at different scales.The processed image size is then reduced to half of the original size, and finally multiplied by the output image of the original channel, it can further suppress background noise and enhance small target areas.

Data preparation and parameter setting
In our experiment, the test dataset consists of nine infrared image sequences, labeled Seq.1-9, each representing a typical scene in infrared small target detection, including ground, river, sky and clouds, the data sets are described in Table 1, and Fig. 3 shows images of these test sequences, with dim small targets in red rectangle marker.
To verify the performance of the proposed algorithm, we tested it on some real infrared images and compared the results with some well-known algorithms, including (10)  PSTNN [27], ADMD [19], NTHF [12], TLLCM [20], WSLCM [23], and NRAM [26].All experiment is simulated on matlab2016.The experimental machine is a 64-bit windows 11 system, the video card is Nvidia GeForce RTX3060, the memory is 16 GB, and the processor frequency is 3.20 GHz.

Evaluation metrics
In order to quantitatively describe the effect of the algorithm on the improvement of image signal-to-clutter ratio and the suppression effect of background clutter.Introduce the calculation formula of signal-heteroratio SCR [35] and background inhibitor BSF: where G mt represents the grayscale mean of the target area, G mb and σ b represents the grayscale mean and grayscale standard deviation of the background area near the target area, respectively.Generally, the scale of the background area near the target area is three times that of the target area [19].(σ c ) in is the standard deviation of the original image grayscale (σ c ) out is the grayscale standard deviation of the image processed by the algorithm.The larger the SCR value, the higher the signal-to-noise ratio of the image, the more obvious the target, and the larger the BSF value, indicating the better the algorithm's suppression effect on background noise.In addition, we compared the running time of these algorithms in the experiment.

Target detection
In the experiment, we compared the proposed method with six methods in Seq.1-9.
Figure 4 shows the 3D gray distribution of the detection results of these methods in Seq.1-9.In Seq.1, the background is simple, so most algorithms achieve background suppression while enhancing the target.In Seq.2, the background noise is complex, including ground, cloud, and sky backgrounds.Only the proposed method and WSLCM achieve background suppression while enhancing the target, and the proposed method has relatively good performance.In Seq.3, the background noise is more complex, including river, ground, cloud, and sky backgrounds.Only the proposed method successfully suppressed the background.In Seq.4, the target is located in a river with a complex background, and the proposed method demonstrates strong performance.In Seq.5, except for the proposed method and NRAM, all other methods retain a large amount of clutter, and the results are not satisfactory.In Seq.6-9, the background is simple, with only clouds and sky, and the target is not obstructed.Therefore, most algorithms have shown good background suppression performance, but relatively speaking, the proposed method shows more robust performance, showing strong background suppression performance for these images, and the target is significantly enhanced.
The results indicate that ADMD performs weakly in these small target detection images and cannot effectively suppress the background.PSTNN, NTHF, and TLLCM have good background suppression performance for some images with simple backgrounds, but are sensitive to noise and cannot completely suppress noise in complex backgrounds.WSLCM can effectively enhance targets, but it also cannot effectively suppress noise in complex backgrounds.Relatively speaking, NRAM has better performance and effectively suppresses most of the background noise, but there are also clutter in the detection of some small target images.Compared with other methods, the performance of the proposed method is satisfactory.The target area is significantly enhanced, while non target areas are suppressed to low values and almost flat.This proves the effectiveness of the adaptive morphological operator and background subtraction process in the proposed method, resulting in better background and clutter suppression performance and superior target enhancement function.

Quantitative analysis
In order to obtain more convincing results, we quantitatively compared SCR, BSF, and runtime.Tables 2 and 3 list the average values of SCR and BSF obtained using different methods on Seq.1-9.Among them, Inf represents that the value is infinite.Bold font represents best results.For each evaluated value, the highest represents the best result, with bold markings representing the best test result for that sequence.From the results in the table, it can be seen that our proposed method achieved the highest results, indicating that our method can effectively suppress background while enhancing small.
To illustrate that the proposed methods are equally superior in efficiency, we compare the run times of these methods on Seq.1-9, and the results are shown in Table 4.In the comparative method, ADMD and NTHF showed superior performance in speed, but these two methods did not have satisfactory treatment effect on Seq.1-9 and showed a very weak suppression effect on background noise.In addition, TLLCM, WSLCM and NRAM have much more running time, showing unsatisfactory effects in terms of speed, and cannot be effectively applied in practical scenarios, while PSTNN and the proposed method have the same order of magnitude running time, but the proposed method is the opposite of PSTNN The inhibition effect on the background and the enhancement function of the target are much better, so the proposed method has superior comprehensive performance, which is conducive to practical small target detection.

Ablation experiment
To verify the effectiveness of each stage and module, we conducted ablation experiments on stage1, stage2, stage3, as well as Dual channel and multi-scale morphological operators.In the following

Ablation experiments for each stage and dual channel module
Based on the results in Tables 5 and 6, it is easy to see the effectiveness of each stage and dual channel module.When the image passes through stage 1, the area of the infrared small target is enhanced, resulting in an increase in SCR value.Then, when the image passes through stages 2 and 3, the SCR value and BSF value are significantly increased, indicating that the infrared small target is effectively detected and background noise is greatly suppressed.At the same time, the images output by the Dual channel module also have good infrared small target detection performance, further proving the effectiveness of this module.In Table 7, we show the processing time for each stage and module.

Ablation experiment of multi-scale morphological operators
To demonstrate the effectiveness of multi-scale adaptive morphological operators, we conducted ablation experiments on square structures with scales of 2-5 and adaptive square structures.This experiment only changes the adaptive square structure used in stage 2 of DMAM.In Tables 8 and 9, sizes 2-5 represent square structural elements with sizes 2-5, and adaptive represents adaptive square structure.According to the results, it can be seen that adaptive square structure has advantages in both SCR and BSF, and has the best processing effect in most cases, further proving the effectiveness of multi-scale adaptive square structure.
To demonstrate the effectiveness of the adaptive ring structure, we selected different combinations of structural elements, including (3,4) , (3,5) ,(4,5) , (5,5), among them, (3,4) represents the use of ring structure elements with size combination 3 in the original channel, ring structure elements with size combination 4 in the Dual channel, and so on.In Tables 10 and 11, adaptive represents adaptive ring structure, and its effectiveness is further evaluated based on experimental results.

Conclusion
In this paper, we propose an adaptive weighted composite morphological operator in view of the problem that traditional morphological algorithms are easily affected by complex background noise in small target detection, the detection effect is not good, and the choice of structural element size has a great influence on the detection effect.The algorithm makes full use of the background information difference between images to obtain weights, and sets adaptive weighted morphological operators and further suppressing background noise through dual channel modules.The results showed that compared with other methods, DMAM showed superior performance in suppressing complex background noise and enhancing small infrared target areas in qualitative analysis; In quantitative analysis, DMAM obtained inf SCR and BSF values in sequences other than seq.4,while in complex seq.4, it obtained SCR values of 1282.2 and BSF values of 8847.9, demonstrating superior performance.In summary, this algorithm considers the detection rate and speed of small targets, which is helpful for deployment and application in practical scenarios.The effectiveness of each stage and module has been demonstrated through ablation experiments.
Due to the particularity of infrared small targets, preprocessing them can usually improve detection performance.However, preprocessing methods often have the problem of slow running speed, which reduces algorithm efficiency.Therefore, we will further study preprocessing methods for infrared small targets in the future, such as improving filtering performance.

Fig. 2
Fig.2The overall framework of DMAM(with the target marked in a red rectangle, The steps for representing the algorithm with numbers inside the circle)

Fig. 3
Fig. 3 Test sequence images and their 3D grayscale distribution maps

Fig. 4
Fig. 4 3D grayscale distribution of Seq.1-9 treatment results by different methods(The first to ninth columns are seq.1-9 in sequence)

Table 1
Details of real image sequences

Table 2
SCR of different IR sequences processed by different methodsThe bold data indicates that it achieved the best performance in the comparison of this indicator

Table 3
BSF of different IR sequences processed by different methodsThe bold data indicates that it achieved the best performance in the comparison of this indicator table, Original represents the original image, Stage. 1, Stage.1&2, and Stage.1&2&3 are all processed in the original channel (in which case a dual channel module is not added), and Dual channel represents the processing result of only the dual channel output image (in which case it is not multiplied with the original channel image).

Table 4
Running time(in s) of different methodsThe bold data indicates that it achieved the best performance in the comparison of this indicator

Table 5
SCR of each stage and module processing resultThe bold data indicates that it achieved the best performance in the comparison of this indicator

Table 6
BSF of each stage and module processing result The bold data indicates that it achieved the best performance in the comparison of this indicator

Table 7
Running time(in s) of each stage and module processing

Table 8
SCR of stage 2 structural element ablation experimentThe bold data indicates that it achieved the best performance in the comparison of this indicator

Table 11
BSF of stage 3 structural element ablation experimentThe bold data indicates that it achieved the best performance in the comparison of this indicator

Table 9
BSF of stage 2 structural element ablation experimentThe bold data indicates that it achieved the best performance in the comparison of this indicator