Recent Advances of Deep Learning within X-ray Security Imaging

You are here

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

Recent Advances of Deep Learning within X-ray Security Imaging

Wednesday, 24 August, 2022
Dr. Samet Akcay

Contributed by Dr. Samet Akcay based on the IEEEXplore® article "Using Deep Convolutional Neural Network Architectures for Object Classification and Detection Within X-Ray Baggage Security Imagery," IEEE Transactions on Information Forensics and Security, 2018 and the SPS webinar of the same title, February 2022, available in the SPS Resource Center.

X-ray security imaging

The necessity of X-ray security screening in aviation and transportation has generated interest in automated screening systems. This blog post seeks to investigate computerized X-ray security imaging techniques by categorizing them as traditional machine learning applications and contemporary deep learning applications. Following a brief overview of the classic machine learning techniques employed in X-ray security imaging, we delve into the applications of recent deep learning-based methods. With an emphasis on object classification, detection, segmentation, and anomaly detection, the taxonomy splits deep learning applications into supervised, unsupervised, and meta learning categories. The blog then examines renowned X-ray datasets and gives a performance benchmark. We finish with a discussion and recommendations for the future of security imaging, based on recent developments and anticipated trends in deep learning.

Figure 1
Figure 1. A taxonomy of the X-ray security imaging approaches.

1. Classical Image Analysis

First part of the taxonomy is the classification image analysis, steps of which are as follows: the preprocessing stage improves the quality of the input image; the segmentation stage extracts the region of interest (RoI) from the entire image; the feature extraction stage computes essential characteristics of the object such as edges, texture, and shape; and the classification stage predicts the class label based on the extracted features.

This section examines classical image analysis approaches for image enhancement and threat image projection.

Figure 2
Figure 2. Pseudo-coloring adds color to grey-scale images to improve readability.

1.1 Image Enhancement

Preprocessing the data is an essential part of making images easier for both humans and computers to read. A common method in the literature is to combine low and high energy X-ray images and subtract the background to reduce noise, then choose a threshold manually [1] or automatically [2]. Pseudo-coloring [3] is another way to improve X-ray images. As shown in Figure 2, it adds color to grey-scale images, which makes it easier to find objects and makes the operators more aware.

1.2 Threat Image Projection

Threat image projection (TIP) [4] is another image analysis tool. TIP generates a synthetic dataset for human screeners or machine/deep learning algorithms. A popular TIP strategy is to multiply a binary threat mask onto a benign input X-ray picture to produce the threat item. Affine or logarithmic modifications provide threat projections to benign images. Studies suggest that TIP increases model identification [5].

2. Machine Learning Approaches in X-ray Security Imaging

This section discusses machine learning in X-ray security imaging with a focus on classification, detection, and segmentation tasks.

2.1 Object Classification

Prior to the rise of deep learning, the bag of visual words (BoVW) technique was rather common. As illustrated in Figure 3, a popular strategy is to initially extract features using detectors/descriptors, cluster the features using k-means, and finally classify using Random Forest (RF), Support Vector Machine (SVM), or sparse-representations.

Figure 3.
Figure 3. Bag of Visual words (BoW) approach where the features are collected from positive and negative samples via descriptor and detectors, followed by a classifier such as SVM or RF.

These algorithms are, however, evaluated on relatively small datasets, restricting real-time scalability.

3. Deep Learning in X-ray Security Imagery

This section explores how deep learning algorithms are used in X-ray security. As shown in Figure 1, we divide the algorithms into supervised (classification, detection, and segmentation), unsupervised (anomaly detection) and meta-learning approaches.

3.1 Supervised Approaches

Supervised techniques are used for classification, detection, and segmentation, using global, bounding-box, and pixel-wise labels.

3.2 Classification

In 2016, Akcay et al. [6] conducted one of the first studies to use CNN to classify X-ray security images. The authors investigated how CNN can be used with transfer learning to see how well it helps classify X-ray objects in this problem domain, where the availability of the datasets are limited (See Figure 4). Freezing AlexNet weights layer by layer on a two-class ("gun vs. no-gun") X-ray classification problem shows that CNN performs much better than the classical BoW approach (SIFT+SURF), trained with SVM or RF, even when all of the network's layers are frozen. In another set of tests, CNN is used to solve a hard problem that involves six different classes. The results show that CNN could be very useful in the field.

Figure 4
Figure 4. Transfer-learning approach used in [6] and [7], where an ImageNet pretrained model is fine-tuned on an Xray dataset for the classification problem.

3.3 Detection

As shown in the preceding section, CNN has demonstrated success in classification. Nevertheless, the cluttered nature of X-ray datasets restricts the application of classification methods in this domain. To this end, the work of [7] investigated object detection techniques. The authors train sliding-window CNN, Faster RCNN, and R-FCN and YOLO models using the DBF2/6 dataset for firearm and multi-class detection problems. Experiments show that Faster RCNN [8] with VGG16 [9] reach 88.3% mAP on the 6-class DBF6 dataset, whereas R-FCN with ResNet-101 obtains the maximum performance 96.3 mAP on the 2-class (gun vs. no-gun).

Figure 5
Figure 5. Object detection models used in [7] to evaluate their applicability in X-ray security domain.

To further address the occlusion issue, Hassan et al. [10] propose an object detection technique in which the RoI is built by cascading multi-scale structure tensors that extract object orientations. The extracted RoI is fed into a CNN that overall performs better than RetinaNet, YOLOv2, and F-RCNN on GDXray and SIXray in terms of numbers and processing power. Some follow-up works [11], [12] use a similar strategy to create contour-based threat detection that achieve 96% mAP on the SIXray10 dataset.

3.4 Segmentation

Segmentation is understudied due to datasets with limited pixel-level ground-truth.

Among the few, Hassan et al. [13] propose a single-stage instance segmentation technique. As depicted in Figure 6, trainable structure tensors are used to detect transitional patterns, which are subsequently used to construct binary segmentation masks. The model achieves the highest benchmark segmentation performance on the GDXray (96.7), SIXray (96.16), OPIXray (75.32), and COMPASS XP (58.4) datasets.

Figure 6
Figure 6. Segmentation approach via trainable structure tensors.

3.5 Unsupervised Approaches

This section examines unsupervised deep learning models, focusing on anomaly detection.

In a work [14], named GANomaly, the generator is constructed up of encoder-decoder-encoder sub-networks. The model aims to learn how a normal (threat-free) image would look like to identify threats during inference via subtracting its predictions from a test image. The model achieves statistically and computationally superior results than the prior state-of-the-art techniques. (AUC: UBA: 64.3%, FFOB: 88.2%). Skip-GANomaly [15] improves the previous work [14] by 1) employing skip-connections in the generator network to accommodate higher quality images and 2) learning latent representations in the discriminator network (AUC: UBA: 94%, FFOB: 90%).

Another anomaly detection model [16] shown in Figure 7, where the system is trained only once, detects and removes contraband items regardless of their scanner parameters. The one-staged solution reconstructs normal baggage images using an encoder-decoder network and a stylization loss function. The model identifies anomalous regions by comparing the original and generated scans. Clustering and post-processing of anomalous regions localizes the threats via a bounding box. A classifier can be optionally added to the suggested framework to classify extracted anomalies. A thorough evaluation of the proposed system on four public baggage X-ray datasets without any re-training shows that it achieves competitive performance compared to traditional fully supervised methods while outperforming state-of-the-art semi-supervised and unsupervised baggage threat detection frameworks by 67.37%.

Figure 7
Figure 7. An unsupervised anomaly detection pipeline where the model is trained on normal images during training to find the threats during inference. (This figure is taken from [16].)

4. Meta Transfer Learning

The previous models had certain limitations to recognize threats in baggage across different scanner types without a specific retraining process. To get around this, a new meta-transfer learning-driven tensor-shot detector [17] breaks the input scan into dual-energy tensors and uses a meta-one-shot classification backbone to recognize and localize the cluttered baggage threats (see Figure 8). Also, the proposed detection framework can be used with many different types of scanners since it can classify objects from unified tensor maps instead of different raw scans. The model is tested on the SIXray and GDXray datasets. On the SIXray dataset, the proposed framework yields a mAP score of 64.57, and on the GDXray dataset, it achieves a precision score of 94.41 and an F1 score of 95.98. Also, it does better than the best frameworks on the SIXray and GDXray datasets by 8.03% in terms of mAP, 1.49% in terms of precision, and 0.573% in terms of F1.

Figure 8
Figure 8. Meta transfer learning approach proposed in [17]. (This figure is taken from [17].)

5. Discussion and Future Directions

Despite the good performance of the presented techniques, several limitations may be identified. This section analyzes the problems and potential directions based on the current methodologies described in this study and the larger literature, including contemporaneous work to that presented here.


Although transfer learning help to train small X-ray datasets, a lack of larger datasets restricts deep model training. Existing large datasets in the field, such as SIXray, are biased towards specific classes, restricting supervised method training. Hence, it's still important to build large, realistic, publicly available datasets.

Use of the Material Information

Dual-energy X-ray systems use attenuation between high and low energies to classify/detect objects more accurately [18]. There have been some efforts studying this to some extent; however, future work could further investigate this problem.

Improving Unsupervised Anomaly Detection Approaches

The current algorithms for finding anomalies that are described in this blog post do not always work well to be used in the real-world use-cases. So, more research needs to be done on this topic to come up with better reconstruction techniques that learn the normal characteristics that can be used to find the abnormal ones.

Continual Learning Approaches

Another limitation of the existing datasets is that the number of available classes is limited, which reflects real-world use-cases. In addition, the definition of a threat evolves over time, and models are expected to identify new threats. With the existing (supervised) approaches, however, this is usually not possible. To be able to deploy in a real-world setting, we need to develop a lifelong threat detection system capable of recognizing new types of threats. This could potentially be achieved via incremental few-shot training.

3D CT Imagery

A growing number of airports across the globe are evaluating the implementation of 3D CT machines to enhance the screening process in their airports. This creates an exciting new research topic to be studied.

6. Conclusion

This blog post classifies the traditional machine learning and newer deep learning techniques used in X-ray security imaging. Traditional techniques are subdivided according to computer vision tasks such as image augmentation, threat image projection, object segmentation, feature extraction, object classification, and detection. A review of the deep learning methodologies includes classification, detection, segmentation, and unsupervised anomaly detection algorithms.

Overall, this study examines the strengths and drawbacks of present methodologies, gives a full discussion of open difficulties, and foresees the future paths of the subject.


[1] B. R. Abidi, J. Liang, M. Mitckes and M. A. Abidi, "Improving the detection of low-density weapons in x-ray luggage scans using image enhancement and novel scene-decluttering techniques," Journal of Electronic Imaging 13(3), (1 July 2004).

[2] M. Singh and S. Singh, "Optimizing image enhancement for screening luggage at airports," CIHSPS 2005. Proceedings of the 2005 IEEE International Conference on Computational Intelligence for Homeland Security and Personal Safety, 2005., 2005, pp. 131-136, doi:

[3] J. Chan, P. Evans and X. Wang, "Enhanced color coding scheme for kinetic depth effect X-ray (KDEX) imaging," 44th Annual 2010 IEEE International Carnahan Conference on Security Technology, 2010, pp. 155-160, doi:

[4] M. Mitckes, “Threat Image Projection – An Overview,” 2003.

[5] T. W. Rogers, N. Jaccard, E. D. Protonotarios, J. Ollier, E. J. Morton and L. D. Griffin, "Threat Image Projection (TIP) into X-ray images of cargo containers for training humans and machines," 2016 IEEE International Carnahan Conference on Security Technology (ICCST), 2016, pp. 1-7, doi:

[6] S. Akçay, M. E. Kundegorski, M. Devereux and T. P. Breckon, "Transfer learning using convolutional neural networks for object classification within X-ray baggage security imagery," 2016 IEEE International Conference on Image Processing (ICIP), 2016, pp. 1057-1061, doi:

[7] S. Akcay, M. E. Kundegorski, C. G. Willcocks and T. P. Breckon, "Using Deep Convolutional Neural Network Architectures for Object Classification and Detection Within X-Ray Baggage Security Imagery," in IEEE Transactions on Information Forensics and Security, vol. 13, no. 9, pp. 2203-2215, Sept. 2018, doi:

[8] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 1 June 2017, doi:

[9] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” in International Conference on Learning Representations (ICLR), 2015. 

[10] T. Hassan, S. H. Khan, S. Akcay, M. Bennamoun and N. Werghi, “Deep CMST Framework for the Autonomous Recognition of Heavily Occluded and Cluttered Baggage Items from Multivendor Security Radiographs,” in CoRR, 2019.

[11] T. Hassan, M. Bettayeb, S. Akçay, S. Khan, M. Bennamoun and N. Werghi, "Detecting Prohibited Items in X-Ray Images: a Contour Proposal Learning Approach," 2020 IEEE International Conference on Image Processing (ICIP), 2020, pp. 2016-2020, doi:

[12] T. Hassan, S. Akcay, M. Bennamoun, S. Khan and N. Werghi, “Cascaded Structure Tensor Framework for Robust Identification of Heavily Occluded Baggage Items from X-ray Scans,” in arXiv, 2020. 

[13] T. Hassan, S. Akcay, M. Bennamoun, S. Khan and N. Werghi, “Trainable Structure Tensors for Autonomous Baggage Threat Detection Under Extreme Occlusion,” in Asian Conference on Computer Vision - ACCV, 2020. 

[14] S. Akcay, A. Atapour-Abarghouei and T. P. Breckon, “GANomaly: Semi-supervised Anomaly Detection via Adversarial Training,” in Asian Conference on Computer Vision - ACCV, 2018. 

[15] S. Akcay, A. Atapour-Abarghouei and T. P. Breckon, “Skip-GANomaly: Skip Connected and Adversarially Trained Encoder-Decoder Anomaly Detection,” in International Joint Conference on Neural Networks (IJCNN), 2019. 

[16] T. Hassan, S. Akçay, M. Bennamoun, S. Khan and N. Werghi, “Unsupervised anomaly instance segmentation for baggage threat recognition,” Journal of Ambient Intelligence and Humanized Computing, 2021. 

[17] T. Hassan, M. Shafay, S. Akçay, S. Khan, M. Bennamoun, E. Damiani and N. Werghi, “Meta-Transfer Learning Driven Tensor-Shot Detector for the Autonomous Localization and Recognition of Concealed Baggage Threats,” Sensors, 2020. 

[18] K. Fu, D. Ranta, P. Das and C. Guest, “Layer Separation For Material Discrimination Cargo Imaging System,” Image Processing: Machine Vision Applications III, 2010.



SPS on Twitter

  • DEADLINE EXTENDED: The 2023 IEEE International Workshop on Machine Learning for Signal Processing is now accepting…
  • ONE MONTH OUT! We are celebrating the inaugural SPS Day on 2 June, honoring the date the Society was established in…
  • The new SPS Scholarship Program welcomes applications from students interested in pursuing signal processing educat…
  • CALL FOR PAPERS: The IEEE Journal of Selected Topics in Signal Processing is now seeking submissions for a Special…
  • Test your knowledge of signal processing history with our April trivia! Our 75th anniversary celebration continues:…

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel