A network intrusion detection system based on convolutional neural network

Abstract

Intrusion detection systems (IDSs) play an important point in resisting hacker intrusion. With the rapid development of the network technology, network security has received more and more attention from researchers of different fields, and the traditional network security system based on the regular intrusion detection rules cannot meet the growing demand of changeable and timely intrusion prevention. Therefore, the development of efficient IDSs always is an open challenge. Firstly, a novel intrusion detection method based on the Convolutional Neural Network (CNN) is proposed in this paper. Secondly, based on the proposed method, an efficient, real-time and automated intrusion detection system named IDS-CNN is well designed. The system is built by several open source tools, such as the packet capture interface Tcpdump, the traffic analysis Bro and the machine learning interface Tensorflow. The system is composed of data preprocessing, neural network training, network testing and intrusion response based on Linux platform. Finally, through the simulation experiment with NSL-KDD data set and the actual network flow test, the experimental results indicate that the proposed IDS-CNN system can not only complete the intrusion detection for network data streams efficiently, but also its detection precision is better than the state-of-the-art method.

Keywords

Convolutional Neural Network intrusion detection NSL-KDD data flow analysis

1 Introduction

With the rapid development of network technology, network security has also received more and more attention from researchers of different fields [2 , 14]. The main threatening factor of network security is that hackers can use various methods to invade the network systems. Intrusion detection is a proactive security defense technology to provide the real-time protection against internal attacks and external attacks. Intrusion detection can intercept intrusions before network systems were compromised, and it becomes the preferred protection technology for maintaining network security. However, the traditional intrusion detection systems (IDSs) have many unavoidable problems, such as poor detection capability for unknown network attacks, high system false positive rate and high resource consumption [7, 22]. There are many intrusion detection products, such as the network intrusion detection system of NSFOCUS, the cloud shield retest of Ali cloud, Snort, and etc. These intrusion detection systems have also many similar disadvantages of high false positive rate. Therefore, it can be seen that the traditional intrusion detection system has not met the needs of network development [6 , 30].

With the development of artificial intelligence and machine learning technology, intelligent technology has penetrated into various application fields. The use of artificial intelligence and machine learning technology in the field of network security has inspired a new hot wave of research. In particular, the neural networks were applied to intrusion detection have become a research hotspot in current network security technologies [15, 16]. In this paper, a new intrusion detection method based on the Convolutional Neural Network (CNN) was proposed, and then an effective intrusion detection system based on this method was well designed. The IDS proposed in this paper completes the process of combining intrusion detection with neural network by the data acquisition, data processing, neural network operation and intrusion response functions. It can realize the automation of intrusion detection to obtain the real-time monitoring of network data. With the effective intrusion detection method based on the CNN, the intrusion detection rate can be significantly enhanced compared with other state-of-the-art intrusion detection method. Besides, the well-designed IDS can also defend against all malicious network traffic or abnormal network traffic.

The rest of this paper is as follows. The second part introduces the background and related work of this paper. The third part proposes a new intrusion detection method, and the fourth part is the experimental link. The fifth part is about the discussion of the model and system, and finally summarizes the full paper.

2 Research background and related works

2.1 Intrusion detection system

IDS combines software and hardware devices in accordance with a specific security policy to perform real-time monitoring of the network and system [29]. If any attack or attempted attack is discovered, the manage user is alerted to ensure the safe use of network system resources and the network environment. IDS analyzes network data based on the network characteristics and matching rules set by the users [12, 25]. If malicious network data is found, it will alert the users. The general methods of intrusion detection include pattern matching, statistical analysis and integrity analysis.

Pattern matching is a comparison between the information that obtained from the network data with the network intrusion feature data in the database of the system misused pattern. Thus, it can detect the network behavior that violates the security policies. The whole process of pattern matching can be very simple to judge the User Agent of the data packet or match some simple characters injected by SQL, and it also can be complicated to use the mathematical expressions of complex regular to describe the changes of specific security situation [9 –11]. This kind of intrusion detection only needs to collect relevant datasets that can reduce the burden of operating system. But it is necessary for users to collect and upgrade the feature rules against the attacks of hackers.

The methods of statistical analysis need to create a statistical model of system users, files, directories and devices with some attributes, such as the number of times that accesses to the target, the number of times that user operations failures and network delays, and etc. The average value of these attributes is used to compare the behavior of the network and the system. If the average value surpasses the normal range, it can be determined to be intrusive. The advantage of this method is that it can detect the unknown and more complex intrusions [18 , 27]. At the same time, it also has the disadvantages of the high rate of false negative, and it is difficult to adapt the sudden change of normal network behavior of users. There are three methods of the common statistical analysis method which are the analysis method based on the expert system, the analysis method based on the model reasoning, and the analysis method based on the neural network.

The method of integrity analysis is mainly for a single file or an operation object whether is maliciously modified on the computer system, such as the file content and the directory attributes of the file system. It is useful for malicious applications, such as the Trojans of remote controlling. And it is also applied to the personal computers. As long as the files, registry table, and other objects have changed slightly in the operating system, the intrusion can be detected by integrity analysis method, even though the pattern matching and the statistical analysis cannot detect the intrusions [24, 28].

The proposed IDS-CNN method belongs to the second kind of intrusion detection methods. As far as we are concerned, it is the first time to use CNN as the statistical analysis model to detect the intrusion behavior. IDS combining with the CNN will be possible to obtain the expectant feature rules automatically, and it will prominently reduce the trivial process that the users input the feature rules.

2.2 Convolutional Neural Network

CNN is a forward feed neural network. It was inspired by the research on the unique network structures for local sensitive and directional selection of neurons in the cat’s cerebral cortex. It is similar to the normal neural network that consists of the neurons which with weights and biases. The difference between CNN and the normal neural networks is that CNN contains a feature extractor that consists of a convolutional layer and a subsampling layer. In the convolutional layer of a CNN, one neuron is only connected to the partial neighboring neurons [1 , 17]. In a convolution layer of CNN, there are consists of several feature planes. Each feature planes consists of a number of neurons that arranged by matrix. The neurons of the same feature planes shared weights that are the Convolution kernel. Convolution kernels are initialized in the form of random fractional matrix generally. During the training process of the network, the convolution kernel will learn to obtain the reasonable weights. The direct benefit of Convolution kernels is that reducing the connections between layers of the network and the risks of over-fitting. Sub-sampling is also called pooling, and can be seen as a special convolution process. Convolution and subsampling have simplified the complexity of the model and reduced the parameters of the model.

The basic structure of the CNN consists of two layers. The first layer is a feature extraction layer. It connects the entry of each neuron to the topical receptive domain of the previous layer, and extracts topical features. Once the topical features are extracted, it will determine the relationship with location of other features. The second layer is the feature mapping layer. Each computing layer of the neural network is composed of multiple feature maps. Each feature map is a plane, and the weights of all neurons on the same plane are equal. The structure of feature mapping mainly uses an activation function, so that the feature map has a characteristic of the position is fixed. Then, the neurons on the same map share weights and reduce the number of free parameters in the network structure.

The complete structure of the CNN includes input layer, convolution calculation layer, Rectified Linear Units (ReLU) layer, pooling layer, and full connection (FC) layer. Among them, the main function of the convolution calculation layer is to achieve the process of convolution. The process of convolution is to mimic the visual cortex of the mammals that is to handle the brain of visual input. In the visual cortex, the specific neurons are only emitted in the specific phenomenon and vision. The first layer distinguishes the basic attributes such as lines and curves. At a higher level, the brain identifies the configurations of the edges and colors, which is called processing of image by the weight matrix of the filter. This filter detects specific properties such as diagonal edges and vertical edges, and recognizes more complex properties with the process of the image passes through each layer. The ReLU layer enables the neural network to explain the relationship of non-linear. The FC layer is a special structure in CNN, and is a link between the convolutional layer and the common layer. The pooling layer is a special calculus of CNN, and is a summary of the local information. There are two types of pooling. One is the max pooling, which outputs the maximum of all received, and another is average pooling, which outputs the average of all received. The process of pooling is similar to the process of convolution. A whole structure of the CNN is shown in Fig. 1.

Fig. 1

Structural diagram of a CNN.

2.3 Training process of CNN

Training process of CNN includes five steps which are given as follows.

Convolution. Convolution is a process, and each convolution filter represents a feature. The strength of the input signal cannot determine the location of the feature, and it is just a basis for judging whether a feature exists or not.

Subsampling. The incoming signals from the convolutional layer should be “smoothed” to reduce the sensitivity that the aliased signals of the filter. This smoothing process is called subsampling.

Activation. The activation layer is the structure that controls the flow of signals from one layer to another, and simulates how neurons emit signals in our brains. The output signal that associated with the reference will activates more neurons, and makes the signal more effective to propagate and identify. CNN is compatible with various complex functions of activating, and spread by the analog signals.

Full Connection. The last layer of the CNN network architecture is a fully connected layer, which means that the neurons in the front layer can be connected to every neuron in the subsequent layer. This process simulates all possible paths from input to output.

Loss. This layer provides feedback to the neural network to detect whether the input recognized correctly. If it is incorrect, it will calculate the difference of measuring, which is helpful for the neural network to correct input identification in the training time.

2.4 Data flow analysis

In this paper, the open source Bro is utilized as the data flow analysis tool. Bro originates from the International Academy of Computer Science in Berkeley, California. It provides a free open source framework that can handle large amounts of the flows of the network. It can also be used to build the IDS of the network. The most important thing is that it can obtain the threat log by users’ needs [13 , 21]. Bro is mostly used for the analysis of network flow, which is often used to detect behavioral anomalies on the network for network security. Although the functions of Bro are similar to the network IDS, Bro only includes some functions such as execution event response, forensics, and file extraction and so on. When Bro monitors the network flow, it generates a log of all the content that records network activity. This comprehension includes the records of connection, the number of packets that have sent and received, the attributes about TCP session, and the other data that used to analyze and understand the network behavior. It can formulate analytical explanations of the network data that based on the specific needs of users. Bro also provides a way to perform the same type of checking that based on the attributes of flow, which means that users can use the pattern of numerical statistics and regular expression to match.

Bro can be divided into two layers according to the concept of Bro event engine and Bro strategy script. The Bro event engine can analyze and record the real-time network flow or follow the files to generate the events of report. The Bro policy script is to create the event based on the analysis of the action script. It should be noted that our designed system uses a Bro script to analyze the network packets to generate a specific log file.

3 A novel intrusion detection method based on CNN

Based on the analysis of the second part, the proposed intrusion detection method is shown in Fig. 2. To get the training model, the CNN framework is adopted as the part of the offline training with the NSL-KDD dataset [3 , 19]. In this part, it is separate from the intrusion detection. Firstly, the network data is captured in real time, and then the network data is processed to obtain characteristic data that corresponding to the NSL-KDD dataset. After that the intrusion detection model obtained by convolution neural network is used to judge whether the feature data is intrusion detection. If there is not an intrusion, it will continue to achieve the network data. If there is an intrusion, the IP that corresponding to the network feature data is placed in the Intrusion Prevention System (IPS), and the offensive log is generated to prevent the network intrusion.

Fig. 2

Diagram of IDS-CNN.

3.1 Network packet capture

Network data acquisition is the most basic module of the whole system. Because the system is running in the Linux, the simplest method of data capture, Python transfer Tcpdump is used to complete the copy and capture of the whole network packets. The requirement of this module is to capture the network data in real time.

3.2 Preprocess of network packet

The module of data preprocessing is a difficult point of the whole system. The data which is easy to use must make the network data that captured on the network split into formatted data of CVS which is the same as the features in the NSL-KDD dataset. Hence, it has to use Bro to complete the process of the data. Bro is mainly used to analyze and monitor all traffic on the link. By writing the rules script of Bro to realize the preliminary segmentation analysis of the data packet, and then, by writing the C code to segment again. The main requirements are the use of the Bro rule and the basic use of the Bro application.

3.3 Construction of intrusion detection model

Training CNN network architecture using the NSL-KDD dataset is the core process of the whole system. Firstly, it needs to satisfy the environment of Python3 and the normal use of the neural network library that based on Python, such as Numpy, Tensorflow, etc. Because the Network construction that produced by the training of CNN to NSL-KDD dataset is based on the training model that produced by the output of CNN in the environment of Python neural network. Through the calculation of the multi-level in CNN, the feature data in NSL-KDD dataset can be transformed into a training model, and the trained model can be saved to the local computer [30, 31]. It is convenient to test the test set, and the construction of the CNN training network is the most critical technology in this module. Then, it must realize the normal use of neural network library in Python. Secondly, it sets the input format of CNN network, the parameters of convolution layer, dimension, weights and other a series of parameters by using Tensorflow library. Finally, the CNN network structure of this construction will be used normally.

3.4 Detection of network feature data using intrusion detection model

When the intrusion detection model has been well constructed, the next key step is to analyze the malicious network behavior within the captured network feature data by using the trained model. In our proposed IDS-CNN, the Tensorflow library of Python is adopted to call or invoke the training model constructed by CNN in the previous step. Then the Numpy library of Python is used to perform a series of calculations on the captured network feature data to determine the result of the intrusion detection. If the captured network feature data is a malicious network behavior, the malicious network data flow of the visitor’s IP will be intercepted or blocked.

4 The design of IDS-CNN

The IDS-CNN system mainly consists of four modules, namely, data acquisition, data preprocessing, neural network and intrusion response. The module of neural network and the intrusion response are the key to the research of whole system. The training model is produced by training the dataset of the CNN, which is used in the intrusion response module, which can greatly improve the intrusion detection efficiency and the accuracy of the detection result. In the module of data acquisition, Tcpdump is utilized as a flexible and compact tool to capture network flow in a real time, which is convenient to reduce system loss and improve efficiency of the system. The data preprocessing module is a difficult point of the whole system. The Bro filters and segments the feature data that obtained by using rules, and then completes the extraction process of feature data. If the feature data is recognized as a malicious flow with a high probability detected by the training model, the function of intrusion detection can be triggered, and the iptables is used to block malicious flow and achieve the purpose of defense. The structure of system function modules is shown in Fig. 3. The implementation process of four functional modules is introduced as follows.

Fig. 3

Diagram of IDS-CNN.

4.1 Data acquisition module

The data acquisition module is to complete the function of capturing network flow. In this system, the Tcpdump in Ubuntu Linux is used to implement the real-time grabbing of network packets. It is worth noting that Tcpdump uses libpcap to grab packets from the network layer. The process of capturing the packets of Tcpdump is shown in Fig. 4, and it runs in the backstage.

Fig. 4

Process of capturing network packets using Tcpdump.

4.2 Data preprocessing module

The data preprocessing module is the most difficult module in this system. The difficulty is mainly embodied in that the data format obtained by data acquisition model must be consistent with the CVS data format of NSL-KDD dataset. It uses Bro to analyze and segments the data packets by writing a series of Bro rule scripts. By writing the proper C program, the preprocessed data is processed to obtain the matched feature data. The analysis process of captured data packets is shown in Fig. 5.

Fig. 5

Process of data preprocessing.

4.3 Neural network module

The neural network module is the core module of the IDS system. Its task is to learning the NSL-KDD dataset to obtain an approximate training model. The Python program is used to obtain the feature data in the NSL-KDD dataset of the CVS format, and the feature data is coded in one-hot and processed normalized. The data that have been processed as the input of the CNN, the final training model is calculated through the layers of the CNN network. It is a very complicated process, which requires a lot of mathematical calculations.

4.4 Intrusion response module

The intrusion response module is the final module of the system. The feature data extracted from the preprocessing module is detected by the approximate trained model. Then, the data result is processed to achieve the effect of the intrusion detection. The training model that generated from the neural network module has been transferred, which is called from the Tensorflow library.

5 Experimental results

5.1 Experimental environment and dataset

5.1.1 Experimental environment

This system is mainly developed in Python language. The working interface is executed in the shell command line, and it is mainly used in Linux servers. The system uses the convolution neural network framework to train the NSL-KDD dataset to obtain the training model and uses the training model to detect the network feature data in a real time. Because the Python provides the Tensorflow module to complete the architecture of CNN, so there is no need to consider complex mathematical calculations in the program, and reducing the development time. Ubuntu comes with the Tcpdump tool to copy and grab the network data in a real time, the Bro traffic analysis tools to process the network data so that to obtain the feature data [27, 29]. This system can only be used with administrator privileges, which ensure the security of the system and prevents from the low-privilege intruders of third-party using intercepted network flow. Furthermore, the CNN is combined with IDS and IPS technology, and the training model generated by the CNN is used to judge the abnormal data to block data sources and realize the function of intrusion prevention.

5.1.2 NSL-KDD dataset

The NSL-KDD dataset is a benchmark for assessment the performance of intrusion detection method with intelligent technology, which contains approximately 4.9 million data records with forty-one feature variables. It is worth noting that the NSL-KDD dataset is an enhanced version of KDD99 dataset is based on an assemblage of capturing the raw data from the network for seven weeks. Similar to KDD99 dataset, each record in the NSL-KDD dataset has a label to indicate the normality of the corresponding data records, which includes four main types of attacks, i.e. DOS, R2 L U2 R and Probing. The advantage of the NSL-KDD dataset contains two parts. On the one hand, there are no redundant or duplicate records in the train set. Therefore, there will be not any biased result for the classifier. On the other hand, the number of chosen records from each difficult level group is inversely proportional to the percentage of records in the original KDD data set. The instance of the feature data is shown in Fig. 6.

Fig. 6

Diagram of features in NSL-KDD dataset.

5.2 Experimental Process

5.2.1 Implementation of network data acquisition

The network data acquisition module captures the network data in the Linux server in a real time. The parameter w is the preservation of the network data (it must be the pcap file format type). The parameter i is the selection of the network card in the Linux machine. By using these two parameters, network data acquisition on the test machine in a real time can be achieved. The quick mode of Ctrl+C is to pause the capture. The process of capturing packets in a real time is shown in Fig. 7.

Fig. 7

Data acquisition using Tcpdump.

After executing the capture packets, the test.pcap file will be generated in the specific directory. By using Wireshark which the third-party software, the test.pcap file can be analyzed, and to judge whether it is a data packet, as is shown in Fig. 8. The data packets that can be segmented and counted by grabbing the pcap file in the data preprocessing module. The -c parameter of the Tcpdump can be used to specify the number of packets to grab, and the -G parameter can be used to specify how long to save data packets at intervals (in seconds). Both methods can segment the grabs of network data, as is shown in Fig. 9.

Fig. 8

Data flow analysis of network packets.

Fig. 9

Data flow analysis of network packets.

This IDS-CNN system uses the -c parameter in Tcpdump to set the number of captured data packets. The description of the program parameters is shown in Fig. 10.

Fig. 10

Data flow analysis of network packets.

It is clear that the -h parameter is a description of all the parameters in the program, and the -c parameter is to set the number of packets to be captured at atime. Besides, the -i parameter is to set the network interface, and the -o parameter is to save the.pcap file and the path of the file that obtain the eigenvalues initially. The -f parameter is to save the path of the.csv file with complete eigenvalues, and the –ip parameter is to set its own ip address.

5.2.2 Implementation of data preprocessing

The data preprocessing is related to whether the entire IDS can operate normally. This functional module is very important. Because it is necessary to segment the data packets that obtained from the network data acquisition module to get 41-bit feature data the same as the NSL-KDD dataset, it can be used as input in the neural network module, so the processing will impact all the follow steps.

The data packets obtained by the network data collection module is set to 5 minutes each time. Subsequently, the network flow analysis tool Bro will segment and collect the data packets by using the Bro file that have been prepared, as is shown in Fig. 11. In the Fig. 11, it is disorder in the segmented statistical files from the leftmost number. Here, the sort command which comes with Linux is used to process the disordered files, as is shown in Fig. 12.

Fig. 11

Segmenting the data packets using Bro.

Fig. 12

Sorting the data.

The above step is a preliminary processing of the data. Then, 41 feature values will be obtained by using the C language program to reprocess the data processing file. The obtained feature values are shown in Fig. 13.

Fig. 13

Obtained feature values by using C program.

5.2.3 Training of CNN

The CNN module is the core of the whole system. It is to train the NSL-KDD dataset to get the training model and construct the CNN framework. The process of training to NSL-KDD dataset is shown in Fig. 14.

Fig. 14

Training of CNN to NSL-KDD dataset.

In the Fig. 14, Training_acc and Training_loss represent the training completion rate and the training loss rate, respectively. Validation_acc and Validation_loss represent the verification success rate and failure rate, respectively. Set epoch = 2 to complete the two trains to get the training model. The result of the training will be saved in the /output/models/ directory, as is shown in Fig. 15.

Fig. 15

Results of the model saved by TensorFlow.

In the Fig. 15, preprocess.pkl is a model file that is saved when the features are processed, while model.data-00000-of-00001, model.index, model.meta and checkpoint files are model files which are saved by using Python and the TensorFlow library. The description of model parameter is given in Table 1.

The meta file saves the structure diagram of the entire CNN model in a protocol buffer format, and the model defines a series of information such as operations.

The data-00000-of-00001 file and the.index file save the values of weight and offset in the CNN structure together. The.data file save the variable values, and the.index file saves the correspondence between the data in the.data file and the structure diagram in the.meta file.

Checkpoint is a text file that records the model name that saved on all time nodes in the process of CNN training. The first row records the model names that saved at last time.

Table 1

The generated model file

File format	File function
Checkpoint	Provide file index paths
.Data	Save model parameter values
.index	Save model parameter names
.meta	Save model graph structure

This IDS-CNN system uses Tensorflow to construct a CNN framework, which defines the input layer, two convolution layers, two activation function layers, the pooling layer, and two full connections layers, as well as loss layer and output layer. Define a input for 36 neuron nodes that is 6x6 with the dimension of 4. The four numbers in [-1,6,6,1] represent: the number of input pictures, picture height, picture width and picture channel number. “-1” is defined the number of the program identify the pictures. The most important convolution kernel in the convolutional layer is [3 , 32] with the dimension of 4, representing the height of the convolution kernel, the width of the convolution kernel, the number of image channels, and the number of convolution kernels. After executing the tf.nn.conv2 function, the return value is the feature map (the data feature generated by the convolution kernel convolution, the feature map), and the feature map by using the tf.nn.relu function to set the non-maximum of each row in the matrix to 0. The return value enters the second layer of convolutional layer, the convolution kernel is [3,3,32,64 , 3,3,32,64], the height of the convolution kernel is 3,the width of the convolution kernel is 3, the number of image channels is 32, the number of convolution kernels is 64, and the data return value is processed by the activation function layer to enter the tf.nn.max_pool function to execute the operation of the largest pooling, the return value enters the two fully connected layers, and finally the training completion rate and the training loss rate are calculated in the loss layer. The general structure of CNN in this system is shown in Fig. 16.

Fig. 16

Network structure of CNN in this system.

The training process of CNN in this system is shown in Fig. 17. In the Fig. 17, the input image is 6*6, the feature map in the first convolution is 6×6, and the number of convolution kernels is set to 32. In the second convolution, the number of convolution kernels is set to 64. After the first and second convolution, the size of outputs images is 6*6.

Fig. 17

Training chart of CNN used in IDS-CNN system.

On the whole, the tf.train.Saver function in the Tensorflow library is used to save the training model in the custom train function. The test set is used to test the preservation model. The test processes also transfer the tf.train.Saver function to use the training model. The CNN neural network is the division of the categories. The feature values of the test set are extracted, and then verified the feature data belongs to what type of network attack or normal access to data by using the predict function, that is UR2, Normal, R2 L, Probe, and Dos. After the judgment, the y value of corresponding model is obtained, and then the np.argmax function is used for the y value to obtain the largest index value, the final label of the test set is extracted by using the np.argmax function, which is the function to get the maximum index value, and then calculate the accuracy rate by using the evaluation function np.average that defined, using the method of confusion matrix to calculates the recall rate.

5.2.4 The intrusion response module

The intrusion response module mainly detects the feature data which is obtained in the preprocessing. Then the training model of the neural network module is used to calculate the accuracy of the test, detects the malicious feature data, and records the IP address. If there is a malicious detection, iptables will be used to prohibit the access of specific IP address, which achievesthe linkage between IDS and iptables. The accuracy of the test set is shown in Fig. 18.

Fig. 18

Detection accuracy in the test set.

In this paper, 10% of the test set is taken as the test set, the accuracy of test is about 97.7%. The new feature data that is captured by data preprocessing module is the format of 41-bit feature data, it can be judged whether are the intrusive feature data through the training model. The y value captured by the model is shown in Table 2.

Table 2

Y value in the intrusion detection model

Characteristic results and types	Y value in training model
Ur2	[1,0,0,0,0]
Normal	[0,1,0,0,0]
R2l	[0,0,1,0,0]
Probe	[0,0,0,1,0]
Dos	[0,0,0,0,1]

The obtained feature data is followed by “normal”. Adding “normal” to the feature data that have been obtained, it will be recognized as the normal type when the feature data enters the training model. At this time, all the feature data will be determined to be normal type by label, but the 41-bit feature data will be judged by the training model, and the input feature data is classified into the five categories in Table 2. By using the np.argmax function, it will obtain the index of the maximum along the axis. For example, the returned value of DOS is 4, and the returned value of normal is 1. Compared with the value that is judged by the training model, if the decision value is not 1, it means that the feature data is the malicious network data, and the IP address will be recorded.

The complete process of the IDS-CNN system is shown in Fig. 19, which includes capturing the data packet, processing the data packet to obtain the feature value, and judging the feature value to complete the intrusion response.

Fig. 19

Whole process of system execution.

5.3 Discussions

Experimental results show that the intrusion detection method based on the CNN can provide protection to the network environment with high detection rate. Obviously, this method belongs to the kind of the statistical analysis methods. However, it has also some drawbacks. The first drawback is the hard preprocessing of source about the network data and the timeliness of system functions. It is known that the data needs to be segmented from the data packets, but the data packets cannot be segmented and analyzed in a real time. Hence, it can only get packets to analyze in a period of time. Secondly, it is difficult to preprocess the network data into the desired eigenvalues. This is a small drawback of the intrusion detection mode, which use the neural network to analyze. Thirdly, the NSL-KDD dataset was in 2009, and the data of U2 R type was too little to cause the imbalanced in the dataset. Therefore, it has a lower accuracy of the final verification test set.

6 Conclusions

A new network intrusion detection method based on CNN is proposed in this paper. It can detect malicious intrusion in network data flow, and implement the real-time and active defense. The construction of intrusion detection model based on CNN is systematically studied. The method of how to capture network flow and how to process the network flow into the desired feature data is discussed. Then the CNN is used to train the model obtained by NSL-KDD dataset and prognosis the test results of intrusion. In addition, the main functions of intrusion prevention in the proposed intrusion detection system are also realized, and the effective combination of these functions shows good performance. On the basis of intrusion detection, the specific defensive measures of how to against network intrusion is given. The feasibility of the defense method has proved in the experimental results. Experiments show that using the model proposed in this paper can protect the network environment in a certain degree. It also can monitor the network in a real time and detect the abnormal network behavior. Meanwhile, it can defense against the abnormal behavior.

Footnotes

Acknowledgments

This research work was partially supported by key research and development plan project of Shaanxi Science and Technology Department (Grant No. 2017ZDXM-GY-016), the Project of Shaanxi Natural Science Basic Research (Grant No.2020JM-565), The project of students’ innovation and entrepreneurship training program (Grant No.S201910702022), The project of students’ innovation and entrepreneurship training program (Grant No.X201910702141), the project of Department of Education Science Research of Shaanxi Province (Grant No. 17JK0371), the Industrial research project of Science and Technology Department of Shaanxi Province (Grant No. 2016KTZDGY4-09), the Pre-research Project of 13th Five-year Equipment Development (41402020202), the Research project on teaching reform of education in Shaanxi province (Grant No. 17JZ004, 17JY015), the fund of National Laboratory of Network and Detection Control (Grant No. GSYSJ2017007), the Characteristic disciplines in Education department of Shaanxi province (Grant No. 080901), the Key project of educational reform at the school level of Xi’an Technological University (Grant No. 18JGZ01), the project of Innovation and Entrepreneurship Training Program for College Students at the national level (Grant No. 1070214030).

References

Aburomman

A.A.

and Reaz

M.B.I.

, A survey of intrusion detection systems based on ensemble and hybrid classifiers, Computers & Security65 (2017), 135–152.

Ghost

A.K.

, A Study in Using Neural Networks for Anomaly and Misuse Detection, USENIX Security Symposium, 1999.

Aziz

A.S.A.

, Hanafi

E.O.

and Hassanien

A.E.

, Comparison of classification techniques applied for network intrusion detection and classification, Journal of Applied Logic24 (2016), 109–118.

Callegari

, Giordano

and Pagano

, An information-theoretic method for the detection of anomalies in network traffic, Computers & Security70 (2017).

Khammassi

and Krichen

, A GA-LR wrapper approach for feature selection in network intrusion detection, ScienceDirect70 (2017), 255–277.

Kolias

, Kambourakis

and Maragoudakis

, Swarm intelligence in intrusion detection: A survey, Computers & Security30(8) (2011), 625–642.

Tsai

C.F.

, Hsu

Y.F.

and Lin

C.Y.

, Intrusion detection by machine learning: A review, Expert Systems with Applications36(10) (2009), 11994–12000.

Kabir

, Hu

and Wang

, A novel statistical technique for intrusion detection systems, Future Generation Computer Systems79 (2018), 303–318.

Kuang

, Xu

and Zhang

, A novel hybrid KPCA and SVM with GA model for intrusion detection, Applied Soft Computing Journal18(C) (2014), 178–184.

10.

Kim

, Lee

and Kim

, A novel hybrid intrusion detection method integrating anomaly detection with misuse detection, Expert Systems with Applications41(4) (2014), 1690–1700.

11.

Nadiammai

G.V.

and Hemalatha

, Effective approach toward Intrusion Detection System using data mining techniques, Egyptian Informatics Journal15(1) (2014), 37–50.

12.

Liao

H.J.

, Lin

C.H.

and Lin

Y.C.

, Intrusion detection system: A comprehensive review, Journal of Network and Computer Applications36(1) (2013), 16–24.

13.

Wang

H.W.

, Gua

and Wang

S.S.

, An effective intrusion detection framework based on SVM with feature augmentation, Knowledge-Based Systems136 (2017), 136–139.

14.

Fox

, Henning

and Reed

, A Neural Network Approach towards Intrusion Detection, National Computer Security Conference, 1990, 125–134.

15.

Koc

, Mazzuchi

T.A.

and Sarkani

, A network intrusion detection system based on a Hidden Na?ve Bayes multiclass classifier, Expert Systems with Applications39(18) (2012), 13492–13500.

16.

Tavallaee

, Bagheri

and Lu

, A Detailed Analysis of the KDD CUP 99 Data Set, IEEE International Conference on Computational Intelligence for Security & Defense Applications. IEEE, 2009.

17.

Akashdeep

M.I.

and Kumar

, A feature reduced intrusion detection system using ANN classifier, Expert Systems with Applications88 (2017), 249–257.

18.

Singh

, Kumar

and Singla

R.K.

, An intrusion detection system using network traffic profiling and online sequential extreme learning machine, Expert Systems with Applications42(22) (2015), S0957417415004753.

19.

Ashfaq

R.A.R.

, Wang

X.Z.

and Huang

J.Z.

, Fuzziness based semi-supervised learning approach for Intrusion Detection System, Information Sciences378(C) (2016), 484–497.

20.

Gauthama

R.M.R.

, Somu

and Kirthivasan

, An efficient intrusion detection system based on Hypergraph, Knowledge-Based Systems 2017, S0950705117303209.

21.

Aljawarneh

, Aldwairi

and Yassein

M.B.

, Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model, Journal of Computational Science25 (2018), 152–160.

22.

Xiaonan

, Banzhaf

W.W.

and Xiaonan

, The use of computational intelligence in intrusion detection systems: a review, Applied Soft Computing10(1) (2010), 1–35.

23.

Horng

S.J.

, Su

M.Y.

and Chen

Y.H.

, A novel intrusion detection system based on hierarchical clustering and support vector machines, Expert Systems with Applications38(1) (2011), 306–313.

24.

Bamakan

S.M.H.

, Wang

and Yingjie

, An effective intrusion detection framework based on MCLP/SVM optimized by time-varying chaos particle swarm optimization, Neurocomputing199(C) (2016), 90–102.

25.

Lin

S.W.

, Ying

K.C.

and Lee

C.Y.

, An intelligent algorithm with feature selection and decision rules applied to anomaly intrusion detection, Applied Soft Computing12(10) (2012), 3285–3290.

26.

Dash

, A study on intrusion detection using neural networks trained with evolutionary algorithms, Soft Computing21(10) (2017), 2687–2700.

27.

Lokesha

, Deepika

, Ranjini

P.-S.

and Cangul

I.-N.

, Operations of Nanostructures via SDD, ABC4 and GA5 indices, Applied Mathematics & Nonlinear Sciences2(1) (2017), 173–180.

28.

Feng

, Zhang

and Hu

, Mining network data for intrusion detection through combining SVMs with ant colony networks, Future Generation Computer Systems37(7) (2014), 127–140.

29.

Gao

and Wang

, New isolated toughness condition for fractional (g, f, n) – critical graph, Colloquium Mathematicum147(1) (2017), 55–65.

30.

Gao

and Wang

W.-F.

, The fifth geometric-arithmetic index of bridge graph and carbon nanocones, Journal of Difference Equations and Applications23(1-2SI) (2017), 100–109.

31.

Yan

and Jing-Li

, Noether’s theorems of variable mass systems on time scales, Applied Mathematics & Nonlinear Sciences3(1) (2018), 229–240.

32.

Lin

W.C.

, Ke

S.W.

and Tsai

C.F.

, CANN: An intrusion detection system based on combining cluster centers and nearest neighbors, Knowledge-Based Systems78 (2015), 13–21.

33.

, Xia

and Zhang

, An efficient intrusion detection system based on support vector machines and gradually feature removal method, Expert Systems with Applications39(1) (2012), 424–430.

34.

Chung

Y.Y.

and Wahid

, A hybrid network intrusion detection system using simplified swarm optimization (SSO), Applied Soft Computing12(9) (2012).