Evolving fuzzy neural networks to aid in the construction of systems specialists in cyber attacks 1

Abstract

The growth of the computerization of processes and services has changed human relations and, as a consequence, have created new forms of attacks and frauds for users of digital equipment. Because many people use computers, smartphones, and e-mail to perform day-to-day tasks, various data traffic is susceptible to attack. This can undermine the competitiveness of a company that may have breached strategic information. Therefore, security and information management are fundamental factors for companies to keep due control and management of their business knowledge. Cyber attacks are represented by a growing worldwide scale of secrecy breach of relevant information and are characterized as one of the significant challenges of the contemporary world. This article aims to propose a computational system based on intelligent hybrid models, which through fuzzy rules allows the construction of expert systems in attacks on cybernetic data of diverse natures. The tests were carried out with real bases of attacks on the database of governmental computerized devices.

The model proposed in this paper uses fuzzy evolving data grouping concepts. The extreme learning machine performs the training and the logical neurons of the unineuron type are responsible for creating fuzzy rules capable of transforming the knowledge acquired by the model into a database for employee training in companies, construction of other computer systems and awareness of elements which may harm the integrity of the data of individuals and companies. The novelty of the intelligent technique presented in the paper is that the nature of cyber attacks defines the structure of the model because the techniques of fuzzification and regularization are based entirely on the complexity of the cybernetic invasions. The binary pattern classification tests confronted with traditional models of the literature prove that the proposal of this paper can maintain the accuracy of detection of cyber attacks and still manages to construct a set of rules that serve as knowledge for the companies that wish to protect their information from attacks devices.

Keywords

Evolving fuzzy neural network cyber attack cyber protection knowledge management

1 Introduction

The evolution of the times creates situations that must be controlled and monitored by employees and companies. As new computing resources become available to society, new forms of fraudulent data collection become recurrent in society. The spread of computing resources make threats more susceptible to attacks on personal computers, mobile devices [59], tablets, among others. This can undermine the competitiveness of a company that may have breached strategic information. Therefore, security and information management are fundamental factors for companies to keep due control and management of their business knowledge. It is a recent lawsuit, and many people do not know or care about protecting their information. Such oversights can create significant problems for the lives of individuals or corporations that may have personal or strategic data stolen and used for malicious purposes [106].

Knowledge management is an area of transversal activity between the different disciplines related in the life of companies and people, above all, strategic management, organization theory, information system, technology management, and more traditional areas such as economics, sociology, psychology, marketing, among others. Therefore it is found that it can interfere in diverse contexts in modern society. Knowledge, in addition to media, requires management, storage process, zeal in the custody of its information, management and channels for its dissemination, so knowledge management is necessary because of the existence of knowledge in the company, in the minds of people and processes performed. This type of management is the modeling of corporate processes through generated knowledge, a way to structure the organizational activities in the internal and external environment, it is a corporate management [1].

As cyber attacks are a new field of science and knowledge is not always disseminated for the prevention of this type of attack, many companies and individuals are susceptible to this kind of malicious approach, with good practices being disseminated so that everyone has knowledge, resources, and devices to prevent data from being shared inappropriately [107].

Resources widely used in artificial intelligence are expert systems, which emerge through intelligent techniques based on a set of data capable of explaining real characteristics of a problem and transforming them into relational knowledge, through IF / THEN rules [77] that allow interpretability of a complex problem through a more straightforward and more feasible language to be understood by people who are not experts in the field.

This type of approach requires that the data be verified and validated by people who have knowledge about the analyzed problem and the use of systems that add artificial intelligence techniques are used to extract from the representative dataset about the problem its similarities and patterns, allowing that the generated rules are the basis for the construction of systems in programming languages (such as java) or enable the construction of documents to help users to protect themselves from this type of problem [106] thus allowing the knowledge to be used to disseminate corporate strategies that assist in the management, control, and storage of information that is preponderant for the company’s business.

Cyber attacks grow as data becomes critical to business decision-making. However, many employees who are involved in the process can not keep pace with the evolution of these threats.Therefore a technique that can identify and extract knowledge about the nature of cyber attacks becomes essential to spreading technical knowledge in a way that is understandable to most people. The terms of computing are complex and a technique that is capable of transforming knowledge into an interpretable language can facilitate the training of people involved in the company’s processes.

Techniques that use the concepts of artificial neural networks and fuzzy systems can abstract information from the database and build fuzzy rules that represent the domain of the analyzed problem [77]. The fuzzy neural network models combine the interpretability capability of fuzzy systems with the efficiency of training that artificial neural networks can provide [79]. These models use fuzzy neurons instead of artificial neurons, and their connections between layers make possible the generalization of the results and the better understanding of a problem. Several expert systems have been produced in many areas of science through fuzzy neural networks, such as health [66 , 92], engineering [70 , 82], economics and financial problems [67 , 84], time series forecasting [73 , 28], nature elements [18, 71], general contexts [68 , 26], regression problems [23 , 65], software effort estimation [96] and even cyber attacks [54 , 12].

This paper proposes the creation of an evolving model, capable of updating its parameters to each interaction with new information to build systems specialists in the prevention of cyber attacks. The proposed model has three layers like [100], where the first two form a fuzzy system of inference, capable of generating fuzzy rules based on the problem data, and the third layer is a neural network of aggregation, capable of transforming and grouping the fuzzy rules produced in responses about a malicious attack on the information set of people or companies. In the first layer, different from [100] and [94] will be used evolving data grouping concepts, where clusters can change with each new interaction of data coming from devices prepared to detect non-standard behavior. In the second layer, fuzzy logical neurons use the concepts of uninorm to perform the aggregation of the neurons produced in the first layer. These fuzzy neurons are called unineuron and can allow generated rules to travel more simply between AND and OR rules. Since the generation of rules can happen uncontrollably, generating redundant and unnecessary information, a regularization technique based on Bayesian theory capable of automatically determining the relevance of the element to the problem is applied to the hidden layer of the model to regularize it, thus avoiding overfitting. Finally, these rules are aggregated by an artificial neuron that has its synaptic weights calculated by the Extreme Learning Machine technique, generating the weights that unite the hidden layer to neural network analytically and in a single step. The novelty of the hybrid technique is to use the complexity of the cybernetic invasions to define the neurons through an evolving technique and the selection of more fuzzy rules according to the Bayesian techniques.

To perform the classification test of the model, databases of cybernetic invasions commonly used in problems of this nature were used. They have been provided in data mining and KDD (Knowledge Discovery in Databases) competitions and bring knowledge about simulations of various kinds of attacks to army computers. The proposed model will be compared with classical pattern recognition techniques, and it is intended to assess if it can maintain the ability to identify invading patterns and at the same time generate fuzzy rules to help people and companies disseminate patterns of attacks.

The paper is organized as follows: In the next section (2), the general references that guide this research will be presented. In section 3, the concepts that make up the proposed model are presented. Section 4 presents the new model proposed in this paper. In section 5 the tests and the database are presented to the reader, besides the generation of the fuzzy rules proposed by the approach and finally, in section 6 are presented the conclusions about the model and approach of dissemination of knowledge for the prevention of cyber attacks.

2 Literature review

2.1 Cyberspace

The term cyberspace was created in 1984 by an American writer named William Gibson; the term is presented in the author’s book entitled Neuromancer. The term cyberspace has different interpretations portrayed by different authors. By this statement we can understand that cyberspace is an environment that has as According to [39] there are still two variants of cyberspace in Barlovian cyberspace and virtual reality (VR). Barlovian cyberspace consists of an international computer network where the user interacts but does not immerse entirely in the environment, and the term virtual reality refers to the user’s deep interaction using most of his or her senses to create a simulated or real environment [39].

Computer theorists use the term cyberspace to refer to the notional social arena we enter when using computers to communicate. We can consider it as a metaphor that describes non-physical territory created by computational means, especially the internet, where individuals and corporations, alone or in groups, members of companies, public agencies or governments, can communicate, conduct research and traffic data in general, using Information and Communication Technologies (ICT) as support for its operation [45]. Actions in cyberspace are classified as offensive, exploitative or protective, and offensives can even impact national security. Cyberspace may be used more generally to refer to the potential "lifeway" or general type of culture being created via Advanced Information Technology (AIT) [45].

Cyberspace can relate to individuals by creating networks that are increasingly connected to a large number of points especially in the current day that computers and electronic devices are more accessible to the population, which makes sources of information increasingly accessible. It includes not only subjects, but also institutions that interconnect and interconnect with people, machines and documents.

Cyberspace is not just about who connects via the internet. It is an environment where humans interact with technologies, namely: cell phones, pagers, walk talk, among many others [74]. Figure 1 presents concepts related to cyberspace.

Fig.1

Cyberspace concepts. Avaliable in http://www.brasilcn.com/article/article_3840.html

2.2 Cyber attacks

Cyber attack is a malicious action commonly known as hacking, which involves the transmission of viruses (malicious files) that infect, damage and steal information from computers and other online databases of companies and individuals [55].

In an incipient way, nations, large and medium-sized companies have been preparing to avoid or minimize cyber attacks on the networks and information systems they use, as well as all other segments of society [11]. The attacks can happen in a physical way, where the devices containing the information are easily accessible, as well as modems, cables, and physical storage media [52]. Through the human environment, the attack is used by social engineering to obtain information from unprepared or unknowing people about virtual attacks, and by the logical medium where techniques like the Invasion to Overthrow Services (DDoS) are employed where the attacker works to overload the system in focus.

Techniques that exploit vulnerabilities for access ports, the fleeing of viruses and malware, or even password decoders, are examples of ways to attack the integrity of cyberspace. Some approaches work with scripts that try to decipher important access passwords [52]. Another vital aspect to be considered relates to cybercrime, mainly due to the damaging effects that can result from misuse of information and communication systems by malicious people. Despite the efforts of some sectors of the Public Administration and private companies’ computing departments, there are still gaps in physical and logical structures, in addition to countries such as Brazil that have weak legislation to typify attacks on the computer network [107]. Figure 2 below shows the origin and destination of cyber attacks.

Fig.2

Cyber attack in real time. Avaliable in 2

2.3 Mallware

Malware is a term originating in English malicious software. Malware is a software intended to invade a computer system illegally, with information theft as its main objective [62].

In addition to computer viruses, which are developed to do harmful actions on a computer, legal applications for programming failures can also be considered as malware. According to [62], the human factor can contribute as much to the success as to the failure of the Malware attacks. According to the degree of awareness that a particular individual has about this type of attack, it can directly influence the expected result concerning the defense using anti-virus. Figure 3 shows examples of malware.

Fig.3

Malware. Avaliable in http://www.net-security.org/malwarenews.php?id=2636

2.4 Packed viruses

A computer virus package is a program or piece of code designed to damage the computer device by corrupting system files, using resources, destroying data, or otherwise being a hassle. They may contain lines of code capable of damaging the processing of the computer [98].

Viruses are unique among other forms of malware because they can self-replicate, that is, they can copy to other files and computers without the user’s consent [98].

2.5 SQL injection

Structured Query Language, or just SQL, is the default language for interacting with relational databases. In it, the main tasks related to data manipulation in database structures [48] are performed.

SQL Injection is a type of cyber attack that takes advantage of flaws in systems that generally have a miscommunicated communication, programming or low-security criteria in web pages with the database through SQL commands, and for that reason is considered a type of attack it is effortless. In this invasion process, the attacker can insert a custom and undue SQL statement inside a query called SQL Query through the data entries of a program, such as forms or URL of an application. In the fields destined for user information, these commands are performed, that is, SQL commands are displayed, however because of this failure in the applications they end up causing changes in the database, loss of information, sharing of knowledge of the company in an inadequate way [46].

A cracker manages to obtain any sensitive data maintained in the database of a server computer through SQL injection attacks, including depending on the database version, it is also possible to insert malicious commands and obtain full permission to the host machine and executes the structure of the database [46]. Figure 4 reports the processes and steps involved in SQL Injection.

Fig.4

SQL Injection. Avaliable in https ://www . veracode . com/security/sql - injection

2.6 Main techniques of artificial intelligence

Data mining is a technology that combines traditional methods of data analysis with sophisticated algorithms to process large volumes of data [13]. These algorithms can receive as input a set of facts and return a behavior pattern that can be expressed as an association rule, a mapping function, or the modeling of a profile [13]. Data mining is a step in the process of knowledge discovery in databases (KDD). All steps are important in the process of knowledge discovery. The following is a description of each step [13]: -Data collection: a collection of data related to the business object to be analyzed; -Pre-processing: data treatment to reduce repetitions or discrepant values; procedures for selection of attributes and normalizations; -Data mining: application of data mining tasks; -Post-processing: validation and formatting of analyzes. The data mining tasks, in turn, are divided according to [13]: -Predictive tasks: The goal is to predict the value of a given attribute based on the values of other attributes. This attribute to be discovered is called the target attribute. They are specialized in the tasks of classification and regression, where the first use of discrete target attributes (binary values) and the second continuous target attributes (e.g., price, length, weight). - Descriptive tasks: The goal is to derive patterns, correlations, and groups of data. They specialize in the tasks of association analysis, grouping, and detection of deviations and/or anomalies. They are exploratory and, therefore, require post-processing techniques to validate their results.

Figure 5 summarizes the process of extracting knowledge from a database [40].

Fig.5

Process of applying data mining techniques to data.

2.7 Intelligent models for detecting cyber attacks

The paper on network anomaly detection based on neural network evolution written by Konstantinos Demertzis and Lazaros Iliadis [32] describes an intelligent system of machine learning, where part of the system works looking for known threats, and another part tries to detect probable threats according to abnormal activities that take place in the system. The detection system is simple, it generates a state being treated as usual, and all signals outside the edge of that state are treated as an anomaly, so the detection algorithm learns continuously while the system is active in the network, is more and more precise. The methodology used in the article was Artificial Neural Networks (SANN) [32], which uses an approach of classification Evolving Connectionist System (eCOS) and Multi-Layer Feed Forward (ANN) to classify the exact type of the invasion or abnormality in the network with minimum computational potential. SANN is a set of modular systems based on node connections. The system continuously organizes itself, in line mode, adapting itself from the input data, being able to function or not in a supervised way. The SANN is also being applied to several other complex real-world problems, proving to be quite capable. The name of the developed model is called the biohybrid BIOPSSQLI (A Bio-Inspired Hybrid Artificial Intelligence Framework for Cyber Security), which works on the peaks that occur in the system, while the neurons are used to monitor the algorithm using OnePass learning. Traffic-oriented data is used by importing the classes, which use the variable Population Encoding (control variable from data conversion of the sample to the actual value in the time peaks). Data were classified into two types, Class 0, is the typical class results. Class 1, corresponding to abnormal results. When there is verification, and the result is 0, the eSNN classification process is repeated, but with appropriate data vectors. If the result continues 0, the process is terminated. When the result is Class 1, a neural network of two layers is used to recognize the pattern of the type of attack, using all the resources of the KDD database, if it happens in the hidden layer, 33 neurons are used. The results of the process are presented to the network administrator in the form of an alert, and the BIOPSSQLI graphical model can be analyzed in Fig. 6 [32].

Fig.6

BIOPSSQLI [32].

A work inspired by Greek mythology lies in the security of information systems. Ladon digital is a security mechanism of advanced information systems, which uses Artificial Intelligence to protect, control and offers an early warning in cases of deviations or mistakes of digital security measures. It is an effective system of network supervision, which enriches the lower layers of the system (transport, network, and data). Intelligently amplifies the top layers (Session, Presentation and Application) with automated control capabilities. This is done to increase the energy security and the mechanisms of reaction of the general system, without special requirements in computational resources. The Ladon Algorithm has advanced techniques for detecting cyber attacks, generating incredibly fast actions and low computational costs [33].

The use and generation of content, entertainment, and services through mobile services generate an extraordinary demand for the protection of these services. More and more people are using cell phones to share essential data, including their top payments. This attracts powerful gangs of cybercriminals, who use sophisticated, highly intelligent types of malware to amplify their attacks. Malicious software is designed to run silently and remain unsolvable for a long time. The work of [34] proposes the development of the anti-malware structure of computational intelligence (CIantiMF), which is innovative, ultra-fast and with low requirements. To run under the Android operating system (OS). His rationale is based on advanced approaches to computational intelligence such as the extreme learning machine. CIantiMF uses two advanced technology extensions for the ART Java virtual machine: the first is the intelligent anti-malware extension, which can recognize whether the java classes of an Android application are benign or malicious using an optimized multilayer perceptron. The second is the extension of online traffic identification Tor, which is capable of locating malware, identifying traffic Tor and prohibiting botnets, using the sequential algorithm of extreme online learning [34].

Most recently, Intelligent systems were also designed to protect data traffic from power distribution. An intelligent network is an improved power transmission and distribution network through digital control, monitoring, and telecommunications capability. It provides a two-way real-time flow of energy and information to all stakeholders in the chain from the generation plant to the commercial, industrial and residential end-user. Information and communication infrastructures will play an essential role in linking and optimizing the available grid layers. Grid operation relies on control systems called Supervision Control and Data Acquisition (SCADA) that monitors and controls the physical infrastructure. Because it is a sophisticated computer system of great relevance to the power distribution system, its devices can be targeted by cyber attacks. At the heart of these SCADA systems are specialized computers known as Programmable Logic Controllers (PLCs). In such devices, destructive cyber attacks against SCADA systems are carried out, destroying many devices and damaging the average speed of operation in order to deceive the operators of the equipment. To solve the problem, the work of [35] proposes a computer intelligence system to identify cyber attacks Intelligent Energy Networks (SICASEG). It is a big-time forensics tool that can capture, log and analyze the events of the intelligent power network to find the source of an attack to prevent future attacks and perhaps for lawsuits.

Also, noteworthy work to aid cybersecurity proposed in [37 , 36]. These works use artificial intolerance approaches recognizing attack patterns and connecting to other systems that can act to avoid further damage to computational systems. Figure 7 presents a flowchart of actions performed by intelligent models incorporated into data traffic systems in the identification of cybernetic intrusions.

Fig.7

Computational intelligence system for malware detection [32].

2.8 Fuzzy neural networks, neurofuzzy network and expert systems

Fuzzy neural networks (FNN) are neural networks of fuzzy neurons. These networks have as main characteristic the synergic collaboration between the fuzzy and neural networks theory generating models that integrate the treatment of the uncertainty and interpretability provided by fuzzy systems and the learning ability provided by neural networks [77].

Thus a Neuro-Fuzzy network (NFN) can be defined as a Fuzzy system that is trained by an algorithm provided by a Neural network. Given this analogy, the union of the neural network with the Fuzzy logic comes with the intention of softening the deficiency of each of these systems, making us have a more efficient, robust and easy to understand system [50]. Figure 8 presents examples of various combinations present between artificial neural networks and fuzzy systems. These intelligent models have an architecture based on multi-layered networks, where each of them has different functions in the model. In the works of [15 , 85 and 72]. FNNs have three layers. In the models in [89] and [104, 16], its structure is composed of four layers. The function of each of these layers includes the concepts of fuzzy systems and artificial neural networks. In most models, the first layer is the one that partitions the input data, transforming them into fuzzy logical neurons. Versions of fuzzy c-means [14], ANFIS [53], and clustering by the cloud [4] are commonly applied.

Fig.8

Fuzzy Neural Networks examples [32].

Recent models of fuzzy neural networks based on the extreme learning machine were proposed in [49 , 20]. Another models use back propagation like [17], genetic optmization [78], Hebb approach [99], stable learning [64] and ulti-objective algorithm [63]. Another model proposed by [26] uses the data density concepts for the fuzzification process and a fast extreme learning machine procedure.

It should also be noted that recent models of fuzzy neural networks have worked to solve problems of cybernetic attacks [12, 11]. However different from the approach proposed in this paper, resampling techniques and a non-evolutionary approach was used to solve the problems, which generated excellent results, but with long processing time.

2.9 Self-organised direction aware data partitioning algorithm- SODA

The process by which fuzzy models treat data can determine how hybrid models can have the interpretability of their results closer to their real world. Models that are fully data-driven are the targets of recent research and have achieved satisfactory results in cloud data cluster. This clustering concept focused on data is called Empirical Data Analytics (EDA) [7]. This concept brings together the data without statistical or traditional probability approaches, based entirely on the empirical observation of the input data of the model, without the need for any previous assumptions and parameters [42]. SODA is a data partitioning algorithm capable of identifying peaks/modes of data distribution and uses them as focal points to associate other points to data clouds that resemble Voronoi tessellation. Data clouds can be understood as a particular type of clusters, but with a much different variety. They are non-parametric, but their shape is not predefined and predetermined by the type of distance metric used. Data clouds directly represent the properties of the local set of observed data samples [42]. The approach employs a magnitude component based on a traditional distance metric and a directional/angular component based on the cosine similarity. The main EDA operators are described in [7], which are also suitable for streaming data processing. The EDA operators include the Cumulative Proximity, Local Density, and Global Density. See more in [42].

The SODA approach uses two central concepts where the first stand out magnitude component based on a traditional distance metric and the second element involves the directional/angular component based on the cosine similarity.

The most widely used Euclidean distance metric was used in SODA as the magnitude component, and thus the magnitude component can be expressed by [42]:

$\begin{matrix} d_{M} (x_{i}) = ∥ x_{i} - x_{j} ∥ = \sqrt{\sum_{k = 1}^{m} (x_{ik} - x_{jk})^{2}} \\ i, j = 1, 2 . . ., N \end{matrix}$ (1)

The angular component that used the concepts of cosine similarity and expressed in SODA as [42]:

$d_{A} (x_{i}) = \sqrt{1 - cos (Θ_{x_{i} - x_{j}})} i, j = 1, 2 . . ., N$ (2)

where cos(Θ_{x_i,x_j}) is $\frac{< x_{i}, x_{j} >}{∥ x_{i} ∥, ∥ x_{j} ∥}, Θ_{x_{i}, x_{j}}$ is the value of the angle between x_i and x_j [42].

When one uses the magnitude and angular component values together, significant problems can be projected onto a 2D plane. This plan is called direction aware plane (DA) [42].

The Empirical Data Analytics (EDA) operators [7] include the Cumulative Proximity, Local Density, and Global Density. As the SODA approach, understanding the concept of density is relevant. This theme has been extensively presented in [7] and [5]. The local density D _n is defined as the inverse of the normalized cumulative proximity and directly indicates the main pattern of observed data [5], where D for the training samples x _i = (1,2...N; N _u ) >1 is defined as follow [42]:

$D_{n} (x_{i}) = \frac{\sum_{j = 1}^{n} π_{n} (x_{j})}{2 n π_{n} (x_{j})}$ (3)

Global density is defined for unique data samples together with their corresponding numbers of repeats in the dataset/stream, and of a particular unique data sample, u _i (i=1, 2,... n _u ; nu ≥ 1) is expressed as the product of its local density and its number of repeats considered as a weighting factor [7] as follows:

$D_{n}^{G} (u_{i}) = f_{i} D_{n} (u_{i})$ (4)

As the main EDA operators: cumulative proximity, local density (D) and global density (D^G) can be updated recursively, allowing the SODA algorithm to be suitable for online processing of streaming data, causing the updating of density groups of the data is evolving. Figure 6 shows an example of the SODA definition and the center (black points) of density grouping defined by the algorithm.

The algorithm is performed in the following steps [42]:

Stage 1- Preparation: we calculate the average values between every pair of data samples, x₁, x₂, . . . , x_n for both, the square Euclidean components, d_M and square angular components, d_A.

Stage 2- DA Plane Projection: The DA projection operation begins with the unique data sample that has the highest global density, namely u₁. It is initially set to be the first reference, μ₁ ← u₁, which is also the origin point of the first DA plane, denoted by P1 (L ← 1, L is the number of existing DA planes in the data space).

Stage 3: Identifying the Focal Points: for each DA plane, denoted as P_e, find the neighboring DA planes.

Stage 4: Forming Data Clouds: After all the DA planes standing for the modes/peaks of the data density are identified, we consider their origin points, denoted by μ_o, as the focal points and use them to form data clouds according to as a Voronoi tessellation [76]. It is worth to stress that the concept of data clouds is quite similar to the concept of clusters, but differs in the following aspects: i) data clouds are nonparametric; ii) data clouds do not have a specific shape; iii) data clouds represent the real data distribution.

Figure 9 shows an example of the SODA definition and the center of cloud grouping defined by the algorithm.

Fig.9

SODA [42].

3 EFNN-evolving fuzzy neural network

This section will present the main concepts involved in the architecture of the proposed algorithm for the identification of cyber attacks.

3.1 Model architecture

The fuzzy neural network described in this section is composed of three layers. In the first layer, fuzzification is used through the concept of data density. The centers of the clusters are used to create the fuzzy Gaussian neurons in the first layer. The weights and bias of these neurons are randomly defined in the range of zero to one. Already in the second layer the logical neurons of the unineuron [60] type. These neurons have weights and activation functions determined at random and through t-norms and s-norms to aggregate the neurons of the first layer. To define the weights that connect the second layer with the output layer, the concept of a fast-learning machine [51] is used to act on the neuron with a linear activation function.

Unineuron is used to construct fuzzy neural networks in the second layer to solve pattern recognition problems and bring interpretability to the model. Figure 10 illustrates the feedforward topology of the fuzzy neural networks considered in this paper.

Fig.10

FNN architecture.

The second layer is composed by L fuzzy andneurons. Each neuron performs a weighted aggregation of some of the first layer outputs. This aggregation is performed using the weights w _il (for i = 1... N and l = 1... L). For each input variable j, only one first layer output a _jl is defined as input of the l-th neuron. So that w is sparse, each neuron of the second layer is associated with an input variable.

Finally, the output layer is composed of one neuron whose activation functions are linear.

3.1.1 First layer- Evolving fuzzification

The first layer is composed of neurons whose activation functions are membership functions of fuzzy sets defined for the input variables. For each input variable x _ij , L clouds are defined A_lj, l = 1,... L whose membership functions are the activation functions of the corresponding neurons. Thus, the outputs of the first layer are the membership degrees associated with the input values, i.e., $a_{jl} = μ_{l}^{A}$ for j = 1... N and l = 1... L, where N is the number of inputs and L is the number of fuzzy sets for each input results by SODA.

For the evolving SODA algorithm, 75% of the training samples are used in the evolving form at the first moment, and the remaining 25% is used for the recursive updating of the parameters that define the data density groups.

3.1.2 Second layer- Logical neurons fuzzy and fuzzy rules

The logical neurons used in the second layer of the model are of the unineuron [60]. They uses the concepts of uninorm [103] to perform more simplified operations according to the functions of activation of the fuzzy neurons. Its formatting allows the unineuron to use either concepts of a neuron and, or a neuron or. [61] explain important concepts about a unineuron. The processing of neurons occurs at two levels. At the first level of L₁ locations the input signals are combined individually with the weights. In the second, at global level L₂, a global aggregation operation is performed on the results of all first-level combinations. Traditional logical neurons use t-norms and s-norms to perform the described operations. 1- each pair (a_i, w_i) is transformed into a single value b_i = h (a_i, w_i); 2- calculate the unified aggregation of the transformed values U (b₁, b₂ . . . b_n), where n is the number of inputs.

The function p is responsible for transforming the inputs and corresponding weights into individual transformed values. A formulation for the p function can be described as [60]: $p (w, a) = wa + wo$ (5) using the weighted aggregation reported above the unineuron can be written as:

$z = UNI (w; x; z) = U_{i = 1}^{n} p (w_{i}, z_{i})$ (6) where T are t-norms (product), s is a s-norms (probabilistic sum). Fuzzy rules can be extracted from unineurons according to the following example (Fig. 11): $\begin{matrix} {Rule}_{1} : If x_{i 1} is A^{1} with certainty w_{1} . . . \\ and x_{i 2} is A^{2} with certainty w_{1} . . . \\ and x_{iL} is A^{L} with certainty w_{1} . . . \\ Then y_{1} is v_{1} \\ {Rule}_{2} : If x_{i 1} is A^{1} with certainty w_{2} . . . \\ and x_{i 2} is A^{2} with certainty w_{2} . . . \\ and x_{iL} is A^{L} with certainty w_{2} . . . \\ Then y_{2} is v_{2} \\ {Rule}_{3} : If x_{il} is A^{1} withcertainty w_{3} . . . \\ and x_{i 2} is A^{2} with certainty w_{3} . . . \\ and If x_{iL} is A^{L} with certainty w_{3} . . . \\ Then y_{3} is v_{3} \end{matrix}$ (7)

Fig.11

Evolving Systems Concepts [90].

These rules allow the creation of a building base for expert systems [15].

3.1.3 Third layer - neural network of aggregation

The artificial neural network of aggregation uses the simple concepts of a network with its bias and its weight, which in this case is defined analytically by the extreme learning machine proposed by Huang. The output of the model is:

$y = sign \sum_{j = 0}^{l} f_{linear} (z_{l}, v_{l})$ (8)

where z ₀ = 1, v ₀ is the bias, and z _j and v _j , j = 1,..., l are the output of each fuzzy neuron of the second layer and their corresponding weight and sign is the signum function, respectively.

3.1.4 Evolving fuzzy neural network training

Intelligent evolving systems are based on online machine learning methods for intelligent hybrid models. These systems are characterized by their ability to extract knowledge from data and adapt their structure and parameters to better adapt to changes in the environment [57 , 25]. In general, they are formed by an evolving set of locally valid subsystems that represent different situations or points of operation [3]. [58] further developed evolving connectionist systems (ECOS) and not only to learn in an adaptive, incremental way from data that measure evolving processes, but to extract rules and knowledge from the trained systems. Evolving Fuzzy Systems (eFS) [6] are adaptive systems that modify both their structure and their parameters as data flow is processed. That is, the structure of the evolving fuzzy system can be reduced or expanded to fit each new input data. Evolving fuzzy systems can be seen as a combination of fuzzy models, an evolving mechanism for representing and compacting input data and methods recursive machine learning.

In Fig. 11 are exemplified the processes that involve the concept of evolving systems.

The membership functions in the first layer of the FNN are adopted as Gaussian, constructed through the centers obtained by the method of granularization of the evolving SODA input space and by the randomly defined sigma (in the interval between zero and one). The number of neurons L in the first layer is defined according to the input data, and by the number of partitions (ρ), defined parametrically. This approach partitions the input space, following the definition logic of creating data nodes. The centers of these formed clouds make up the Gaussian activation functions of the fuzzy neurons. These changes will allow the adaptation of the data according to the basis submitted to the model, allowing a more independent and data-centered approach. The second layer performs the aggregation of the L neurons from the first layer through the unineurons.

After the construction of the L unineurons the bolasso algorithm [8] is executed to select LARS using the most significant neurons (called L _s ). The final network architecture is defined through a feature extraction technique based on regularization and resampling. The learning algorithm assumes that the output hidden layer composed of the candidate neurons can be written as [94, 24]:

$f (x_{i}) = \sum_{i = o}^{L_{p}} v_{i} z_{i} (x_{i}) = z (x_{i}) v$ (9) where v = [v₀, v₁, v₂, . . . , v_L?] is the weight vector of the output layer and z ( x _i ) = [z₀, z₁ (x_i) , z₂ (x_i) . . . z_L? (x_i)] the output vector of the second layer, for z ₀ = 1. In this context, z ( x _i ) is considered as the non-linear mapping of the input space for a space of fuzzy characteristics of dimension L _ρ [24].

When the amount of neurons is high to solve the problem of an intelligent model, intelligent techniques can be used to improve the architecture of the networks. These techniques use mathematical concepts to determine the neurons most relevant to the model for solving problems. Some of the feature selection techniques use probability features based on Bayes’ theory. A helpful indistinct penalty develops from a dual space representation of sparse Bayesian learning, which is based on the assumption of automatic relevance determination (ARD) that explains this problem by regularizing the solution space utilizing a parameterized prior distribution data-dependent prior distribution that effectively eliminates unnecessary or superfluous features [75]. [102] gives the canonical form of this problem $min_{x} ‖ x ‖_{0}, s . t . y = Φ x$ (10) where $Φ \in ℝ^{n \times m}$ is a matrix whose columns φ_i describe an overcomplete data (i.e., rank (Φ) = n and m>n), $x \in ℝ^{m}$ is a vector of unexplained coefficients to be discovered, and y is the signal vector. The cost function is minimized the l₀ norm of x, which is a number of the nonzero components in x [102]. If frequency noise or modeling inaccuracies are already, we preferably solve the alternative problem for this method:

$min_{x} ‖ y - Φ x ‖_{2}^{2} + λ ‖ x ‖_{0}, λ > 0$ (11) SBL assumes a Gaussian likelihood function $p (y | x) = N (y; Φ x, λ I)$ , consistent with the data fit term from Equation (10). The basic ARD prior incorporated by SBL is $p (x; γ) = N (x; 0, diag [γ])$ , where $\in ℝ_{+}^{m}$ is a vector of m non-negative hyperparameters governing the prior variance of each unknown coefficient. These hyperparameters are estimated from the data by first marginalizing over the coefficients x and then performing what is commonly referred to as evidence maximization or type-II maximum likelihood, this is equivalent to minimizing [102]: $\begin{matrix} L (γ) ≜ - log \int p (y | x) p (x; γ) d x = \\ - log p (y; γ) \equiv log | Σ_{y} | + y^{T} Σ_{y}^{- 1} y \end{matrix}$ (12)

where Σ_y ≜ λI + ΦΓΦ^T and Γ ≜ diag [γ]. Once some $γ_{*} = arg min_{γ} L (γ)$ is estimated, an evaluation of the unknown coefficients can be achieved by setting x_SBL to the succeeding mean measured using γ_∗ [102] $x_{SBL} = E [x | y; γ_{*}] = Γ_{*} Φ^{T} Σ_{y *}^{- 1} y .$ (13)

See that if some γ_∗,i = 0, as regularly happens through the learning process, then x_SBL,i = 0 and the similar reference column is effectively pruned from the model. The resulting x_SBL is therefore sparse, with nonzero elements corresponding including the suitable basis vectors [102]. Therefore, this probability-based approach to pruning second-layer neurons is efficient because it is linked to a data-centric non-parametric technique, allowing model definitions of relevance to be based on the problem data.

Subsequently, following the determination of the network topology, the predictions of the evaluation of the vector of weights’ output layer are performed. The Moore-Penrose pseudo-Inverse [41] estimates this vector:

$v = Z^{+} y$ (14) where Z⁺ is the Moore-Penrose pseudo Inverse of z, which is the minimum norm of the least squares solution for the output weights.

4 Construction of expert systems for cyber attack prevention

The proposed model synthesized as demonstrated in Algorithm 4. It has one parameters: 1- the number of grid size, ρ ;

Algorithm 1
EFNN training

(1) Define grid size, ρ

(2) Calculate L cluster in the first layer using SODA

and ρ

(2.1) The evolving approach uses 75% of the training samples in

the offline training and 25% for online updating of the clusters.

(3) Construct L fuzzy neurons with Gaussian membership

functions constructed with center values derived from SODA

and sigma defined at random.

(4) Define the weights and bias of the fuzzy neurons randomly.

(5) Construct L logic neurons with random weights and bias on

the second layer of the network by welding the L fuzzy neurons

of the first layer.

(6) For all K entries do

(6.1) Calculate the mapping z (x_i)

end for

(7) Select significant L_s neurons using ARD using Eq. 13

(8) Estimate the weights of the output layer (Eq. 14)

(9) Calculate the output of the model using an artificial neuron

with activation function of type linear.

The model proposed in this paper has as main advantages over the other models that the nature of the data of the problem defines the construction of fuzzy rules and also selects those that are more representative to the problem, based on Bayesian techniques.

5 Simulation of binary cyber attacks

5.1 Databases used in the tests

The data originated from MIT’s Lincoln Lab represent the most popular free dataset used in the IDS assessment [97] was used to test the approach proposed in this paper. It contains recordings of the total network flow of a network that was installed in Lincoln Labs and simulates the military network of the US Air Force. The academic community commonly uses this database for having been challenged in the 1999 KDD Cup.

The network event analysis method includes the connection between a source IP address and a destination IP, during which a sequence of TCP packets are exchanged, using a specific protocol and a strictly defined operating time. They feature 41 features that are organized into the following four basic categories: Content Features, Traffic Features, Time-Based Traffic Features, Host-Based Traffic Features. Besides, the attacks are divided into four categories, namely: DoS, r2l, u2r, and probe.

The following files and configurations were used to perform the tests for the network traffic: TrafRedFull.data-(TRFD): In the first case of classification, all (41) features were used. The data were labeled as normal or abnormal. The TrafRedFull.data dataset has 145.738 records and 70 % (102.016 rec.) Were allocated to training data and the 25 % (43.722 rec.) Were applied in the validation test of the model.

In the second case of classification was used the normalFull.data (NFD) That has the standard characteristics relevant to the problem (11 resources). The data were also classified as normal or abnormal. The normalFull.data dataset has the same number of records as the TrafRedFull.data, so the division intended for training and model testing was the same.

FullVirusDataset.data (FVD) has a total of 5498 records consisting of 2.598 compressed viruses from the Malfease Project3 dataset, plus 2.231 noncompacted benign executables collected from a Windows XP Home plus, several common user applications and 669 benign executables packed. The dataset was randomly divided into two parts: A training data set (70 %) containing 3.849 patterns and a set of test data (30 %) containing 1.649 random patterns from the database. These Datasets [80] are available at 4.

VirusDataset.data (VD) containing 2.598 malware and 669 benign executables is divided into two parts: A training data set (70 %) containing 1.834 malware-related and 453 Patterns related to benign executables A set of test data (30 %) containing 762 malware-related patterns and 218 benign executables. To translate each executable into a standard vector Perdisci et al. [80] used binary static analysis to extract information such as the name of the code and data sections, the number of writable, executable sections, the code and the data entropy.

The SQLDataset.data (SQLD) used to evaluate attack patterns SQLInjection includes a list of 13.884 SQL statements that were selected by various sources through computational means. It should be noted that 12.881 of them are malicious (SQL Injections) and 1.003 are legitimate commands. With the help of the sqlparse module in Python5, which is a type of validation of malicious SQL codes, the syntax path and the use of certain SQL symbols in the construction of SQL injection commands. The SQL Statement correlation patterns were also obtained with the SQL injection type attacks. Therefore these patterns represent the main characteristics collected [32].

5.2 Configurations and models used in the intrusion detection and attack test

6In this section, the assumptions of the classification tests for the model proposed in this paper are presented. To perform the tests, real and synthetic bases were chosen, seeking to verify if the accuracy of the proposed model surpasses the traditional techniques of pattern classification. All the tests with the involved algorithms were done randomly, avoiding tendencies that could interfere in the evaluations of the results. The model proposed in this paper, called EFNN, was compared to fuzzy neural network classifiers using fuzzy c-means [14] (FFNN) [61] and genfis1 [53] (GFNN) [100]. In the Evolving SODA algorithm, 75% of the training samples are used for the offline training and the remaining 25% for the evolving training. In all models, the weights and bias were used in the first and second layers randomly. The number of primary neurons of each model is defined according to the number of centers (FFNN), membership functions (GFNN) and grid size (EFNN). For uniformity of the tests, the values involved in the first layers of the models, which end up defining the number of L neurons, were arbitrated in the range of [3 –10], where the best results were defined using cross-validation. In the two models tested (GFNN and FFNN), we adopted the unineuron [60] for the composition of the logical neurons, and as neuron of third layer, we used a single neuron with a function of linear activation. For the model proposed in this paper, both the type of L neurons and the activation function of the exit neuron that compose the artificial neural network is defined according to the database and the combination established in stochastic form.

To verify the ability to classify binary patterns of the model we compared the results obtained with traditional models of machine learning that are available in the tool WEKA [47]. These are Naive Bayes [56], Multilayer Perceptron [86] and C4.5 [83]. In the artificial neural network models that use ELM as a training base, the same number of neurons used in the fuzzy neural network model test was used. In the models proposed in the WEKA, the concept of 10 k-fold was used to obtain the results. The initial configurations of the algorithm proposed in the weka tool were maintained. A total of 30 experiments were performed with the all models submitted to all test databases. In all tests and all models, the samples were shuffled in each test to demonstrate the actual capacity of the models. Percentage values for the classification tests are presented in the results tables, accompanied by the standard deviation found in the 30 replicates. The expected pattern obtained all accuracy responses with the response obtained. Finally, the AUC is also highlighted for classification tests and a test time. The outputs expected in the test were set to -1 and 1. Therefore, all bases used had their outputs converted to zero and one. Accuracy is the primary test result. In the neural network models, the activation functions are of the hyperbolic tangent type. The evaluation of the performance of these algorithms occurs through the following equations:

$accuracy = \frac{TP + TN}{TP + FN + TN + FP}$ (15) $AUC = \frac{1}{2} (sensitivity + specificity)$ (16) where the sensitivity and specificity of these models are calculated by:

$sensitivity = \frac{TP}{TP + FN}$ (17) $specificity = \frac{TN}{TN + TP}$ (18)

In all cases, TP = true positive, TN = true negative, FN = false negative and FP = false positive.

5.3 Results

Table 2 present the accuracy results of the models in the 30 replicates in each of the bases evaluated.

Table 1
Accuracies of the Fuzzy Neural Network in the tests performed

Dataset EFNN FFNN GFNN NB MLP C4.5

TRFD 97.84 (0.07) 97.24 (0.09) 97.12 (0.09) 96.89 (0.03) 96.18 (0.07) 97.02 (0.15)

NFD 99.16 (0.76) 99.12 (0.42) 99.24 (0.01) 98.37 (0.76) 99.03 (0.41) 98.78 (0.44)

FVD 97.24 (0.09) 89.62 (1.15) 95.18 (0.98) 94.17 (0.62) 98.17 (0.64) 91.33 (0.86)

VD 95.46 (0.86) 93.30 (1.64) 92.48 (3.42) 84.35 (28.97) 91.42 (2.36) 91.06 (26.23)

SQLD 98.49 (0.66) 97.63 (1.14) 98.02 (0.07) 92.74 (3.22) 96.73 (1.16) 95.44 (2.06)

Dataset	EFNN	FFNN	GFNN	NB	MLP	C4.5
TRFD	97.84 (0.07)	97.24 (0.09)	97.12 (0.09)	96.89 (0.03)	96.18 (0.07)	97.02 (0.15)
NFD	99.16 (0.76)	99.12 (0.42)	99.24 (0.01)	98.37 (0.76)	99.03 (0.41)	98.78 (0.44)
FVD	97.24 (0.09)	89.62 (1.15)	95.18 (0.98)	94.17 (0.62)	98.17 (0.64)	91.33 (0.86)
VD	95.46 (0.86)	93.30 (1.64)	92.48 (3.42)	84.35 (28.97)	91.42 (2.36)	91.06 (26.23)
SQLD	98.49 (0.66)	97.63 (1.14)	98.02 (0.07)	92.74 (3.22)	96.73 (1.16)	95.44 (2.06)

Table 2

AUC of the Fuzzy Neural Network in the tests performed

Dataset	EFNN	FFNN	GFNN	NB	MLP	C4.5
TRFD	0.9700 (0.07)	0.9700 (0.02)	0.9700 (0.09)	0.9700 (0.03)	0.8426 (0.23)	0.9698 (0.07)
NFD	0.9899 (0.01)	0.9902 (0.02)	0.9901 (0.01)	0.9814 (0.16)	0.9900 (0.01)	0.9866 (0.14)
FVD	0.9700 (0.03)	0.8854 (0.11)	0.9418 (0.10)	0.9514 (0.02)	0.9841 (0.01)	0.9186 (0.12)
VD	0.9441 (0.06)	0.9258 (0.04)	0.9183 (0.12)	0.8366 (0.58)	0.9083 (0.06)	0.8871 (0.23)
SQLD	0.9841 (0.16)	0.9687 (0.09)	0.9802 (0.02)	0.9198 (0.15)	0.9603 (0.05)	0.9504 (0.03)

Table 3 show the AUC results of the models.

Table 3

Time execution of the Fuzzy Neural Network in the tests performed

Dataset	EFNN	FFNN	GFNN
TRFD	2396.52 (541.07)	7345.84 (1845.62)	6850.69 (2412.52)	3214.54 (587.15)	6214.52 (1054.33)	6358.78 (448.98)
NFD	2287.54 (125.69)	5894.87 (336.02)	7369.58 (1500.22)	2079.65 (233.18)	2265.98 (314.01)	3665.51 (450.14)
FVD	168.54 (12.93)	198.25 (40.64)	242.87 (30.29)	2079.65 (233.18)	2265.98 (314.01)	3665.51 (450.14)
VD	74.56 (5.42)	116,87 (14.27)	238.98 (19.26)	98.74 (21.06)	69.87 (10.18)	116.32 (15.23)
SQLD	584.17 (12.21)	774.16 (21.18)	1158.47 (100.22)	441.46 (16.48)	325.10 (21.54)	233.47 (14.16)

In Tables 4 the execution time (in seconds) of the algorithms for the proposed tests is present.

5.4 Fuzzy Rules

The fuzzy rules generated by the system attend to a logical and interpretive relation about the possible contexts of invasion. The technical terms involved in the use of artificial intelligence concepts can hinder the learning and understanding of the people involved in the processes. With the set of generated fuzzy rules, a language is allowed that is closer to the contexts lived in the daily life. A formed pool may be the relationship between malicious level and the trust level of a package or requisition. When we identify that new data belongs to this grouping, more coherent decisions can be made to prevent and support that computerized systems undergo cyber attacks. See the following rule example and how it can assist in the training and dissemination of technical knowledge:

"If the length is group 1 AND / OR the entropy is low AND / OR the level of malice is group 3, And the confidence level is group 2, And the level difference is group 1 then there is a SQL Injection Invasion." where according to the expert’s knowledge we can identify group 1 as a moderate characteristic, group 2 as high characteristic and group 3 as a simple characteristic.

6 Discussion - conclusions

We can verify that in Tables I and II the proposed algorithm was able to identify more appropriately the possible attacks from several sources of threats. This proves that the evolving approach becomes critical to day-to-day operations where systems need to learn about new situations and new attacks.

In this context, it can be verified that the proposed model acts in a similar way or even with superior performance to the algorithms of classification of binary patterns commonly used in the literature.

When comparing the hybrid approaches, it stands out a superior performance with a much shorter execution time than the others. This is due to updates of equally spaced clusters and membership functions that can make models unfeasible for approaches with large flows of information. The model maintained its ability to execute by performing the tasks in acceptable times, but still inferior to the traditional classifier algorithms. It has the advantage of the possibility of interpretability of the results, a factor that is not present in the neural network models that can be seen as a black box.

The generated fuzzy rules can serve as business rules for the construction of other computer systems or even for employee training and dissemination of knowledge so that according to the discovered patterns, preventive and corrective actions can be taken in companies or the familiar routine of people.

The technique proposed in this paper has advantages and disadvantages concerning the intelligent models commonly used to solve problems of cyber attacks as follows:

Advantages: Compared to traditional neural network models that use backpropagation to update network parameters, the model can act with a random definition of the internal parameters and only the third layer weights calculation, making the model simpler. Because it is a hybrid model, it has advantages over artificial neural networks by exploring aspects of interpretability of the problem, transforming the data into linguistic variables.

On other hybrid models that are state of the art in identifying problems on data attacks in cyberspace, we can highlight that the techniques used in the fuzzification of the model are based entirely on the nature of the data. It allows the formation of neurons more representative of the nature of the problems, thus allowing a more compact network.

Disadvantages: Neural network models can have the processing time of cyber attack activities much faster in the training phase of the models. The model proposed in the paper acts with more robust training since it extracts the characteristics of the data submitted to the model. Another disadvantage is that depending on the parameters used in the creation of the data clouds; the models may have their accuracy impaired when compared to FNN models that have grid-based fuzzification techniques. This is because, in the Grid technique, all possible combinations are defined by the membership functions formed.

Therefore we can conclude that the evolving fuzzy neural network proposed in this paper meets the requirements of a binary cyber attack classifier and still provides a set of knowledge that may be of value to individuals or companies. The density of the data allows us to find patterns of a grouping of characteristics allowing such situations to be taken care of by IT managers, ordinary people or even entire corporations.

The technique proposed in this paper may also work on classification problems related to other binary problems with numerical features, such as in health and industry.

As future work, other testing procedures, comparisons with other models, and other ways of evaluating results may be addressed.

Footnotes

The acknowledgments are sent to CEFET-MG and UNA for allowing the creation of this work.

For the execution of the tests, the configuration adopted in the work of [] were taken as the reference, differentiating only by the percentage destined to the training and the tests by each one of the models.

References

Maryam

Alavi and Dorothy

E. Leidner

, Knowledge management and knowledge management systems: Conceptual foundations and research issues, MIS quarterly (2001), 107–136.

Ammar

Almomani ,

Tat-Chee

Wan ,

Altyeb

Altaher ,

Ahmad

Manasrah ,

Eman

ALmomani ,

Mohammed

Anbar ,

Esraa

ALomari and

Sureswaran

Ramadass , Evolving fuzzy neural network for phishing emails detection, Journal of Computer Science 8(7) (2012), 1099.

Plamen

Angelov and

Edwin

Lughofer , Data-driven evolving fuzzy systems using ets and flexfis: Comparative analysis, International Journal of General Systems 37(1) (2008), 45–67. doi: 10.1080-03081070701500059.

Plamen

Angelov and

Ronald

Yager , A new type of simplified fuzzy rule-based system, International Journal of General Systems 41(2) (2012), 163–185.

Plamen

Angelov ,

Xiaowei

Gu and

Dmitry

Kangin , Empirical data analytics, International Journal of Intelligent Systems 32(12) (2017), 1261–1284.

Plamen

P. Angelov

, Evolving rule-based models: A tool for design of flexible adaptive systems, volume 92. Physica, (2013).

Plamen

P. Angelov

Xiaowei

Gu and José

C. Príncipe

, A generalized methodology for data analysis, IEEE transactions on cybernetics, (2017).

Francis

R. Bach

, Bolasso: Model consistent lasso estimation through the bootstrap, In Proceedings of the 25th international conference on Machine learning, pages 33–40, ACM, (2008).

Rosangela

Ballini and

Fernando

Gomide , Heuristic learning in recurrent neural fuzzy networks, Journal of Intelligent & Fuzzy Systems 13(2–4) (2002), 63–74.

10.

Rosangela

Ballini and

Fernando

Gomide , Learning in recurrent, hybrid neurofuzzy networks, In Fuzzy Systems, 2002. FUZZ-IEEE'02. Proceedings of the 2002 IEEE International Conference on, volume 1, pages 785–790. IEEE, (2002).

11.

Lucas

Oliveira Batista ,

Gabriel

Adriano de Silva ,

Vanessa

Souza Araújo ,

Vinícius

Jonathan Silva Araújo ,

Thiago

Silva Rezende ,

Augusto

Junio and

Paulo

Vitor de Campos Souza Guimarães , Utilização de redes neurais nebulosas para criaçáo de um sistema especialista em invasões cibernéticas. In The Tenth International Conference on FORENSIC COMPUTER SCIENCE and CYBER LAWICOFCS 2018, number 10 in 1, pages 12–22. BRASíLIA CHAPTER OF THE HIGH TECHNOLOGY CRIME INVESTIGATION ASSOCIATION (HTCIA), (2018).

12.

Lucas

Oliveira Batista ,

Gabriel

Adriano de Silva ,

Vanessa

Souza Araújo ,

Vinícius

Jonathan Silva Araújo ,

Thiago

Silva Rezende ,

Augusto

Junio Guimarães and

Paulo

Vitor de Campos Souza , Fuzzy neural networks to create an expert system for detecting attacks by sql injection, International Journal of Forensic Computer Science 13(1) (2019), 8–21.

13.

Michael

J. Berry

and

Gordon

Linoff , Data mining techniques: For marketing, sales, and customer support. John Wiley & Sons, Inc., (1997).

14.

James

C. Bezdek

Robert

Ehrlich and

William

Full , Fcm: The fuzzy c-means clustering algorithm, Computers & Geosciences 10(2–3) (1984), 191–203.

15.

Walmir

M. Caminhas

Hermano

Tavares ,

Fernando AC

Gomide and

Witold

Pedrycz , Fuzzy set based neural networks: Structure, learning and application, JACIII 3(3) (1999), 151–157.

16.

Shan

Cao , YuhuanWang and

Jun

Li , Approximation of fuzzy neural networks based on choquet integral, Journal of Intelligent & Fuzzy Systems 31(2) (2016), 691–698.

17.

Chu

Kwong Chak and

Gang

Feng , A new fuzzy neural network system, Journal of Intelligent & Fuzzy Systems 3(2) (1995), 131–144.

18.

Fi-John

Chang and

Yen-Chang

Chen , A counterpropagation fuzzy-neural network modeling approach to real time streamflow prediction, Journal of hydrology 245(1–4) (2001), 153–164.

19.

NEURAL NETWORK OVER NSL DATASET, Hybrid of fuzzy clustering neural network over nsl dataset for intrusion detection system, Journal of Computer Science 9(3) (2013), 391– 403.

20.

Paulo

Vitor de Campos Souza and

Pedro

Felipe Alves de Oliveira , Regularized fuzzy neural networks based on nullneurons for problems of classification of patterns, In 2018 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE), pages 25–30. IEEE, (2018).

21.

Paulo

Vitor de Campos Souza and

Augusto

Junio Guimaraes , Using fuzzy neural networks for improving the prediction of children with autism through mobile devices, In 2018 IEEE Symposium on Computers and Communications (ISCC), pages 01086–01089, (2018).

22.

Paulo

Vitor de Campos Souza and

Luiz

Carlos Bambirra Torres , Regularized fuzzy neural network based on or neuron for time series forecasting, In North American Fuzzy Information Processing Society Annual Conference, pages 13–23. Springer, (2018).

23.

Paulo

Vitor de Campos Souza ,

Augusto

Junio Guimaraes ,

Vanessa

Souza Araújo ,

Thiago

Silva Rezende and

Vinicius

Jonathan Silva Araújo , Fuzzy neural networks based on fuzzy logic neurons regularized by resampling techniques and regularization theory for regression problems, Inteligencia Artificial 21(62) (2018), 114–133.

24.

Paulo

Vitor de Campos Souza ,

Gustavo

Rodrigues Lacerda Silva and

Luiz

Carlos Bambirra Torres , Uninorm based regularized fuzzy neural networks, In 2018 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), pages 1–8. IEEE, (2018).

25.

Paulo

Vitor de Campos Souza ,

Luiz

Carlos Bambirra Torres ,

Augusto

Junio Guimarães and

Vanessa

Souza Araujo , Pulsar detection for wavelets soda and regularized fuzzy neural networks based on andneuron and robust activation function, International Journal on Artificial Intelligence Tools 28(01) (2019), 1950003.

26.

Paulo

Vitor de Campos Souza ,

Luiz

Carlos Bambirra Torres ,

Augusto

Junio Guimaraes ,

Vanessa

Souza Araujo ,

Vincius

Jonathan Silva Araujo and

Thiago

Silva Rezende , Data density-based clustering for regularized fuzzy neural networks based on nullneurons and robust activation function, Soft Computing, pages (2019), –15.

27.

José

de Jesús Rubio , Sofmls: Online self-organizing fuzzy modified least-squares network, IEEE Transactions on Fuzzy Systems, 17(6) (2009), 1296–1309.

28.

José

de Jesús Rubio , Error convergence analysis of the sufin and csufin, Applied Soft Computing 72 (2018), 587–595.

29.

José

de Jesús Rubio ,

Edwin

Lughofer , Jesús

A. Meda- Campaña

, Luis

Alberto Páramo

, Juan

Francisco Novoa

and

Jaime

Pacheco , Neural network updating via argument kalman filter for modeling of takagi-sugeno fuzzy models, Journal of Intelligent & Fuzzy Systems 35(2) (2018), 1–12.

30.

Konstantinos

Demertzis and

Lazaros

Iliadis , A hybrid network anomaly and intrusion detection approach based on evolving spiking neural network classification, In International Conference on e-Democracy, pages 11–23. Springer, (2013).

31.

Konstantinos

Demertzis and

Lazaros

Iliadis , Evolving computational intelligence system for malware detection, In International Conference on Advanced Information Systems Engineering, pages 322–334. Springer, (2014).

32.

Konstantinos

Demertzis and

Lazaros

Iliadis , A bio-inspired hybrid artificial intelligence framework for cyber security, In Computation, Cryptography and Network Security, pages 161–193. Springer, (2015).

33.

Konstantinos

Demertzis and

Lazaros

Iliadis , Ladon: A cyberthreat bio-inspired intelligence management system, Journal of Applied Mathematics and Bioinformatics 6(3) (2016), 45.

34.

Konstantinos

Demertzis and

Lazaros

Iliadis , Computational intelligence anti-malware framework for android os, Vietnam Journal of Computer Science 4(4) (2017), 245–259.

35.

Konstantinos

Demertzis and

Lazaros

Iliadis , A computational intelligence system identifying cyber-attacks on smart energy grids, In Modern Discrete Mathematics and Analysis, pages 97–116. Springer, (2018).

36.

Konstantinos

Demertzis ,

Lazaros

Iliadis and

Stefanos

Spartalis , A spiking one-class anomaly detection framework for cyber-security on industrial control systems. In International Conference on Engineering Applications of Neural Networks, pages 122–134. Springer, (2017).

37.

Konstantinos

Demertzis ,

Lazaros

Iliadis and

Vardis-Dimitris

Anezakis , A dynamic ensemble learning framework for data stream analysis and real-time threat detection. In International Conference on Artificial Neural Networks, pages 669– 681. Springer, (2018).

38.

Lixin

Fan , Revisit fuzzy neural network: Bidging the gap between fuzzy logic and deep learning, Technical report, Nokia Technologies, (2017).

39.

Mike

Featherstone and

Roger

Burrows , Cyberspace/cyberbodies/cyberpunk: Cultures of technological embodiment, volume 43. Sage, (1996).

40.

Mihaela

Gheorghe and

Ruxandra

Petre , Integrating data mining techniques into telemedicine systems, Informatica Economica 18(1) (2014), 120.

41.

Gene

Golub and

William

Kahan , Calculating the singular values and pseudo-inverse of a matrix, Journal of the Society for Industrial and Applied Mathematics, Series B: Numerical Analysis 2(2) (1965), 205–224.

42.

Xiaowei

Gu ,

Plamen

Angelov ,

Dmitry

Kangin and

Jose

Principe , Self-organised direction aware data partitioning algorithm, Information Sciences 423 (2018), 80–95.

43.

Augusto

Junio Guimaraes , Vinicius

Jonathan Araujo

Lucas

Batista , Paulo

Vitor Campos Souza

, Vanessa

de Souza Araujo

and Thiago

Silva Rezende

, Using fuzzy neural networks to improve the prediction of expert systems for predicting breast cancer. In ENIAC 2018 (), Sao Paulo, oct (2018).

44.

Augusto

Junio Guimarães

, Vinicius

Jonathan Silva Araujo

, Paulo

Vitor de Campos Souza

, Vanessa

Souza Araujo

and Thiago

Silva Rezende

, Using fuzzy neural networks to the prediction of improvement in expert systems for treatment of immunotherapy. In Ibero-American Conference on Artificial Intelligence, pages 229–240. Springer, Cham, (2018).

45.

David

Hakken , Cyborgs@ cyberspace?: An ethnographer looks to the future. Routledge, (2002).

46.

William G.

Halfond ,

Jeremy

Viegas ,

Alessandro

Orso , ., A classification of sql-injection attacks and countermeasures, In Proceedings of the IEEE International Symposium on Secure Software Engineering, volume 1, pages 13–15. IEEE, (2006).

47.

Mark

Hall ,

Eibe

Frank ,

Geoffrey

Holmes ,

Bernhard

Pfahringer ,

Peter

Reutemann and Ian

H. Witten

, The weka data mining software: An update, ACM SIGKDD explorations newsletter 11(1) (2009), 10–18.

48.

Susan

Sales Harkins

and Martin

WP Reid

, Structured query language. In SQL: Access to SQL Server, pages 1–5. Springer, (2002).

49.

Chunmei

He ,

Yaqi

Liu ,

Tong

Yao ,

Fanhua

Xu ,

Yanyun

Hu and

Jinhua

Zheng , A fast learning algorithm based on extreme learning machine for regular fuzzy neural network, Journal of Intelligent & Fuzzy Systems, Pre-press(Preprint) (2018), 1–7.

50.

Michel

Hell ,

Pyramo

Costa and

Fernando

Gomide , Recurrent neurofuzzy network in thermal modeling of power transformers, IEEE Transactions on Power Delivery 22(2) (2007), 904– 910.

51.

Guang-Bin

Huang ,

Hongming

Zhou ,

Xiaojian

Ding and

Rui

Zhang , Extreme learning machine for regression and multiclass classification, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 42(2) (2012), 513–529.

52.

Gabriela

Hug and Joseph

Andrew Giampapa

, Vulnerability assessment of ac state estimation with respect to false data injection cyber-attacks, IEEE Transactions on Smart Grid 3(3) (2012), 1362–1370.

53.

J-SR

Jang , Anfis: Adaptive-network-based fuzzy inference system, IEEE transactions on systems, man, and cybernetics 23(3) (1993), 665–685.

54.

Muna

Mhammad T. Jawhar

and Monica

Mehrotra

, Design network intrusion detection system using hybrid fuzzy-neural network, International Journal of Computer Science and Security 4(3) (2010), 285–294.

55.

Ming

Jin , Javad

Lavaei

and

Karl H.

Johansson , Power grid ac-based state estimation: Vulnerability analysis against cyber attacks, IEEE Transactions on Automatic Control (2018).

56.

George H.

John and Pat

Langley

, Estimating continuous distributions in bayesian classifiers. In Proceedings of the Eleventh conference on Uncertainty in artificial intelligence, pages 338–345. Morgan Kaufmann Publishers Inc., (1995).

57.

Nikola

Kasabov

and Dimitar

Filev

, Evolving intelligent systems: Methods, learning, & applications. In Evolving Fuzzy Systems, 2006 International Symposium on, pages 8– 18. IEEE, (2006).

58.

Nikola K.

Kasabov . Artificial neural networks. evolving connectionist systems. In Time-Space, Spiking Neural Networks and Brain-Inspired Artificial Intelligence, pages 39– 83. Springer, (2019).

59.

Nishtha

Kesswani

, Hongbo

Lyu

and Zuopeng

Zhang

, Analyzing android app privacy with gp-pp model, IEEE Access 6 (2018), 39541–39546.

60.

Andre

Lemos

, Walmir

Caminhas

and Fernando

Gomide

, New uninorm-based neuron model and fuzzy neural networks. In Fuzzy Information Processing Society (NAFIPS), 2010 Annual Meeting of the North American, pages 1–6. IEEE, (2010).

61.

Andre

Paim Lemos

, Walmir

Caminhas

and Fernando

Gomide

, A fast learning algorithm for uninorm-based fuzzy neural networks. In Fuzzy Information Processing Society (NAFIPS), 2012 Annual Meeting of the North American, pages 1–6. IEEE, (2012).

62.

Fanny

Lalonde Lévesque

, Sonia

Chiasson

, Anil

Somayaji

and

José M.

Fernandez , Technological and human factors of malware attacks: A computer security clinical trial approach. ACM Transactions on Privacy and Security (TOPS) 21(4) (2018), 18.

63.

Chengdong

, Zixiang

Ding

, Dianwei

Qian

and Yisheng

. Data-driven design of the extended fuzzy neural network having linguistic outputs. Journal of Intelligent & Fuzzy Systems 34(1) (2018), 349–360.

64.

Xiao-li

, Xiao-fei

Zhang

, Chao

Jia

and De-xin

Liu

. Multimodel adaptive control based on fuzzy neural networks. Journal of Intelligent & Fuzzy Systems 27(2) (2014), 965–975.

65.

Xiaotong

, Hua

, Bingzhen

Sun

and Fang

Wang

. Assessing information security risk for an evolving smart city based on fuzzy and grey fmea. Journal of Intelligent & Fuzzy Systems 34(4) (2018), 2491–2501.

66.

Joon S.

Lim . Finding features for real-time premature ventricular contraction detection using a fuzzy neural network system. IEEE Transactions on Neural Networks 20(3) (2009), 522–527.

67.

Jerry W.

Lin ,

Mark I.

Hwang and

Jack D.

Becker , A fuzzy neural network for assessing the risk of fraudulent financial reporting, Managerial Auditing Journal 18(8) (2003), 657–665.

68.

Liu ,

Lin ,

Wu ,

Chuang and

Lin , Brain dynamics in predicting driving fatigue using a recurrent selfevolving fuzzy neural network, IEEE Transactions on Neural Networks and Learning Systems 27(2) (2016), 347–360. ISSN 2162-237X. doi:10.1109/TNNLS.2015.2496330.

69.

Yurong

Liu

, Zidong

Wang

, Yuan

Yuan

and

Fuad E.

Alsaadi , Partial-nodes-based state estimation for complex networks with unbounded distributed delays, IEEE transactions on neural networks and learning systems 29(8) (2018), 3906–3912.

70.

Zhi-Qiang

Liu

and

Yan , Fuzzy neural network in casebased diagnostic system. IEEE Transactions on Fuzzy Systems 5(2) (1997), 209–222. ISSN 1063-6706. doi:10.1109/91.580796.

71.

Jin

Long

, Jin

Jian

and Yao

Cai

, A short-term climate prediction model based on a modular fuzzy neural network, Advances in atmospheric sciences 22(3) (2005), 428–435.

72.

Leandro

Maciel andre Lemos

, Fernando

Gomide

and Rosangela

Ballini

, Evolving fuzzy systems for pricing fixed income options, Evolving Systems 3(1) (2012), 5–18.

73.

Leandro

Maciel

, Rosangela

Ballini

and Fernando

Gomide

, Evolving granular analytics for interval time series forecasting, Granular Computing 1(4) (2016), 213–224.

74.

Ananda

Mitra

and Rae

Lynn Schwartz

, From cyber space to cybernetic space: Rethinking the relationship between real and virtual spaces, Journal of Computer-Mediated Communication 7(1) (2001), JCMC713.

75.

Radford M.

Neal , Bayesian learning for neural networks, volume 118. Springer Science & Business Media, (2012).

76.

Atsuyuki

Okabe

, Barry

Boots

, Kokichi

Sugihara

and Sung

Nok Chiu

, Spatial tessellations: Concepts and applications of Voronoi diagrams, volume 501. John Wiley & Sons, (2009).

77.

Witold

Pedrycz

, Neurocomputations in relational systems, IEEE Transactions on Pattern Analysis & Machine Intelligence 13(3) (1991), 289–297.

78.

Witold

Pedrycz

, Logic-oriented fuzzy neural networks, International Journal of Hybrid Intelligent Systems 1(1–2) (2004), 3–11.

79.

Witold

Pedrycz

and Fernando

Gomide

. Fuzzy systems engineering: Toward human-centric computing. John Wiley & Sons, (2007).

80.

Roberto

Perdisci

, Andrea

Lanzi

and Wenke

Lee

, Mcboost: Boosting scalability in malware collection and analysis using statistical classification of executables. In Computer Security Applications Conference, 2008. ACSAC 2008. Annual, pages 301–310. IEEE, (2008).

81.

Mahardhika

Pratama

, Witold

Pedrycz

and

Geoffrey I.

Webb , An incremental construction of deep neuro fuzzy system for continual learning of non-stationary data streams. arXiv preprint arXiv:1808.08517, (2018).

82.

Cuiping

, Jie

Ren

and Bin

Xue

. Study of predictive control model of fuzzy neural network, Journal of Computational Methods in Sciences and Engineering Preprint(Preprint) (2018), 1–9.

83.

J. Ross

Quinlan , C4. 5: Programs for machine learning. Elsevier, (2014).

84.

Raul

Rosa

, Fernando

Gomide

and Rosangela

Ballini

, Evolving hybrid neural fuzzy network for system modeling and time series forecasting. In Machine Learning and Applications (ICMLA), 2013 12th International Conference on, volume 2, pages 378–383. IEEE, (2013).

85.

Raul

Rosa

, Leandro

Maciel

, Fernando

Gomide

and Rosangela

Ballini

. Evolving hybrid neural fuzzy network for realized volatility forecasting with jumps. In Computational Intelligence for Financial Engineering & Economics (CIFEr), 2104 IEEE Conference on, pages 481–488. IEEE, (2014).

86.

David E.

Rumelhart and

James L.

McClelland . Parallel distributed processing: Explorations in the microstructure of cognition. volume 1. foundations. Computational Models of Cognition and Perception, (1986).

87.

Rafath

Samrin

and

Vasumathi , Review on anomaly based network intrusion detection system. In Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT), 2017 International Conference on, pages 141– 147. IEEE, (2017).

88.

Tomonobu

Senjyu

, Satoru

Yokoda

and Katsumi

Uezato

, Speed control of ultrasonic motors using fuzzy neural network, Journal of Intelligent & Fuzzy Systems 8(2) (2000), 135–146.

89.

Ying

Shi

and Shaoyong

Jian

, Permeability estimation of rock reservoir based on pca and elman neural networks. In IOP Conference Series: Earth and Environmental Science, volume 128, page 012001. IOP Publishing, (2018).

90.

Takashi

Shimada

, A universal transition in the robustness of evolving open systems, Scientific reports 4 (2014), 4082.

91.

Alisson

Marques Silva

, Walmir

Caminhas

, Andre

Lemos

and Fernando

Gomide

, A fast learning algorithm for evolving neo-fuzzy neuron, Applied Soft Computing 14 (2014), 194–209.

92.

Vinícius

Jonathan Silva Araújo

, Augusto

Junio Guimarães

, Paulo

Vitor de Campos Souza

, Thiago

Silva Rezende

and Vanessa

Souza Araújo

, Using resistin, glucose, age and bmi and pruning fuzzy neural network for the construction of expert systems in the prediction of breast cancer, Machine Learning and Knowledge Extraction 1(1) (2019), 466–482.

93.

Alessandra M.

Soares ,

Bruno J.T.

Fernandes and

Carmelo J.A.

Bastos-Filho , Pyramidal neural networks with evolved variable receptive fields, Neural Computing and Applications 29(12) (2018), 1443–1453.

94.

Paulo

Vitor C. Souza

, Regularized fuzzy neural networks for pattern classification problems, International Journal of Applied Engineering Research 13(5) (2018), 2985–2991.

95.

Paulo

Vitor de Campos Souza

, Augusto

Junio Guimaraes

, Vanessa

Souza Araujo

, Thiago

Silva Rezende

and Vinicius

Jonathan Silva Araujo

, Regularized fuzzy neural networks to aid effort forecasting in the construction and software development, International Journal of Artificial Intelligence & Applications 9(6) (2018), 13–26.

96.

Paulo

Vitor de Campos Souza

, Augusto

Junio Guimaraes

, Vanessa

Souza Araujo

, Thiago

Silva Rezende

and Vinicius

Jonathan Silva Araujo

, Regularized fuzzy neural networks to aid effort forecasting in the construction and software development, International Journal of Artificial Intelligence & Applications 9(6) (2018), 13–26.

97.

Salvatore J.

Stolfo , Wei

Fan

, Wenke

Lee

, Andreas

Prodromidis

and

Philip K.

Chan , Cost-based modeling for fraud and intrusion detection: Results from the jam project. Technical report, COLUMBIA UNIV NEW YORK DEPT OF COMPUTER SCIENCE, (2000).

98.

Robert W.

Taylor ,

Eric J.

Fritsch and John

Liederbach

, Digital crime and digital terrorism. Prentice Hall Press, (2014).

99.

Daxin

Tian

, Yanheng

Liu

and Jian

Wang

, Fuzzy neural network structure identification based on soft competitive learning, International Journal of Hybrid Intelligent Systems 4(4) (2007), 231–242.

100.

Paulo

Vitor de Campos Souza

, Pruning fuzzy neural networks based on unineuron for problems of classification of patterns, Journal of Intelligent & Fuzzy Systems, 35(2) (2018), 1–9.

101.

Wai ,

Yao and

Lee . Backstepping fuzzy-neuralnetwork control design for hybrid maglev transportation system, IEEE Transactions on Neural Networks and Learning Systems 26(2) (2015), 302–317. ISSN 2162-237X. doi:10.1109/TNNLS.2014.2314718.

102.

David P.

Wipf and

Srikantan S.

Nagarajan , A new view of automatic relevance determination. In

J.C.

Platt ,

Koller ,

Singer and

S.T.

Roweis , editors, Advances in Neural Information Processing Systems 20, pages 1625–1632. Curran Associates, Inc., (2008).

103.

Ronald R.

Yager and Alexander

Rybalov

, Uninorm aggregation operators, Fuzzy sets and systems 80(1) (1996), 111–120.

104.

Thi Yen

, Wang

Yao Nan

and Pham

Van Cuong

, Recurrent fuzzy wavelet neural networks based on robust adaptive sliding mode control for industrial robot manipulators, Neural Computing and Applications (2018), 1–14.

105.

Bariah

Yusob

, Zuriani

Mustaffa

and Junaida

Sulaiman

, Anomaly detection in time series data using spiking neural network, Advanced Science Letters 24(10) (2018), 7572–7576.

106.

Zuopeng

Zhang

, Organizational culture and knowledge sharing: Design of incentives and business processes, Business Process Management Journal 24(2) (2018), 384–399.

107.

Bonnie

Zhu

, Anthony

Joseph

and Shankar

Sastry

, A taxonomy of cyber attacks on scada systems. In 2011 IEEE International Conferences on Internet of Things, and Cyber, Physical and Social Computing, pages 380–388. IEEE, (2011).