Data Mining & Big Data Group

The Data Mining & Big Data Group is developing new methods and systems for big data modelling, prediction, data mining, rule extraction and understanding, including real time data.

Current projects

The lab is working on several projects, including:

Predictive modelling of seismic data

In this research, we have used multiple time-series readings of seismic activity prior to the earthquake, applying a SNN architecture called NeuCube. Although NeuCube was designed to map and model brain signals, we have found that SNN can be successfully used for early and accurate prediction of hazardous events.

The Brain-Like Artificial Intelligence (BLAI) is pioneered by Professor Nikola Kasabov and here it is applied to a seismic waveform data.

This project develops novel methods and systems for seismic data modelling in a real time and earthquake risk prediction.

If a potential disaster can be averted, lives can be saved and losses avoided. Risk mitigation strategies from health to civil defense often depend on simple models using a few variables based on limited data. Recent advances in machine learning offer the intriguing possibility that disastrous events, as diverse as strokes, earthquakes, financial market crises, or degenerative brain diseases, could be predicted early if the patterns hidden deeply in the intricate and complex interactions between spatial and temporal components could be understood. Although such interactions are manifested at different spatial or temporal scales in different applications or domain areas, the same information-processing principles can be applied. Researchers have recently demonstrated that third-generation artificial neural networks, called spiking neural networks (SNN), can be used to learn patterns in SSTD. In SNN, information is represented and processed as temporal sequences of spikes, similar to the way the brain processes information. They learn from data related to certain events by forming and updating connections between neurons, creating neuronal chains and networks. Moreover, a chain may be incrementally activated when only a small amount of new data is presented. Hence, SNN are capable of fast parallel information processing, compact representation of space and time, learning, and pattern recognition. In this research, we have used multiple time-series readings of seismic activity prior to the earthquake, applying a SNN architecture called NeuCube. Although NeuCube was designed to map and model brain signals, we have found that SNN can be successfully used for early and accurate prediction of hazardous events. The models still need to be verified using large-scale global earthquake data from seismic monitoring sites around the world, and some fine-tuning will be needed to find the best prediction horizon and observation period. Nonetheless, this is a promising line of research for hazardous event prediction, with great potential to understand geophysical phenomena - and save lives.

_{The spiking neural network model of New Zealand seismic stations. Input nodes (dark green spheres) represent the 52 sites of the New Zealand National Seismograph Network (https://www.geonet.org.nz/). Relevant changes (light green dots) of seismic waveforms encoded into spike trains are propagated trough the SNN. Analysis of the cause-effect relationships (blue and red lines) might help researchers to understand how seismic activity in different sites affect each other.}

Related papers and benchmarking

The proposed methods and systems, when compared with traditional statistical and machine learning methods, showed superior results in the following aspects:

Preliminary results of earthquake prediction on a small dataset of seismic activity in the Canterbury region, New Zealand, using the NeuCube give us the confidence that seismicity data might be a viable precursor for short-term earthquake prediction. In comparison with traditional techniques such as SVM, MLP, NB, the NeuCube performs better accuracy (see table below).

Measure	SVM	MLP	ECF	NeuCube
Accuracy (%)	54	58	67	92
F -Score	0.58	0.58	0.58	0.92

Though the experiment is in a very preliminary stage, this research has shown a promising way to predict the occurrence of strong earthquakes by training the SNN to differentiate between strong and moderate earthquakes based on spatio-temporal seismicity precursors.

Modelling the spatial location of the sites along with the 3D and Virtual Reality visualisation allow researchers to extract spatio-temporal knowledge or rules pertaining to how the seismic activities in different sites affect each other

R&D system

For this project, an R&D system has been developed based on NeuCube. The system can be obtained for R&D subject to licensing agreement.

Developer

Dr Israel Espinosa Ramos

Air pollution data modelling

A NeuCube-based system is used to model relationship between ozone concentrations and simultaneous increments of temperature, periods of elevated nitrogen monoxide concentrations and ozone episodes. We can also determine the reasons why nocturnal spikes are most likely to be found in certain areas.

The Brain-Like Artificial Intelligence (BLAI) is pioneered by Prof.Nikola Kasabov and here it is applied to an air quality case study.

This project develops novel methods and systems for modelling of multisensory streaming data in a real time for pollution estimation and for the prediction of the effect of it.

Using spiking neural networks and the NeuCube architecture to model multisensory streaming data and a case study on ozone concentration data modelling. Spiking neural networks can describe the spatial and temporal relationships among the variables that describe the dynamics of a system. Particularly, we use the NeuCube architecture to model and study the concentrations of greenhouse gases such as carbon monoxide (CO) and nitrogen dioxide (NO₂) at a time, and their relationship with nocturnal spikes in ozone (O₃) concentrations. Spatial and temporal patterns associated with high ozone concentrations are related to increased hospital admissions. NeuCube can work with high resolution data which results in a more effective model. Understanding the system better leads to evaluating the temporal and spatial occurrence on nocturnal spikes in ozone concentrations. This allows us to have better predictions to establish the health impact and its outcomes. In general, a better streaming data modelling of multisensory data for a better classification of events, better prediction and a better understanding of the processes measured by the sensors. In our case study: NeuCube is used to model relationship between ozone concentrations and simultaneous increments of temperature, periods of elevated nitrogen monoxide concentrations and ozone episodes. We can also determine the reasons why nocturnal spikes are most likely to be found in certain areas. Nowadays, nocturnal spikes in ozone concentration are unrelated to health concerns. However, NeuCube can incorporate diverse variables and can also model their relationships, in order to quantify not only health costs of air pollution but also the economic effect of emission changes.

_{NeuCube 3D spiking neural network map of southwestern British Columbia showing the Lower Fraser Valley network of monitors with regional and government fixed monitors (dark green circles). Spatio-temporal relationships (lines) and activity (light green circles) of ozone (O3) (left cube) and carbon monoxide (CO) (right cube) concentrations can be analysed simultaneously.}

Related papers and benchmarking

The proposed methods and systems, when compared with traditional statistical and machine learning methods, showed superior results in the following aspects:

NeuCube can help us to model the relationship between nocturnal spikes in ozone concentrations and simultaneous increments of temperature, periods of elevated nitrogen monoxide concentrations and ozone episodes;
The 3D and Virtual Reality visualisation eases to study why nocturnal spikes in ozone concentration are most likely found in certain areasp;
Nowadays, nocturnal spikes in ozone concentration are unrelated to health concerns. However, NeuCube can incorporate diverse variables and model their relationships, in order to quantify not only health costs of air pollution but also the economic effect of emission changes.

R&D system

For this project, an R&D system has been developed based on NeuCube. The system can be obtained subject to licensing agreement.

Developer

Dr Israel Espinosa Ramos

Wind turbine energy prediction

This R&D system uses the NeuCube for modelling and understanding spatial and temporal data from multi-sensory networks. NeuCube can incorporate spatial information - such as the geographical coordinates of wind turbines - and temporal data from diverse variables - such as wind velocity and wind direction - for wind energy forecasting.

In the past three decades, research and development in green energy has exploded, yielding hundreds of promising new technologies that can reduce our dependence on coal, oil, and natural gas. In this context, wind energy is a growing industry with high potential and relatively low production costs, supplying electricity to national grids worldwide. In 2015, wind energy supplied about 3.7% of global electricity. However, wind energy cannot be generated on demand, in the manner of traditional electricity generation due to a strong dependency on atmospheric phenomena.

Wind power gives variable power which is very consistent from year to year but which has significant variation over shorter time scales. In practice, the variations in thousands of wind turbines, spread out over several different sites and wind regimes, are smoothed. As the distance between sites increases, the correlation between wind speeds measured at those sites, decreases. Thus, while the output from a single turbine can vary greatly and rapidly as local wind speeds vary, as more turbines are connected over larger and larger areas the average power output becomes less variable and more predictable.

Management of wind energy uses forecasting methods, but predictability of any particular wind farm is low for short-term operation. For example, for any particular generator there is an 80% chance that wind output will change less than 10% in an hour and a 40% chance that it will change 10% or more in 5 hours. Therefore, efficient management of wind energy requires new and novel forecasting methods.

In this research, we propose the use of NeuCube as a novel methodology for modelling and understanding spatial and temporal data from multi-sensory networks. For wind energy forecasting, NeuCube can incorporate spatial information such as the geographical coordinates of wind turbines and temporal data from diverse variables such as wind velocity and wind direction. These unique NeuCube’s features allows not only to achieve better prediction accuracy but also to analyse the variation and correlation between the variables involved in wind energy production.

_{NeuCube 3D spiking neural network map (left) of a 13-wind turbine farm (dark green circles). Spatio-temporal relationships (lines) and activity (light green circles) of wind speed can be analysed for wind power prediction (right).}

Related papers and benchmarking

The proposed methods and systems, when compared with traditional statistical and optimisation methods, showed superior results in the following aspects:

Though the research is in an early state, we have exploited the online multisensory data modelling feature of the NeuCube for mapping the geographical position of wind turbines. We have exploited the online multisensory data-modelling feature of the NeuCube for modelling the wind speed and the wind direction, both simultaneously recorded from each turbine. This main feature allows researchers to discover new information and knowledge on how these two variables interact for forecasting the wind power production in different time windows (e.g. 30 mins, 1 hr, 6 hr. or 24 hr. ahead).

R&D system

For this project, an R&D system is being developed. The system can be obtained subject to licensing agreement.

Developer

Dr Israel Espinosa Ramos

Our research groups

The research and development work done by KEDRI's founding director, Professor Nikola Kasabov, and his team is organised into six areas of research.

Find out more

Data Mining & Big Data Group

Current projects

Related papers and benchmarking

Measure

SVM

MLP

ECF

NeuCube

Accuracy (%)

54

58

67

92

F -Score

0.58

0.58

0.58

0.92

R&D system

Developer

Related papers and benchmarking

R&D system

Developer

Related papers and benchmarking

R&D system

Developer

Our research groups