A neural network-based ionospheric model for the auroral zone

A new empirical model for the lower ionosphere in the auroral zone, called IMAZ, has been developed, tested and refined for use in the International Reference Ionosphere (IRI) global model. Available ionospheric data have been used to train neural networks (NNs) to predict the high latitude electron density profile. Data from the European Incoherent Scatter Radar (EISCAT), based near Tromsø, Norway (69.581N, 19.231E), combined with rocket-borne measurements (from 611 to 691 geomagnetic latitude) make up the database of reliable Dand E-region data. NNs were trained with different combinations of the following input parameters: day number, time of day, total absorption, local magnetic K index, planetary Ap index, 10.7 cm solar radio flux, solar zenith angle and pressure surface. The output that the NNs were trained to predict was the electron density for a given set of input parameters. The criteria for determining the optimum NN are (a) the root mean square (RMS) error between the measured and predicted output values, and (b) the ability to reproduce the absorption they are meant to represent. An optimum input space was determined and then adapted to suit the requirements of the IRI community. In addition, the true quiet electron densities were simulated and added to the database, thus allowing the final model to be valid for riometer absorptions down to 0 dB. This paper discusses the development of a NN-based model for the high-latitude, lower ionosphere, and presents results from the version developed specifically for the IRI user community. r 2007 Elsevier Ltd. All rights reserved.


Introduction
This paper presents an outline of the development of a new model for the auroral zone D-and E-region ionosphere. Attempts to model the ionospheric electron density in the auroral zone have been made previously by various groups (Jespersen et al., 1968;Miyazaki et al., 1981;Kirkwood and Collis, 1991;Friedrich and Torkar, 1995). More recently, further attempts have been made by Harrich (2001) and Steiner (2003) to provide a model for the electron density based on available data using analytical assimilation and binning techniques, respectively. In all of these abovementioned attempts, an analytical relationship was established, with varying levels of success, between the riometer absorption and the electron density.

ARTICLE IN PRESS
www.elsevier.com/locate/jastp 1364-6826/$ -see front matter r 2007 Elsevier Ltd. All rights reserved. doi:10.1016All rights reserved. doi:10. /j.jastp.2007 A major difference between the new auroral zone model presented in this paper and these previous attempts is the technique that is employed here. The authors have made a pioneering attempt to model the auroral zone using neural networks (NNs).
Recently, NNs have been successfully employed for the prediction of bottomside ionospheric parameters (Altinay et al., 1997;Williscroft and Poole, 1996;Wintoft and Cander, 1999;McKinnell and Poole, 2000) and for the creation of a bottomside electron density profile model (McKinnell, 2002). Most of the work in this area has been undertaken using mid-latitude data from ground-based ionosondes.
For the development of this new model, we initially split the available data into day and night time datasets, with the split at a solar zenith angle (ZA) of 981. We chose this ZA as it was previously found by both theoretical and empirical means to be optimum for quantifying the day-night boundary at auroral latitudes (Stauning, 1996). Therefore, as a first attempt two separate models were produced, one for predicting electron densities at ZAs less than 981 (daytime) and the other for predictions at ZAs greater than 981 (nighttime). The details and results concerning these two models appear in McKinnell et al. (2004) and McKinnell and Friedrich (2003), respectively. This paper covers the details and results from the development of the optimum final model covering all situations, and using inputs relevant to the International Reference Ionosphere (IRI). The IRI is an international project jointly sponsored by the Committee on Space Research and the International Union of Radio Science, who formed a working group in the late 60s to develop a global empirical ionospheric model. Improved versions of the model are released on a regular basis. For given location, time and date, IRI provides monthly averages in the non-auroral ionosphere under magnetically quiet conditions for electron density, electron temperature, ion temperature and ion composition in the altitude range from 60 km to 2000 km (Bilitza, 1997). Current versions of the IRI model offer options to take into account magnetic storm conditions and for the prediction of the lower ionosphere electron densities. A version of our new NN-based model was developed specifically for incorporation into the global IRI model. The differences between the final optimum version and the IRI version will be discussed and results from the IRI version will be presented. The precursor models will be referred to only when it is necessary to make comparisons with the final model. The authors have coined the name IMAZ (Ionospheric Model for the Auroral Zone), and throughout this paper the model will be referred to as such.

Neural networks
In this paper, NNs are used as a tool for producing the model parameters required for predicting the auroral ionospheric electron density at different altitude levels. NNs were also used to determine the optimum input parameters upon which the output, electron density, is dependent.
Briefly, an NN is a computer program that is trained by presenting to its input any number of multidimensional input vectors that correspond to a known measured output parameter. The NN learns to identify the relationship between the input vectors and the known output.
The most essential requirement for training an NN is a large database of data describing the history of the relationship between the input and output parameters. This database usually takes the form of a number of input vectors, each with a corresponding output. Available data for our model are described in the next section. After determining the input space, the network architecture is designed consisting of at least three layers, an input, a hidden and an output layer. Each layer contains a number of nodes; the input and output nodes correspond to the parameters within the known database, while the number of hidden nodes are determined iteratively based on the performance of the training procedure. There are connections between the nodes, which represent the feeding of the output from one node to the other, multiplied by a weight. A simplified version of a three-layer NN architecture is shown in Fig. 1.
Before presenting the data to the network, the dataset is randomly split into training and testing subsets. This procedure is performed in order to prevent the training results from being biased towards a particular section of the database. For the NNs trained for our model, we split the data by using 70% for training and 30% for testing.
Training the NN is an iterative process that starts with randomly chosen weights in the NN model. The input vectors are ordered randomly, and then each is presented in turn to the network. In each case, a predicted output is produced, which is compared with the given measured output. An algorithm is then applied to update the weights in such a way as to minimise the difference. This process is repeated until the root mean square (RMS) error between the given and predicted output on the testing set is stabilised. The algorithm that we make use of for ionospheric models is a feedforward backpropagation algorithm. For more details on the working of NNs, the reader is referred to Haykin (1994).

Data
For this modelling effort, the available data came from two sources. A combination of UHF and VHF data from the European Incoherent Scatter Radar (EISCAT) provided high-latitude electron density profile data covering the period 1985-2004. We combined these EISCAT data with 94 electron density profiles obtained from rocket-borne wave propagation experiments. This gave us a total of 495 388 electron density profile points covering an altitude range of 50-150 km. Since most of our data came from the EISCAT radar, we also restricted the latitude to 701 geomagnetic and only used rocket measurements that fell within a 51 range of this. All profiles included in our database were manually edited to ensure quality.
The data are also restricted according to total absorption. EISCAT data with absorption levels that fall below pre-determined minimum absorption levels are disregarded. This restriction is applied in order to avoid results biased to values above the EISCAT thresholds at low absorptions (Friedrich et al., 2004a), and is consistent with the processing of data in the development of the analytical MEDAL model (Steiner, 2003). By definition, a riometer measures excess absorption beyond a normal quiet value. The measured value is converted to a vertical incidence signal at 27.6 MHz (Friedrich et al., 2002). We use the notation Lv to represent the vertical riometer absorption parameter. Then, the quiet day value (rest absorption) needs to be added to the measured value (Lv) to obtain the total absorption (Li).
Examining the data revealed that most of the available data lie above about 80 km and 0.1 dB. Data from below 70 km are almost exclusively from the rocket measurements. Although we have only a few of these rocket measurements, they allow us access to data at much lower altitudes than we would have from EISCAT alone. The importance of the few rocket data is more pronounced for quiet situations.

True quiet
The ionosphere at high latitudes usually behaves unrelated to solar control, save a distinct day-night difference typically occurring at a solar ZA of 981 (Stauning, 1996). Under very quiet conditions, however (i.e. in the absence of particle precipitation), the ionosphere will behave as outside the polar regions, namely following a Chapman variation of the form N e ¼ N 0 (cos w) n , where N 0 is the electron density at 01 solar ZA w. For a constant recombination rate and negligible negative ions, the exponent n is ideally 0.5. We call this solar controlled electron density as the true quiet (TQ). It can be obtained from the low-density envelope of the electron density data as a function of solar ZA and F10.7 cm solar flux.
It is possible to obtain the TQ electron density profile for every available geophysical condition in the dataset. The TQ electron density profile would be the equivalent of a profile for which the riometer absorption is 0 dB. More details on the determination of the true quiet profiles can be found in Harrich (2001) and Friedrich et al. (2004b). For the training of the IMAZ model, the true quiet profile determined for each geophysical condition in the dataset was added to the dataset. This expanded the development of the IMAZ model to include the prediction of true quiet profiles (i.e. profiles at 0.0 dB), and eliminated the need for a separate model for this purpose.

Determining the input space and training the NN
In addition to establishing the available dataset that can be used for training our NN-based model, it is also important to determine a suitable input space. The input space can consist of all those parameters that are known or suspected to produce an effect in the output. In Williscroft and Poole (1996), NNs were used for the determination of the input space, and a procedure similar to the one described in that paper was carried out here.
The input parameters considered, which were chosen for their known effect on the auroral electron density, were day number, local magnetic time (hl), total absorption (li), local magnetic activity (kt), planetary magnetic activity (ap), solar ZA (za), the F10.7 cm flux value (F) and altitude. The pressure surface, which is a representation of the seasonal variation (day number) and the altitude, was also considered as an input parameter. When the pressure surface is included in the input space, the need for separate inputs representing day number and altitude is eliminated. As an alternative to the solar ZA value, the inverse Chapman function (ic) was also attempted as an input. In order to ensure continuity in the time input, the cyclic components of the hl input were calculated and used as inputs to the NN. Therefore, the hl parameter was represented by two inputs, hls and hlc, calculated as follows: hlc ¼ cosð2phl=24Þ.
The pressure surface rather than altitude was chosen as an input since it is assumed that the ionosphere will behave comparably for the same background pressure. The relationship between altitude and pressure is based on a combination of the CIRA-86 atmosphere and the empirical atmospheric model for Andøya, which was updated by Lu¨bken in 1999 (Friedrich et al., 2004b). Using the pressure surface as an input allows us to have one input representing the seasonal variation as well as the altitude level.
The output was the logarithm of the electron density at a given pressure surface. The logarithm of the electron density was used in order to limit the range of the output parameter to one order of magnitude. The resulting predicted profiles are still plotted against altitude. For the same reason, the logarithm of the pressure surface was presented to the NN as an input.
To determine the optimum input space, seven NNs were trained with various combinations of the above-mentioned input parameters. The optimum NN is determined according to two main criteria: (i) the RMS error between the measured and predicted output parameter over the entire dataset, NeRMS, and (ii) the ability of the predicted profile to reproduce the input absorption, called the simulated absorption. The second criterion is measured by producing a scatter plot of the simulated total absorption versus the input total absorption, and by calculating the RMS error between the measured and simulated absorption values over the entire dataset, LiRMS. The final RMS error that is used to obtain the optimum input space is obtained by adding the NeRMS and LiRMS values. Fig. 2 shows a graph of the two sets of RMS error values plus the combination RMS for each of the NNs trained. The RMS errors reported in this figure are calculated over the entire dataset. The labels on the graph in Fig. 2 refer to the combination of input parameters.
In addition to careful consideration of the RMS errors as per our criteria, as well as a subjective viewing of the simulated absorption scatter plots, the ability of each NN to predict realistic electron density profiles is also taken into account. From this process, and the results shown in Fig. 2, the optimum input space was chosen to be that which included the inputs, hl, li, kt, ic, F and P (the pressure surface). The combined RMS error for this optimum NN, which shall be referred to as IMAZ_v2, was 0.354 (or, equivalently, a factor of 2.26).
However, in order for IMAZ to be incorporated into the global IRI model, it was necessary to adapt some of the input parameters to suit those that are already available to the IRI. Therefore, within the optimum input space, the ic parameter was replaced with the solar ZA, and the kt parameter was replaced with the planetary magnetic Ap index, ap. This NN shall be referred to as IMAZ_IRI, and the combined RMS error after training was 0.403. Since ic has a simple relation with the solar ZA, and the magnetic activity index is still included, just represented by another parameter, the optimum input space is still relevant with this change of parameter. For the remainder of this paper, all discussions and results shall refer to the IMAZ_IRI model, unless otherwise specified.
The NN architecture for the IMAZ_IRI model consisted of the seven determined to be optimum inputs (hls, hlc, li, ap, ZA, F and log(P)) and 1 output (log of electron density, ne). A block diagram of the inputs and output to the IMAZ_IRI NN is shown in Fig. 3. The measured database consisting of EISCAT plus rocket measurements was supplemented by the TQ database. For every geophysical condition that appeared in the measured database, a TQ profile was determined and added to the database.
After training, the RMS error between the measured and predicted output (NeRMS) over the entire measured database was calculated to be 0.266 for IMAZ_v2 and 0.276 for IMAZ_IRI. For the precursor individual daytime and nighttime models, the NeRMS values were 0.177 and 0.289, respectively. It is more instructive to look at the RMS error (NeRMS) at each pressure level (i.e. altitude) and this is shown in Fig. 4. In this figure, the RMS error is plotted against pressure level for both the IMAZ_v2 and IMAZ_IRI models. This shows that the difference between these two models is very little, which is as expected, since the input parameters were not changed but only the indices with which they are measured. Fig. 4 illustrates that as we decrease in altitude the predictive ability of our NN also decreases somewhat exponentially with the rate of increase in RMS error getting larger as the altitude drops. This is to be expected since the amount of data available for training at the lower altitudes is very small, as also shown in Fig. 4. The penalty in using the IMAZ_IRI as opposed to IMAZ_v2 is very small in terms of RMS error, but the gain is large in making IMAZ suitable for the ionospheric community to use. In choosing an input space, a balance will always have to be reached between the RMS error, the realistic predictions acquired and the availability of the input parameters to the average user. It is possible to match the criteria set perfectly, but to end up with a model that requires unusual inputs and does not produce realistic predictions.

Estimating the uncertainty
NNs have also been used to estimate the uncertainty on a prediction. This uncertainty can be determined by finding a statistical measure of the differences between the predicted and measured values. For each input vector (hls, hlc, li, ap, ZA, F and log(P)) of the original dataset used to train the NN that became our IMAZ_IRI model, the difference between the predicted and measured output values (log of the electron density) is evaluated and squared. A second NN (referred to as IMAZ_ERR) is then trained with the same input data as IMAZ_IRI but with the squared differences as the output. Since it is the nature of NNs to find the mean predicted value, the square root of the output of IMAZ_ERR is an RMS difference. This difference represents a measure of the variation that can be expected between any predicted and measured value. More details of this technique and its application to the ionospheric parameter, foF2, can be found in Poole and McKinnell (2000).
The output from the IMAZ_ERR model provides an estimated uncertainty on the log of the electron density, which is dependent on all parameters within the input space, including altitude (in the form of the pressure surface input). This provides the opportunity to include error bars on the predicted electron density values. These error bars represent the maximum possible statistical variation of the average predicted electron density for a given input set. A cut-off value for the estimated uncertainty can then be established to indicate the limitations on the prediction. For example, predictions that have uncertainties that are more than this cut-off value are flagged as unreliable.

Results and discussion
There are various procedures that are used for testing an NN-based ionospheric model. The most common is to vary one input parameter while keeping the rest constant and then examine variations in the output. Another is to predict the output directly for different input sets. We will show results from both of these methods here. Fig. 5 shows the results of interrogating our model for a summer and winter solstice day, at varying altitude and absorption levels, with all other inputs at the median values. These graphs illustrate that at lower altitudes (70 km, top panel) and high absorption levels, our predictions are more unstable. This is to be expected as the amount of data available for training at the lower altitudes was very small. The NN has learnt the relationship between electron density and altitude, which is demonstrated by the expected increase in predicted electron density at higher altitudes. In addition, at the higher absorption levels (greater disturbances), the diurnal variation disappears and the geomagnetic time becomes more important since the increasing particle spectrum has changed shape.
To illustrate the ability of the IMAZ_ERR network to predict the uncertainty on our prediction, we show in Fig. 6  at altitude levels of 70 and 90 km, for three levels of absorption (true quiet, median and high), with the error bars representing the uncertainty prediction. It can be seen, in all cases, that the error bars at 70 km are larger than those at 90 km, which is to be expected since our training database, and therefore our input space, experiences a paucity of data at this altitude.
In Fig. 7 predicted daytime and nighttime profiles are shown for a day in March, which was chosen for having a good diurnal distribution in the database. The profiles were predicted for different absorption levels, with all other inputs set to their median values. From the daytime profiles (Fig. 7a); it is clear that the variation at lower altitudes is dependent on absorption but at higher altitudes the magnetic activity takes precedence over the absorption (indicated by the very small difference between profiles at higher altitudes), which is as expected. In this case, it is to be noted that the magnetic activity input was kept at a constant median value while predicting all five profiles at the different absorption levels. A similar comparison is made in Fig. 8   included to illustrate the differences between morning and afternoon profiles. The diurnal variation of the electron density for median absorption, magnetic and solar activity at constant altitude levels is shown in Fig. 9. These graphs represent the summer and winter variations in the predictions. In this figure we have indicated the electron density values we consider unreliable, i.e. the error predicted by the IMAZ_ERR model that exceeded our cut-off limit, by dotted lines.
One of the criteria for determining our optimum model was to test its ability to reproduce the total absorption the predicted profile is meant to represent. Fig. 10 illustrates this with a scatter plot depicting the simulated total absorption (Lisim) versus the actual total absorption (Li). Each point on this graph represents a full profile predicted for a set of inputs from the measured database and compared with the actual total absorption that forms part of the inputs. The final IMAZ_IRI model, whose results are shown in this section, was also compared with measured data and IRI predictions. Fig. 11 shows the comparisons between rocket measurements obtained during the MaCWAVE campaigns (Croskey et al., 2004), the 2D-region electron density options within IRI-2001 and IMAZ_IRI. These profiles show that the IMAZ model is performing substantially better than either option within IRI. This is not too surprising since one of the major strengths of IMAZ is the fact that it makes use of the absorption, a major factor describing the Dregion electron density, as an input. IRI-2001 does not make use of absorption as an input, and the   IRI-FPT option for predicting D-region electron densities within IRI-2001 is only valid to 601N, whereas we are predicting at auroral latitudes with IMAZ.
In order to illustrate the ability of IMAZ to predict electron densities during solar proton events, as well as PCA events and the more common electron precipitation events, we have chosen three rocket flight measurements to compare with IMAZ predicted profiles for the same conditions. These comparisons are shown in Fig. 12, with the top panel representing the measured electron densities from the 18.1020 rocket flight, which took place during a usual electron precipitation event (Smith et al., 1983). The middle panel illustrates the comparisons between the measured electron densities from the F34 rocket flight, a PCA and a relativistic electron precipitation event, and the IMAZ predicted profile for those conditions. A solar proton event is illustrated by comparing the measured electron densities from the B5 rocket flight (Swider and Dean, 1975) with IMAZ predictions for the same conditions. In all three cases, it is interesting to note that at low altitudes and high absorption IMAZ performs relatively well. This can be attributed to the fact that the few rocket measurements available were included in the training set for the NN, and so this emphasises the importance of having the rocket measurements and including them. Without these vital measurements, we would lose valuable information on the ionospheric behaviour at these altitudes. In addition, what these comparisons make clear is that NNs are able to provide adequate predictions, provided some history of what has come before can be supplied.

Conclusion
This paper has introduced the new auroral zone lower ionosphere model, IMAZ, in a form that allows for easy incorporation into the IRI-2006 model (Bilitza, 2006). An optimum input space was designed using NNs, and then adapted to suit the requirements (mostly easily accessible input parameters) of the IRI community. Results show that this new technique for predicting D-and E-region electron densities in the auroral zone has been successful. In addition, IMAZ provides the IRI with the ability to predict auroral zone electron densities at low altitudes, using the absorption as an input. IMAZ can also provide an uncertainty on the predicted electron density, which is related to the section of the input space where the NN is being interrogated. In addition, the power of the NN technique lies in its ability to learn from historic evidence how the ionosphere behaves under all conditions. Therefore, IMAZ has the ability to deal with disturbed ionospheric conditions within the constraints of its input space (i.e. examples of a disturbed ionosphere appeared in the input space and, therefore, we can expect the NN to reproduce those conditions).
Although we have designed IMAZ to suit the requirements of the IRI community with easily accessible inputs, we do understand that not all users will have the benefit of access to the riometer absorption required as an input. Therefore, a version of IMAZ that does not include the riometer absorption as an input has also been provided for incorporation into the IRI.
Typically this model has been aimed at existing IRI users who are interested in the ionospheric behaviour at lower altitudes within the auroral zone. In addition, radio communication experts who require knowledge of the effect D-region absorption will have on HF communications will be interested in this model. We believe that IMAZ provides greater scope for electron density prediction for the IRI community and that the end user will benefit from this new option.