Cavity Optimisation With Incorporation of Machine Learning Techniques

How to use machine learning to auto-tune cavity filters.

7 min readMay 19, 2021

Time: November 2019 to April 2020

Github Repository: https://github.com/bozliu/Cavity-Optimisation

Abstract

Cavity ﬁlters are a necessary component in base stations used for telecommunication. Without these ﬁlters it would not be possible for base stations to send and receive signals at the same time. Today these cavity ﬁlters require ﬁne tuning by humans before they can be deployed. This article has designed and implemented a supervised neural network that can predict radius and height of a cylindrical cavity resonator based on resonant frequencies, electric field and magnetic field. Different machine learning methods have been evaluated, such as decision tree, random forest, k-nearest neighbours, neural network classfication and regression. It was shown that there was a relationship between error and number of weights in the neural network. The article also presents some rules of thumb for future designs of neural network.

1 Objective

Given the three attriabutes, resonant frequency, electric field and magnetic field, to design a system to predict height and radius of a cavity ﬁlter.

2 System Algorithm Design

2.1 General Machine Learning Algorithm Design

The data is split into a training dataset and test dataset. In training dataset, the algorithm test if data has targets. Targeted data would be divided into classification and regression depending on whether data is continuous or discrete. Test data will be used to validate the result from the trained model, and therefore the accuracy can be evaluated as well.

Figure 2. General machine learning algorithm design.

2.2 Neural Network Algorithm Design

Figure 3. Neural network algorithm design

2.3 Decision Tree Algorithm Design

A decision tree is a flowchart-like tree structure where an internal node represents a feature, the branch represents a decision rule, and each leaf node represents the outcome. The topmost node in a decision tree is known as the root node. It learns to partition on the basis of the attribute value. It partitions the tree in a recursive manner called recursive partitioning.

The decision tree algorithm selects the best attribute using Attribute Selection Measures (ASM) to split the records. Attributes are made a decision node and dataset can be broken into smaller subsets. The decision tree Starts tree building by repeating this process recursively for each child until either all the tuples belong to the same attribute value or no more remaining attributes or instances.

Figure 4. Decision tree algorithm design

2.4 Random Forest Algorithm Design

This algorithm works in four steps shown below.

(a) Selecting random samples from a given dataset.

(b) Constructing a decision tree for each sample and get a prediction result from each decision tree.

(c)Performing a vote for each predicted result.

(d) Selecting the prediction result with the most votes as the final prediction.

Figure 5. Random forest algorithm design

3 Experimental Result

3.1 Neural Network

In terms of the radius prediction, the classification method is applied. With regard to a neural network, the input layer and output layer are usually fixed. There are three inputs layer of the neural network representing resonant frequency, electric field and magnetic field receptively. Since the data, cR is from 3.5 to 20 with an interval of 0.5, the radius data is separated into 34 categories, and therefore there are 34 output layers. The number of the hidden layer could be varied to see the results under different structures.

In the experiment of increasing the number of hidden layers and try with different hidden noes in the same layer, 5 hidden layers with node number 2048, 1024, 512, 512, 512 could achieve the highest accuracy in radius prediction approximately 86.8 %. When further increasing the number of hidden layers, the accuracy start to drop.

In addition, the same classification method could be used to predict height values. However, when it applies in height prediction, the accuracy is relatively low only about 3.8%. Due to this low accuracy value, I start to use regression method to train the height. The output layer is changed to 1. R2 score can be imported as the criterion to evaluate the performance. It has been found that the four-layer neural network can achieve the highest R2 score, 94%.

3.2 Decision Tree

Classification is applied to train the data. The cavity’s radius and height are predicted respectively. The accuracy of prediction on radius was 86.3%, and accuracy of height was 16.2%. On the other hand, using decision tree regression can achieve relatively high R2 score, approximately 85.7% in height prediction.

3.3 Random Forest

The similar classification method is also applied here. The result is similar to the decision tree. The accuracy of prediction on radius was 82.3%, and accuracy of height was 6.3%.

3.4 Linear Regression

Since the regression method preforms better than classification in height prediction, the linear regression method is also applied. The achieved R2 score is 60.8%

3.5 K-Nearest Neighbours

K-Nearest Neighbours models is a better regression model than linear regression. The achieved R2 score in height prediction is 76.4%

3.6 Time Complexity Comparison

Neuron network has the most time complexity. It requires much more training time compared to other classification and regression algorithms. On the other hand, the decision tree used the least training time.

4 Rules of Thumb for Neural Network Design

4.1 Number of Input Neurons

The effect of varying the number of input neurons is small but noticeable. Between 200–1000 input neurons performs best, even if the variation is still significant.

4.2 Number of Hidden Neurons

There is a tendency that more hidden neurons would achieve better performance. It was also shown that, with the exception of 70 hidden neurons, the span between the 25th percentile and 75th percentile was relatively small despite a change in the number of input neurons. This illustrates that the number of input neurons does not have an as significant effect on the results as the number of hidden neurons

4.3 Number of Weights

Increasing the number of weights in the neural networks improves performance. The beneﬁts of using more weights is not visible until enough examples have been added. In theory, a larger neural network should be able to generalize at least as well as a smaller neural network. The larger neural network could “emulate” a smaller neural network by setting some weights to zero. Given enough time, a larger neural network should therefore either perform better than a smaller neural network or set its weights in such a way that it performs very similarly to a smaller network.

The number of weights in a neural network can be rewritten as:

W=hidden(input+output)

In order to change the number of weights as much as possible, the smaller value of hidden and (input + output) should be increased. This can explain why there was a clear correlation between validation error and more hidden neurons, while no clear correlation was found between the validation error and the number of input neurons.

It has also been observed that the number of hidden neurons were fairly close to the number of input neurons. This meant that increasing the number of hidden neurons would increase the number of weights in the network noticeably. For the larger values, the number of hidden neurons is so small comparatively, that increasing the number of input neurons will barely affect the number of weights.

4.4 Neural Network Size

Effect of changing neural network size shows that the error rate exponentially decays as the number of examples increases. A neural network with more weights will achieve a smaller error if enough examples have been used. From this, one can learn that if the error stops improving, despite adding more examples, adding more weights to the network might improve the error. In this particular problem, it was possible to select both the number of hidden neurons, and the number of input neurons. The number of input nodes does not seem to have an as significant effect on the error as the number of weights. The error decays exponentially with the number of examples, and that the number of hidden nodes does not appear to have a substantial effect on the error. It is possible that the difference in error would have been more considerable if the difference in the number of input neurons and hidden neurons had been more massive.

In general, increasing the number of weights in the neural network will decrease the error. Nevertheless, in order to take advantage of the extra weights, the number of examples may need to be increased. If increasing the size of the neural network does not improve the error, adding more examples should help. If each neural network architecture was tested several times with different weight initializations, together with even larger neural networks, it might be possible to estimate how much lower the error would be if the number of examples stayed the same, but the number of weights increased. It might also have been possible to estimate how many examples would be needed before the error stops improving, given the number of weights stay the same.

Appendix Other Useful Methods

Ada boost

sklearn.ensemble.AdaBoostRegressor(n_estimators=50)
sklearn.ensemble.AdaBoostClassifier()

2. Gradient boosting

sklearn.ensemble.GradientBoostingRegressor(n_estimators=100)
sklearn.ensemble.GradientBoostingClassifier()

3. Bagging Regression

sklearn.ensemble.BaggingRegressor()
sklearn.ensemble.BaggingClassifier()

4. Extra Regression

sklearn.ensemble.ExtraTreesRegressor ()
sklearn.ensemble.ExtraTreesClassifier()