# GECCO '21: Proceedings of the Genetic and Evolutionary Computation Conference

Full Citation in the ACM Digital Library## SESSION: Genetic programming

### Measuring feature importance of symbolic regression models using partial effects

In explainable AI, one aspect of a prediction's explanation is to measure each predictor's importance to the decision process. The importance can measure how much variation a predictor promotes locally or how much the predictor contributes to the deviation from a reference point (Shapley value). If we have the ground truth analytical model, we can calculate the former using the Partial Effect, calculated as the predictor's partial derivative. Also, we can estimate the latter by calculating the average partial effect multiplied by the difference between the predictor and the reference value. Symbolic Regression is a gray-box model for regression problems that returns an analytical model approximating the input data. Although it is often associated with interpretability, few works explore this property. This paper will investigate the use of Partial Effect with the analytical models generated by the Interaction-Transformation Evolutionary Algorithm symbolic regressor (ITEA). We show that the regression models returned by ITEA coupled with Partial Effect provide the closest explanations to the ground-truth and a close approximation to Shapley values. These results open up new opportunities to explain symbolic regression models compared to the approximations provided by model-agnostic approaches.

### A novel multi-task genetic programming approach to uncertain capacitated Arc routing problem

Uncertain Capacitated Arc Routing Problem (UCARP) is an NP-hard optimisation problem with many applications in logistics domains. Genetic Programming (GP) is capable of evolving routing policies to handle the uncertain environment of UCARP. There are many different but related UCARP domains in the real world to be solved (e.g. winter gritting and waste collection for different cities). Instead of training a routing policy for each of them, we can use the multi-task learning paradigm to improve the training effectiveness by sharing the common knowledge among the related UCARP domains. Previous studies showed that GP population for solving UCARP loses diversity during its evolution, which decreases the effectiveness of knowledge sharing. To address this issue, in this work we propose a novel multi-task GP approach that takes the uniqueness of transferable knowledge, as well as its quality, into consideration. Additionally, the transferred knowledge is utilised in a manner that improves diversity. We investigated the performance of the proposed method with several experimental studies and demonstrated that the designed knowledge transfer mechanism can significantly improve the performance of GP for solving UCARP.

### Speed benchmarking of genetic programming frameworks

Genetic Programming (GP) is known to suffer from the burden of being computationally expensive by design. While, over the years, many techniques have been developed to mitigate this issue, data vectorization, in particular, is arguably still the most attractive strategy due to the parallel nature of GP. In this work, we employ a series of benchmarks meant to compare both the performance and evolution capabilities of different vectorized and iterative implementation approaches across several existing frameworks. Namely, TensorGP, a novel open-source engine written in Python, is shown to greatly benefit from the TensorFlow library to accelerate the domain evaluation phase in GP. The presented performance benchmarks demonstrate that the TensorGP engine manages to pull ahead, with relative speedups above two orders of magnitude for problems with a higher number of fitness cases. Additionally, as a consequence of being able to compute larger domains, we argue that TensorGP performance gains aid the discovery of more accurate candidate solutions.

### Zoetrope genetic programming for regression

The Zoetrope Genetic Programming (ZGP) algorithm is based on an original representation for mathematical expressions, targeting evolutionary symbolic regression. The zoetropic representation uses repeated fusion operations between partial expressions, starting from the terminal set. Repeated fusions within an individual gradually generate more complex expressions, ending up in what can be viewed as new features. These features are then linearly combined to best fit the training data. ZGP individuals then undergo specific crossover and mutation operators, and selection takes place between parents and offspring. ZGP is validated using a large number of public domain regression datasets, and compared to other symbolic regression algorithms, as well as to traditional machine learning algorithms. ZGP reaches state-of-the-art performance with respect to both types of algorithms, and demonstrates a low computational time compared to other symbolic regression approaches.

### PSB2: the second program synthesis benchmark suite

For the past six years, researchers in genetic programming and other program synthesis disciplines have used the General Program Synthesis Benchmark Suite to benchmark many aspects of automatic program synthesis systems. These problems have been used to make notable progress toward the goal of general program synthesis: automatically creating the types of software that human programmers code. Many of the systems that have attempted the problems in the original benchmark suite have used it to demonstrate performance improvements granted through new techniques. Over time, the suite has gradually become outdated, hindering the accurate measurement of further improvements. The field needs a new set of more difficult benchmark problems to move beyond what was previously possible.

In this paper, we describe the 25 new general program synthesis benchmark problems that make up PSB2, a new benchmark suite. These problems are curated from a variety of sources, including programming katas and college courses. We selected these problems to be more difficult than those in the original suite, and give results using PushGP showing this increase in difficulty. These new problems give plenty of room for improvement, pointing the way for the next six or more years of general program synthesis research.

### CoInGP: convolutional inpainting with genetic programming

We investigate the use of Genetic Programming (GP) as a convolutional predictor for
missing pixels in images. The training phase is performed by sweeping a sliding window
over an image, where the pixels on the border represent the inputs of a GP tree. The
output of the tree is taken as the predicted value for the central pixel. We consider
two topologies for the sliding window, namely the *Moore* and the *Von Neumann neighborhood.* The best GP tree scoring the lowest prediction error over the training set is then
used to predict the pixels in the test set. We experimentally assess our approach
through two experiments. In the first one, we train a GP tree over a subset of 1000
complete images from the MNIST dataset. The results show that GP can learn the distribution
of the pixels with respect to a simple baseline predictor, with no significant differences
observed between the two neighborhoods. In the second experiment, we train a GP convolutional
predictor on two degraded images, removing around 20% of their pixels. In this case,
we observe that the Moore neighborhood works better, although the Von Neumann neighborhood
allows for a larger training set.

### Cooperative coevolutionary multiobjective genetic programming for microarray data classification

DNA microarray data contain valuable biological information, which makes it of great importance in disease analysis and cancer diagnosis. However, the classification of microarray data is still a challenging task because of the high dimension and small sample size, especially for multiclass data accompanied by class imbalance. In this paper, we propose a cooperative coevolutionary multiobjective genetic programming (CC-MOGP) for microarray data classification. It converts a multiclass problem into a set of tractable binary problems and coevolves the corresponding population. And a cooperative coevolutionary Pareto archived evolution strategy (CC-PAES) is employed to approximate the Pareto front. During this procedure, we propose a synergy test method to assist in guiding the coevolution between populations. Experimental results on 8 multiclass microarray data show that CC-MOGP can obtain competitive prediction accuracy compared with several state-of-art evolutionary computation and traditional methods.

### Towards effective GP multi-class classification based on dynamic targets

In the multi-class classification problem GP plays an important role when combined with other non-GP classifiers. However, when GP performs the actual classification (without relying on other classifiers) its classification accuracy is low. This is especially true when the number of classes is high. In this paper, we present DTC, a GP classifier that leverages the effectiveness of the dynamic target approach to evolve a set of discriminant functions (one for each class). Notably, DTC is the first GP classifier that defines the fitness of individuals by using the synergistic combination of linear scaling and the hinge-loss function (commonly used by SVM). Differently, most previous GP classifiers use the number of correct classifications to drive the evolution. We compare DTC with eight state-of-art multi-class classification techniques (e.g., RF, RS, MLP, and SVM) on eight popular datasets. The results show that DTC achieves competitive classification accuracy even with 15 classes, without relying on other classifiers.

### A generalizability measure for program synthesis with genetic programming

The generalizability of programs synthesized by genetic programming (GP) to unseen test cases is one of the main challenges of GP-based program synthesis. Recent work showed that increasing the amount of training data improves the generalizability of the programs synthesized by GP. However, generating training data is usually an expensive task as the output value for every training case must be calculated manually by the user. Therefore, this work suggests an approximation of the expected generalization ability of solution candidates found by GP. To obtain candidate solutions that all solve the training cases, but are structurally different, a GP run is not stopped after the first solution is found that solves all training instances but search continues for more generations. For all found candidate solutions (solving all training cases), we calculate the behavioral vector for a set of randomly generated additional inputs. The proportion of the number of different found candidate solutions generating the same behavioral vector with highest frequency compared to all other found candidate solutions with different behavior can serve as an approximation for the generalizability of the found solutions. The paper presents experimental results for a number of standard program synthesis problems confirming the high prediction accuracy.

### Genetic programming is naturally suited to evolve bagging ensembles

Learning ensembles by bagging can substantially improve the generalization performance of low-bias, high-variance estimators, including those evolved by Genetic Programming (GP). To be efficient, modern GP algorithms for evolving (bagging) ensembles typically rely on several (often inter-connected) mechanisms and respective hyper-parameters, ultimately compromising ease of use. In this paper, we provide experimental evidence that such complexity might not be warranted. We show that minor changes to fitness evaluation and selection are sufficient to make a simple and otherwise-traditional GP algorithm evolve ensembles efficiently. The key to our proposal is to exploit the way bagging works to compute, for each individual in the population, multiple fitness values (instead of one) at a cost that is only marginally higher than the one of a normal fitness evaluation. Experimental comparisons on classification and regression tasks taken and reproduced from prior studies show that our algorithm fares very well against state-of-the-art ensemble and non-ensemble GP algorithms. We further provide insights into the proposed approach by (i) scaling the ensemble size, (ii) ablating the changes to selection, (iii) observing the evolvability induced by traditional subtree variation.

Code: https://github.com/marcovirgolin/2SEGP.

### Evolvability and complexity properties of the digital circuit genotype-phenotype map

Recent research on genotype-phenotype (G-P) maps in natural evolution has contributed significantly to our understanding of neutrality, redundancy, robustness, and evolvability. Here we investigate the properties of the digital logic gate G-P map and show that this map shares many of the common properties of natural G-P maps, with the exception of a positive relationship between evolvability and robustness. Our results show that in some cases robustness and evolvability may be negatively related as a result of the method used to approximate evolvability. We give two definitions of circuit complexity and empirically show that these two definitions are closely related. This study leads to a better understanding of the relationships between redundancy, robustness, and evolvability in genotype-phenotype maps. We further investigate these results in the context of complexity and show the relationships between phenotypic complexity and phenotypic redundancy, robustness and evolvability.