ECJ relies heavily on parameter files for nearly every conceivable parameter setting. It even relies on parameter files to determine which classes to use in diffent places. This means that understanding parameters and parameter files is crucial to using ECJ.
ECJ's parameters are written one to a line in Java property-list style. They may be in one of the two following formats:
parametername = value
parametername value
The second option is deprecated, please don't do it. Whitespace is stripped. Parameter values may contain internal whitespace but parameter names may not. Blank lines and lines beginning with a "#" are ignored. Parameter names and values are case-sensitive.
Parameter values are interpreted as one of five data types, depending on the parameter:
parametername = true
parametername = false
If a parameter is declared but is not one of these two values, it is assumed to be a "default" value, which varies depending on the parameter. For example, if the default for "myparameter" is true, then:
myparameter = gobbledygook
myparameter =
...both signify that myparameter is to be set to "true".
java ec.Evolve -file myParameterFile
Parameter files can have multiple parents which define additional parameters. A parameter file specifies that it has a parent with a special parameter:
parent.n = parentFile
...where n indicates that the parent is parent #n. n starts at 0 and increases. Your parents must be assigned with consecutive parameter names starting with parent.0. For example:
parent.0 = ../../myFirstParent.params
parent.1 = ../../../mySecondParent.params
parent.2 = ../foo/bar/myThirdParent.params
Parameters may also be defined on the command line when running ECJ with the "-p" option, which may appear multiple times. No space may appear between the parameter name, "=", and value. For example:
java ec.Evolve -file my.params -p extraparam=extravalue -p anotherparam=anothervalue
Parameters may further be programatically defined internally by the system, though ECJ presently never does this. If you have two parameters with the same name, here are the rules guiding which ones take precedence:
Since numerous objects read parameters from the parameter database, ECJ organizes its parameter namespace hierarchically using periods to separate elements in parameter names. Let's begin with the simplest situation: someECJ parameters are simple global parameters. For example,
evalthreads = 4
...tells ECJ that it should spawn 4 threads when doing population evaluation. Other parameters are organized hierarchically because it's cleaner that way. For example, if evalthreads and breedthreads are both 4, then there are 4 seeds for the random number generator which must be defined. They are defined as such (Note the period between seed and the number n):
seed.0 = 2341
seed.1 = 7234123
seed.2 = 411
seed.3 = 34021239
It's common for arrays of objects are defined like this, with numbers representing their position.
The period is used for other hierarchical purposes. When an object contains other objects as subordinates, they fall within its hierarchy. Such objects have a parameter base which is prefixed to them. For example, the global Population instance contains an array of Subpopulation instances, each of which in turn contain a variety of objects. Here's how the Population instance is defined, the number of subpops it contains is set, the classes for its various subpopulations are defined, and the number of individuals each one has is set:
# We're doing some coevolution, so we need two
# subpopulations, each with 500 individuals
pop = ec.Population
pop.subpops = 2
pop.subpop.0 = ec.Subpopulation
pop.subpop.0.size = 500
pop.subpop.1 = ec.Subpopulation
pop.subpop.1.size = 500
Note that the parameters for each subpopulation begin with the parameter base pop.subpop.n. Each Subpopulation instance requests a "size" relative to its current parameter base handed it by its "controlling" object. As you might guess, these hierarchical bases can get very long.
If an object needs a given parameter, and the parameter does not exist with the provided base, then the object can check a default base for the parameter. For example, let's say that breeding pipeline #0 of the species for subpopulation #1 of the population is a MutationPipeline (GP point mutation) and is using Tournament Selection as it's source #0 to select individuals. It might declare some information thusly:
pop.subpop.1.species.pipe.0 = ec.gp.koza.MutationPipeline
pop.subpop.1.species.pipe.0.source.0 = ec.select.TournamentSelection
...we can custom-define the tournament size parameter by tacking it onto this base as:
pop.subpop.1.species.pipe.0.source.0.size = 7
...or we can fall back on a "default" setting for this parameter for all Tournament Selection objects as:
select.tournament.size = 2
...In this case the hierarchical parameter base is pop.subpop.1.species.pipe.0.source.0 and the "default base" for Tournament Selection is select.tournament. If the object looks both places and still can't find a parameter defined (or it's improperly defined), it will issue an error. Some global objects don't have default parameter bases, but most every object which can be repeatedly declared in different places will have a default base.
In general, objects which read parameters fall into one of several classes:
The class documentation contains three tables which give information about parameters and parameter bases for instances of that class. The Parameters table indicates the valid parameters declared for that instance. The Default Base indicates the class's default base, if any. The Parameter Bases table indicates the new parameter bases for subsidiary objects to this instance. For example, here's the tables from ec.gp.koza.MutationPipeline, the class responsible for doing the GP point mutation operator:
Parameters
Default Base Parameter bases
|
MutationPipeline is derived from ec.BreedingPipeline, which adds the following tables:
Parameters
Parameter bases
|
ec.BreedingPipeline in turn is derived from ec.BreedingSource, which adds the following tables:
Parameters
|
Although MutationPipeline inherits all these parameters, the parameter base for all of them is the instance's parameter base handed it by its controller object. And the default base for all of them is always the last one defined (in this case, "gp.koza.mutate". Default bases for parent classes are not used.
Back to our original example, imagine that we had a MutationPipeline used as breeding pipeline #0 of the species used in subpopulation #1 of the population:
pop.subpop.1.species.pipe.0 = ec.gp.koza.MutationPipeline
We could specify a probability for this pipeline as:
pop.subpop.1.species.pipe.0.prob = 0.9
...or we might specify a default probability (not necessarily a good idea) for all MutationPipelines as:
gp.koza.mutate.prob = 0.4
MutationPipeline contains two subsidiary instances, one which subclasses from gp.GPNodeSelector, and one which subclasses from gp.GPNodeBuilder. The first is responsible for picking a subtree to mutate, and the second is responsible for creating a new subtree. We specify classes for those instances in their parameters (we'll use a KozaNodeSelector and a GrowBuilder):
pop.subpop.1.species.pipe.0.ns.0 = ec.gp.koza.KozaNodeSelector
pop.subpop.1.species.pipe.0.build.0 = ec.gp.koza.GrowBuilder
Of course, we might provide default choices as well:
gp.koza.mutate.ns.0 = ec.gp.koza.KozaNodeSelector
gp.koza.mutate.build.0 = ec.gp.koza.GrowBuilder
These two objects have parameters to set up as well. Their parameter bases are specified as base.ns and base.build respectively. In this case, it means that their parameter bases are pop.subpop.1.species.pipe.0.ns.0 and pop.subpop.1.species.pipe.0.build.0. And thus the cycle of life continues. For example, KozaNodeSelectors have default base of gp.koza.ns and a root parameter which specifies the probability they'd pick the root of a tree. The root parameter would then be found at pop.subpop.1.species.pipe.0.ns.0.root, or the default value at gp.koza.ns.root.
There are way too many possible parameters to discuss here. Here are some places to start digging.
Some are global parameters, some are defined through the parameter base hierarchy, and some are defined through default bases. The parameter files are app/regression/noerc.params, its parent gp/koza/params, and its parent simple/params.
Number of threads and random number generator seeds breedthreads = 1 evalthreads = 1 seed.0 = 4357 Garbage collection gc = false aggressive = true gc-modulo = 1 Checkpointing checkpoint = false checkpoint-modulo = 1 prefix = ec Outputting Stuff nostore = false flush = true verbosity = 0 The EvolutionState Object state = ec.simple.SimpleEvolutionState Evolution Parameters generations = 51 quit-on-run-complete = true The Initializer, Breeder, Exchanger, and Finisher breed = ec.simple.SimpleBreeder exch = ec.simple.SimpleExchanger finish = ec.simple.SimpleFinisher init = ec.gp.GPInitializer The Evaluator and the Problem (ADF stuff is always loaded but not used in this case) eval = ec.simple.SimpleEvaluator eval.problem = ec.app.regression.Regression eval.problem.data = ec.app.regression.RegressionData eval.problem.stack = ec.gp.ADFStack eval.problem.stack.context = ec.gp.ADFContext eval.problem.stack.context.data = ec.app.regression.RegressionData The Statistics stat = ec.gp.koza.KozaStatistics stat.file = $out.stat Default Tournament Selection tournament size select.tournament.size = 7 Default HalfBuilder (ramped half/half tree building) parameters gp.koza.half.growp = 0.5 gp.koza.half.max-depth = 6 Default KozaNodeSelector parameters gp.koza.ns.nonterminals = 0.9 gp.koza.ns.root = 0.0 gp.koza.ns.terminals = 0.1 Default Reproduction operator parameters gp.koza.reproduce.source.0 = ec.select.TournamentSelection Default Crossover operator parameters gp.koza.xover.maxdepth = 17 gp.koza.xover.ns.0 = ec.gp.koza.KozaNodeSelector gp.koza.xover.ns.1 = same gp.koza.xover.source.0 = ec.select.TournamentSelection gp.koza.xover.source.1 = same gp.koza.xover.tries = 1 Function Sets (there's only one) gp.fs.size = 1 gp.fs.0 = ec.gp.GPFunctionSet gp.fs.0.info = ec.gp.GPFuncInfo gp.fs.0.name = f0 gp.fs.0.size = 9 gp.fs.0.func.0 = ec.app.regression.func.X gp.fs.0.func.0.nc = nc0 gp.fs.0.func.1 = ec.app.regression.func.Add gp.fs.0.func.1.nc = nc2 gp.fs.0.func.2 = ec.app.regression.func.Mul gp.fs.0.func.2.nc = nc2 gp.fs.0.func.3 = ec.app.regression.func.Sub gp.fs.0.func.3.nc = nc2 gp.fs.0.func.4 = ec.app.regression.func.Div gp.fs.0.func.4.nc = nc2 gp.fs.0.func.5 = ec.app.regression.func.Sin gp.fs.0.func.5.nc = nc1 gp.fs.0.func.6 = ec.app.regression.func.Cos gp.fs.0.func.6.nc = nc1 gp.fs.0.func.7 = ec.app.regression.func.Exp gp.fs.0.func.7.nc = nc1 gp.fs.0.func.8 = ec.app.regression.func.Log gp.fs.0.func.8.nc = nc1 Standard Node Constraints for untyped GP with nodes of various arity sizes gp.nc.size = 7 gp.nc.0 = ec.gp.GPNodeConstraints gp.nc.0.name = nc0 gp.nc.0.returns = nil gp.nc.0.size = 0 gp.nc.1 = ec.gp.GPNodeConstraints gp.nc.1.name = nc1 gp.nc.1.returns = nil gp.nc.1.size = 1 gp.nc.1.child.0 = nil gp.nc.2 = ec.gp.GPNodeConstraints gp.nc.2.name = nc2 gp.nc.2.returns = nil gp.nc.2.size = 2 gp.nc.2.child.0 = nil gp.nc.2.child.1 = nil gp.nc.3 = ec.gp.GPNodeConstraints gp.nc.3.name = nc3 gp.nc.3.returns = nil gp.nc.3.size = 3 gp.nc.3.child.0 = nil gp.nc.3.child.1 = nil gp.nc.3.child.2 = nil gp.nc.4 = ec.gp.GPNodeConstraints gp.nc.4.name = nc4 gp.nc.4.returns = nil gp.nc.4.size = 4 gp.nc.4.child.0 = nil gp.nc.4.child.1 = nil gp.nc.4.child.2 = nil gp.nc.4.child.3 = nil gp.nc.5 = ec.gp.GPNodeConstraints gp.nc.5.name = nc5 gp.nc.5.returns = nil gp.nc.5.size = 5 gp.nc.5.child.0 = nil gp.nc.5.child.1 = nil gp.nc.5.child.2 = nil gp.nc.5.child.3 = nil gp.nc.5.child.4 = nil gp.nc.6 = ec.gp.GPNodeConstraints gp.nc.6.name = nc6 gp.nc.6.returns = nil gp.nc.6.size = 6 gp.nc.6.child.0 = nil gp.nc.6.child.1 = nil gp.nc.6.child.2 = nil gp.nc.6.child.3 = nil gp.nc.6.child.4 = nil gp.nc.6.child.5 = nil Tree Constraints gp.tc.size = 1 gp.tc.0 = ec.gp.GPTreeConstraints gp.tc.0.init = ec.gp.koza.HalfBuilder gp.tc.0.name = tc0 gp.tc.0.returns = nil GP Types gp.type.a.size = 1 gp.type.a.0.name = nil gp.type.s.size = 0 The Population, and its one subpopulation, species, breeding pipelines and individuals pop = ec.Population pop.subpops = 1 pop.subpop.0 = ec.Subpopulation pop.subpop.0.duplicate-retries = 100 pop.subpop.0.fitness = ec.gp.koza.KozaFitness pop.subpop.0.size = 1000 pop.subpop.0.species = ec.gp.GPSpecies pop.subpop.0.species.ind = ec.gp.GPIndividual pop.subpop.0.species.ind.numtrees = 1 pop.subpop.0.species.ind.tree.0 = ec.gp.GPTree pop.subpop.0.species.ind.tree.0.tc = tc0 pop.subpop.0.species.pipe = ec.breed.MultiBreedingPipeline pop.subpop.0.species.pipe.source.0 = ec.gp.koza.CrossoverPipeline pop.subpop.0.species.pipe.source.0.prob = 0.9 pop.subpop.0.species.pipe.source.1 = ec.gp.koza.ReproductionPipeline pop.subpop.0.species.pipe.source.1.prob = 0.1