-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quality Diversity Problems #279
Comments
Another associated issue is how do we dictate A very invasive solution is to merge |
So, first of all love this (I have a soft spot in my heart for QD algorithms). When we tried (and then reverted) an overhaul of the Is your point here that using that approach requires subclassing My own intuition was to avoid the idea of subclasses altogether wherever possible in leap (as I tend to strongly favor composition in place of inheritance). @markcoletti has made a good case for the usefulness of |
Composition, or building alternates to The issue with using the existing Also, yes subclassing |
Another option is to support dictionary return values for Edit: For reference of a similar interface, the Deep RL library Ray Train uses a similar method to control multiple actors. |
This last option is growing on me. It opens opportunities for adaptive fitness by separating where the evaluation is stored from the fitness property. For instance, rather than returning a fitness at evaluation, the user could return desirable characteristics and subclass |
@markcoletti and I were chatting about this—what about treating fitness & novelty/diversity evaluation as entirely separate processes? So, the usual evaluation operator/ Then some third operator (?) would synthesize the two in an algorithm-specific way. The question, I suppose, is whether an approach like this would be flexible enough to model all the major QD algorithms—ex. novelty search with LS, surprise search, MAP-elites, etc. (incidentally I have always viewed MAP-elites as a form of island model, but that's a discussion for another time). |
I don't believe that approach is flexible enough as you surmised. The trouble is if the QD algorithm uses a behavior descriptor that is drawn from the observed states, or has to observe them itself like surprise search, then quality and diversity, or at the very least the values used to determine them, have to be evaluated simultaneously. On a local branch I've given a trial run at replacing @property
def fitness(self):
if hasattr(self.evaluation, "__getitem__") and "fitness" in self.evaluation:
return self.evaluation["fitness"]
elif hasattr(self.evaluation, "fitness"):
return self.evaluation.fitness
return self.evaluation This lets you return a dictionary containing fitness, a class with the For actual usage, I've been working on an implementation of BRNS. I created a subclass of @property
def fitness(self):
quality, diversity = self.brns.quality_diversity(self.evaluation["quality"], self.evaluation["behavior"])
if self.global_quality:
quality = np.mean(self.evaluation["quality"])
return np.array([quality, diversity])
def evaluate(self):
evaluation = super().evaluate()
self.brns.push_observation(self.evaluation["quality"], self.evaluation["behavior"])
return evaluation And included the BRNS train step as an operator in my pipeline. As you can see, relatively painless. As a note, there was a lot of accompanying boilerplate due to the |
In theory the changes I've made could be done in a subclass of its own, although the essentials would need to be reimplemented for |
Ah, right—we need to collect behavior information from the same "simulation run" as fitness information in any given case, so they really do need to be done at the same time. |
I am working on an extra package for QD algorithms. My current solution is a mixin that redirects the result from |
Quality Diversity algorithms have an additional characteristic recorded when evaluating individuals, their behavior descriptor. Depending on the algorithm, this can be translated into fitness in a variety of ways, and the process is almost always stateful. The current
Problem
cannot introduce new member variables to theIndividual
without subclassingIndividual
and altering theevaluate
interface. Altering theProblem
to translate behavior into fitness at evaluate time is not an option since in distributed evaluation each problem would be accessing different states.An option is to store the quality and behavior descriptors with fitness as returned by
.evaluate()
, then decompose and convert to fitness using an operator, but this isn't super appealing to me.A pie in the sky ask would be to alter the base
Problem
andIndividual
evaluate interfaces to support QD throughout. My idea is something along this line, barring more intuitive name choices:Problem
:A function,
evaluate_fitness
that calculates the traditional fitness of the individual and defaults to returningNone
.A function,
evaluate_behavior
that calculates the behavior descriptor of the individual and defaults to returningNone
.The original
evaluate
, defaulting to callevaluate_fitness
andevaluate_behavior
and return them as a tuple.Individual
:evaluate
changed to store the respective return values tofitness
andbehavior_descriptor
.Then operators to join or replace
fitness
with some function ofbehavior_descriptor
.This would let the user implement any combination of behavior and fitness. If they want a normal optimization algorithm, they can simply implement
evaluate_fitness
. If they want to determine behavior separately, or are just doing a diversity based approach, implementevaluate_behavior
. If the two have to be determined jointly, they can overrideevaluate
and ignore the other two functions. Migrating existing code would just require renamingevaluate
on existingProblem
implementations toevaluate_fitness
.The text was updated successfully, but these errors were encountered: