Merge pull request #194 from Lux-AI-Challenge/v2.0.7

V2.1.0
Lux-AI-Challenge · Jan 28, 2023 · f19ec3d · f19ec3d
2 parents 0f00419 + 1927476
commit f19ec3d
Show file tree

Hide file tree

Showing 18 changed files with 166 additions and 425 deletions.
diff --git a/ChangeLog.md b/ChangeLog.md
@@ -1,5 +1,16 @@
 # ChangeLog
 
+### v2.0.7
+
+Added [advanced_specs](https://github.com/Lux-AI-Challenge/Lux-Design-S2/blob/main/docs/advanced_specs.md) document that goes over CPU engine code in depth
+
+Fix bug where repeated actions had their `n` value reset to 1. This now means actions internally have an `execution count`. This is not in the action space but in action queues the execution count is tracked and displayed.
+
+Fix bug where two heavies entering a tile can both get destroyed if a single light unit there has more power.
+
+Fix bug where clearing action queues with an empty action queue `[]` was not permitted.
+
+Java kit and updated JS, C++ kits are merged in.
 ### v2.0.6
 
 Fix bug where no seed provided to CLI meant no random maps

diff --git a/README.md b/README.md
@@ -45,8 +45,9 @@ The kits folder in this repository holds all of the available starter kits you c
 <!-- - [Reinforcement Learning (Python)](https://github.com/Lux-AI-Challenge/Lux-Design-S2/tree/main/kits/rl-sb3/) and [Reinforcement Learning (Python + Jax Env)](https://github.com/Lux-AI-Challenge/Lux-Design-S2/tree/main/kits/rl-sb3-jax-env/) -->
 - [C++](https://github.com/Lux-AI-Challenge/Lux-Design-S2/tree/main/kits/cpp/)
 - [Javascript](https://github.com/Lux-AI-Challenge/Lux-Design-S2/tree/main/kits/js/)
+- [Java](https://github.com/Lux-AI-Challenge/Lux-Design-S2/tree/main/kits/java/)
 - Typescript - TBA
-- Java - TBA
+- [Go](https://github.com/rooklift/golux2/) - (Currently a bare-bones kit external to this repository at the moment)
 
 Want to use another language but it's not supported? Feel free to suggest that language to our issues or even better, create a starter kit for the community to use and make a PR to this repository. See our [CONTRIBUTING.md](https://github.com/Lux-AI-Challenge/Lux-Design-S2/tree/main/CONTRIBUTING.md) document for more information on this.
 
@@ -72,7 +73,9 @@ We are proud to announce our sponsors [QuantCo](https://quantco.com/), [Regressi
 
 We like to extend thanks to some of our early core contributors: [@duanwilliam](https://github.com/duanwilliam) (Frontend), [@programjames](https://github.com/programjames) (Map generation, Engine optimization), and [@themmj](https://github.com/themmj) (C++ kit, Engine optimization).
 
-We further like to extend thanks to some of our core contributors during the beta period: [@LeFiz](https://github.com/LeFiz) (Game Design/Architecture), [@jmerle](https://github.com/jmerle) (Visualizer), .
+We further like to extend thanks to some of our core contributors during the beta period: [@LeFiz](https://github.com/LeFiz) (Game Design/Architecture), [@jmerle](https://github.com/jmerle) (Visualizer)
+
+We further like to thank the following contributors during the official competition: [@aradite](https://github.com/paradite)(JS Kit), [@MountainOrc](https://github.com/MountainOrc)(Java Kit), [@ArturBloch](https://github.com/ArturBloch)(Java Kit), [@rooklift](https://github.com/rooklift)(Go Kit)
 
 
 ## Citation

diff --git a/kits/README.md b/kits/README.md
@@ -38,11 +38,11 @@ For the JS kit, forward simulation is possible by setting the `FORWARD_SIM` valu
 
 In each episode there are two competing teams, both of which control factories and units.
 
-In the early phase, the action space is different than the actual game phase. See the starter kit codes (agent.py file) for how they are different.
+In the early phase, the action space is different than the normal game phase. See the starter kit codes (agent.py file) for how they are different.
 
-During the actual game phase, factories have 3 possible actions, `build_light`, `build_heavy`, and `water`. Units/Robots (light or heavy robots) have 5 possible actions: `move`, `dig`, `transfer`, `pickup`, `self_destruct`, `recharge`; where `move, dig, self_destruct` have power costs
+During the normal game phase, factories have 3 possible actions, `build_light`, `build_heavy`, and `water`. Units/Robots (light or heavy robots) have 5 possible actions: `move`, `dig`, `transfer`, `pickup`, `self_destruct`, `recharge`; where `move, dig, self_destruct` have power costs
 
-In Lux AI Season 2, the robots's actual action space is a list of actions representing it's action queue and your agent will set this action queue to control robots. This action queue max size is `env_cfg.UNIT_ACTION_QUEUE_SIZE`. Each turn, the unit executes the action at the front of the queue, and repeatedly does this a user-specified `n` times until exhausted. If the action is marked as to be repeated, it is replaced to the back of the queue.
+In Lux AI Season 2, the robots's actual action space is a list of actions representing it's action queue and your agent will set this action queue to control robots. This action queue max size is `env_cfg.UNIT_ACTION_QUEUE_SIZE`. Each turn, the unit executes the action at the front of the queue, and repeatedly executes this a user-specified `n` times. Moreover, action execution counts towards `n` only when it is succesful, so if your robot runs out of power or a resource to transfer, it won't be counted towards `n`. Finally, each action can specify a `repeat` value. If `repeat == 0` then after `n` executions the action is removed. If `repeat > 0`, then the action is recycled to the back of the queue and sets `n = repeat` insead of removing the action.
 
 In code, actions can be given to units as so
 
@@ -52,7 +52,7 @@ actions[unit_id] = [action_0, action_1, ...]
 
 Importantly, whenever you submit a new action queue, the unit incurs an additional power cost to update the queue of `env_cfg.ROBOTS[<robot_type>].ACTION_QUEUE_POWER_COST` power. While you can still compete by submitting a action queue with a single action to every unit (like most environments and Lux AI Season 1), this is power inefficient and would be disadvantageous. Lights consume 1 power and Heavies consume 10 power to update their action queue,
 
-See the example code in the corresponding `agent.py` file for how to give actions, how to set them to repeat or not, and the various utility functions to validate if an action is possible or not (e.g. does the unit have enough power to perform an action).
+See the example code in the corresponding `agent.py` file for how to give actions, how to set their `n` and `repeat` values to control execution count and recycling, and the various utility functions to validate if an action is possible or not (e.g. does the unit have enough power to perform an action).
 
 ## Environment Observations
 
@@ -72,7 +72,7 @@ The general observation given to your bot in the kits will look like below. `Arr
           "unit_type": "LIGHT" or "HEAVY",
           "pos": Array(2),
           "cargo": { "ice": int, "ore": int, "water": int, "metal": int },
-          "action_queue": Array(N, 5)
+          "action_queue": Array(N, 6)
         }
       }
     },

diff --git a/luxai_s2/luxai_runner/logger.py b/luxai_s2/luxai_runner/logger.py
@@ -1,5 +1,3 @@
-from turtle import color
-
 from luxai_s2.globals import TERM_COLORS
 
 try:

diff --git a/luxai_s2/luxai_s2/actions.py b/luxai_s2/luxai_s2/actions.py
@@ -22,7 +22,7 @@ def __init__(self, act_type: str) -> None:
         self.act_type = act_type
         self.n = 1  # number of times to execute the action
         self.power_cost = 0
-        self.repeat = False  # whether to put action to the back of the action queue
+        self.repeat = 0  # if 0, no recycling. If > 0, we recycle action and set n equal to this.
 
     def state_dict(self):
         raise NotImplementedError("")
@@ -62,7 +62,7 @@ def __str__(self) -> str:
 
 class MoveAction(Action):
     def __init__(
-        self, move_dir: int, dist: int = 1, repeat: bool = False, n: int = 1
+        self, move_dir: int, dist: int = 1, repeat: int = 0, n: int = 1
     ) -> None:
         super().__init__("move")
         # a[1] = direction (0 = center, 1 = up, 2 = right, 3 = down, 4 = left)
@@ -85,7 +85,7 @@ def __init__(
         transfer_dir: int,
         resource: int,
         transfer_amount: int,
-        repeat: bool = False,
+        repeat: int = 0,
         n: int = 1,
     ) -> None:
         super().__init__("transfer")
@@ -105,7 +105,7 @@ def state_dict(self):
                 self.resource,
                 self.transfer_amount,
                 self.repeat,
-                self.n,
+                self.n
             ]
         )
 
@@ -115,7 +115,7 @@ def __str__(self) -> str:
 
 class PickupAction(Action):
     def __init__(
-        self, resource: int, pickup_amount: int, repeat: bool = False, n: int = 1
+        self, resource: int, pickup_amount: int, repeat: int = 0, n: int = 1
     ) -> None:
         super().__init__("pickup")
         # a[2] = R = resource type (0 = ice, 1 = ore, 2 = water, 3 = metal, 4 power)
@@ -133,7 +133,7 @@ def __str__(self) -> str:
 
 
 class DigAction(Action):
-    def __init__(self, repeat: bool = False, n: int = 1) -> None:
+    def __init__(self, repeat: int = 0, n: int = 1) -> None:
         super().__init__("dig")
         self.repeat = repeat
         self.n = n
@@ -147,7 +147,7 @@ def __str__(self) -> str:
 
 
 class SelfDestructAction(Action):
-    def __init__(self, repeat: bool = False, n: int = 1) -> None:
+    def __init__(self, repeat: int = 0, n: int = 1) -> None:
         super().__init__("self_destruct")
         self.repeat = repeat
         self.n = n
@@ -161,7 +161,7 @@ def __str__(self) -> str:
 
 
 class RechargeAction(Action):
-    def __init__(self, power: int, repeat: bool = False, n: int = 1) -> None:
+    def __init__(self, power: int, repeat: int = 0, n: int = 1) -> None:
         super().__init__("recharge")
         self.power = power
         self.repeat = repeat
@@ -191,19 +191,21 @@ def format_action_vec(a: np.ndarray):
     # (0 = move, 1 = transfer X amount of R, 2 = pickup X amount of R, 3 = dig, 4 = self destruct, 5 = recharge X)
     a_type = a[0]
     if a_type == 0:
-        return MoveAction(a[1], dist=1, repeat=a[4], n=a[5])
+        act = MoveAction(a[1], dist=1, repeat=a[4], n=a[5])
     elif a_type == 1:
-        return TransferAction(a[1], a[2], a[3], repeat=a[4], n=a[5])
+        act = TransferAction(a[1], a[2], a[3], repeat=a[4], n=a[5])
     elif a_type == 2:
-        return PickupAction(a[2], a[3], repeat=a[4], n=a[5])
+        act =  PickupAction(a[2], a[3], repeat=a[4], n=a[5])
     elif a_type == 3:
-        return DigAction(repeat=a[4], n=a[5])
+        act = DigAction(repeat=a[4], n=a[5])
     elif a_type == 4:
-        return SelfDestructAction(repeat=a[4], n=a[5])
+        act = SelfDestructAction(repeat=a[4], n=a[5])
     elif a_type == 5:
-        return RechargeAction(a[3], repeat=a[4], n=a[5])
+        act =  RechargeAction(a[3], repeat=a[4], n=a[5])
     else:
         raise ValueError(f"Action {a} is invalid type, {a[0]} is not valid")
+    return act
+
 
 
 # a[1] = direction (0 = center, 1 = up, 2 = right, 3 = down, 4 = left)

diff --git a/luxai_s2/luxai_s2/env.py b/luxai_s2/luxai_s2/env.py
@@ -613,7 +613,7 @@ def _handle_movement_actions(self, actions_by_type: ActionsByType):
                 continue
             if len(heavy_entered_pos[pos_hash]) > 1:
                 # all units collide, find the top 2 units by power
-                (most_power_unit, next_most_power_unit) = get_top_two_power_units(units)
+                (most_power_unit, next_most_power_unit) = get_top_two_power_units(units, UnitType.HEAVY)
                 if most_power_unit.power == next_most_power_unit.power:
                     # tie, all units break
                     for u in units:
@@ -670,7 +670,7 @@ def _handle_movement_actions(self, actions_by_type: ActionsByType):
                         (
                             most_power_unit,
                             next_most_power_unit,
-                        ) = get_top_two_power_units(units)
+                        ) = get_top_two_power_units(units, UnitType.LIGHT)
                         if most_power_unit.power == next_most_power_unit.power:
                             # tie, all units break
                             for u in units:
@@ -810,7 +810,7 @@ def step(
                                 continue
                             formatted_actions = []
                             if type(action) == list or (
-                                type(action) == np.ndarray and len(action.shape) == 2
+                                type(action) == np.ndarray and (len(action.shape) == 2 or len(action) == 0)
                             ):
                                 trunked_actions = action[
                                     : self.env_cfg.UNIT_ACTION_QUEUE_SIZE

diff --git a/luxai_s2/luxai_s2/spaces/act_space.py b/luxai_s2/luxai_s2/spaces/act_space.py
@@ -58,6 +58,9 @@ def contains(self, x: Any) -> bool:
             isinstance(x, np.ndarray) and len(x.shape) == 2 and len(x) > self.max_length
         ):
             return False
+        elif isinstance(x, np.ndarray) and x.shape[0] == 0:
+            # empty action
+            return True
         elif isinstance(x, np.ndarray) and len(x.shape) == 1:
             x = [x]
         # if (not isinstance(x, list) and not isinstance(x, np.ndarray)) or len(x) > self.max_length:
@@ -129,13 +132,14 @@ def get_act_space(
         # a[3] = X, amount of resources transferred or picked up if action is transfer or pickup.
         # If action is recharge, it is how much energy to store before executing the next action in queue
 
-        # a[4] = 0 or 1 (false or true). If True, then action is placed to the back of the action queue after it has been exhausted
+        # a[4] = repeat. If repeat == 0, then action is not recycled and removed once we have executed it a[5] = n times. 
+        # Otherwise if repeat > 0 we recycle this action to the back of the action queue and set n = repeat.
 
-        # a[5] = X, number of times to execute this action before exhausting it and removing it from the front of the action queue. Minimum is 1.
+        # a[5] = n, number of times to execute this action before exhausting it and removing it from the front of the action queue. Minimum is 1.
         act_space[u.unit_id] = ActionsQueue(
             spaces.Box(
                 low=np.array([0, 0, 0, 0, 0, 1]),
-                high=np.array([5, 4, 4, config.max_transfer_amount + 1, 1, 9999]),
+                high=np.array([5, 4, 4, config.max_transfer_amount + 1, 9999, 9999]),
                 shape=(6,),
                 dtype=np.int64,
             ),

diff --git a/luxai_s2/luxai_s2/spaces/obs_space.py b/luxai_s2/luxai_s2/spaces/obs_space.py
@@ -139,11 +139,11 @@ def get_obs_space(config: EnvConfig, agent_names: List[str], agent: int = 0):
             unit_type=UnitTypeSpace(),
         )
         if config.UNIT_ACTION_QUEUE_SIZE != 1:
-            # note, action queue space copied over from act_space.py
+            # same as unit action space
             obs_dict["action_queue"] = ActionsQueue(
                 spaces.Box(
-                    low=np.array([0, 0, 0, 0, 0, 1]),
-                    high=np.array([5, 4, 4, config.max_transfer_amount + 1, 1, 9999]),
+                    low=np.array([0, 0, 0, 0, 0, 1, 0]),
+                    high=np.array([5, 4, 4, config.max_transfer_amount + 1, 9999, 9999]),
                     shape=(6,),
                     dtype=np.int64,
                 ),

diff --git a/luxai_s2/luxai_s2/state/state.py b/luxai_s2/luxai_s2/state/state.py
@@ -2,6 +2,7 @@
 from collections import OrderedDict
 from dataclasses import dataclass, field
 from typing import Dict, List
+
 try:
     from typing import TypedDict    
 except:
@@ -17,7 +18,8 @@
 from luxai_s2.map_generator.generator import GameMap
 from luxai_s2.state.stats import StatsStateDict
 from luxai_s2.team import Team, TeamStateDict
-from luxai_s2.unit import FactionTypes, Unit, UnitCargo, UnitStateDict, UnitType
+from luxai_s2.unit import (FactionTypes, Unit, UnitCargo, UnitStateDict,
+                           UnitType)
 
 
 class SparseBoardStateDict(TypedDict):

diff --git a/luxai_s2/luxai_s2/state/stats.py b/luxai_s2/luxai_s2/state/stats.py
@@ -2,6 +2,7 @@
 from collections import OrderedDict
 from dataclasses import dataclass, field
 from typing import Dict, List
+
 try:
     from typing import TypedDict    
 except:

diff --git a/luxai_s2/luxai_s2/unit.py b/luxai_s2/luxai_s2/unit.py
@@ -2,6 +2,7 @@
 from dataclasses import dataclass
 from enum import Enum
 from typing import List
+
 try:
     from typing import TypedDict    
 except:
@@ -109,8 +110,8 @@ def repeat_action(self, action):
             # remove from front of queue
             self.action_queue.pop(0)
             # endless repeat puts action back at end of queue
-            if action.repeat:
-                action.n = 1
+            if action.repeat > 0:
+                action.n = action.repeat
                 self.action_queue.append(action)
 
     def move_power_cost(self, rubble_at_target: int):

diff --git a/luxai_s2/luxai_s2/utils/utils.py b/luxai_s2/luxai_s2/utils/utils.py
@@ -1,28 +1,29 @@
 from typing import List
 
 from luxai_s2.config import EnvConfig
-from luxai_s2.unit import Unit
+from luxai_s2.unit import Unit, UnitType
 
 
 def is_day(config: EnvConfig, env_step):
     return env_step % config.CYCLE_LENGTH < config.DAY_LENGTH
 
 
-def get_top_two_power_units(units: List[Unit]):
+def get_top_two_power_units(units: List[Unit], unit_type: UnitType):
     most_power_unit: Unit = units[0]
     most_power = -1
     next_most_power_unit: Unit = units[1]
     next_most_power = -1
     for u in units:
-        if u.power > most_power:
-            next_most_power_unit = most_power_unit
-            most_power_unit = u
-            most_power = u.power
-        elif (
-            u.power >= next_most_power
-        ):  # >= check since we want to top 2 power units which can tie
-            next_most_power_unit = u
-            next_most_power = u.power
+        if u.unit_type == unit_type:
+            if u.power > most_power:
+                next_most_power_unit = most_power_unit
+                most_power_unit = u
+                most_power = u.power
+            elif (
+                u.power >= next_most_power
+            ):  # >= check since we want to top 2 power units which can tie
+                next_most_power_unit = u
+                next_most_power = u.power
 
     return (most_power_unit, next_most_power_unit)
 

diff --git a/luxai_s2/setup.py b/luxai_s2/setup.py
@@ -17,7 +17,7 @@ def read(fname):
     long_description="Code for the Lux AI Challenge Season 2",
     packages=find_packages(exclude="kits"),
     entry_points={"console_scripts": ["luxai-s2 = luxai_runner.cli:main"]},
-    version="2.0.6",
+    version="2.1.0",
     python_requires=">=3.7",
     install_requires=[
         "numpy",