Hello- This is Chris. From November 3rd to 4th, I had attended Deepmind & Blizzard’s AI Workshop in Anaheim. This article has written in the workshop session.
On this article, I’m not going to cover the reinforcement learning directly. I’m going to share you about the pysc StarCraft II learning environment variables. If you are interested in Deepmind’s StarCraft II reinforcement learning environment, you need to understand pysc2’s environment variables first. It’s quite complex.
Don’t worry. I’m here for you. Let’s first understand the observation variables of the pysc2.
Prerequisites
- IntelliJ ( or PyCharm)
- Python3
- StarCraft II
- GIT
- The tutorial is on Mac Environment.
Tutorial Outline
- StarCraft II Observation Space
- pysc2-examples Star & Fork
- pysc2-examples git clone
- Run pysc2 application in Debug mode to check out observation variables.
On this posting, I’ll cover just one subject. StarCraft II Reinforcement Learning consists of observations and actions. Observation is the variable that illustrates the current state of StarCraft II. The action is the command that can give some orders.
On this posting, we will cover observation features.
Observation
Observation space contains many different state information about the game environment. For example, it includes unit locations and unit health values and mineral locations.
Of course, there was some observation space in Atari games. In Atari games, the primary observation spaces are RGB pixels of the game. If you see the RGB pixels in Tensors, you’ll find it quite complicated. However, StarCraft II Learning Environment offers you even complicated information that cannot be compared with Atari games.
Spatial/Visual Information
RGB Pixels
Currently, pysc2 does not support RGB pixels.
Feature Layers
StarCraft II pysc2 Learning Environment does not directly deliver RGB pixels; they provide the information with high-level features. High-level features are beneficial when it comes to reinforcement learning since you can skip implementing Convolution Neural Network to figure out the features out of the RGB pixels. It helps us to reduce a lot of computations. If they only provided the RGB pixels, then we could not find our unit locations easily.
Currently, they provide observation space with 20 number of feature layers.
minimap
pysc2 is providing minimap information into several feature layers.
height_map : Show the terrain levels.
visibility : It shows which part of the map is hidden, have been seen, or currently visible.
creep : It show which part of the minimap is covered by creeps.
camera : It shows which part of the minimap are visible in the screen layers.
player_id : It shows which units are
player_relative : It illustrates which units are friendly, neutral, hostile. 0 means background, 1 is self, 2 is ally, 3 is neutral, 4 is hostile.
selected : It contains currently selected units. If some units are selected, then the value of that pixel should be 1.
screen
Screen layer provides higher resolution information on the current screen.
height_map : Show the terrain levels.
visibility : It shows which part of the map is hidden, have been seen, or currently visible.
creep : It show which part of the minimap is covered by creeps.
power : Which parts have protoss power, only shows your power.
player_id : Who owns the units, with absolute ids.
player_relative : Which units are friendly vs hostile. Takes values in [0, 4], denoting [background, self, ally, neutral, enemy] units respectively.
unit_type : A unit type id
selected : Which units are selected.
hit_points : How many hit points the unit has.
energy : How much energy the unit has.
shields : How much shields the unit has. Only for protoss units.
unit_density : How many units are in this pixel.
unit_density_aa : An anti-aliased version of unit_density with a maximum of 16 per unit per pixel. This gives you sub-pixel unit location and size.
You can check out detailed observation features on this Github page.
https://github.com/deepmind/pysc2/blob/master/docs/environment.md
Practice
I think you are not getting it with those documents. So, I prepared a simple practice to understand the internal observation data structures.
1. pysc2-examples Star & Fork
First of all, let’s go to the pysc2-examples Github repo.
https://github.com/chris-chris/pysc2-examples
Star it! :)
Fork it! :)
2. pysc2-examples git clone or git pull
Please clone my Github project or update the project using git pull
:)
git clone https://github.com/chris-chris/pysc2-examples
If you already cloned it, please type git pull
to update it to the latest code.
git pull
3. Check out the observation feature values using Debug mode
As we covered in tutorial 1, we are going to check the observations values using IntelliJ(or PyCharm) with debug mode.
Let’s open the project with IntelliJ, and open the file tests/scripted_test.py
After opening scripted_test.py
, click the area left to the line 39 below # Break Point!!
comment.
If you click that red box right after the line 39, you will see the red circle appear.
Okay, we are now ready to see the internal feature data using debug mode.
Right-click the scripted_test.py
file and click [Debug 'Unittests in scripte...]
Yes! StarCraft II executed!
After waiting a moment, you will see that the StarCraft II is stopped as soon as it started. This is because of the Break Point
we set.
Then we can freely check out the observation feature values. Let check it out on the Debug Console below.
First, Let’s check out the screen
field of the feature layers. Inside the screen
feature layer, we’ll check the player_relative
layer which tells us the hostile, friendly, neutral unit locations.
Using the debug mode, you can check out the internal values by clicking the triangle like below. Let’s click the triangle on the left side of obs
variable.
Again, let’s get into deeper. Click the triangle on the left side of 0
and observation
. Okay, then we are in this state.
obs[0].observation
And then, let’s click the triangle on the left side of the screen
and [0:13]
. Alright, we are at this stage.
obs[0].observation["screen"]
Currently, we have 13 number of screen feature layers. Among the 13 screen feature layers, we are going to access the 6th layer which is the player_relative
layer. The index of the player_relative
layer is 5. So, let’s dig down into the player_relative
layer by clicking the triangle. Also, let’s click triangle of [0:64]
.
player_relative : Which units are friendly vs hostile. Takes values in [0, 4], denoting [background, self, ally, neutral, enemy] units respectively.
obs[0].observation["screen"][5]
Finally, we got the feature layer what I want to show you. You can check out the marines inside of 64 x 64 layer.
Also, you can see the hostile enemies marked as four on the right side.
The essence of today’s tutorial is clear.
Tip: Use debug mode to understand the observation variables.
Okay, guys. You did a good job. On the next tutorial, I’ll cover the action spaces of the pysc2 environment.