StarCraft II RL Tutorial 2

Deepmind's pysc2: Observations

Posted by Chris Hoyean Song on November 6, 2017

Hello- This is Chris. From November 3rd to 4th, I had attended Deepmind & Blizzard’s AI Workshop in Anaheim. This article has written in the workshop session.

On this article, I’m not going to cover the reinforcement learning directly. I’m going to share you about the pysc StarCraft II learning environment variables. If you are interested in Deepmind’s StarCraft II reinforcement learning environment, you need to understand pysc2’s environment variables first. It’s quite complex.

Don’t worry. I’m here for you. Let’s first understand the observation variables of the pysc2.

Prerequisites

  • IntelliJ ( or PyCharm)
  • Python3
  • StarCraft II
  • GIT
  • The tutorial is on Mac Environment.

Tutorial Outline

  • StarCraft II Observation Space
  • pysc2-examples Star & Fork
  • pysc2-examples git clone
  • Run pysc2 application in Debug mode to check out observation variables.

On this posting, I’ll cover just one subject. StarCraft II Reinforcement Learning consists of observations and actions. Observation is the variable that illustrates the current state of StarCraft II. The action is the command that can give some orders.

On this posting, we will cover observation features.

Observation

Observation space contains many different state information about the game environment. For example, it includes unit locations and unit health values and mineral locations.

Of course, there was some observation space in Atari games. In Atari games, the primary observation spaces are RGB pixels of the game. If you see the RGB pixels in Tensors, you’ll find it quite complicated. However, StarCraft II Learning Environment offers you even complicated information that cannot be compared with Atari games.

Spatial/Visual Information

RGB Pixels

Currently, pysc2 does not support RGB pixels.

Feature Layers

StarCraft II pysc2 Learning Environment does not directly deliver RGB pixels; they provide the information with high-level features. High-level features are beneficial when it comes to reinforcement learning since you can skip implementing Convolution Neural Network to figure out the features out of the RGB pixels. It helps us to reduce a lot of computations. If they only provided the RGB pixels, then we could not find our unit locations easily.

Currently, they provide observation space with 20 number of feature layers.

minimap

pysc2 is providing minimap information into several feature layers.

height_map : Show the terrain levels.

visibility : It shows which part of the map is hidden, have been seen, or currently visible.

creep : It show which part of the minimap is covered by creeps.

camera : It shows which part of the minimap are visible in the screen layers.

player_id : It shows which units are

player_relative : It illustrates which units are friendly, neutral, hostile. 0 means background, 1 is self, 2 is ally, 3 is neutral, 4 is hostile.

selected : It contains currently selected units. If some units are selected, then the value of that pixel should be 1.

screen

Screen layer provides higher resolution information on the current screen.

height_map : Show the terrain levels.

visibility : It shows which part of the map is hidden, have been seen, or currently visible.

creep : It show which part of the minimap is covered by creeps.

power : Which parts have protoss power, only shows your power.

player_id : Who owns the units, with absolute ids.

player_relative : Which units are friendly vs hostile. Takes values in [0, 4], denoting [background, self, ally, neutral, enemy] units respectively.

unit_type : A unit type id

selected : Which units are selected.

hit_points : How many hit points the unit has.

energy : How much energy the unit has.

shields : How much shields the unit has. Only for protoss units.

unit_density : How many units are in this pixel.

unit_density_aa : An anti-aliased version of unit_density with a maximum of 16 per unit per pixel. This gives you sub-pixel unit location and size.

You can check out detailed observation features on this Github page.

https://github.com/deepmind/pysc2/blob/master/docs/environment.md

Practice

I think you are not getting it with those documents. So, I prepared a simple practice to understand the internal observation data structures.

1. pysc2-examples Star & Fork

First of all, let’s go to the pysc2-examples Github repo.

https://github.com/chris-chris/pysc2-examples

Star it! :)

alt text

Fork it! :)

alt text

2. pysc2-examples git clone or git pull

Please clone my Github project or update the project using git pull :)

git clone https://github.com/chris-chris/pysc2-examples

If you already cloned it, please type git pull to update it to the latest code.

git pull

3. Check out the observation feature values using Debug mode

As we covered in tutorial 1, we are going to check the observations values using IntelliJ(or PyCharm) with debug mode.

Let’s open the project with IntelliJ, and open the file tests/scripted_test.py

alt text

After opening scripted_test.py, click the area left to the line 39 below # Break Point!! comment.

alt text

If you click that red box right after the line 39, you will see the red circle appear.

alt text

Okay, we are now ready to see the internal feature data using debug mode.

Right-click the scripted_test.py file and click [Debug 'Unittests in scripte...]

alt text

Yes! StarCraft II executed!

After waiting a moment, you will see that the StarCraft II is stopped as soon as it started. This is because of the Break Point we set.

alt text

Then we can freely check out the observation feature values. Let check it out on the Debug Console below.

alt text

First, Let’s check out the screen field of the feature layers. Inside the screen feature layer, we’ll check the player_relative layer which tells us the hostile, friendly, neutral unit locations.

Using the debug mode, you can check out the internal values by clicking the triangle like below. Let’s click the triangle on the left side of obs variable.

alt text

Again, let’s get into deeper. Click the triangle on the left side of 0 and observation. Okay, then we are in this state.

alt text

obs[0].observation

And then, let’s click the triangle on the left side of the screen and [0:13]. Alright, we are at this stage.

alt text

obs[0].observation["screen"]

Currently, we have 13 number of screen feature layers. Among the 13 screen feature layers, we are going to access the 6th layer which is the player_relative layer. The index of the player_relative layer is 5. So, let’s dig down into the player_relative layer by clicking the triangle. Also, let’s click triangle of [0:64].

alt text

player_relative : Which units are friendly vs hostile. Takes values in [0, 4], denoting [background, self, ally, neutral, enemy] units respectively.

obs[0].observation["screen"][5]

Finally, we got the feature layer what I want to show you. You can check out the marines inside of 64 x 64 layer.

alt text

Also, you can see the hostile enemies marked as four on the right side.

The essence of today’s tutorial is clear.

Tip: Use debug mode to understand the observation variables.

Okay, guys. You did a good job. On the next tutorial, I’ll cover the action spaces of the pysc2 environment.