HunterAI: A Devlog on a building an Intelligent Agent
Project Devlog

# Introduction

This is my project devlog to create a certain complex AI. In each part of this page, I will write about my journey progress in developing and upgrading the complexity of the AI program from the previous version.

# Latest State

On 28 January 2020, I created a reflex agent with internal state in an imperfect information environment. Specifically, I created a top-down shooter AI with limited vision range to its surrounding. Using common pathfinding algorithm (in this case, BFS), the AI agent is able to determine an efficient way to discover monsters and position itself in the nearest shooting position.

# Part 1: Creating a simple Intelligent Agent

Written on 2019-09-16

## Today's Goal

For the first part, I will try to make an intelligent agent that can hunt monster. I name this agent: Hunter. (In the future, this version will be referenced as Alexander)

## Defining "Agent"

Before we talk about building agent, let's understand what this "agent" is about. How can we build something if we do not what it is?
What is Agent?
Agent is anything that can be viewed as: perceiving environment through sensors; and acting upon environment through actuators. It will run in cycles of perceiving, thinking, and acting.
What is an Intelligent Agent?
An agent that acts in a manner that causes it to be as successful as it can. This agent acts in a way that is expected to maximise to its performance measure, given the evidence provided by what it perceived and whatever built-in knowledge it has. The performance measure defines the criterion of success for an agent.

## Designing an Intelligent Agent

Now that we already understand what an intelligent agent is, how do we start designing the agent?
When designing an intelligent agent, we can describe the agent by defining PEAS Descriptors:
Performance -- how we measure the agent's achievements
Environment -- what is the agent is interacting with
Actuator -- what produces the outputs of the agent
Sensor -- what provides the inputs to the agent
Alternatively, we can describe the agent by defining PAGE Descriptor:
Percepts -- the inputs to our system
Actions -- the outputs of our system
Goals -- what the agent is expected to achieve
Environment -- what the agent is interacting with
Formally, we can define structure of an Agent as:
$Agent = Architecture + Program$
where:
Architecture: a device with sensors and actuators. Agent Program: a program implementing Agent Function
$f$
on top an architecture, where Agent Function is a function that maps a sequence of perceptions into action.
$f(P) = A$

## Agent Interaction Concept

I propose a concept idea of agent interaction as follow:
Agent is represented as
$$
.
Architecture is responsible for perceiving environment through sensors and acting upon environment through actuators.
Percept perceived by architecture is then passed to Agent Program. According to agent function, Agent Program then map the percept to an action. Action is then passed to Architecture.
Environment is a representation that will be perceived and acted upon by architecture.
Action is represented as
$$
, where id is used to identify the type of the action.
Percept is represented as
$$
, where state is a key-value map containing the information perceived by sensor.
Let's start by making a simple abstract code implementing the above concepts.
concepts.py
1
class Environment:
2
pass
3
4
5
class Action:
6
def __init__(self, id):
7
self.id = id
8
9
10
class Percept:
11
def __init__(self, state):
12
self.state = state
13
14
15
class Architecture:
16
def perceive(self, environment):
17
raise NotImplementedError()
18
19
def act(self, environment, action):
20
raise NotImplementedError()
21
22
23
class Program:
24
def process(self, percept):
25
raise NotImplementedError()
26
27
28
class Agent:
29
def __init__(self, program, architecture):
30
self.program = program
31
self.architecture = architecture
32
33
def step(self, environment):
34
percept = self.architecture.perceive(environment)
35
action = self.program.process(percept)
36
if action:
37
self.architecture.act(environment, action)
Copied!

## Let's start simple

Let's try implementing our Hunter as a simple reflex agent. Because this is our first attempt, let's simplify things:
There is one monster in battlefield.
We only care about the monster visibility. We don't care about its position.
When a monster is visible, hunter will react by shooting it.
When a monster got hit, it will disappear.
I know that this looks oversimplified and seems very easy. But this is important as our building foundation.
First, let's define our environment. According to our simplification, there is only one state attribute: monster_visible.
environment.py
1
from .concepts import Environment
2
3
4
class HunterEnvironment(Environment):
5
def __init__(self):
6
self.monster_visible = False
Copied!
Next, let's define our Hunter's Agent program. According to our simplification, when the agent perceives that a monster is visible, it will do shooting action.
program.py
1
from .concepts import Action, Program
2
3
4
class HunterProgram(Program):
5
def process(self, percept):
6
if percept.state['monster_visible'] is True:
7
return Action('shoot')
8
else:
9
return None
Copied!
Next, let's define our Architecture. Perceiving is simple, it will check the monster_visible attribute of the environment. If shooting action is acted upon this environment, the monster will disappear, thus setting monster_visible into False.
architecture.py
1
from .concepts import Architecture, Percept
2
import logging
3
4
5
class HunterArchitecture(Architecture):
6
def perceive(self, environment):
7
return Percept({'monster_visible': environment.monster_visible})
8
9
def act(self, environment, action):
10
if action.id == 'shoot':
11
logging.debug('Pew! Shooting monster')
12
environment.monster_visible = False
Copied!
So far, we have already finished implementing our agent program and architecture. And this is sufficient to say that we already finished implementing our agent. Our agent can be instantiated with the following code:
1
program = HunterProgram()
2
architecture = HunterArchitecture()
3
agent = Agent(program, architecture)
Copied!
However, currently we don't have any way to run and test our agent. Let's make a simulator to run this agent.
simulator.py
1
from .concepts import Agent
2
from .architecture import HunterArchitecture
3
from .program import HunterProgram
4
from .environment import HunterEnvironment
5
import logging
6
import os
7
import time
8
9
10
class Simulator:
11
def __init__(self, environment, agents):
12
self.environment = environment
13
self.agents = agents
14
self.time = 0
15
16
def step(self):
17
for agent in self.agents:
18
agent.step(self.environment)
19
self.time += 1
20
21
def debug(self):
22
logging.debug('monster_visible is %s', self.environment.monster_visible)
23
24
@staticmethod
25
def instantiate():
26
environment = HunterEnvironment()
27
28
program = HunterProgram()
29
architecture = HunterArchitecture()
30
agent = Agent(program, architecture)
31
32
return Simulator(environment, [agent])
Copied!
Our simulator class represents a world where our agent is running. As you may realize, We have not yet defined what a world is. Let's define world as a triplet
$$
, where time is an increasing number that represents a moment in the world. A world contains an environment and a list of agent (agents) that will interact with the environment.
A moment in the world can be moved forward by calling step() function. When stepping, the agents will start processing the environment, perceiving and acting upon it; and the time will increase.

## Running our agent with simulator

Now that we finished programming our intelligent agent and simulator, let's run and play around with them.
Spawn a python interpreter, and import our simulator.py programs.
1
\$ python3
2
3
>>> from simulator import *
4
5
>>> logging.basicConfig(level='DEBUG')
6
7
>>> simulator = Simulator.instantiate()
8
9
>>> simulator.debug()
10
DEBUG:root:[time:0] monster_visible is False
Copied!
We step one time unit period in our world simulator by calling simulator.step().
1
>>> simulator.step()
2
3
>>> debug(world)
4
DEBUG:root:[time:1] monster_visible is False
Copied!
Let's try updating the environment. We want to see if our agent react accordingly when a monster appears.
1
>>> debug(world)
2
DEBUG:root:[time:1] monster_visible is False
3
4
>>> simulator.environment.monster_visible = True
5
6
>>> debug(world)
7
DEBUG:root:[time:1] monster_visible is True
Copied!
After running step, we will see that our Hunter should shoot the monster.
1
>>> debug(world)
2
DEBUG:root:[time:1] monster_visible is True
3
4
>>> simulator.step()
5
DEBUG:root:Pew! Shooting monster
6
7
>>> debug(world)
8
DEBUG:root:[time:2] monster_visible is False
Copied!

## Conclusion

Today, we learned about Intelligent agent and we have built a very simple intelligent agent. In the future, we will try to extend this agent to have more advanced features. I will also try to create a visualization for this agent. While this seems very simple, this is my first step towards creating a certain complex AI program.

# Part 2: Integrating AI Agent with Unity

Written on 2019-09-19

## Today's Goal

In this part, I will try to create a 3D visualization using Unity and connect it to our agent written in python. You may want to read the previous part about Alexander AI concept because we will reuse a lot of the concepts. (In the future, today's version will be referenced as Barton).
Part 2's Milestone

## Idea

I will still continue using the concepts introduced in previous part. However, there is an update that should be done to accommodate remote functionality.
Updated remote agent concepts in python side:
Remote environment: remote environment is similar with common environment, a representation that will be perceived and acted upon by architecture. The differences are:
The architecture is not allowed to modify environment state directly. Action from architecture will be stored at remote_action.
Environment state is updated at every moment / step. This constant update is required because the Remote environment is actually a representation of the corresponding environment objects in the Unity side.
Remote architecture: implementation-wise there is no difference with common architecture. However, there is an important concept that must be followed by remote architecture during actuating actions. Remote architecture should never modify environment state directly, instead it should use set_remote_action(action) function to pass the action (remote_action) to the Unity side. The action will be then realized by the corresponding agent object in the Unity side.
Next, We have to define a remote communication module that allows communication between the python and Unity side. One of the simplest approach is by serving a python HTTP server, and let Unity side send requests and receive responds from the python side.
Communication Protocol for our AI python process and Unity
The HTTP server behaviour is defined as follows:
Endpoint /init will initiate the time, environment and agents. The initiation procedure is passed as init_function parameter when constructing Remote module.
Endpoint /step will trigger the world step, moving the time forward. This endpoint accepts environment state in JSON format. These actions will be executed:
Retrieve the environment state from Unity (from the HTTP body), then update the the python representation of remote environment will update its state accordingly to the passed state.
The agents will start processing the environment, perceiving the environment and producing remote_action.
The remote_action will be returned as HTTP Response. The Unity side will read this action and actuate it accordingly.
Increase time.

## The Unity

Unity Editor of our AI visualization
For illustration, I show the completed today's Unity project above. Quick explanation of what is happening in the Unity scene:
The blue cube represents the Hunter (player) object.
The red cube represents the Monster object.
The blue cube can spawn Projectile object that will hit the red cube out, making it fall off the ground (then destroyed).
The main problem here is how to make Hunter object communicate with our Hunter agent in python? As discussed in the Idea section, we are going to make the Unity communicate with our python with an HTTP client.
First, put an empty object (Simulator) into our Unity scene. Then, add HunterSDK and SimulatorScript as components to Simulator object.
The HunterSDK is essentially the code to communicate with our server protocol. It provides the Initiate() and Step() method. The initiate() method will send a request to /init endpoint, initiating agents and environment representation in python side. The Step() method accepts the current environment state in Unity, then send it in JSON format alongside a request to /step endpoint, and then pass the respond to the callback function. The respond should be containing the action to be actuated.
HunterSDK.cs (partial)
1
public class HunterSDK : MonoBehaviour
2
{
3
public string Host = "http://127.0.0.1:5000/";
4
5
public void Initiate()
6
{
7
StartCoroutine(PutRequest(Host + "init"));
8
}
9
10
public void Step(State state, Action<StepResponse> callback)
11
{
12
string stateJson = JsonUtility.ToJson(state);
13
StartCoroutine(PutRequest(Host + "step", stateJson, (res) =>
14
{
15
StepResponse stepResponse = JsonUtility.FromJson<StepResponse>(res);
16
callback(stepResponse);
17
}));
18
}
19
20
IEnumerator PutRequest(string uri, string data = "{}", Action<string> callback = null)
21
{
22
using (UnityWebRequest webRequest = UnityWebRequest.Put(uri, data))
23
{
24
25
yield return webRequest.SendWebRequest();
26
27
}
28
}
29
}
Copied!
Next, the SimulatorScript is a behavioural code that should define the behaviour of our agents and environment in Unity side. Initially in Start(), it will run HunterSDK.Initiate(). Then, after a fixed interval, it will periodically call the SimulatorScript.Step() function. In this step function, you need to determine the current environment state with Unity functionality (e.g. Calculate the visibility of monster), then pass it to HunterSDK.Step(state). Finally, you also need to define what should be done to actuate action accordingly (e.g. Spawn a projectile).
SimulatorScript.cs
1
public class SimulatorScript : MonoBehaviour
2
{
3
public float stepInterval;
4
HunterSDK hunterSDK;
5
6
void Start()
7
{
8
hunterSDK = gameObject.GetComponent<HunterSDK>();
9
hunterSDK.Initiate();
10
InvokeRepeating("Step", stepInterval, stepInterval);
11
}
12
13
void Step()
14
{
15
State state = new State();
16
17
// todo: Sensor code block; fill in state values
18
19
hunterSDK.Step(state, (stepResponse) =>
20
{
21
AgentAction agentAction = stepResponse.action;
22
23
// todo: Actuator code block; actuate action accordingly
24
25
});
26
}
27
}
Copied!

## Python Code

### Remote Agent Concept

As discussed above, let's define the updated concepts in barton/concept.py. Note that we will still be reusing the old concepts from alexander/concept.py.
barton/concepts.py
1
from alexander.concepts import Architecture, Environment
2
3
4
class RemoteArchitecture(Architecture):
5
def perceive(self, environment):
6
raise NotImplementedError()
7
8
def act(self, environment, action):
9
raise NotImplementedError()
10
11
12
class RemoteEnvironment(Environment):
13
def __init__(self):
14
self.remote_action = None
15
16
def update(self, state):
17
self.remote_action = None
18
self.update_state(state)
19
20
def set_remote_action(self, action):
21
self.remote_action = action
22
23
def get_remote_action(self):
24
return self.remote_action
25
26
def update_state(self, state):
27
raise NotImplementedError()
Copied!

### Remote Communication Module

As discussed above, we will be serving a python HTTP server, and let Unity side send requests and receive responds from the python side. I use flask for its simplicity for running HTTP server. The HTTP server behavior is defined as follows:
Endpoint /init will initiate the time, environment and agents. The initiation procedure is passed as init_function parameter when constructing Remote module.
Endpoint /step will trigger the world step, moving the time forward. This endpoint accepts environment state in JSON format. These actions will be executed:
Retrieve the environment state from Unity (from the HTTP body), then update the the python representation of remote environment will update its state accordingly to the passed state.
The agents will start processing the environment, perceiving the environment and producing remote_action.
remote_action will be returned as HTTP Response. The Unity side will read this action and actuate its accordingly.
Increase time.
The implementation is as follows:
barton/remote.py
1
2
3
4
5
class Remote:
6
class Controller:
7
@staticmethod
8
def ping():
9
return 'pong'
10
11
@staticmethod
12
def init(remote):
13
remote.init()
14
return jsonify({'ok': True})
15
16
@staticmethod
17
def step(remote):
18
state = request.get_json()
19
response = remote.step(state)
20
return jsonify(response)
21
22
def __init__(self, init_function):
23
self.init_function = init_function
24
self.agents = None
25
self.environment = None
26
self.time = 0
27
28
def init(self):
29
self.environment, self.agents = self.init_function()
30
self.time = 0
31
32
def step(self, state):
33
self.environment.update(state)
34
for agent in self.agents:
35
agent.step(self.environment)
36
action = self.environment.get_remote_action()
37
action_serializable = action.__dict__ if action is not None else None
38
self.time += 1
39
return {'action': action_serializable}
40
41
def app(self):
42
43
CORS(app)
44
45
46
47
return app
Copied!

## Writing Hunter Agent with the Remote Function in Python

First, let's define our hunter architecture using the updated remote architecture concept. If you notice, the logic is still similar with the previous alexander/architecture.py. The difference is that instead of changing state environment.monster_visible directly, we will delegate the action to the Unity side through remote_action.
barton/architecture.py
1
from alexander.concepts import Architecture, Percept
2
import logging
3
4
5
class HunterRemoteArchitecture(Architecture):
6
def perceive(self, environment):
7
return Percept({'monster_visible': environment.monster_visible})
8
9
def act(self, environment, action):
10
if action.id == 'shoot':
11
logging.debug('Pew! Shooting monster')
12
environment.set_remote_action(action)
Copied!
Next, let's define our hunter environment using the updated remote environment concept.
barton/environment.py
1
from .concepts import RemoteEnvironment
2
3
4
class HunterRemoteEnvironment(RemoteEnvironment):
5
def __init__(self):
6
super().__init__()
7
self.monster_visible = False
8
9
def update_state(self, state):
10
if 'monster_visible' in state:
11
self.monster_visible = state['monster_visible']
Copied!
Finally, let's write our main file. In main, we will define our init_function that will instantiate the hunter environment and agent. Then we will pass the it to Remote module, and run the python remote server.
barton/main.py
1
from .remote import Remote
2
from .architecture import HunterRemoteArchitecture
3
from .environment import HunterRemoteEnvironment
4
from alexander.program import HunterProgram
5
from alexander.concepts import Agent
6
7
if __name__ == '__main__':
8
def init_function():
9
environment = HunterRemoteEnvironment()
10
program = HunterProgram()
11
architecture = HunterRemoteArchitecture()
12
agent = Agent(program, architecture)
13
return environment, [agent]
14
15
remote = Remote(init_function)
16
remote.app().run()
Copied!
We can start the python remove server by running the following shell command:

## Hunter Agent in Unity Side

Remember that we need to write the code to fill in environment state values and actuating agent action in Unity scene.
According to our current Hunter scenario, there is only one state information that we need to attain: monster visibility (monster_visible). Let's just implement that monster_visible is true if the monster object exists and the distance between player object and the monster object is less than 4 unit.
Next, in actuator code block, we check that if the action id is "shoot", then we call the method playerScript.Shoot(). This method is essentially spawn a projectile and add a big force toward the monster object direction. By the physics of Unity engine, the monster object will be pushed backward by the projectile and it will fall off the ground.
I also add a method SetAIEnabled() to toggle the activation of the AI. This method is bound to the Toggle UI "Enable AI". There is also a SpawnMonster() method that will be executed when clicking the button "Spawn Monster".
The complete Unity Simulator script is as follows:
SimulatorScript.cs
1
public class SimulatorScript : MonoBehaviour
2
{
3
4
public GameObject monster;
5
public GameObject player;
6
public float stepInterval;
7
8
bool aiEnabled = true;
9
HunterSDK hunterSDK;
10
PlayerScript playerScript;
11
12
13
void Start()
14
{
15
hunterSDK = gameObject.GetComponent<HunterSDK>();
16
playerScript = player.GetComponent<PlayerScript>();
17
hunterSDK.Initiate();
18
InvokeRepeating("Step", stepInterval, stepInterval);
19
}
20
21
void Step()
22
{
23
if (aiEnabled)
24
{
25
playerScript.ShowThinking(true);
26
GameObject monsterGameObject = GameObject.FindGameObjectWithTag("Monster");
27
28
State state = new State();
29
state.monster_visible = monsterGameObject != null
30
&& Vector3.Distance(player.transform.position, monsterGameObject.transform.position) <= 4.0f;
31
32
hunterSDK.Step(state, (stepResponse) =>
33
{
34
AgentAction agentAction = stepResponse.action;
35
if (agentAction.id == "shoot")
36
{
37
playerScript.Shoot();
38
}
39
Helper.RunLater(this, () => playerScript.ShowThinking(false), 0.1f);
40
});
41
}
42
}
43
44
public void SpawnMonster()
45
{
46
GameObject monsterGameObject = GameObject.FindGameObjectWithTag("Monster");
47
if (monsterGameObject == null)
48
{
49
Instantiate(monster, new Vector3(2, 6, 0), transform.rotation);
50
}
51
}
52
53
public void SetAIEnabled(bool isEnabled)
54
{
55
aiEnabled = isEnabled;
56
}
57
}
Copied!

## Demo & Conclusion

In this Part 2, we learned how to connect an Intelligent agent with a Unity scene. The Unity scene acts as a visualization or a frontend for our agent. In the future, I will keep using visualization in order to better showcase the progress of our agent. And the agent feels more like a real robot, right?

# Part 3: Top-down Shooter AI

Written on 2020-01-28

## Today's Goal

In this part, I will try to create a reflex agent with internal state in an imperfect information environment. The agent is a top-down shooter with a limited vision range, therefore it needs to store information about the map it already visited. Furthermore, it needs to make a rational decision to find enemies and take necessary actions to shoot them.
For future reference, today's version will be referenced as Caleb.
Part 3's Milestone

## The Scenario

We are going to expand from the previous part. In today's scenario, we are going to put our hunter in a grid of M x N.
There will be some monsters placed at random coordinates.
Hunter's vision range is limited. Thus making hunter to not have perfect information.
Projectile range is limited. Thus, hunter needs to position itself accordingly to shoot monsters.
In a moment, the actions that hunter can take are:
$MoveForward$
$RotateLeft$
$RotateRight$
$Shoot$

## Demo

I put the finished demo in the early section of this part with hope of motivating the readers.

## Finding Shortest Path with BFS

In this scenario, we are going to face a problem where we need to find the shortest path to multiple cells. For example, there are multiple monsters in the grid, and hunter needs to approach the nearest monster.
To solve this problem, we can use Breadth-first-Search (BFS); with Hunter's coordinate as starting point and monsters coordinates as goals. With BFS, since we are going to do level order traversal, the first goal that we reach is the nearest goal. We can then follow the path formed, which is the shortest path to the nearest goal.
BFS to find nearest target
caleb/algorithm/bfs.py
1
def neighbors(current, grid):
2
size_x = len(grid[0])
3
size_y = len(grid)
4
5
res = []
6
if current[0] - 1 >= 0:
7
res.append((current[0] - 1, current[1]))
8
if current[0] + 1 < size_x:
9
res.append((current[0] + 1, current[1]))
10
if current[1] - 1 >= 0:
11
res.append((current[0], current[1] - 1))
12
if current[1] + 1 < size_y:
13
res.append((current[0], current[1] + 1))
14
return res
15
16
17
def bfs(start, goals, grid):
18
"""
19
Using BFS to get path to nearest goal.
20
"""
21
size_x = len(grid[0])
22
size_y = len(grid)
23
24
visited = [[False for _ in range(size_x)] for _ in range(size_y)]
25
parent = [[None for _ in range(size_x)] for _ in range(size_y)]
26
queue = [start]
27
visited[start[1]][start[0]] = True
28
29
while queue