Game of Life with Enterprise Components

DEVnet (the company I work for) has decided to release one of its products – Enterprise Components (called EC below) – as free software. Since I contributed some code to it I would like to advertise it a bit on this blog.

What are Enterprise Components?

EC is a collection of libraries and tools that support building systems based on KDB+ – a database made by Kx Systems. Traditionally KDB+ has been mostly used for time series analytics in finance industry. Part of the reason for that was that the commercial license for KDB+ was very expensive. However, some two months ago Kx announced that they now allow commercial use of the free-of-cost 32-bit version of their product. This opened a path for KDB+ applications in areas well beyond institutions with deep pockets.

The main limitation of the 32-bit version is the amount of data it can handle – about 1GB. This limitation is per-process though and since KDB+ has a built-in inter-process communication protocol one can create very capable systems using the free version just running as many instances as needed.

Enterprise Components can be thought of as a layer above KDB+. EC provide logging, process management, configuration management, user access control, system monitoring and more – all that is needed if you want to build a larger scale system. Typical use cases are covered in the standard components provided with EC so that one can define a basic system with data feeds, in-memory database backed by on-disk journals and storage and archiving at the end of day with configuration only.

Game of Life

In this post I will show how to set up a system that plays Conway’s Game of Life. The Game of Life is a cellular automaton defined on a two-dimensional grid, with cells in one of two states – dead or alive. For each time step the state of a cell depends on what was the state of it and its grid neighbors in the previous step. If a cell is alive, it may be alive in the next step, or it can die of overcrowding if the number of alive neighbors is too large. If a cell is dead, it may be alive in the next step if it has the right number of neighbors alive – as if it got colonized. These simple rules lead to lots of patterns of behavior to observe – steady states, oscillators, patterns moving across grid, colliding with each other and so on.

KDB+ comes with its own language called q (I have written about q on this blog in the past). Of course it would be trivial to write the Game of Life in q. Maybe it would be about as short as the one written in APL. However I want to do in a way a bit different than the obvious one – I want to to represent each cell in the grid by a KDB+ process. The cells will communicate their states to neighbors using KDB+ IPC mechanism. All cells will report to one process that collects the data about the state of the game as whole. I would like to make it clear here that my goal is not to come up with the optimal solution for anything but to present some features of KDB+ and Enterprise Components on a toy problem. Besides I am just curious how it will do.

Implementation

All code and configuration files that I present in this post are licensed with Apache 2.0 license and can be downloaded from here. The code runs on Linux. It can be easily modified for Windows once the Windows version of Enterprise Components is released.

My Game of Life implementation consists of two components (modules). The first one, called cell, represents one cell in the grid. The second one, called admin serves as the central hub that keeps track of the changes in the cells’s states to allow querying for the game statistics and interaction with the GUI (when I get around to write one).

We start with the qsd file for the cell component. The qsd files are a purely EC concept. Their purpose is to encode the list of the configuration variables that a given component needs, together with their types and, optionally, default values. In principle, such information could be provided in the component’s documentation, but having it written in a formal way allows the EC code to check if the system configuration contains all necessary information to run the components and to provide clear error messages when this is not the case. The qsd file for the cell component declares three configuration variables describing the grid dimensions and the number of generations to run. This is how the cell.qsd file looks like:

[group]	
	cfg.gridi = <type(LONG)>
	cfg.gridj = <type(LONG)>
        cfg.generations = <type(INT), default(200)>

The values for these configuration variables are defined in the system.cfg file. The section that describes the collection of cell processes looks as follows:

[[grid.cell:100]]
     command = "q cell.q"
     type = q:cell/cell
     port = ${basePort} + ${EC_COMPONENT_INSTANCE}
     cfg.gridi = 10
     cfg.gridj = 10 

That number after the colon in the [[grid.cell:100]] part is what makes the EC process management tool (called yak) to run 100 processes that each loads the same component code from the cell.q file. They are almost identical (we call them clones) – except that each of them is passed a different integer that numbers them. In our setup this integer allows the component code to derive the names of the processes that are neighbors of a given cell in the grid. The logical names of the processes are of the form grid.cell_0, grid.cell_1, grid.cell_2 and so on.

Now, with the boring stuff out of the way let’s turn to the implementation of the main grid.cell component. This component represents one cell in the grid. The algorithm is very simple: at every time step each cell receives the information about neighbors dead/alive states. Once the cell knows the states of all its neighbors it calculates its own state in the next step and pushes this information to the neighbors, who do the same. All states are also submitted to the life.admin process to be stored for later analysis.
I have to say that the actual implementation is a bit more involved than I expected. The main difficulty here is with the system initialization. With that many connections we have to make sure that the neighbors are ready for the game with all connections up before we start to push data.

We start with some boilerplate EC-specific initialization calls

system"l ",getenv[`EC_QSL_PATH],"/sl.q";
.sl.init[`cell];
.sl.lib["cfgRdr/cfgRdr"];

Then there is the function that initializes the component – gets the values of the configuration variables, defines the rules of when the cell is alive or dead in the next step and opens the connection to the grid.admin component. The .hnd.poAdd[] call sets the callback to be called when the connection to the grid.admin is actually open (the port-open callback). The subsequent .hnd.hopen[] call attempts to open the connection, but if the connection does not open in the time specified by the timeout parameter, the EC handle.q library sets up a timer that repeats connection attempts once a second. Only after the connection is successful the callback is called. This is quite useful when starting large systems when some processes we need to connect to take a long time to start (they are replaying the journal for example).

.sl.main:{
  .log.info[`life] "Starting the Game of Life";
  gridi:.cr.getCfgField[`THIS;`group;`cfg.gridi];
  gridj:.cr.getCfgField[`THIS;`group;`cfg.gridj];
  .cell.maxGenerations:.cr.getCfgField[`THIS;`group;`cfg.generations];
  instance:value .cr.getCfgField[`THIS;`group;`EC_COMPONENT_INSTANCE];
  .cell.i:floor instance%gridj;
  .cell.j:instance mod gridj;
  / get names of neigbors
  .cell.neighbors:.cell.getNeighbors[gridi;gridj;.cell.i;.cell.j];
  .log.info[`life] "coordinates " .Q.s1 (.cell.i;.cell.j);
  .log.info[`life] "neighbors " .Q.s1 .cell.neighbors;
  .cell.initState:.cell.state:.cell.getInitState[.cell.i;.cell.j];
  .log.info[`life] "initial state: ",.Q.s1 .cell.initState;
  / init some counters
  .cell.updCount:()!(); / time step keyed dictionary of number of neighbor updates 
  .cell.nactive:()!();  / time step keyed dictionary of active neighbors
  / init game rules
  .cell.rules:()!();
  .cell.rules[0b]:00010000b;
  .cell.rules[1b]:00110000b;
  / add a callback to run when conection to the admin is open
  .hnd.poAdd[`life.admin;`.cell.onAdminConnection];
  / open connection to the admin with timeout set to 10ms
  .hnd.hopen[`life.admin;10i;`eager];
  };

Because the code for getting the names of the neighbor servers from our instance number is kind of hairy, I decided to put it in a separate function. This function takes the grid dimensions and the cell coordinates and returns a list of neighbor names of the form grid.cell_0, grid.cell_1,... and so on. This is not a typical q code – I have added some spaces for clarity.

.cell.getNeighbors:{[gi;gj;ci;cj]
  {[gi;gj;n]`$"grid.cell_",string (n[1]mod gj)+gj*n[0]mod gi}[gi;gj] each (ci;cj)+/:((-1;0;1)cross(-1;0;1))except enlist(0 0)
  };

Another helper function is one that generates a random initial grid state. One can replace this function to start from some known state.

/ generates a random initial cell state. The arguments are ignored here.
.cell.getInitState:{[ci;cj] system "S ",string `int$.z.t;:0~rand 2};

The next step after the connection to the grid.admin is to set up the “port-open” callbacks and open connections to the neighbors. For those readers who don’t know q syntax the keyword each is like map in Haskell, usually written infix.

.cell.onAdminConnection:{[admin]
  .log.info[`life] "successful admin connection";
  / tell the admin the timestep, cooordinates and initial state
  .hnd.ah[`life.admin](`.admin.upd;0;.cell.i;.cell.j;.cell.initState);
  / set up the callbacks and open the connections to neighbors
  .hnd.poAdd[;`.cell.onCellConnection] each .cell.neighbors;
  .log.info[`cell] "opening connections to neighbors";
  .hnd.hopen[.cell.neighbors;10i;`eager];
  };

What follows is the definition of the port-open callback that we attached to all neighbor’s connections with the line starting from .hnd.poAdd above.

.cell.onCellConnection:{[neighbor]
  .cell.connected+:1;
  .log.info[`cell] "successful neighbor connection to ",(.Q.s1 neighbor)," total ",.Q.s1 .cell.connected;
  if[(.cell.connected~8);
    /notify admin that the cell is ready
    .hnd.ah[`life.admin](`.admin.cellReady;.cell.i;.cell.j);
    ];  
  };

The grid.admin counts the number of calls to its admin.cellReady[] function. When it gets such messages from all cells in the grid, it calls the following function on each cell process to start the game:

.cell.start:{[x]
  .log.info[`cell] "signal from admin to start";
  (.hnd.ah each .cell.neighbors) @\: (`.cell.upd;.cell.i;.cell.j;0;.cell.initState);
  };

The next function is the one that actually implements the game interaction between the cells on the grid. It is called remotely by neighbors’ processes and receives the information about the current state of the neighbor. Once we know the dead/alive state of all neighbors we can determine our own state in the next time step. Then we notify the grid.admin and neighbors about our new state and the process repeats. This goes on until the maximal number of generations specified in the configuration is reached. All communication between the cell processes is done asynchronously to avoid deadlocks.

.cell.upd:{[ci;cj;t;state]
  if[not t in key .cell.updCount;
    .cell.updCount[t]:1;
    .cell.nactive[t]:`long$state;
    :();
    ];
  .cell.updCount[t]+:1;
  .cell.nactive[t]+:state;
  if[.cell.updCount[t]~8; / we know the state of all neighbors for this time step
    .cell.state:.cell.rules[.cell.state;.cell.nactive[t]];
    / notify admin
    .hnd.ah[`life.admin](`.admin.upd;t+1;.cell.i;.cell.j;.cell.state);
    / clean up
    .cell.updCount _:t;
    .cell.nactive _:t;
    if[t>.cell.maxGenerations;:()];
    / tell the neighbors our new state 
    (.hnd.ah each .cell.neighbors) @\: (`.cell.upd;.cell.i;.cell.j;t+1;.cell.state);
    ];
  };

The final line in the cell.q script is one that starts the script through EC standard library.

.sl.run[`cell; `.sl.main;`];

That is all about the cell component implementation. There is also the grid.admin component that collects the info about the state of all cells and allows to query some game statistics. This is not very interesting though and this post is too long already, so I will skip this.

How to run this

If you want to play with this Game of Life implementation the prerequisities and installation are described in the readme file. After that all you need is start the admin

$ yak start life.admin

then start the grid cell processes:

$ yak start grid

and watch yak starting 100 processes. The results can be viewed by connecting to the life.admin process with qStudio or a similar application.

Advertisements

Tags: ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: