Advances in Computer Games 2017

Abstract

The Fifteenth International Conference on Advances in Computer Games (ACG 2017) was held at Leiden University, Leiden, The Netherlands. The three-day conference took place on July 3^rd–5^th in conjunction with the 20^th Computer Olympiad and the 23^rd World Computer Chess Championship. The Advances in Computer Games conference series is a major international forum for researchers and developers interested in all aspects of artificial intelligence and computer game playing. Earlier conferences took place in London (1975), Edinburgh (1978), London (1981, 1984), Noordwijkerhout (1987), London (1990), Maastricht (1993, 1996), Paderborn (1999), Graz (2003), Taipei (2005), Pamplona (2009), Tilburg (2011) and Leiden (2015).

Day 1 reported by Mark Winands

In the first talk, Richard Lorentz presented his paper Machine Learning in the Game of Breakthrough. He presented a study of applying machine-learning techniques to the game of Breakthrough. He showed that by using temporal difference learning in a Monte Carlo Tree Search (MCTS) setting, results are achieved almost equal to those obtained by Wanderer, a strong program with a highly tuned evaluation function. The talk also demonstrated that convolutional neural networks, trained by using Wanderer as a provider of expert moves, can produce a program much stronger than the original. Even in an environment with quite slow execution speeds, excellent results were achieved.

The following paper, A Curling Agent Based on the Monte-Carlo Tree Search Considering the Similarity of the Best Action among Similar States, was presented by Katsuki Ohto. He explained that Curling is one of the most strategic winter sports. To compare curling strategies, the Digital Curling system has been used. In this talk, a computer agent based on MCTS was proposed for the Digital Curling framework. It uses a novel action-decision method based on MCTS for Markov decision processes with continuous state space.

In the third talk, Naoki Mizukami presented the paper Exploration Bonuses Based on Upper Confidence Bounds for Sparse Reward Games. A closer look was provided into deep reinforcement learning algorithms that have achieved super-human-level performance in many Atari games. However, the performance of the algorithms falls short of humans in games where rewards are only obtained occasionally. One solution to this problem is to incorporate an explicit and more sophisticated exploration strategy in the agent’s learning process. In this talk, an effective exploration strategy was presented that explicitly considers the progress of training using exploration bonuses based on Upper Confidence Bounds. The method was evaluated on Atari 2600 games with sparse rewards and achieved significant improvements over the A3C algorithm.

After the coffee break Noah Weninger discussed the paper Exploring Positional Linear Go. In the talk he introduced Linear Go, a Go variant played on a 1xn board. Positional Linear Go was investigated: it has a rule set that uses positional superko. The game-theoretic properties of Positional Linear Go were further explored and were incorporated into a solver based on $MTD (f)$ search, solving states on boards up to 1x9.

Tristan Cazenave continued with the presentation Improved Policy Networks for Computer Go. He discussed how to utilize residual policy networks in the Go engine Golois. Two improvements to these residual policy networks were proposed. The first one is to use three output planes. The second one is to add Spatial Batch Normalization. Using three output planes enabled the networks to generalize better and reach a greater accuracy. A new residual layer with Spatial Batch Normalization proved to perform better than existing residual layers.

In the last presentation, An Analysis of Majority Voting in Homogeneous Groups for Checkers: Understanding Group Performance through Unbalance, Danilo Carvalho argued that experimental evidence and theoretical advances over the years have created an academic consensus regarding majority voting systems, namely that the group performs better than its components under certain conditions. However, the underlying reason for such conditions, stochastic independence of agents, is not often explored and may help to improve performance in known setups by changing agent behaviour, or by finding new ways of combining agents to take better advantage of their characteristics. In the talk, an investigation was conducted for homogeneous groups of independent agents playing the game of Checkers. The analysis aimed to find the relationship between the change in performance caused by majority voting, the group size, and the underlying decision process of each agent, which is mapped to its source of non-determinism. A characteristic imbalance in Checkers, due to an apparent initiative disadvantage, served as a pivot for the study from which decisions can be separated into beneficial or detrimental biases. Experimental results indicated that performance changes caused by majority voting may be beneficial or not, and are linked to the game properties and player skill.

After the lunch break there were two keynote lectures. Yori Kamphuis gave an entertaining talk on the purpose and future of technology. In the subsequent talk, Arno Knobbe discussed data-intensive optimization of Olympic training programs. These talks were enjoyed by everybody in the audience. Hereafter, there was a small intermezzo, where the participants were able to watch or even compete in the Computer Olympiad. In the late afternoon, a city walk was organized for the participants, starting at the Snellius Building of Leiden University and ending at the Leiden town hall. At our historic destination, a very pleasant reception was held for all the participants of ACG 2017, WCCC 2017 and the Computer Olympiad.

Day 2 reported by Richard Lorentz

The early morning session 3, dubbed the Theory Session, began at 9:00, was chaired by Hendrik Jan Hoogeboom and saw the presentation of three papers.

The first was titled Analysis of Fred Horn’s “Gloop” Puzzle, authored and presented by Cameron Browne. Gloop is a combinatorial puzzle game devised by Fred Horn in 1995. It uses equally sized square tiles with various lines drawn on them with the goal of arranging the tiles so that the lines line up and certain shapes are achieved. The author explains how a number of problems associated with this puzzle that have stood for over a decade were finally solved. From the conclusion: “This study demonstrates a successful collaboration between human and computer-based approaches to solving a difficult problem ...where neither approach had succeeded in isolation.” We were delighted to see Mr. Horn in the audience during the presentation of this paper.

The second paper, titled Set Matching: An Enhancement of the Hales-Jewett Pairing Strategy, was by Jos Uiterwijk. The Hales-Jewett pairing strategy dates back to 1963 (Hales and Jewett, 1963) and describes a technique useful for helping to solve k-in-a-row games, games like Go Moku. The author presents an enhancement to this technique that provides potential for a significant performance increase over the basic strategy.

The final paper of the early morning session was Yasol: An Open Source Solver for Quantified Mixed Integer Programs by Thorstein Ederer, Michael Hartisch, Ulf Lorenz, Thomas Opfer and Jan Wolf. The presentation was given by Michael Hartisch. Mixed integer linear programs have variables that are either existentially or universally quantified. The task is to find an “especially attractive winning strategy” and the authors show how their open source program Yasol attacks this problem by combining both linear programming and game-tree search techniques. Although it cannot compete with the best commercial solvers, they claim it is competitive with most freely available programs and “...compares well against highly specialized solvers on QSAT instances” even though it is more general purpose.

The late morning session began at 11:00 and was entitled “Dice”. It was chaired by Walter Kosters and again included three papers, the first being Analytical Solution for “EinStein würfelt nicht!” with One Stone by François Bonnet and Simon Viennot. François Bonnet gave the presentation and explained their work with EWN. They studied a special case where players start with only one stone rather than the usual six, but then generalized that situation by looking at all board sizes rather than just the usual 5x5 board. Under these conditions, they completely solved the game. They summarize their results thus: “Most of the time the first player is the winner, but surprisingly, depending on some [not so trivial] conditions on the board size, there are cases when the second player is the winner.”

Fig. 1.

Mark Winands (left) presenting the Best Paper award to Simon Viennot and (right) Francois Bonnet for their paper Toward Solving “EinStein würfelt nicht!”.

The second paper of this session was by the same two authors and was the recipient of the Best Paper Award. The title of the work was Toward Solving “EinStein würfelt nicht!” and this time, the presentation was given by Simon Viennot. They believe that finding the optimal placement for the pieces (if the rule set allows it – an alternate rule set places the pieces randomly) and finding optimal moves for the standard game (6 stones per player, 5x5 board) is still out of reach. However, they achieved a number of interesting results using fewer pieces and/or other-sized boards. For example, on the 3x3 board with 3 pieces per player they show that there are two different important initial placements of the pieces and they tell us “... the Nash Equilibrium consists of playing each option with probability 0.5 for both players. At the Nash Equilibrium, the winning chance for Player 1 is $169 / 243 \approx 0.7$ .” Generating their results required significant use of parallelization.

The final paper of the session and the day was Optimal Play of the Farkle Dice Game by Matthew Busche and Todd W. Neller, presented by Todd Neller. Farkle is a dice game of unknown origin that goes by many different names. It reminds us a little of Yahtzee since the goal is to try to obtain certain patterns when rolling and rerolling the dice, but it incorporates additional strategic elements. In the Conclusions section, they tell us that they “... expressed and solved optimality equations for the 2-player game for the first time. We noted that, between optimal players, the first-player win advantage is $\approx 7.3906 %$ .” They go on to recommend a fair komi for the second player.

Day 3 reported by Matej Guid

The last day of the conference started with Evaluating Chess-like Games Using Generated Natural Language Descriptions by Jakub Kowalski, Lukasz Zarczynski and Andrzej Kisielewicz. The authors presented a method for generating natural language descriptions for arbitrary fairy chess pieces given in the Simplified Boardgames language. They also described an algorithm for generating natural language descriptions of piece movements that can be used for two purposes: to explain the rules of the procedurally generated chess-like games to the human players, and for the task of procedural game generation.

The second talk of the day was on Automated Adaptation and Assessment in Serious Games by Enkhbold Nyamsuren, Wim van der Wegt and Wim Westera, focusing on a Portable Tool for Supporting Learning. An open-source tool was introduced for serious games, capable of adjusting game difficulty to players’ skill-levels. The problem selection process in learning environments may involve various steps: defining the player’s capabilities, estimating a target difficulty rating of a problem, and selecting a problem that closely matches the target difficulty rating. The Adaptation and Assessment (TwoA) component by the authors adopts a fuzzy selection rule in order to reduce the selection bias that may arise for items of similar difficulty. The most important goal of TwoA is to balance to the best possible extent player’s motivation and game challenge. The achievements were validated using multi-agent simulations, with positive results.

Adam Rhys Boulton argued that A Little Bit of Frustration Can Go a Long Way as the result of investigating the relationship between frustration and engagement over time in a carefully selected video game Limbo. This is a game that reviewers had noted for its strong potential to be both highly enjoyable and frustrating. It is not surprising that engagement often falls as frustration rises (and vice versa). However, the authors identified and analysed several situations in which a rise in frustration can give rise to an increase in engagement. They also reported important individual differences between players and identified further research questions associated with this interesting topic.

Kiminori Matsuzaki opened the final session of the conference with the talk on Developing 2048 Player with Backward Temporal Coherence Learning and Restart. He showed that backward learning can be very useful for the game of 2048, since the game has quite a long sequence of moves in a single play. He also demonstrated a restart strategy to improve the learning by focusing on the later stage of the game. Both of these techniques greatly improve the performance of the N-tuple-based players especially when combined with the expectimax search.

The talk by Bruno Bouzy, Playing Hanabi Near-Optimally, showed that the game of Hanabi can be played near-optimally by the computer. Hanabi is a multi-player cooperative card game in which a player sees the cards of the other players but not his own cards. Previous work reached near-optimal results for five players and four cards per player. The author has managed to generalize the near-optimal results to other numbers of players and cards per player: the perfect score was reached 90% of the time on average. The author described his program Hannibal and the experiments performed by the program on the aforementioned card game. One of the interesting conclusions backed up by the results of the experiments is that Hanabi is not a difficult game for the computer.

Matej Guid presented his joint work with Ivan Bratko on Influence of Search Depth on Position Evaluation. He argued that – possibly surprisingly – backed-up evaluations should not respect the minimax relation. He pointed out some implications of this phenomenon on the practice and theory of computer game playing, namely (a) that heuristic evaluations obtained by searching to different search depths are not directly comparable, and (b) that fewer decision changes with deeper search are a direct consequence of this property of heuristic evaluation functions. The reasons why the computer analysis of chess players’ games with the aim of obtaining a sensible ranking of the players should be fixed-depth-based (as opposed to time-limit based) were presented. The authors also demonstrated that knowing this property of heuristic evaluation functions for game playing may be used to develop a method for detecting fortresses in chess, which is regarded as an unsolved task in computer chess.

Finally, Deep df-pn and its Efficient Implementations by Song Zhang, Hiroyuki Iida, and Jaap van den Herik returned to the ever-hot topic of solving games. The paper proposes a new proof number algorithm that aims to reduce the seesaw effect in depth-first proof-number search, which is a powerful variant of proof number search algorithms. The main difference of the new algorithm lies in the proof number or disproof number of unsolved nodes. In the new algorithm, it is a function of depth with two parameters that enable the search to be guided from searching broadly to searching deeply. The idea was successfully implemented in the domain of the game Connect6, leading to improved efficiency. Importantly, the experimental results indicate that improving efficiency by the same adjustment technique is also possible in other domains.

The closing speech was given by Mark Winands and Jaap van den Herik announced the venues for the following events. As usual, there were various exchanges of useful ideas between the participants. The proceedings of the conference will be published by Springer-Verlag in the Lecture Notes on Computer Science series.

References

Hales, A.W. & Jewett, R.I. (1963). Regularity and positional games. Trans. Amer. Math. Soc., 106, 222–229. doi:10.1090/S0002-9947-1963-0143712-1.