Example: quiz answers

Preference Learning from Annotated Game …

Preference Learning from Annotated GameDatabasesChristian Wirth and Johannes F urnkranzKnowledge Engineering, Technische Universit at Darmstadt, chess, as well as many other domains, expert feedback isamply available in the form of Annotated games . This feedback usuallycomes in the form of qualitative information because human annotatorsfind it hard to determine precise utility values for game states. There-fore, it is more reasonable to use those annotations for a Preference basedlearning setup, where it is not required to determine values for the quali-tative symbols. We show how game annotations can be used for learninga utility function by translating them to evaluate the resulting function by creating multiple heuristics basedupon different sized subsets of the training data and compare them ina tournament scenario.

Preference Learning from Annotated Game Databases Christian Wirth and Johannes Furnkranz Knowledge Engineering, Technische Universit at Darmstadt, Germany

Tags:

  Form, Database, Learning, Games, Annotated, Preference, Preference learning from annotated game, Preference learning from annotated game databases

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Preference Learning from Annotated Game …

1 Preference Learning from Annotated GameDatabasesChristian Wirth and Johannes F urnkranzKnowledge Engineering, Technische Universit at Darmstadt, chess, as well as many other domains, expert feedback isamply available in the form of Annotated games . This feedback usuallycomes in the form of qualitative information because human annotatorsfind it hard to determine precise utility values for game states. There-fore, it is more reasonable to use those annotations for a Preference basedlearning setup, where it is not required to determine values for the quali-tative symbols. We show how game annotations can be used for learninga utility function by translating them to evaluate the resulting function by creating multiple heuristics basedupon different sized subsets of the training data and compare them ina tournament scenario.

2 The results show that Learning from game an-notations is possible, but our learned functions did not quite reach theperformance of the original, manually tuned reason for thisfailure seems to lie in the fact that human annotators only annotate in-teresting positions, so that it is hard to learn basic information, suchas material advantage from game annotations IntroductionFor many problems, human experts are able to demonstrate good judgmentabout the quality of certain courses of actions or solution attempts. Typically,this information is of qualitative nature, and cannot be expressed numericallywithout selecting arbitrary values. Particularly well-studied form of qualitativeknowledge are so-calledpairwise comparisonsorpreferences.

3 Humans are oftennot able to determine a precise utility value of an option, but are typically ableto compare the quality of two options ( , Treatmentais more effective thantreatmentb ). Thurstone sLaw of Comparative Judgmentessentially states thatsuch pairwise comparisons correspond to an internal, unknown utility scale [14].Recovering this hidden information from such qualitative Preference is studied invarious areas such as ranking theory [12] or voting theory [3] Most recently, theemerging field of Preference Learning [6] studies how such qualitative informationcan be used in a wide variety of machine Learning 2014by the paper s authors. Copying permitted only for private andacademic : T.

4 Seidl, M. Hassani, C. Beecks (Eds.): Proceedings of theLWA 2014 Workshops: KDML, IR, FGWM, Aachen, Germany, 8-10 September 2014,published at the game of chess, qualitative human feedback is amply available in theform of game show how this information can be used in combination with state-of-the-art ranking algorithms to successfully learn an evaluation function. The learningsetup is based on the methodology used in Paulsen and F urnkranz [13], where ithas been used for Learning evaluation functions from move preferences of chessplayers of different strengths. This paper briefly summarizes the key results, fordetails we refer to Wirth and F urnkranz [18].In Section 2, we discuss the information than can typically be found in anno-tated chess games .

5 Section 3 shows how to extract Preference information fromsuch data and how to use this information for Learning an evaluation function viaan object ranking algorithm. In our experimental setup (Section 4), we evaluatethe proposed approach with an large scale tournament, discussing the resultsin Section 5. Section 6 concludes the paper and discusses open questions andpossible future Game Annotations in ChessChess is a game of great interest, which has generated a large amount of literaturethat analyzes the game. Particularly popular are game annotations, which areregularly published after important or interesting games have been played intournaments. These annotations reflect the analysis of a particular game by a(typically) strong professional chess chess games are amply available.

6 For example, the largest databasedistributed by the companyChessbase1, contains over five million games , morethan 66,000 of which are players annotate games with alternative lines of play and/or textualdescriptions of the evaluation of certain lines or positions. An open format forstoring chess games and annotations is the so-calledportable game notation(PGN) [4].Most importantly, however, typical events in a chess game can be encodedwith a standardized set of symbols. There is a great variety of thesenumeric an-notation glyphs(NAG), referring to properties of the positions ( , attackingchances, pawn structures, etc.), the moves ( , forced moves, better alterna-tives, etc.)

7 , or to external properties of the game (such as time constraints). Inthis work, we will focus on the most commonly used symbols, which annotatethe quality of moves and positions: move evaluation:Each move can be Annotated with a symbol indicating itsquality. Six symbols are commonly used: very poor move (??), poor move (?), speculative move (?!), interesting move (!?),1 good move (!), very good move (!!). position evaluation:Each move can be Annotated with a symbol indicatingthe quality of the position it is leading to: white has a decisive advantage (h), white has a moderate advantage (c), white has a slight advantage (f), equal chances for both sides (j), black has a slight advantage (g), black has a moderate advantage (e), black has a decisive advantage (i), the evaluation is unclear (k).

8 We will denote the set annotation symbols withA=AP AMwhereAPareposition annotations andAMare move (m) are the annotationsassociated with a given movemin the Annotated game. Move and positionevaluations can be organized into a partial order which we denote with thesymbolA. The move evaluations can be ordered ashAcAfAjAgAeAiand the position evaluations as!!A!A!?A?!A?A??.Note that, even though there is a certain correlation between position and moveannotations (good moves tend to lead to better positions and bad moves tendto lead to worse positions), they are not interchangeable. A very good movemay be the only move that saves the player from imminent doom, but must notnecessarily lead to a very good position.

9 Conversely, a bad move may be a movethat misses a chance to mate the opponent right away, but the resulting positionmay still be good for the player. For this reason,Ais partial in the sense thatit is only defined onAM AMandAP AP, but not onAM addition to annotating games with NAG symbols, annotators can also addtextual comments and variations. These complement the moves that were actu-ally played in the game with alternative lines of play that could have happenedor that illustrate the assessment of the annotator. Typically, such variations areshort move sequences that lead to more promising states than the moves playedin the actual game. Variations can also have NAG symbols, and may is important to note that this feedback is of qualitative nature, , it isnot clear what the expected reward is in terms of, , percentage of won gamesfrom a position with evaluationc.

10 However, according to the above-mentionedrelationA, it is clear that positions with evaluationcare preferable to positionswith evaluationfor worse (j,g,e,i).As we will see in the following, we will collect Preference statements overpositions. For this, we have to uniquely identify chess positions. Chess positions59can be efficiently represented in theForsyth-Edwards Notation(FEN), which isa serialized, textual representation of the game board, capturing all data that isrequired to uniquely identify a chess Learning an Evaluation Function from AnnotatedGamesOur approach of Learning from qualitative game annotations is based on the ideaof transforming the notation symbols into Preference statements between pairs ofpositions, which we describe in more detail in sections Each such preferencemay then be viewed as a constraint on a utility function for chess positions,which can be learned with state-of-the-art machine Learning algorithms such assupport-vector machines (Section ).


Related search queries