Janken Data Package
Copyright_OML 2000 M. Ishiguro & S. Sato
The Institute of Statistical Mathmatics
This data package can be obtained from
ISMLIB( ftp://ftp.ism.ac.jp/pub/ISMLIB/janken_data) of the
Institute of Statistical Mathematics without any charge.
On the conditions that terms of the OPEN MARKET LICENCE for data
(version:OML-DT-E-1996, ftp://ftp.ism.ac.jp/pub/ISMLIB/OML) are
observed, this data package can be used freely. On the condition
that this note is attached as it is, Janken Data Package can be
redistributed. Appropriate reference must be made at the time of
publishing results obtained using this data package.
Janken is a game played by Japanese children. Every Japanese children
know how to play this game. Even adult people play the game in some
occasion. Actually this game is believed to be originated in China.
Though there is a wide variation, it is played by two players in its
simplest form.
Players synchronize their hands up-and-down motion by shouting together
the words 'Jan---Ken---Pon'. In each game, when they call 'Pon' every
player make any of the shape of 'Stone', 'Scissors' or 'Paper' by
their hands. They have to make the shape at the same instant.
'Stone' is stronger than the 'Scissors'. 'Scissors' is stronger
than the 'Paper'. 'Paper' is stronger than the 'Stone'. One who showed
the stronger hand wins the game/session. Delaying ones motion is the
most serious violation of the rule. The reason why the delaying of the
motion is forbidden will be clear. This foul play is called
Ato-dashi (Delayed-showing).
In 1999, ISM gave a program for children. Janken is chosen as a topic
through which the way of statistical thinking can be introduced effectively.
20 PC's are prepared and a specially prepared Janken playing software
is loaded. Instead of making shape with their hands, children made
their choice by clicking the mouse on any one of three patterns
representing Stone, Scissors and Paper on the screen. One important
point was how to remove children's suspicion about PC's Ato-dashi.
This problem was solved by showing a number which announces PC's
next hand before hand. It is so designed that this number is clearly
seen if one watches carefully, but the number is not very obvious
so that the player can choose their hands not violating the rule.
In the introductory lecture the role of the 'announcing number' was
explained.
Children are invited to play games and some introductory talk about
statistics with a little bit of Markov chain was given.
This software game is designed for two players (computer and the player).
The 'match' is over if one gets hundred points first. One point is
added when one wins a game/session. It means that in one match about
300 clicks are necessary. We had been worrying that this is a too
heavy tasks for children. But we had to demonstrate the power of
statistics. Statistics can beat children only when a sufficient
number of data are supplied. After all our worry turned out to be
groundless. Children enjoyed clicking and playing the game very much.
And the strength of Statistics was impressive enough!
We took the log/records of the games, which are collected here.
Some data are also collected during the preparation stage.
Five types('a', 'c', 'd', 'e' and 'f') of algorithms are prepared.
First four type('a','c','d' and 'e') algorithms have adjustable
parameters. With differentiating the setting of the parameters,
there are 10 algorithms a0, a1, c0, c1, c2, d0, d1, e0, e1 and f0.
'a' type algorithm does not have learning ability. 'c' type algorithm
provides a simple learning ability. c0, c1 and c2 differ
in their levels of randomization. 'c0' employs the non-randomized optimal
strategy. 'd' type algorithm provides a sophisticated learning ability
and a set of CATDAP type models. 'd1' employs largely randomized strategy.
'e' type algorithm is a deterministic strategy. 'f0' is an extension
of 'd0' with more models some of which require a heavy numerical
optimization computation. Details about these algorithms are given in
"A study of Janken Data as Two Dimensional Tri-nomial Time Series",
Resarch Memorandom No. 759 of the Institute of Statistical Mathematics
( in Japanese).
The name of the files in this package have the form gcp.xm.nn, where
'xm' is the code of the algorithm with which the player fought and
'nn' is the sequential number.
Format of data is simple. A file is composed of three parts divided
by lines 'DATASTARTINGLINE' and '-1 -1'. Data between the
'DATASTARTINGLINE' and '-1 -1' is a N x 2 matrix, where N is data
length. The first and the second columns of the matrix represent hands
of human player and the software, respectively. Here, 1, 2 and 3
denote the 'Stone', 'Scissors' and 'Paper', respectively.
Everything above 'DATASTARTINGLINE' and below '-1 -1' are free memo.
A perl script 'batch.prl' is contained. It can be modified to conduct
a systematic analysis of data in this package.
As explained above, the 'announcing number' was shown always on the screen.
Some games were played with 'post-it' on the number. It is possible that
some player may try ato-dashi in some games.
---------------------------------------------------------------
Open Market Licence for data(version:OML-DT-E-1996)
0. In the following, 'Copyright_OML notice' means 'the copyright
notice with a statement saying that the said data is made
public under this Open Market Licence for data'.
1. On the condition that 'Copyright_OML notice' and attached
statements are copied as they are, all of the said data can
be copied or redistributed.
2. Any portion of the data can be extracted and used freely.
3. Appropriate reference must be made at times of publishing
results of the study of the data.
----------------------------------------------------------------