Replication of
Imperfect Public Monitoring with Costly Punishment: An Experimental Study
Ambrus, A. / Greiner, B. (2012)
American Economic Review 2012, 102(7): 3317–3332

Replication Authors:
Teck Hua Ho

Ambrus and Greiner conduct six treatments combining punishment severity and noise (imperfect monitoring) in sequential Prisoner’s Dilemma games. There is a surprising non-obvious U-shaped effect of severity on net earnings (earnings after paying punishment costs). Therefore, we will concentrate only on the basic effect of noise in the regular punishment condition.

Hypothesis to bet on:
When there is imperfect monitoring, allowing punishment reduces net earnings (i.e., earnings after punishment costs; a comparison of the Regular Punishment with noise treatment and the No Punishment with noise treatment).

Power Analysis

The focal effect is the sum of the “regular punishment” coefficient and the “noise * regular punishment” coefficient in the last column in Table 2 (p. 3324), an OLS regression of earnings in each period. Footnote 16 reports that this summed effect is significant at p<0.05 by an F-test (exact p-value=0.050 from authors). As this test is based on a regression including more than the two treatments that will be replicated, the authors provided regression results based on only the two treatments to be replicated. The p-value of the treatment effect in this regression is 0.057 (based on a t-test of the coefficient of the dummy variable for the treatment with regular punishment and noise).

The original sample size is 117 participants in two conditions (n=57 no punishment with noise; n=60 regular punishment with noise). To achieve 90% power the required sample size is 340 participants.


In the original sample, 339 subjects participated in 12 sessions, with between 24 and 30 subjects per session. The sample for replication consists of 340 participants from the National University of Singapore. There are no exclusion criteria.


We use the material of the original experiment (programmed in z-Tree) along with the original instructions which have been made available by the authors.


We follow the procedure of the original article, with only slight but unavoidable deviations as outlined below. The following summary of the experimental procedure is therefore based on the section “I Experimental Design” (pp. 3320–3321) in the original article.

Upon arrival, participants are seated in front of computers at desks that are separated by dividers. The experiment starts after participants have read the written instructions and completed a short comprehension test at the screen.

At the beginning, participants are randomly and anonymously matched to groups of 3 that stayed constant over all 50 rounds. In each round, each of the 3 participants in a group is endowed with 20 tokens and is asked to either contribute all or none of these tokens to a group account. If the endowment is kept, it benefits the participant by 20 points, while if the endowment is contributed, it benefits each of the 3 group members by 0.5 x 20 = 10 points. After all group members made their choice[s] simultaneously, they are informed about the outcome of the game. In the noise treatments only a “public record” of each group member’s choice is displayed. If a group member did not contribute, then the public record will always indicate “no contribution.” If the group member contributed, there is a 10 percent chance that the public record shows “no contribution” rather than “contribution.” Participants are fully informed about the structure of the noise.

In the punishment treatments, subjects participate in a second stage in each round. Here they are asked whether they want to assign up to five deduction points to the other two members of their group. In the regular punishment treatments, each assigned deduction point implies a reduction of three points from the punished group member’s income. Received punishment is capped at the earnings from the public goods game in the same round, i.e., the punishment deduction cannot exceed the within-round earnings, however, while a punisher always has to pay for assigned punishment points.

At the end of the experiment, participants fill out a short demographic survey. They are then privately paid in cash based on their cumulated experimental earnings plus a show-up fee of AU\$ 5.00 (average earning were AU\$28.94 per subject in the original study).


The analysis will be performed in the same way as in the original article; but based on a regression including only the two treatments in the replication (rather than all treatments as in the original article; see above). An OLS regression using per round earnings (net of punishment costs and penalties) will be run replicating the result in Table 3, rightmost column. The hypothesis will be tested with a t-test of the regression coefficient of the treatment dummy variable for the treatment with regular punishment and noise.

Differences from Original Study

The replication procedure is identical to that of the original study, with some unavoidable deviations. This replication will be performed at National University of Singapore, in 2015, while the original data was gathered at the Australian School of Business Experimental Research Laboratory at the University of New South Wales in February and March 2010 and 2011. The experiment will be in English as in the original study.