(This paper is a major revision of http://works.bepress.com/chris_lloyd/15/.) Standard first order P-values suffer from two important drawbacks. First, even for quite large sample sizes they can misrepresent the exact significance which depends on nuisance parameters unspecified under the null. For most discrete models is that accuracy is variable and breaks down completely at the boundary. Second, different test statistics can give practically different results.
The bootstrap P-value is the exact significance with the null maximum estimate (ML) of the nuisance parameter substituted. We show that bootstrap P-values based on different first order statistics differ to second order. We also show that they are appropriately conservative on the boundary. We present numerical evidence that for discrete models mean inferential errors are of O(1/m) rather than O(1/m^(3/2)), as is the case for continuous models. However, worst case errors for bootstrap P-values are an order of magnitude smaller than for first order P-values, even for small sample sizes. The is partly due to boundary effects but may also be related to the way bootstrap corrects `incorrect' ordering of the sample space. This argument for this feature of bootstrap only holds when the null ML estimate is used.
- uisance parameters; exact test; tests of independence; r-star; bootstrap
Available at: http://works.bepress.com/chris_lloyd/17/