Barney's Stats Problem

Given a set of N distinct values and counter c initialized to zero, inspect one of them at random, and increment c. Repeat.

Answers to the below questions should be formulas expressed in terms of N and c, along with any other variables defined for the specific question. It is assumed that randomization is perfect across all N values for each inspection.

For a given c, what is the probability P(c) that at least one repeat value has been inspected?

    P(1) = 0
    P(2) = P(1) + 1 / N
    P(3) = P(2) + 2 / N  // I don't think this correctly accounts for c=2 being a dupe
P(N + 1) = 1

At what c have x duplicate values been inspected (primarily where x = 1)?

For a given c, how many duplicates have been inspected?

What c is required to inspect x% of values?


Everything is based on several runs of the scenario with different N values:

N # Runs
10 200
50 200
100 515
250 200
1000 214