DelphiCloud - Elicitation Method


Date: 8-23-2018

Author: Eric Pacuit (based on notes from Jan-Willem Romeijn)

Initial Judgement


A user is invited to answer a question. The question can one of three types: 1. chance: e.g., what is the chance that human will land on Mars by 2024? 2. percentage: e.g., what percent of the US population will vote for Donald Trump in 2020? 3. quantity: e.g., in two weeks, what will the value of bitcoin be in US dollars?

Assumption 1: We assume that the answers to all questions are bounded. For chance and quantity, must be between @@0@@ and @@1@@. For quantity questions, the minimum and maximum are provided by the creator of the question. Let @@2@@ be the minimum possible answer to the question and @@3@@ the maximum answer to a question.

Elicitation Form: The user answers the questions in 4 steps:

  • Initial judgement: The first step is to ask the user to submit the following four numbers:

    • lower_estimate, denoted @@4@@: The value that the user expects to be the lowest answer to the question
    • upper_estimate, denoted @@5@@: The value that the user expects to be the greatest answer to the question
    • best_guess, denoted @@6@@: The user's best estimate regarding the answer to the question
    • confidence, denoted @@7@@: The user's estimated probability that the true value is between the @@8@@ and @@9@@

    Assumption 2: The form ensures that:

    • @@10@@, @@11@@ and @@12@@
    • @@13@@ and @@14@@
    • if @@15@@, then @@16@@
    • @@17@@
  • Develop step: We ask an additional question to fill out the user's judgement. There are two cases:

    • Case 1: The submitted judgement is flush to the left or flush to the right in the interval @@18@@. This would happen when @@19@@ or @@20@@. In this case, we do not ask for any additional input from the user.
    • Case 2: The submitted judgement is strictly contained in the interval @@21@@. This happens when @@22@@. In this case, we select an interval @@23@@ or @@24@@ (whichever one is the widest) and ask the user to provide a probability (the subinterval_probability, denoted @@25@@) for this interval: The user's estimated probability that the true value falls in this interval.

    Assumption 3: The user's probabilities are coherent and, when plotting the step function generated by the inputs, most of the probability mass os over the interval containing the best guess.

  • The user submits an explanation of their judgements.

  • The user submits an explanation of the evidence in the group of people answering the question.

Note: Only the first two steps are relevant to the elicitation method. The next interactive graphs demonstrates the initial judgements.

Loading output library...

The problem


We need to find a distribution to fit the initial judgement submitted by a user during the elicitation procedure defined above. The initial judgement provided by the user can be described as follows:

  • @@0@@ with @@1@@
  • @@2@@, where @@3@@, @@4@@ is the probability assigned to the interval @@5@@, @@6@@ is the probability assigned to the interval @@7@@ and @@8@@ is the probability assigned to the interval @@9@@ subject to the constraints:
    • @@10@@

  • @@11@@ with @@12@@

We are looking for a truncated skewnormal over the interval @@13@@. A skewnormal is a density function @@14@@ made up of two Gaussians with the same mean @@15@@ and with different standard deviations @@16@@:

  • @@17@@ below)

Since the normals and hence the skewnormals do not have an analytic expression for the @@18@@, we have to find the solution using an optimization procedure.



This section contains our implementation of a skewnormal, truncated skewnormal and the cdf for the skewnormal. Consult Kenny Easwaran's notes on skewnormal distributions for more information.

We first define our skewnormal functions. The function skewnormal_pdf is the pdf given the location (i.e., mean), scale and shape (i.e., skewness). We also define a wrapper function dc_skewnormal_pdf that gives the pdf given the location and the standard deviations of the two normal distributions that are merged to form a single skewnormal pdf.

Loading output library...

The problem with the skewnormal distribution is that the range is @@0@@ and we our distributions must be bounded. This means we need a truncated skewnormal (take the probability from outside the bounds and shift it to all the probability mass over the range @@1@@).

Loading output library...

There is no analytic solution


First, a bit of notation. The error function (@@0@@) is defined as follows (this function was already used in the definition of skewnormal_prob): @@1@@

The function @@2@@ (essentially a look-up table) can be used to quickly determine the cumulative distribution for a standard Gaussian (with mean 0 and standard deviation 1):


It can be used to determine the cumulative distribution over an arbitrary interval @@4@@, and for any other Gaussian distribution as well.

By transforming away the skewness and transforming the constraints accordingly (further details below), we can cast the constraint satisfaction problem entirely in terms of @@5@@'s. However, this function can only be used to determine the cumulative distribution for a given interval @@6@@ and for given values of the parameters of the skewnormal. Because it is not itself an analytic expression, it cannot be rewritten into a function that returns values for those parameters on the basis of the value of the cdf over a specified interval.

There is a look-up table for the inverse @@7@@, @@8@@, and we can only determine that if we have the parameters that we are trying to find.

In short, there are difficulties with determining the parameters @@9@@ and @@10@@ analytically in terms of @@11@@ and @@12@@ function in an efficient optimization procedure.

Loading output library...
Loading output library...

The optimization procecure


We first transform the problem into one that is couched entirely in @@0@@'s. Our target is @@1@@.

We translate @@2@@ towards @@3@@, and define the new subdomains @@4@@ and @@5@@ with

  • @@6@@

and new boundaries:

  • @@7@@

The constraints are transformed accordingly:

  • @@8@@

Over the new domain the transformed function is a Gaussian distribution with a mean at 0 and a standard deviation of 1. The cumulative distribution function then is the look-up table @@9@@, with the constraints @@10@@

Notice that the values of @@11@@ and @@12@@ fully dictate the integration boundaries for these constraints. We have fixed


so that


The integration boundaries @@15@@ and @@16@@ are the new targets of our optimization problem: our goal is to find their values, such that the constraints are met. We can then transform these values back into values for @@17@@ in the skewnormal distribution.

The code in the next cell displays a skewnormal and the truncated Gaussian on the interval @@18@@.

Loading output library...

The search itself is done in five stages, using parameters @@0@@ and @@1@@ to approach @@2@@ and @@3@@ respectively. We direct the search by retaining the adequate ratio between the functions @@4@@ in a commensurate way (see stage (4)).

  • Choose @@5@@ low, so that @@6@@.

  • Look for @@7@@ such that @@8@@, aimed at finding the correct ratio between the ERF's.

  • We now calculate the corresponding values for @@9@@. If this deviation passes a certain treshold, we might return to stage (2).

  • We nudge @@10@@ upwards by a small amount @@11@@.

  • After having nudged @@12@@ and for the derivatives of the ERF's, and we loop back to stage (3).

Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...
Loading output library...



1. Choose @@0@@ low, so that @@1@@:


Question: Is there a better algorithm we can use here? I'm always worried when hard-coding in a number like this. Surely, -4 won't work in all cases? Here's the code we are currently using:

def initalize_left_guess(left_probability, xmin, xmax):
    """Initialize the guess for sigma_l
    Currently, we are guessing 4 standard deviations 
    return float(-4.5)

Note that I am sending in @@2@@, @@3@@ and @@4@@ since my intuition is that the initial guess should depend on these values, but we can send in anything we want.

2. Look for @@5@@ such that @@6@@, aimed at finding the correct ratio between the ERF's.


This is working pretty good (at leat in my example above. And also with various other tests I've done. Here is the code to do this:

def compare_ratios(x, alpha_l, alpha_r, v, r_ini, xmin, xmax, best):
    """ Optimization objective. Looking for w such that ERF_u_w / ERF_u_v is equal to R_v. """
    return (erf(float(x)) - erf(float(x)*alpha_r)) - ((erf(v) - erf(v)*alpha_l)) * r_ini * (x/(xmax - best))/(v/(xmin - best))

def right_guess_finder(v, params):
    Given left_guess, return the right_guess that  w that satisfies the constraint that: 
     $\frac{ERF_{r}(w)}{ERF_{l}(v)} = R_{v}$ with $R_{v} = \frac{U (w /(xmax - b))}{L (v / (xmin - b))}$.
    # TODO: Why  is the initial guess at 100?  
    x0 = float(100.0) #initial guess for w 
    sol = minimize(compare_ratios, x0, args=(float(params['alpha_l']),float(params['alpha_r']), 
                                             float(v), float(params['r_ini']), 
                                             float(params['xmin']), float(params['xmax']), float(params['best'])), 
                   method='SLSQP', bounds=[(3, None)], options ={'disp':True})
    w = float(sol.x)
    return w

Question 1: Why initialize x0 at 100.0? Is the idea that 100 standard deviations away is going to be far enough away that we are sure not to miss the true value?

Question 2: We have @@7@@ in the optimization procedure. This means that the minimum value for @@8@@ is 3. I don't understand why that should be true...why is it that this initial procedure will never have a value of @@9@@ lower than 3, is there a mathematical reason for this? When running experiments I notice that very often we end up with @@10@@, but that doesn't seem correct...why would it be that the optimal @@11@@ and @@12@@ always turns out to be @@13@@ and @@14@@. One thing that is true is that we can't have @@15@@.

Question 3: In the optimiziation, we are meant to: Look for @@16@@ such that @@17@@ such that: @@18@@

(This always raises a question about whether, given any v, xmin, xmax, best_guess, lower_estimate, upper_estimate, there always is such a @@19@@ and if there is a unique smallest, but these questions are super important since the optimization code will find the first such @@20@@). Looking at the compare_ratios function, this is what calculation is actually be minimized:

(erf(float(x)) - erf(float(x)*alpha_r)) - ((erf(v) - erf(v)*alpha_l)) * r_ini * (x/(xmax - best))/(v/(xmin - best))

It's hard to parse the parantheses, but this is attempting to minimizing the following: @@21@@

But one thing I don't understand is that we are using the math.erf function here rather thatn @@22@@ @@23@@ as JW uses in the description above. We have determined that these functions should be:

def erf_l(v, b, l, xmin):
    alpha_l = float(b - l) / (b-xmin)
    return (erf(alpha_l*v)-erf(v))*(v /(xmin - b))

def erf_r(w, b, u, xmax):
    alpha_r = float(u - b) / (xmax - b)
    return (erf(w) - erf(alpha_r*w))*(w /(xmax - b))

So, shouldn't the return statment of the compare_ratios function be instead:

erf_r(x) - erf_l(v) * r_ini * (x/(xmax - best))/(v/(xmin - best))

Of course, it might be mathematically equivalent, but I want to verify that is actually the case (sorry if this is pedantic, when I'm stuck on some bit of code, I always go back to basics and ask myself very basic questions about the underlying mathematics, double checking every step).

3. We now calculate the corresponding values for @@24@@. If this deviation passes a certain treshold, we might return to stage (2).


Quesiton 1: I'm confused about how we are calculating @@25@@ here. We are meant to use the above erf_r and erf_l functions, correct? If so, then I don't understand the results I'm seeing. After the initial optimization using the example above we have:

Initial optimization: v = -4.5, w= 4.45942548416

Initial estimate left probability = 0.508422799921
Initial estimate right probability = 0.455908008861

But these probability can't be correct, since they sum to 0.963.... and the initial left/right probabilities were 0.1. I also have situations where the sum of these numbers is greater than 1...something is not clicking for me here.

Question 2: Part of the description says Depending on the step sizes @@26@@. If this deviation passes a certain treshold, we might return to stage (2). I don't see anywhere in the code where we are doing this. Two quesitons about this:

  • What threshold do we use to signal that the deviation is "too far away from @@27@@"?
  • What exactly does it mean to "we might return to stage (2)." If we go back to stage (2) with the same initial guess for @@28@@, then we are just going to redo the same calculations and get the same initial guess for @@29@@. I take it that we need to do a different initial guess for @@30@@. How should we do this?

4. We nudge @@31@@ upwards by a small amount @@32@@ and for the derivatives of the ERF's, and we loop back to stage (3).


The code to do this is described above in the w_optimizer function. One problem I can see that we are having is the following: The test to break out of the loop is:

abs(xup - erf_r(w, best, upper, xmax)) <= threshold and abs(xlow - erf_l(v, best, lower, xmin)) <= threshold

If we start the optimization with efr_l(v, best, upper, xmax) >> xlow (as I found in my experiments). Now, if erf_l(v, best, lower, xmin) increases as v shifts to the right, then this test will never be true.