Thursday, January 29, 2015

Bayesian Network Reconstruction Using Systems Genetics Data: Comparison of MCMC Methods

Several former colleagues and I collaborated on a paper that was finally published in Genetics on January 28.  You can read it for free online here.

I won't regurgitate the entire work here, but I did want to take a moment to say why this research is important for many fields, not just genetics.  In fact, the Bayesian network reconstruction problem arises in numerous places including finance, machine learning, and crystallography.  The premise of it is this:  suppose you had a black box that, based on some inputs that are supplied to it, would perform a possibly random calculation and then give some results back to you.  Now it is your job to figure out how the machine works.  How are the outputs related to the inputs, and more importantly what is the underlying calculation that happens to produce these outputs?  Keep in mind that this machine could be doing nearly anything under the hood:  performing entirely different calculations based on one input, rolling dice and giving a random result, or throwing your inputs out entirely.

To computational biologists right now, the cell is such a black box.  We are able to measure a lot of things about it and do a lot of things to it, but we are unable to actually look at all of the things that are happening inside a cell in real time.  Not only are they happening at really fast timescales, we don't have microscopes capable of watching all of the relevant interactions that happen inside a cell at once, for example DNA unwinding, RNA synthesis, ribosomes making proteins, protein folding, protein-protein and extra-cellular interactions.  As a result, we are left measuring things such as single-nucleotide polymorphism and expression data and then trying to infer how the genes in the cell worked after the fact.

How we reconstruct a cell's regulatory network is remarkably naive:   we construct a completely random network, tweak it to make it more plausible (e.g., simulated annealing or gradient descent), and then repeat this many many times.  Finally, we propose that the links in the resulting networks that appeared most often are likely to be the true connections in the real network.

This isn't a perfect approach, but it has yielded new insights in the past.  Perhaps someday someone will come up with a greatly improved approach to reconstructing Bayesian networks, and it will be a boon to technology and mankind worthy of a Nobel prize.

Saturday, January 10, 2015

Nash Equilibrium in the Sealed-Bid Senior Auction

At work we've been thinking a lot about auction theory, and I was challenged to solve for the Nash equilibrium in the sealed-bid senior auction.  I found the solution and I think the result is pretty interesting, but before I get to that point I want to give some background about what an auction is and this scenario in particular.

An sealed-bid auction is in its purest sense a game played between a number of bidders who all would like an item and would like to pay as little as possible for it to maximize their profit.  Unlike an auction like eBay where the current hammer price is public knowledge, in the sealed-bid auction every bidder will write down how much they want to bid for the item.  The auctioneer will then collect all of these secret bids and select the highest bid to be the winner1.  Nobody has a chance to change their bid after they pass the auctioneer their secret bid.

It's crucial to understand what a Nash Equilibrium is, and why it's important.  Simply put, a Nash equilibrium strategy is a strategy such that if it was known that all other bidders were following this strategy, then it would not be optimal to deviate from the strategy.  A kind of Nash equilibrium is a dominant strategy, where all bidders should follow the dominant strategy regardless of what the other players were doing.

Let's start with a familiar example of a dominant strategy:  the second-price auction.  In a second-price auction, the highest bidder wins but only pays as much as the second-highest bidder.  An example of this is an eBay auction.  In a second-price auction, the dominant strategy is to bid your valuation v of the item at auction.  In the case where you lose the auction, you pay nothing and gain nothing.  In the case where you win the auction, you get the item (with value v) and you pay the second-highest bidder's bid (which we will call b for now).  Your profit is then v - b, and is maximized because if you had bid less than b then you wouldn't have won the auction, if you bid between b and v then you are indifferent because you would have paid b and made profit v - b anyway, and if you had bid more than v you might have actually lost money because you would have paid more than the item is worth to you if the second bidder had bid higher than v.  Thus, no matter what, even if you knew that everyone else was going to bid zero, you would still bid your valuation of the item in the second-price auction.

The first-price sealed-bid auction, on the other hand, does not have a dominant strategy.  In a first-price sealed-bid auction, the highest bid wins the item and pays their bid exactly.  In this case, bidding my valuation v is actually not a good idea:  I will never make a profit!  If I lose, I get nothing, and if I win, my profit is my valuation v minus my bid v, which is zero!

In fact, in the first-price sealed-bid auction, the best strategy actually depends on the distribution of valuations of the item in question.  That is, if I thought everyone valued the item between $0 and $1 with a uniform distribution, then the optimal strategy would be different than if everyone valued the item in a normal distribution around $0.50.  It becomes even more complicated in the case where I am uncertain in my appraisal of the item, in which case I must consider the winner's curse!

I will sidestep all of this complexity by asserting that in every case I study below, the bidders all know that every bidder's valuation of the item is selected from a uniform distribution between 0 and 1, and that every bidder knows how many other bidders there are.  The solutions below are only applicable given this assertion.

In the first-price sealed-bid auction, the ideal bid would be only slightly higher than the second place bidder's bid.  Naively, I will make the zero-order guess that all bidders will actually bid their valuation except for the one who actually does the math.  Now I'd like to know the bid that maximizes this privileged bidder's profit, which is the maximum of the expectation value of profit given a particular bid b, valuation v and number of bidders N.  So,

Equation 1:  Expected profit in the uniform-distribution first-price sealed-bid auction assuming that all bidders except one privileged bidder bid their value.

The first term is the profit if the privileged bidder wins with bid b, and the second term is the probability that all other bidders had valuations and bids less than b.  The expected profit is maximized if the first derivative is zero2

Equation 2:  Solving for the the maximum expected profit in the uniform-distribution first-price sealed-bid auction, if all bidders but one bid their valuation

It is now straightforward to solve for the optimal bidding strategy.

Equation 3:  I will show that this is the Nash equilibrium strategy in the uniform-distribution first-price sealed-bid auction.

I made a crucial assumption above:  that every other player would bid their valuation.  A strategy is only a Nash equilibrium if the strategy would be optimal given that everyone else is using the strategy.  If everyone were using this bidding strategy, would it be worthwhile to deviate?  To find out, I can substitute this bidding strategy into the profit term in equation 1 and solve again.  Now, all of the other bidders are using the strategy in Equation 3.  Can our privileged bidder do better?  The first term in Equation 1 remains the same:  the profit will just be his value minus his bid.  However, the probability that a bidder wins with this bid has changed because everyone else's value could actually be somewhat higher, v = N b / (N - 1).

Equation 4:  Solving for the maximum profit in the uniform-distribution first-price sealed-bid auction, if all other players bid the Nash equilibrium.

Just like in equation 2, I can solve for the new optimal strategy.  It should be obvious that the optimal bid, which is just the solution to

is just the same as equation 3.  Therefore, equation 3 is the Nash equilibrium.

I can now calculate the winner's expected profit in equilibrium by substituting equation 3 into equation 4, yielding

Equation 5:  The expected profit given that all players bid rationally in the uniform-distribution first-price sealed-bid auction.

It turns out that equation 5 is something more universal than might be expected.  The Revenue Equivalence Theorem tells us that as long as we set the rules of the auction so that the highest bidder wins the item and a bidder with value 0 has an expected profit of 0, all sealed-bid uniform-distribution auctions will have the same expected profit as in equation 5!

For example, let's consider an all-pay auction.  In an all-pay auction, every bidder pays their bid but only the highest bidder goes home with the item.  Obviously, a bidder should bid less than in the first-price auction above because their risk is higher: a bidder will always pay his bid but may not actually get the item to show for it.  As long as a bidder's valuation is more than zero, then he should bid more than zero because there is a small chance that he could get a spectacular deal on the item!  How much, exactly, should a bidder bid?

I'll solve this with the revenue equivalence theorem.  The revenue equivalence theorem tells me that the expected profit must be the same as in Equation 5, because the highest bidder still wins the auction and a bidder with the lowest valuation, 0, will simply walk away from the auction neither profiting nor paying.  Therefore, equation to solve is:

The left hand side of the equation comes from the revenue equivalence theorem.  The right hand side is justified as follows:  if a bidder has the highest valuation v (and therefore the highest bid), then he gets the item and recovers value v.  All bidders always pay their bid, b, in the all-pay auction so that value is simply subtracted.   This is trivially solved:

Equation 6:  The Nash equilibrium bid in the uniform-distribution all-pay sealed-bid auction.

Now I am ready to tackle the original challenge:  the Senior Auction.  In the Senior Auction, the first-place and second-place bidders both pay, but nobody else does.  Like all other auctions discussed here, the highest bidder will still be the one who takes home the item.  A rational bidder should bid at least as much in the all-pay auction because their risk is lower, and should bid less than in the first-price auction because their risk is higher.  Let's find out exactly how much using the Revenue Equivalence Theorem.

This should be fairly straightforward to solve.  The profit for the highest bidder is v - b, and the cost for the second-highest bidder is their bid b.  Thus, I have:

Which is just

This is straightforward to solve for b in terms of v:
Equation 7:  The Nash equilibrium bid in the uniform-distribution sealed-bid senior auction.

For example, let's take a look at the Nash equilibrium bid in the first-price, all-pay and senior auction for 10 bidders.

Figure 1:  Nash equilibrium bid as a function of valuation in three different types of auction, given 10 bidders with uniform valuations on [0,1]. 

Figure 1 shows two things that give us confidence in the result from equation 7.  A rational bidder always bids at least as high in the senior auction as in the all pay auction, and never bids more than in the first price auction.  This makes sense because the bidder's risk in the senior auction is higher than in the first price auction, but lower than that in all pay.  Another comforting observation in equation 7 is that for N = 2, the result is exactly the same as in the all pay auction as would be expected.

I'd like to verify that this is actually a Nash equilibrium by inverting equation 7, substituting back into the expected profit function and taking the first derivative, but that is algebraically cumbersome.  Instead, I will employ a computer algebra system, wxMaxima, and verify that the solution is a Nash equilibrium for some chosen values.  In my Maxima notebook (download here), I require both bid(v, B), which is the putative equilibrium bid given valuation v and number of bidders B and value(b, B), which inverts bid(v, B) with respect to valuation. The function ExpectedWinnings(b, v, N, B) gives the expected winnings in a top-N-pay auction given bid b, valuation v, N the number of top bidders who pay their bid, and the number of bidders B.  The 
expression solve(diff(ExpectedWinningsAllPay(b, v, B), b), b) determines the optimal bid b by solving 

given that all other bidders engage in the strategy prescribed by bid and value.  If the result is the same as bid, then bid is a Nash equilibrium because there is nothing to be gained from deviating from this strategy given that everyone else is using it.

If you execute my wxMaxima notebook yourself, you will see that Equation 7 is indeed a Nash equilibrium for the chosen valuation and number of bidders, so it is empirically true that Equation 7 is the Nash equilibrium for the uniform-distribution sealed-bid senior auction.

1 Not all kinds of sealed-bid auctions give the item to the highest bidder, and not all kinds of sealed-bid auctions require that each bidder places only one bid.  For all examples here, the highest bidder wins.  Interested parties might read about unique bid auctions.

2 Actually, it may be maximized or minimized. A more thorough person would check the second derivative to determine whether he found a local maximum or a local minimum. This is an exercise left to the reader.