Tuesday, April 24, 2012

What Science Is

After much discussion about crowdsourcing and democratization of science during the Sage Congress, I feel compelled to commit to writing a revelation--or more accurately a recovered memory--about what science is.

When you are a student studying for exams or preparing your thesis, it's easy to believe that science is about mathematical prowess, or programming ability, or some talent like Newton or Gauss or Bernoulli had in order to compete.  If you're a grad student or an undergrad or high school student considering pursuing a degree in science, you must always keep in mind that that is not the case.  These are important skills to have, but these are not what science is any more than a powerful engine is what a car is.  

Science is first about asking the right questions and second about managing precious resources:  your lab, your time, your grant money.  This is odd because these are generally not the things you are graded on, and most scientists have unfortunately never had a class in business.  If you want to be a good scientist, when you're sitting in your class, you should focus on asking the best question about the material (even if only to yourself) and less on how the material will help you finish your next homework assignment.

So, in that way, our curricula that are supposed to be selecting for the best scientists are actually doing very little to cultivate or test for the most important skills to being a scientist, which leads to a society with vast misconceptions about what science is.

I recovered this memory after some discussions about crowdsourcing.  In crowdsourcing projects such as FoldIt, EteRNA, the Netflix Challenge, or TopCoder Marathon Matches, or the Mechanical Turk, a problem is presented to the internet at large in the hope that someone or some group can solve a problem that is otherwise difficult for the researchers.  Surprisingly, in crowdsourced projects like FoldIt, EteRNA, or TopCoder, while there are tens of thousands of players only thirty or so provide truly useful results.  Put another way, while boasting an impressive draw, these crowdsourcing projects are actually not about getting work from the crowd at large but instead about finding the very best minds to work on your problem.  These minds are far better at solving the problem than even the researcher was--but they are not necessarily better scientists!  The scientist's job was to pose the question in the first place, and in this regard the crowd is actually more of a special-purpose computer.  Nonetheless, our schools are teaching putative scientists how to be better at solving problems instead of teaching us how to pose better questions, which is probably an artifact of an age before Google, instant globalization through the internet, and the personal computer.

Some crowdsourcing problems like FoldIt, EteRNA and TopCoder do indeed seek only to find the very best individuals--and as such the reward structure of TopCoder at least makes sense.  When you want just one best solution, it is absolutely logical to provide prizes and recognition to only the best competitors.

Figure 1:  Reward Structure in TopCoder Marathon Matches.
Oligarchical utility, oligarchical rewards.

Further, the reward structure of Mechanical Turk also makes sense.  When the work is something that is intrinsically valuable, it makes sense to reward everyone just the same for any work they do.

Figure 2:  Reward Structure in Mechanical Turk.
Democratic utility, democratic rewards.

Finally, I ask myself whether science is more like TopCoder, where only the top performers have much utility, or like Mechanical Turk, where the work done has intrinsic value as long as it is done to standards.  In fact, most intelligent questions, when asked, will not yield Earth-shatteringly important results.  That's what science is, though:  the systematic testing of hypothesis.  Most will fail, but knowledge of what doesn't work is still knowledge, and that's what's important.  For example, consider kinases as chemotherapy targets.

There are hundreds of different kinases known to exist in the human body.  Most of them are probably not ideal chemotherapy targets, but knowing which ones are could save thousands of lives.  So, we study each one equally to try to find which one to target, right?  

Wrong.

In fact, just a few dozen kinases are the focus of the vast majority of research, following something of an exponentially decreasing curve in terms of interest (I will cite the appropriate Sage Congress Unplugged talk when it's posted!) Why?  Because some rock-star scientist at some point thought it was important, and then the good funding money went with him, and then all of the bread-and-butter scientists followed.  So, was the rock-star scientist someone deeply brilliant and insightful, like Einstein or Heisenberg?

Absolutely!  ...but that was only one factor in their success.  There are many others like them whose only fault was to be unlucky.

In fact, those wishing to pursue new and unique avenues of research are frequently ignored because they don't fit in with what everyone else is doing.  Stanley Prusiner of Prion fame struggled to get any funding or resources to study his Nobel prize-winning idea.  In fact, Newton or Gauss likely would have struggled to get a job in today's academic climate.  The truth is that for every Stanley Prusiner, there are hundreds more like him with potentially revolutionary ideas that don't end up working, and who are forgotten by history.  Scientific progress is nearly always based on a shot in the dark, and careers are made and broken by what amounts to a stroke of luck.  Good science, whether the study turns out to be successful or not, is intrinsically valuable for what it is.  However, the lion's share of rewards go to the few who happened upon the rare theory that turns out to work.  This leads to most scientists focusing on whatever the lucky few happened upon instead of something potentially revolutionary, because as long as you're somewhere nearby you can scrounge a few scraps of grant money for reflecting someone else's success.
Figure 3: Reward Structure in the Scientific Establishment.
Democratic utility, oligarchical rewards.

This is a huge problem for the democratization of science.  The reward structure looks like one where only the very best solution matters, but the utility is in fact equal to the amount of good science done.  This will discourage anyone straying far from the current hot topics, and vastly reduce the total amount of work done.  Mechanical Turk would be out of business if only the person who did the most got any reward!

This is why new cancer therapies are so slow in coming.  It's dangerous to pursue an approach that nobody's done before, because if you fail you might not get your next grant.  You might be passed over for tenure.  Your paper might not make it into a journal with a high impact factor.  If you aren't the top dog in the field, you have to at least closely follow what the best people are doing so that you can claim that you're nearly as good. 

Despite throwing billions of dollars at important problems, we are not making efficient progress because the reward structure does not reflect the utility of good science.  The sigmoid function of reward versus work has not only led to the bulk of progress coming from a very small number of people, but also led to a huge reduction in the overall amount of work and the number of radical and innovative hypotheses tested.

... all because those in charge of funding have largely forgotten what science is.

No comments:

Post a Comment