Crowdsourcing, as a term, has been around for something like 12 years according to Wikipedia. OpenStreetMap is a little older and the idea stretches back fairly arbitrarily. Wikipedia thinks it goes back to the 1714 Longitude Prize competition. That seems like a stretch too far, but in any case, it’s been around a while.
The ability to use many distributed people to solve a problem has had some obvious recent wins like Wikipedia itself, OpenStreetMap and others. Yet, to some large degree these projects require skill. You need to know how to edit the text or the map. In the case of Linux, you need to be able to write and debug software.
Where crowdsourcing is in some ways more interesting is where that barrier to entry is much lower. The simplest way you can contribute to a project is by answering a binary question – something with a ‘yes’ or ‘no’ answer. If we could ask every one of the ~7 billion people in the world if they were in an urban area right this second, we’d end up with a fair representation of a map of the (urban) world. In fact, just the locations of all 7 billion people would mimic the same map.
Tomnod is DigitalGlobe’s crowdsourcing platform and today it’s running a yes/no campaign to find all the Weddell seals in their parts of the Antarctic.
The premise is simple and effective; repeatedly look for seals in a box. If there seals, press 1. If not, press 2. After processing tens of thousands of boxes you get a map of seals, parallelizing the problem across many volunteers.
Of course, it helps if you have a lot of data to analyze, with more coming in the door every day. There aren’t that many places in the world where that’s the case and DigitalGlobe is one of them, which is why I’m excited to be joining them to work on crowdsourcing.
Crowdsourcing today is pretty effective yet there are major challenges to be solved. For example:
- How can we use machine learning to help users focus on the most important crowd tasks?
- How can crowds more effectively give feedback to shape how machine learning works?
- Why do crowds sometimes fail, and can we fix it? OpenStreetMap is a beautiful display map yet still lacks basic data like addresses. How can we counter that?
These feedback loops between tools, crowds and machine learning to produce actionable information is still in its infancy. Today, the way crowds help ML algorithms is still relatively stilted, as is how ML makes tools better and so on.
Today, much of this is kind of like batch processing of computer data in the 1960’s. You’d build some code and data on punch cards, ship them off to the “priests” who ran the computer and get some results back in a few days. Crowdsourcing in most contexts isn’t dissimilar. We make a simple campaign, ship it to a Mechanical Turk-like service and then get our data back.
I think one of the things that really separates us from the high primates is that we’re tool builders. I read a study that measured the efficiency of locomotion for various species on the planet. The condor used the least energy to move a kilometer. And, humans came in with a rather unimpressive showing, about a third of the way down the list. It was not too proud a showing for the crown of creation. So, that didn’t look so good. But, then somebody at Scientific American had the insight to test the efficiency of locomotion for a man on a bicycle. And, a man on a bicycle, a human on a bicycle, blew the condor away, completely off the top of the charts.
And that’s what a computer is to me. What a computer is to me is it’s the most remarkable tool that we’ve ever come up with, and it’s the equivalent of a bicycle for our minds. ~ Steve Jobs
In the future, the one I’m interested in helping build, the links between all these things is going to be a lot more fluid. Computers should serve us, like a bicycle for the mind, to enhance and extend our cognition. To do that, the tools have to learn from the people using them and the tools have to help make the users more efficient.
This is above and beyond the use of a hammer, to efficiently hit nails in to a piece of wood. It’s about the tool itself learning, and you can’t do it without a lot of data.
This is all sounding a lot like clippy, a tool to help people use computers better. But clippy was a child of the internet before it was the internet it is today. Clippy wasn’t broken because of a lack of trying, or a lack of ideas. It was broken from a lack of feedback. What’s the difference between clippy and Siri or “ok, Google”? It’s feedback. Siri gets feedback in the billions of internet-connected uses every day where clippy had almost no feedback to improve at all.
Siri’s feedback is predicated upon text. Lots and lots of input and output of text. What’s interesting about DigitalGlobe’s primary asset for crowd sourcing is all the imagery, of a planet that’s changing every day. Crowdsourcing across imagery is already helping in disasters and scientific research and 1,001 other fields with some simple tools on websites.
What happens when we add mobile, machine learning and feedback? It’ll be fun to find out.