Menu 1

Archive | maps

A Digital Globe

“Energy Flux,” data source: National Geospatial-Intelligence Agency, September 2000.

Crowdsourcing, as a term, has been around for something like 12 years according to Wikipedia. OpenStreetMap is a little older and the idea stretches back fairly arbitrarily. Wikipedia thinks it goes back to the 1714 Longitude Prize competition. That seems like a stretch too far, but in any case, it’s been around a while.

The ability to use many distributed people to solve a problem has had some obvious recent wins like Wikipedia itself, OpenStreetMap and others. Yet, to some large degree these projects require skill. You need to know how to edit the text or the map. In the case of Linux, you need to be able to write and debug software.

Where crowdsourcing is in some ways more interesting is where that barrier to entry is much lower. The simplest way you can contribute to a project is by answering a binary question – something with a ‘yes’ or ‘no’ answer. If we could ask every one of the ~7 billion people in the world if they were in an urban area right this second, we’d end up with a fair representation of a map of the (urban) world. In fact, just the locations of all 7 billion people would mimic the same map.

Tomnod is DigitalGlobe’s crowdsourcing platform and today it’s running a yes/no campaign to find all the Weddell seals in their parts of the Antarctic.

The premise is simple and effective; repeatedly look for seals in a box. If there seals, press 1. If not, press 2. After processing tens of thousands of boxes you get a map of seals, parallelizing the problem across many volunteers.

Of course, it helps if you have a lot of data to analyze, with more coming in the door every day. There aren’t that many places in the world where that’s the case and DigitalGlobe is one of them, which is why I’m excited to be joining them to work on crowdsourcing.

Crowdsourcing today is pretty effective yet there are major challenges to be solved. For example:

  • How can we use machine learning to help users focus on the most important crowd tasks?
  • How can crowds more effectively give feedback to shape how machine learning works?
  • Why do crowds sometimes fail, and can we fix it? OpenStreetMap is a beautiful display map yet still lacks basic data like addresses. How can we counter that?

These feedback loops between tools, crowds and machine learning to produce actionable information is still in its infancy. Today, the way crowds help ML algorithms is still relatively stilted, as is how ML makes tools better and so on.

Today, much of this is kind of like batch processing of computer data in the 1960’s. You’d build some code and data on punch cards, ship them off to the “priests” who ran the computer and get some results back in a few days. Crowdsourcing in most contexts isn’t dissimilar. We make a simple campaign, ship it to a Mechanical Turk-like service and then get our data back.

I think one of the things that really separates us from the high primates is that we’re tool builders. I read a study that measured the efficiency of locomotion for various species on the planet. The condor used the least energy to move a kilometer. And, humans came in with a rather unimpressive showing, about a third of the way down the list. It was not too proud a showing for the crown of creation. So, that didn’t look so good. But, then somebody at Scientific American had the insight to test the efficiency of locomotion for a man on a bicycle. And, a man on a bicycle, a human on a bicycle, blew the condor away, completely off the top of the charts.
And that’s what a computer is to me. What a computer is to me is it’s the most remarkable tool that we’ve ever come up with, and it’s the equivalent of a bicycle for our minds. ~ Steve Jobs

In the future, the one I’m interested in helping build, the links between all these things is going to be a lot more fluid. Computers should serve us, like a bicycle for the mind, to enhance and extend our cognition. To do that, the tools have to learn from the people using them and the tools have to help make the users more efficient.

This is above and beyond the use of a hammer, to efficiently hit nails in to a piece of wood. It’s about the tool itself learning, and you can’t do it without a lot of data.

This is all sounding a lot like clippy, a tool to help people use computers better. But clippy was a child of the internet before it was the internet it is today. Clippy wasn’t broken because of a lack of trying, or a lack of ideas. It was broken from a lack of feedback. What’s the difference between clippy and Siri or “ok, Google”? It’s feedback. Siri gets feedback in the billions of internet-connected uses every day where clippy had almost no feedback to improve at all.

Siri’s feedback is predicated upon text. Lots and lots of input and output of text. What’s interesting about DigitalGlobe’s primary asset for crowd sourcing is all the imagery, of a planet that’s changing every day. Crowdsourcing across imagery is already helping in disasters and scientific research and 1,001 other fields with some simple tools on websites.

What happens when we add mobile, machine learning and feedback? It’ll be fun to find out.

OpenLocate

Your phone knows where it is thanks to a suite of sensors that basically try to measure everything they possibly can about their environment. Where does the GPS think I am? What orientation is the device in? What WiFi networks can I see? What are the nearby Bluetooth devices? Have I been moving around a lot lately, accelerometer? What cell phone networks am I connected to?

Unless you’re standing in a field in Kansas with a clear view of the sky for ten minutes (so your GPS has lots of time to settle), your location will be questionable.

The original iPhone used WiFi network data to figure out where it was, because a GPS wasn’t included. Skyhook (I think it was…) drove cars around major cities sniffing for networks while recording their geolocation. Then an iPhone could look up its location by comparing what networks it could see to the database of network locations. Then, it could start adding networks not in the database it could also see at the same place.

As phones added all kinds of sensors, these databases grew and became free-floating associations of place information. We can now correlate almost anything with where you are so that if the GPS doesn’t work (because you’re inside a building, say), devices fail-over to what other clues they have to figure out where you are.

Integrating all this information is still a challenge, especially if you’re driving around a major city. The reliability of all the location signals are questionable as Pete Tenereillo outlined in a recent LinkedIn post. Driving around San Francisco, you’re still subjected to the map jumping all over the place even with high end phones and the latest software.

How users experience this can happen at the other end too, when you see your uber or delivery driver jumping around the map on their way to you:

As well as finding your location, many apps want to store it too. There’s 1,001 ways to do that. Different amounts of data, different formats, different places to send it. What ends up happening, quite reasonably, is that various location-based app developers both capture and store location data in many different ways, and there are paid-for APIs and SDKs to help with pieces of the puzzle.

What’s changed over time is the value of this data. Aggregating vast amounts of anonymized location data can help with use-cases such as building base maps for example. If you take all the GPS traces of everyone every day, you can figure out where all the roads are and their speed limits and so on. This data is equally valuable for other uses; advertising and predicting stock prices as two examples. If you know how many people went to WalMart this week you have an indication of their stock value. Things like this appear to have driven the new $164M round for Mapbox – “Mapbox collects more than 200 million miles of anonymized sensor data per day”.

What’s lacking is an open and standardized way to capture and store this data. Enter OpenLocate, an open iOS and Android SDK to simplify capture and storage of location data.

It’s supported by a long list of backers and it should remove a bunch of work when developing anything location-based, much as Auth0 removes having to set up custom authentication. For more, see the announcement blog post here!

Eclipse Poster Kickstarter

I just wrapped up the last kickstarter (details here) when the fine people at NASA put out this super accurate map of the eclipse in August. I put together a small kickstarter to print as many as possible (they’re done at cost!), learn more about it out here.

The map is incredibly accurate, right down to using a topographic model of the moon… will be interesting to see if it succeeds!

 

Kickstarter almost funded

It’s very humbling to look at this graph of funding over the last few days for the OpenStreetMap Stats Kickstarter:

I had expected the whole thing to fail, now it looks like it’ll succeed. I was asked once in a job interview about how much failure I’ve recently had. The idea was that if you’re not failing you’re not really trying – if everything is a success then you can’t be pushing the envelope.

I figured asking for $1k for a statistics site that’s relevant to a minority of a minority in the world was going to be too much to ask for. In the grand scheme of things it’s not a whole lot of cash, but still. And yet, here we are.

Speaking of failure, “failure” itself is the wrong way to model how these things work. Scott Adams has called it “having a system” instead of “goals”. Other people have called it “failing forward”. Either way – the basic idea is that whatever happens you want to win. Adams wrote a whole book about this:

In this case, if the Kickstarter fails then I can shut the project down. This for me is a clear win. I get more time and one less distraction. I don’t have to pay for the hosting any more. I also learn that tiny kickstarters aren’t going to work and not to bother trying them again in a similar context.

On the other hand, if it succeeds that’s great too. I can dedicate the time to fix the site, the hosting is paid for and it proves that there are people out there who care about it.

Setting up situations like this can be enormously beneficial – where you win either way. But, it’s still hard since my lizard brain wants to avoid anything that looks like failure and being judged by those who see it in that way.

There are plenty of smart, educated people out there who think Amazon’s lack of profit is a “failure” for example. I think it’s beautiful. For a start, the definition of “profit” is “we have no idea what to do with the money so we’ll give it to you”. Amazon isn’t running out of ideas worth funding. Second, if they spend all the notional profit then they don’t have to pay tax on it and get some percentage advantage via that. Reinvesting in this way for a few decades leads to some spectacular growth.

This all leads to an idea that’s almost too tantalizing to verbalize: Maybe it’s possible to live by doing Kickstarter after Kickstarter? The idea is insanely fun and the implications profound. If it’s possible to raise $1k in a week then that would lead to a $52k/year revenue, supposing you had 52 great ideas. Perhaps more likely are $10k kickstarters every 2-4 weeks, or $100k kickstarters every month or two. With some number of them failing, plus costs, it should still be possible to live using this method.

OpenStreetMap Stats Kickstarter

I’m attempting to raise $1k in a week via Kickstarter to fix the OpenStreetMap Stats site.

The site lets you explore OSM data by country, time and data type:

Sadly it’s suffered bit rot and some countries are broken and not updating. The $1k goes toward fixing, open sourcing and hosting it for a year or two. Else, it gets canned.

So far it’s raised $163 with 6 days to go.

OpenGeoIP

OpenGeoIP is a little project to crowd source IP address locations and I just made a few updates and bug fixes to it. Most IP to geo systems rely on self-reporting in various IP address metadata which can be pretty inaccurate. What we’re doing here is using actual location data from the browser (usually) which means (usually) wifi-inferred or GPS location.

There are two primary routes the project is building data:

  1. There’s a JS API which allows you to fall back to the database. Normally when you ask the browser for location the user can click ‘no’ and you get nothing at all. In this case, you can automagically fall back to crowdsourced data. If the user clicks ‘yes’ then we can use that to update the fail-over data for everyone else. In theory this feedback loop makes the data better for everyone.
  2. Second, there’s a lot of people out there just searching for location data on an IP. There’s a front end which will share out info if you first share yours. Again, this feedback loop should make the data better for everyone.

 

Continental Drift Part 2

Phone superglued to foundation

(see part 1)

The phone’s been running for nearly a week collecting data despite rebooting servers and wifi failing so I superglued it to the foundation of the house, which is concrete and in the ground. The data collected so far has all been deleted since it was just a test, but from here on out it’s real. In a week or two I’ll write some scripts to analyze the data and figure out what the error bars are.

OpenGeoCodes iOS and Android Apps – Collect Open Address Data

Open Address data from OpenGeoCodes in Durango, CO. Green pins are manually verified, red are awaiting verification.

Open Address data from OpenGeoCodes in Durango, CO. Green pins are manually verified, red are awaiting verification.

screen696x696OpenGeoCodes now has iOS and Android apps to optimize the hand collection of addresses.

badge_newdownload_on_the_app_store_badge_us-uk_135x40

Addresses are the primary limiting factor of OpenStreetMap – there just isn’t much out there that’s easily licensed and OSM itself for a variety of reasons lacks address data. OSM looks pretty – it’s a great display map. It’s also routable with a lot of work. But, you can’t find addresses on it.

OpenGeoCodes has data in the US and some starter data in Canada and the UK to try to fix this.

So what do the apps do?

The apps let you walk around and collect data. Say you’re standing outside 100 Main Street – just tap it, the app records the location and you’re done. Normally the app tries to guess where you are based on location.

But wait, there’s more! As you walk along, the app will optimize what addresses to show you. For example if you’re walking on the even side of a street going north, the app will figure this out and present you ascending even numbers. So if you enter 100 and 102, and the app knows 104 is nearby it will focus on this.

This makes it easy to walk along and just tap, tap, tap to collect data. We collect this data together and then make it freely downloadable. There’s also a mailing list if you want to get involved.

Where to from here? The feature list includes a more human design, notifications for when near places with no data, OSM upload and fixing and more. Drop me an email if you run in to any issues.

 

How Alex Mahrou from CH2M got MapClub shut down

MapClub's funding curve

MapClub’s funding curve

Kickstarter notified us yesterday that they were shutting MapClub down, and of course wouldn’t share why. This is despite the funding being successful and 15 people signing up. Fair enough, it’s their baby and their rules.

Well now we know why – Alex Mahrou’s long, angry and sarcastic investigative piece on why people shouldn’t do things that he disagrees with.

There are two primary failures in the piece. The biggest by far, is that he starts out by describing Ryan Holiday’s excellent books on stoicism, not taking things personally and not trying to control the things you can’t change. From there, he leaps in to a 54 page detailed blog post about trying to change other people and being angry about them doing something he dislikes… thus missing the entire point of the books. (which really are excellent by the way).

Another book that might help Alex is The Fish that Ate the Whale, which is all about not following rules that other people set you.

Anyhow, problem number two was missing the dry humor in the kickstarter about Peter, James and I being “luminaries”. More accurate descriptions may be “drunkards” or “skeptics” perhaps. It’s entertaining to see something thrown in there as a self-deprecating joke being taken so far, because I don’t think any of us take ourselves that seriously.

CH2M’s website mentions that they are “turning challenge into opportunity” which no doubt Alex focuses on day-to-day. The challenge of getting your random idea kickstarter shut down of course is the opportunity to fund it in other ways, without kickstarter’s 5-10% cut.

One of the more memorable stories in The Fish that Ate the Whale is about exactly this. When Zemurray couldn’t build a bridge to his banana plantations because his competitors had got the government to ban bridges, he built two piers and put a barge between them instead.

So, thanks Alex for your positive contributions to the world. Good luck banning more bridges, I’m off to build more piers.

Powered by WordPress. Designed by WooThemes