Archive | data

Explore OpenStreetMap Statistics

OSM Stats for Namibia

Ever wanted to explore OSM statistics over time and in depth? OSM Stats is for you. Notice the site asks for your location – this is just to show you your country automagically by default.

The site lets you explore by country, over time, major types of OSM data. The left-hand graph shows you the aggregate count over time, the right-hand graph shows the difference (delta) over the same time period. You can click different data types on the left, change country at the top, and change the time range just above the graphs.

You can find some interesting things. Here’s the default view for the United Kingdom:

What it shows is data growing over time. We like graphs that go up-and-to-the-right. The right-hand graph shows, as expected, the amount of data being added declining over time. This is because there’s less and less to map in the UK as I started the project there.

Compare that to Haiti:

Can you guess what the spikes in data addition are?

Now look at residential roads only in the United States:

Things are declining over time! Where are all those residential roads going? Well a small part of the answer (notice the vertical axis is 2 orders of magnitude less than above) is the growth of living streets in the US:

That’s a small taste of the things you can learn – have fun exploring the site and email me any comments.

Why you should be using BrowserLocation to get your users location


When a web app asks the browser for location, it prompts the user if they want to share it.

At this point it becomes binary. Either they click ‘yes’ and you get (usually) pretty good latitude and longitude or they hit ‘no’ and you get nothing at all.

The second problem is you just get latitude and longitude. You have to do some work to turn that in to something meaningful like a city or country.

Enter BrowserLocation.

BrowserLocation is a JavaScript wrapper that asks the browser for location. If they click ‘yes’, then we return the data to you just as before but we add city, state and country information. If they click ‘no’ then we fallback to IP address location. This IP-based information is lower accuracy than GPS or wifi triangulation, but better than nothing at all!

And to make it even better, every use of the API is used to improve the IP address fallback data for other users. And you can download the data. And, you can even query it directly like this.

So think of it as an open IP to geo project, with a wonderful API and a virtuous circle where everyone using it also makes the data better. The data is initially released CCBYNC and will leak in to the public domain over time, this creates a space to monetize the data and pay to improve it. It’s been seeded by paying people all around the world to collect some data.

Enjoy. Please email if you have any questions.

License Ascent

Copyright when first envisaged granted a limited-term monopoly on a work which then later fell in to the public domain, or PD. This would give the author some amount of time to make money and pay the mortgage, balanced with allowing people later on to take the work and build upon it. So you write a book, you can sell it but nobody else can, and then some number of years later everybody can do as they please with your book.

This is no longer the case. Copyright is effectively infinite. This means that while we can take old works and build upon them (Pinocchio) we cannot do the same with even pretty old works (Mickey Mouse). Edge cases exist of course, for example you can in many places use old works for parody.

Some people wish to make their work available under less restrictive terms than owning it forever. For them, there are a set of licenses which they can use to release their works.

  • They can claim attribution. Broadly, this means you can use my work but you have to say where it came from.
  • They can claim share-alike. Broadly, this means you can use my work, but any derivative works need to also be sharable. So you can’t take my book and then rewrite portions and claim it for yourself.
  • They can claim commercial rights. Essentially this means you can use my work for anything but profit.
  • You can use some combination of the above.

Thus instead of claiming copyright forever for your new book, photograph or software, you could instead for example say “use it however you wish but all changes must be shared-alike and you can’t use it commercially”. This allows individuals and companies to put works out there and allow them to spread more easily than if they retained all the copyrights.

Two basic methods of making money have emerged while using these open licenses:

  1. The intellectual piece is free, but any physical product costs money. For example, 3D Robotics software for drones is freely downloadable and you can change it. But if you want a physical, flying drone then that costs money. Very similar is this: the basic software is free but some critical piece required for some use case requires payment.
  2. The work is available publicly under a difficult open license, but privately under some commercial agreement. This is known as dual licensing. The downside is that to encourage commercial usage, the open license tends to be as painful as possible. This way, a student at home is unaffected but a company might find a license difficult. Perhaps it requires a legal review, or places burdens on the company like open sourcing everything they do. To avoid this pain, they pay for the commercial license.

The trouble here is we still don’t have things leaking in to the public domain over time. It’s seen that once a work is licensed under some license that it’s stuck there until the end of man.Capture

What if we changed that?

I propose we engage in some kind of license ascent over time. Perhaps descent would be better. Under this scheme, some work starts out under a restrictive and painful license and over time makes it’s way in to the public domain. For example:

  1. I write a book. For the first year, it is available under a attribution, share-alike non-commercial license.
  2. After the first year, it is available attribution, non-commercial.
  3. After the second year, it is available under attribution.
  4. After the third year, it is available public-domain.

We are reintroducing the concept of the work leaking in to the public domain gradually. So when I first create some piece of work I own it outright and then over time it becomes less and less burdened.

For static works like a book, the timelines may be longer. Say, two or five years per step. For works which changed all the time, like datasets about the world, perhaps each step lasts a year. Why would we want to do this? Two reasons: Because otherwise it’s really hard to make money from open source, and otherwise open projects don’t benefit the public domain.

There are classes of works which require “giving back” like OpenStreetMap in order to attract people to contribute. That is, why would you contribute to OSM if you couldn’t access the data? OSM has now existed for 11 years and the state of mapping in the public domain is still essentially the same as it was 11 years ago. But what if OSM data dropped in to a more liberal license, or the public domain, over time? Perhaps we could have a PD version of OSM but it was 5 years old. It wouldn’t compete with OSM itself, but it would enrich what people could build on without restrictions.

Put another way, do we want OSM to be perfect in another 10 years and the public domain still be essentially unusable? Wouldn’t it be nice to improve both OSM and (for free!) the public domain maps available?

Now imagine you have some new project which requires crowdsourcing to succeed. Dual licensing has the downside that picking the open license has many difficulties. You want to pick something that encourages people to contribute yet allows you to retain space to sell things, and this isn’t easy. If instead you practiced license ascent then everybody gets the data at some point in the future. Perhaps if you are a PD person you wait 3 years, and a share-alike person you have to wait two years. But either way, it’s better than never getting the data under a license that you would consider useful.

And, it does this whilst allowing the project or company to make money off the freshest data. It also creates an incentive to make the data fresher, all the time, because otherwise the old data will be good enough for people.

Now you could argue that any project should be open from the start, but open projects tend to have significant downsides. Open projects are terrible at user interaction and experience. They’re terrible at design. They tend to be incoherent. But they are great at innovation and collecting data. At the other end, private companies which collect data tend to be great at design and so on, but terrible at innovation and collecting data because they don’t have volunteers. I posit that license ascent is a way to achieve both and that it’s better than just picking a license or selling widgets on the side.


Measuring continental drift with your phone!

Most of the work mapping people do is concerned with moving GPS units. Walking, driving, biking with a GPS and doing things with that movement data. But, there is a big use case for static GPS units too!

As far as I can find out, continental drift and glacier movement is measured by buying a GPS, strapping a really big battery/solar pack on to it and embedding it in a lump of concrete (or a glacier) so it won’t move much. Then you come back a year (or whatever) later and see how much it’s moved by averaging out the locations it’s been collecting. The location will vary minute to minute within some bubble (30ft across or so). But over the span of a year you can average it out and get the movement of the glacier or continent. Like this guy is doing:


Guy planting a GPS in a glacier

So the question to me is, instead of having one GPS collecting for a year, could I have 365 GPS units for a day? Or, 4,380 GPS units for an hour…. And get the same result? Or at least something fun to wave around?

I suspect the simple answer is no, because having thousands of GPS units in the same place for an hour will all record the same systematic bias. But, what if you had thousands of people collecting drift information part-time around the planet? That would be fun!

So I built a little thing I’m calling OpenDrift. If you go to with a phone there’s an alpha version of what I’m thinking. What it does is uses your accelerometer to wait until your phone is still. Then, it uses the GPS to start recording. As soon as you pick it up it will stop since you moved the phone. If you knock the table it’s on, it will stop. And so on.

So, you could imagine instead of leaving your phone to do nothing overnight you could instead leave it to record 8 hours of drift data. We’d anonymize it and record drift information just for the nearest 100 mile square or something so we don’t know where your house is. Then we could aggregate that data with other phones across the world and see if we get something that looks accurate out of it.

Maybe, just maybe, we could produce a pretty visualization of that data. It would be a huge, fun citizen science project.

Today the code isn’t actually recording anything, and it can’t distinguish a phone from a laptop (which typically won’t have a real GPS). But the proof is there and I’m working on those things.

That sounds fun, what can I do?

Join the mailing list. Also, the code is on github, feel free to submit patches.

How to not get cancer

In principle, there are few chronic diseases that are more easily preventable than cancer.

Cancer as a Metabolic Disease, Chapter 19.

At the beginning of 2014 I started to get involved in CrossFit after decades of not really doing any exercise but biking and snowboarding. I discovered quite quickly how unfit I was and made rapid improvements.

This led fairly quickly to thinking more about what I was eating. If I was spending a bunch of time to get fitter then I should probably spend some time figuring out where the energy was coming from. What I used to eat was essentially any crap that was available.

There’s a strong paleo bias in the CrossFit community. Paleo essentially means eating roughly what you were evolved to eat. The thinking is that it’s been a short time since pre-agrarian society existed and before farming we ate one set of things. After farming, we eat another set of things. Actually a radically different set of things from before farming.

The timeframe is evolutionarily very short. The theory is that we evolved to eat what we as a species ate before farming, and we haven’t evolved to eat the food we get in post-agrarian society. In fact, it takes about an order of magnitude longer for DNA to show some meaningful adaptation than the time we’ve had since someone invented farming.

Paleo led me to learning the three basic food groups: Carbs, protein and fats.

When I grew up, I was taught that fat was bad. It turns out that this isn’t really true. I was taught that carbs are good. Sadly that too doesn’t really have any truth to it.

Even more mind-blowing to me were the results of actually trying calories-in calories-out. This is the myth that if you balance the amount of calories you eat with the amount you burn you can gain or lose weight. It’s scary, but that’s not actually really true either.

You can read about all this in Why We Get Fat and many other books.

Reading that book led to some other interesting things. It turns out Japanese women don’t really get breast cancer. Japanese immigrants to the US do, but if they emigrate back it goes away again. That rules out genetics as the major factor.

Now I don’t know about you, but I was taught that cancer was caused by DNA mutation. A photon comes in from the sun and breaks some DNA, or a virus does the same thing, or some oxidant does it.  The broken DNA somehow causes a bunch of gene signaling that results in cells replicating out of control.

I was floored by the lack of actual evidence for this.

Now, what happens when you have some large metastatic cancer? The current technology is to give you a dose of radioactive glucose. Then you’re put in a large machine that detects that radiation (positrons as it turns out) and with a lot of computation will spit out three-dimensional images of where the radioactive glucose is.

Where is the glucose? It’s at the tumor sites. Think about that for a second.

The resolution of these machines is sub-centimeter (from memory I think it’s 7mm). So what happens if we do a biopsy and find a few cells but it’s not big enough to image or doesn’t have a clearly defined border so we can rip it out in surgery? Basically we irradiate you and kill everything. We hope that normal cells will rejuvenate back and that the cancer cells won’t survive. Unfortunately irradiating you has a lot of downsides which I won’t list, but are essentially horrific. Okay I’ll list one ironic side-effect, which is cancer.

This got me interested. Why was the glucose at the tumor sites?

It turns out someone was thinking about that and got a Nobel Prize for figuring out some of it in 1931. The theory is that cancer cells have broken respiration and are only able to ferment sugar for energy. This neatly (perhaps too neatly) ties a few things together.

First, that women in Japan aren’t eating bucket loads of sugar like we do in the US. If they aren’t eating all that sugar, and cancer requires sugar, then you’d expect cancer incident rates to be lower. Second, it explains why essentially no progress has been made treating cancer in the last 40+ years since going after DNA wouldn’t be the right thing to go after.

But wait. You saw Jurassic Park where they had a bunch of gene sequencing devices and Thinking Machines supercomputers. What if we took a group of people with the same type of tumor and sequenced the DNA in there. We should discover similar mutations – even the same mutations – in these different people. Then we could target those genetic malfunctions using some space age drugs and stop the cancer.

It turns out that people have been trying exactly this. The problem is they haven’t been finding any common genetic flaws and therefore, the entire working model we have of how cancer works might simply be wrong. That ties in nicely with making no progress in a few generations.

So if it isn’t DNA, what is it? Mitochondria. They supply energy to cells and even have their own DNA. It’s fascinating that mitochondria are inherited from your mother, which is interesting since it means a different set of evolutionary pressures will apply.

It’s worth taking a break from cancer for a second. If you go look at the data you’ll see an explosion of all kinds of other things in the world of chronic disease. Diabetes, Alzheimers, Parkinsons and lots more.

What about MS? Here’s something truly scary: It looks like MS is curable by essentially eating vegetables:

Dr. Wahl also has a few books out you can go find. So if thinking deeply about mitochondria can save someone from MS, what about those other things?

Well T2 diabetes is your inability to control blood sugar. It turns out that by not eating sugar you can essentially cure T2 diabetes. What about Alzheimers and Parkinsons?

There’s another great book there: Grain Brain. It turns out, again, that not eating sugar helps a lot. This idea that we end up old and get all these conditions, somehow left up to fate, just isn’t really true. What you eat and how you exercise will essentially preclude you from getting any of these things.

And none of this is particularly new. There was a book in the 80’s that’s been rereleased called Pure, White and Deadly. 50 years before that, Warburg was getting his Nobel.

So this made me think, is it too late for me? Growing up I had cereal with sugar for breakfast and an all-round cheap low-fat… Oh let’s also mention eating fat doesn’t make you fat, in fact it’s incredibly good for you… and high-carb diet.

After much more reading I got to Cancer as a Metabolic Disease. This guy decided to give a bunch of mice brain tumors and then deny them sugar to see what happened, and the result is a $130 cancer textbook examining everything from Warburg onward, up to and including treatment propositions.

We can skip back to diet again, where we left off on paleo. Paleo is essentially a low-carb diet which means no sugar. If you go look, it’s kind of interesting to see what health problems paleo people had (think: polio) and how we’ve solved most of them. Since they didn’t have sugar they didn’t get tooth decay, people have dug up their bones and figured that out.

What happens with very low carb diets? You go in to ketosis. The reddit keto community have a great FAQ all about it. It turns out you have this other way of fueling your body when you don’t have any sugars or things to metabolize in to sugars.

This again is interesting since if you were on planet earth ten thousand years ago then periods without food were a normal occurrence. Dr. Seyfried, and others, proposition is that without sugar cancer cells are put under a lot of stress and they die. How do you deny that? Well don’t eat sugar, or simply don’t eat. There’s a third way to simulate not eating, which is to put yourself in to ketosis. You should spend some time on /r/keto and see how people do on the ketogenic diet. It’s insane.


This is a graph (from Seyfried’s book) of the glucose and ketones in someones blood over 30 days while they eat a ketogeneic diet. Essentially the glucose goes down and the ketones go up in a compensatory manor so you don’t keel over and die.

By extrapolating out from mice models, Seyfried suggests that fasting for a week per year or a few 2-3 day fasts should kill the dysplastic cells you have. Your blood should look like the above graph but with the time axis shortened down from 30 to 7 days. And as it turns out, there’s already a lot of evidence that fasting is good for you.

So I tried it. I managed to get 3.5 days in. The problem was my timing. Let me say upfront I felt fine, actually great for the whole 3.5 days. But I chose to do it just before going on vacation for my birthday with a bunch of stressful driving and screaming kids. That was sub-optimal. At the 3.5 day mark I had some slight heartburn, got pissed off, and had a cookie.

The fascinating thing is how food craving feels. Having a ham sandwich in front of me felt just the same as having a chocolate cake or a beer sitting there. I expected some major difference there since surely chocolate is a treat and a beer is alcohol (and sugar). It was really strange to have the same episodic emotions over plain food.

During the period I recorded my blood sugar and ketone levels with one of these meters. Diabetics will be familiar with them. You lance your fingertip and squeeze to get some blood. Then dip a test strip to the blood and the machine magically spits some numbers out. You can see the numbers in this spreadsheet.

The problem is that the meters are pretty crappy. The readings are only roughly 20% accurate and the ketone strips (which are ten times as expensive as the glucose strips) have a relatively narrow reading band. The error on them is good enough for diabetics but not really for what I wanted. I also recorded blood pressure with one of these things.

You can see the glucose stay about the same and the ketones jump up on day 3. I think the glucose is problematic for two reasons. One, the accuracy of the device is roughly the same as the drop I should see, so it’s easy to hide it in the noise. Two, Seyfried warns about having anything but water. He talks about subjects having decaf tea and I was drinking gallons of decaf coffee.

If you look, I lose 2+lbs per day too.

The main barrier to fasting is simply self-discipline in the face of food everywhere. That’s why lots of people go on retreats to do it. You’re perfectly capable of fasting and obese people can fast for months on just water. Go look it up.

So I’m starting another fast again today without decaf coffee this time.

Here’s another video, longer and with more technical detail:


I don’t have a medical degree and I’ve glossed over a lot of detail in all this. I can’t summarize all these great books but hopefully just enough to get you interested. What can you do?

  1. Stop eating sugar. You’ll be amazed at the grocery store trying to find things that don’t have added sugar. Practically everything you pick up will have added sugar, under some name like “evaporated beet juice” or “dextrose” or whatever. Really, go see.
  2. Read. Get the books I’ve mentioned from the library or amazon, and here’s a recent good documentary to watch. It’s more than not eating sugar, but that appears to be a good start. It will be maybe 50 hours of time invested, but it’s a lot cheaper than getting cancer or some neurological malfunction. If you’re anything like me, educated by the government and charities, what you learn will blow your mind.
  3. Find a community. I highly recommend /r/keto as a starting point. It’s a place to ask questions and learn from others experience.
  4. Try to find some contradictory evidence. I’ve been trying, there doesn’t seem to be a lot out there that’s very defensible. Remember, people don’t change their minds, they just die and get replaced by new minds.

Cost per mile

Your transportation costs are best thought of per-mile. Most Americans apparently have a much more vague and forgetful way of looking at it; car payments and gasoline are just a background fact of life.

The Federal Government will refund you 50 or 51 cents on the mile which is a decent approximation. That builds in purchase price, maintenance and fuel. So that 10 mile roundtrip to the grocery store is a $5 cost. But we can go deeper than that.

Giant Freedom Twist Electric Bike

Giant Freedom Twist Electric Bike

Because I was drunk, stupid, or both I bought an electric bike a while ago. The theory was I’d use it because of the crazy hills on the commute from home to work. The reality was the rain was so depressing I rarely used it. Along the way, I kept a spreadsheet.

Recorded costs of electic bike

Recorded costs of electic bike

I got the bike at approximately half-price as it was second hand. From there I recorded all my trips. Right now my cost per mile for the bike is $5. Yes, $5 per mile. I need to ride it another 2,300 miles before it comes close to Federal reimbursement levels. Moving as I have to Colorado, it’s now almost entirely useless. The boost going up hills (and there is only one near me now) doesn’t outweigh the unfortunate speed limit built-in. It tops out at 20mph or so, when I can get up to 35 on my other bikes down hills. This is some safety constraint imposed on electric bikes.

It’s been fun, but it’s not worth $5/mile. I’m mainly using my normal bike now, every day, and will be selling the electric thing.

In contrast, my car (we’ve gone from three cars to one) is currently costing $1.74/mile. This includes again, the purchase price, gas and maintenance but it has a higher error bar since I didn’t keep track of all the costs exactly. Over time this will drop as the purchase price is amortized over the lifetime of the vehicle.

It’s much easier biking to get groceries knowing that I’m saving $10 or $20 of driving costs, rather than thinking of the car as “free”. As the saying goes; a car burns gas and makes you fat, a bike burns fat and saves you money.

Compare and Contrast

United Airlines will fly me from San Francisco to London, and back, tomorrow, for $2,647. At 5,367 miles each way that’s 25 cents a mile with the added benefit of food and speed. If governments didn’t interfere so much, it would cost half as much again (about 50% of the cost is taxation).

So my electric bike costs twenty times what a 777 costs, and my car costs seven times a 777.

If only I could take a 777 to the grocery store.

Where the new things come from

It’s worth asking in a world full of copying, where do the new things come from?

They’re extremely rare for a start. We know that most new things will fail. Is this due to their inherent newness, or that they’re not really new, or that they’re not useful?

It feels like most so-called new things fail because they’re not really new. It’s yet-another wallet on kickstarter or instagram clone in San Francisco. Second, it’s because they’re not useful. It’s a social network just for squirrels, or a three-wheeled bicycle (or, tricycle, of course).

Only last do things fail because they’re new. That would be or webvan. New and just before their time. They’re all coming back in new disguises today.

Therefore when you do something new, it’s likely you should try to really be new and not just a copy of something else or useless.

Which is not to say copying or useless is the same as profitless. Clearly doing another fountain pen on kickstarter is profitable and the pet rock was genius in its uselessness. So, of course, many counter-examples exist.

In England, my default experience observing anyone trying something new was that they were ridiculed and dragged down. In the US, my experience is, on average, the opposite. It certainly feels like the majority of the new things come from the United States, and are then just copied in other places. Everything from the Boeing 737 (Airbus A320) to (

This isn’t universally true, of course. The jet engine would be a counter example. But, that and the Dyson vacuum cleaner are held up as if they represent the pinnacle of British achievement, over and above English or the Westminster system of Government (copied all over the world). Whereas in America similar innovations happen all the time and are just part of the natural background noise of the place.

A random example, the conference I ran, is thought of positively by the attendees (mostly American) whereas all the negative people, the vocal negative people, are on the other side of the planet in England. It’s hard to conceive of running a follow-up in the UK and receiving hate mail from people in the United States about it.

Interestingly, it doesn’t feel like British ex-pats living in the US suffer from this disease. Therefore, if correct, the best people to try new things are the very same people leaving Britain, making it (Britain) an even worse environment to try new things.

Without new things, we remain in the state we are today, with the same problems. Therefore it’s critical we have new things. So it should be shocking that there are so few new things and we readily drag down those who try to build them.

What can you do to try and right this, wherever you are? Try to find a positive way to react to new things and ideas. New things are scary and we don’t like change. We’re quick to find the negatives when presented with anything new. Try to find the positives instead, whether discussing a new idea at a pub or reading a controversial (e.g. new) book.

Are you copying the right thing?

I’m fascinated by the notion of copying in society. It’s everywhere. Making sure you’re copying the right thing appears to be very hard:

Dubai copied the skyscrapers of New York and ended up with a fake city. Hong Kong copied the basic freedoms and ended up with skyscrapers and a real city.

Everyone is copying the black rectangular nature of the iPhone, rather than something deeper like the Apple org-chart or other attributes like secrecy or having a HQ in Cupertino, or a British head of design.

Rich people have nice cars. Therefore people buy nice cars and expect to be rich.

Silicon Valley has venture capital firms. Therefore, European nations and cities create venture capital firms expecting silicon valley to show up.

Rich people tend to be educated, therefore you send your child to get educated in the expectation that they will get rich. In fact, richness tends to lead to education not the other way around.

Prosperous countries have money. Therefore if we send metric tons of money to poor countries they will become prosperous.

China’s wholesale copying of the United States; from fast food to an aerospace industry. Everything apart from the important thing; the constitution.

Fit people tend to exercise and eat well. So you try to exercise and eat well but fail, because you’re not copying something deeper like self-discipline.

We live in a cargo cult world.



4.6lbs down in two days, just from going zero carb. Or, as reddit prefers, keto.

Diet has consisted of American-sized steaks, leafy greens, coffee, water, and the savior: mini babybel. Plural might be babybeli?


These things are tasty-awesome. Buy a bag of 15 or so, take them out and pop them like, I don’t know, blueberry vodka shots at 3am.

As it’s 2013 Mini babybel have their own website, twitter feed, facebook page and youtube channel. You too can be friends with a cheese.

Oh how I miss thee

Oh how I miss thee

Exercise? I’ve been biking everywhere. Colorado being Colorado, you have to have some kind of extreme outdoor activity every single day. Thus marathon-like hikes across plains that look like some scene from Prometheus only in full color. Without the aliens.  But, with bears, which is almost as bad. Not yogi bear or honey monster. Real, actual, eat your face bears.

Like most things, there’s not a lot of data behind exercise giving you expected outcomes. A rare glimmer of hope was seen recently with this NYT article, neatly distilled by in to something usable:


Just go to, hit start…. and that’s it. I’m totally looking forward to the 6 and 5 minute workouts.

Incredibly, NYT link to the original paper supporting this exercise set. I’m a little skeptical of the depth of proof and unaware of exactly what the “Human Performance Institute” is, but hey, it’s SCIENCE!

Powered by WordPress. Designed by WooThemes