What is MapReduce?

MapReduce is a tool that was created at Google for doing simultaneous processing on large data sets across multiple computers. MapReduce will take your question or problem and split it up into bite-size chunks which can be distributed out to a network of computers to work on, also splitting up the amount of time to crunch the entire data set in question.

This is a particularly important development due to the large amount of data being generated as more business is conducted digitally. The potential to find profitable insights increases as more data is gathered on customers, and that profitability is maintained with efficient tools like MapReduce.

Amazon Web Services describes this process in a very effective graphic:

Here’s an interview with the inventors of MapReduce and their explanation of how it works: Continue reading

The Internet | Mapping the New World

It happened in Pre-Socratic Greece that a philosopher by the name of Anaximander decided to take it upon himself to describe, on paper, the natural landscape surrounding him as well as theorizing what the world must look like from an outside, third person, perspective – The successful creation of the first theorized map of the world.

The understanding of our own world has been at the forefront of human progress since Anaximander and now that we have successfully mapped our globe we have moved beyond that to map the meta-world of HTTP, site-to-site links, and web pages. The following is a chronology of such pushes toward understanding the landscape which makes up the web.

Internet Mapping Project | Bill Cheswick and Hal Burch | 1998

Yet another wonder right out of Bell Labs, Bill Cheswick and Hal Burch decided to take it upon themselves over the summer of 1998 to acquire and save internet topological data over a long period of time. By 1999 they had a working product and a good looking map of the internet. Cheswick and Burch spun off in 2000 to start Lumeta Corporation which applies their topological discovery techniques to make discoveries about the web surrounding a client.

According to Bill Cheswick’s site, Cheswick.com, “This mapping consists of frequent traceroute-style path probes, one to each registered internet entity. From this, we build a graph showing the paths to most of the nets on the Internet.”

The map, when presented by Wired Magazine in 1998, looked like this:

Rocketfuel | University of Washington | 2002

The University of Washington touts their ISP topology mapping engine named Rocketfuel, which uses “routing information to focus our efforts on an ISP at a time, then use ISP specific router naming conventions to understand the topology.”

Starting with 10 ISPs in Europe, Australia, and the USA, Rocketfuel was able to create a database with 50,000 IP addresses representing 45,000 routers in 537 POPs connected by 80,000 links.

You can see the maps produced by Rocketfuel here, and a paper discussing how Rocketfuel works here.

Internet-Map.net | Ruslan Enikeev | 2012

Fast-forward to year 2012 and you have a man by the name of Ruslan Enikeev who decided to take the resources available to him and map as much of the internet as possible. Using technologies provided by Google Maps as well as data on 196 countries, 350,000 sites, and 2,000,000 links between them Enikeev created the first public map of the internet.

It took a year to manifest even with the help of additional coders, designers, and the creative agency Positive Communications. Enikeev humbly gives credit to Daniel Galper for tiles generation and web coding, Vasiliy Pugovkin for web design, Leonid Lil and Vitaly Zuzin for HTML markup and Javascript, and Sergey Suchkov, the CEO of Positive Communications and who lent his resources to help create the map after seeing a prototype of only 1000 sites.

The utility of the map is debatable, especially in its current form, but you can arguably use it in business for competitive purposes by allowing the viewer to visually understand how big their competition’s internet presence is relative to their own, and seeing possible up-and-coming acquisition targets in your industry or second-tier related industries.

Just like how graphs are to mathematics, people like Bill Cheswick, Hal Burch, and Ruslan Enikeev have taken it upon themselves to turn the raw meta-data of the internet into something aesthetically pleasing and highly understandable to the masses. Where we are now with mapping the internet is most likely in its very basic form as to what it will look like in the coming years, and just like the historic mapmakers that Anaximander inspired, the future of mapping the internet based on what we have already is going to be amazing.