Halfway through the process, in 1884, it was evident that it would take a long time. Therefore, one of the employees of the USCO was asked to design a machine that would speed up the process for the upcoming 1890 census. This machine had to make it possible to process the enormous amount of data much faster.
That employee was Herman Hollerith. In fact, William Hunt and Charles Pidgin were asked the same question. A benchmark was prepared where all three could demonstrate how fast their solutions were. Coding took 144 hours for Hunt's method, Pidgin's method took 100 hours, and Hollerith's method 72 hours. The processing of the data took respectively 55, 44, and 5 hours. Conclusion, Hollerith's solution was the fastest by far and was, therefore, selected by the USCO.
For the 1890 census, 50,000 men were used to gather the data and to put it on punch cards. It was decided to store much more data attributes: 235 instead of the 6 used in the 1880 census. Hollerith also invented a machine for punching cards. This machine made it possible for one person to produce 700 punch cards per day. Because of Hollerith's machines, 6,000,000 persons could be counted per day. His machines reduced a ten-year job to a few months. In total, his inventions led to $5 million in savings.
Hollerith's ideas for automation of the census are described in Patent No. 395,782 of Jan. 8, 1889 which starts with the following sentence: "The herein described method of compiling statistics ..."
Does this all sound familiar? Massive amounts of data, compiling statistics, the need for a better performance. To me it sounds as if Hollerith was working on the first generation of big data systems.
Hollerith started his own company in 1896 the Computer Tabular Recording Company (CTR). In 1924, after merging with some other companies, the name CTR was changed in IBM. In other words, IBM has always been in the business of big data, analytics, and appliances.
Why did it take so long before we came up with the term big data while, evidently, we have been developing big data systems since the early beginnings of computing? You could say that the first information processing system was a big data system using analytics. This means that Hollerith, besides being a very successful inventor, can be considered the grandfather of big data.
Posted February 1, 2013 7:00 AM
Permalink | No Comments |



