Information is not new and nor is data – of whatever order of magnitude. This is not the first ‘big data’ era but the second. The first was the explosion in data collection that occurred from the early 19th century. This was an analogue big data era, different to our current digital one but characterized by some very similar problems and concerns.
The early 19th century was when the collection, analysis and production of various forms of information accelerated at a rate not previously seen in human history. The 18th century had already seen rapid developments in dictionaries of various kinds, including Diderot’s 1751 Encyclopédie (based on Chamber’s Cyclopedia) and Johnson’s 1755 Dictionary of the English Language (not the first of its kind) illustrating a growing need to not just to collect but classify, categorise and order information to make it both meaningful and useful. The idea of and search for innate rules and regularity across a wide spectrum of phenomena emerged, with the search for laws of nature came in the following century.
The 19th century was a pre-digital era in which the ‘computer’ was an individual at a desk doing the counting and calculations manually rather than an electro-mechanical or electronic device, but even this early infrastructure clearly set the scene for our current situation. More specifically, it can be called as the first information age. The sciences as we know them were assuming their modern shape , the social sciences were emerging from what were known as ‘political arithmetic’, ‘social physics’ and latterly the ‘moral sciences’, while science became an undertaking distinct from natural philosophy.
The collection of social data had a purpose – understanding and controlling the population in a time of significant social change. To achieve this, new kinds of information and new methods for generating knowledge were required. It is clear that many of the problems in this first big data age and, more specifically, their solutions persist down to the present big data era.
Contemporary problems of data analysis and control include a variety of accepted factors that make them ‘big’ and these generally include size, complexity and technology issues. Digitisation is a central process in this second big data era, one that seems obvious but which has also appears to have reached a new threshold. Until a decade or so ago ‘big data’ looked just like a digital version of conventional analogue records and systems. Now however we see a level of concern and anxiety, similar to the concerns that were faced in the first big data era.
Cataloguing systems had existed for centuries but this period saw their emergence as formalized systems such as the Dewey Decimal System (1876). As Floridi writing on the philosophy of big data, has said quite specifically that the real big data problem we face today is less one of the quantity or quality of data or even technical skills but rather one of epistemology.
Much of the data collected about human beings by various systems has a history not simply of description or even understanding but one of control. Every deviant or ‘underperforming’ social category is a warrant for action once documented. Social data is rarely neutral and the persistence of ‘wicked’ social problems illustrates how regulation has been favored in preference to their solution.
Classical sociology distinguishes between structure and agency. Structure is still equated with order against the potential horrors of anarchy, while agency remains couched in moral terms as personal responsibility.
The collection of data about the deviant categories of people, in particular, was a marked feature of the first big data environment. The risk is that ‘big data’ replicates the ideological underpinnings common to much of what has been produced under the small data paradigm.
Our question then is how do we go about re-writing the ideological inheritance of that first data revolution? The need for critical analysis grows apace not just with the production of each new technique or technology but with the uncritical acceptance of the concepts, categories and assumptions that emerged from that first data revolution.
We are in a period that can reasonably be seen as the second ‘big data’ revolution and it is revolutionary because it challenges our accepted understanding of the world and not simply because of the volumes and velocity of data generation in our new digital information technologies.
Many social categories were designed to control, coerce and even oppress their targets. The poor, the unmarried mother, the illegitimate child, the black, the unemployed, the disabled, the dependent elderly – none of these social categories of person is a neutral framing of individual or collective circumstances. They are instead a judgement on their place in modernity and material grounds for research, analysis and policy interventions of various kinds. Two centuries after the first big data revolution many of these categories remain with us almost unchanged and, given what we know of their consequences, we have to ask what will be their situation when this second data revolution draws to a close?
Like that first data revolution, this present one also has ambitions for people and their interactions with the new media emerging in its wake. Following Floridi, this is a significant epistemic and ethical problem in our current big data era.
“Verily, God orders justice and good conduct and [..] forbids immorality and bad conduct and oppression.” (Qur’an, 16:90)