icm2re logo. icm2:re (I Changed My Mind Reviewing Everything) is an  ongoing web column  by Brunella Longo

This column deals with some aspects of change management processes experienced almost in any industry impacted by the digital revolution: how to select, create, gather, manage, interpret, share data and information either because of internal and usually incremental scope - such learning, educational and re-engineering processes - or because of external forces, like mergers and acquisitions, restructuring goals, new regulations or disruptive technologies.

The title - I Changed My Mind Reviewing Everything - is a tribute to authors and scientists from different disciplinary fields that have illuminated my understanding of intentional change and decision making processes during the last thirty years, explaining how we think - or how we think about the way we think. The logo is a bit of a divertissement, from the latin divertere that means turn in separate ways.

Chronological Index | Subject Index

I am not the lady next to me

About the challenges of datafication

How to cite this article?
Longo, Brunella (2017). I am not the lady next to me. About the challenges of datafication. icm2re [I Changed my Mind Reviewing Everything ISSN 2059-688X (Print)], 6.8 (August).

How to cite this article?
Longo, Brunella (2017). I am not the lady next to me. About the challenges of datafication. icm2re [I Changed my Mind Reviewing Everything ISSN 2059-688X (Print)], 6.8 (August).

Reality is not always probable, or likely.
Jorge Luis Borges

La xerocivilta’ comporta innanzitutto il crollo del concetto di diritto d’autore.
Umberto Eco, De Bibliotheca, 1981

Scholars not only regularly use but highly value libraries. But they do not often take them as a theme for analysis, reflection, or public discussion. Perhaps, as more than one philosopher has suggested, there is a certain inevitability to this: we tend to take for granted the settings in which we all routinely operate.
Michael F. Winter on Umberto Eco, De Bibliotheca, 1994 (1)

[…] only in the mirror of the past does it become possible to recognize the radical otherness of our twentieth-century mental topology, and to become aware of its generative axioms that usually remain below the horizon of contemporary attention.
Ivan Illich (2)

London, 5 February 2018 - The information overload has become a major topical subject in the ICT and creative industries since long. I remember major campaigns and marketing arguments on the issue since the early 1990s. Problem is that almost nobody is going to read or understand what we write, even when we are talking about extraordinary discoveries or achievements, unless people are prepared to prevent and deal with the amass of cognitive debris and emotional distractions that freeze human understanding and judgements every second.

Of course we can call for a self regulated, savvy reduction of literature produced first of all for scientific reasons and secondly to ensure compliance and controls. And yet we cannot expect academics to demonstrate they deserve their job walking dogs nor civil servants to explain the law relying on podcasts on how to cook healthy meals - though such practices could surely help everybody’s fitness to practice.

No matter how controversial it can be, or how enormously enhanced and augmented it can be through audiovisual and graphical contributions, the written text has remained the most efficient way to share and advance human knowledge and social and political understanding for millennia.

Will the digital economy change such equilibrium? The question of how much a new medium of communication impacts our social and economic systems and alter our economic activities, influencing also our social relationships, has fascinated social scientists for decades, starting from Harold Innis who spent several years and wrote three books to trace back such intertwined matters of history and media technologies before his clever student McLuhan crushed all the juice of Innnis' studies in the famous phrase “the medium is the message”.

The academic world keeps on reflecting and advancing our understanding of the complex relations between media and technologies: now the information overload phenomenon is perceived as “datafication” of our collective knowledge and social habits, and with an overwhelming delegation of control and sense-making to the algorithms and the platforms. As a data engineer with a socio-technical background, I share much of the interest and some of the concerns, but not the same attitude.

Is the devil always to be seen or sought inside the machines? do we always need to blame the engineers when we do not understand the world around us? are all the technologists and all the technicians alike so immoral or even amoral? What an unbearable bias we do have to fight against the puritanism of the humanities and the social sciences, decade after decade into the digital revolution?

I remember I was said to be an “amoral technician” and considered with suspicion since my first job because I suggested we could mechanise the production of cards for library catalogs in the early 1980s: that was perceived as the intrusion of an adversarial force and artillery into the library’s sacrality.

Anyhow, I am sure Universities and big IT groups around the world will sooner or later find the definitive way to tackle the problem of information overload and we all enjoy to be more productive with less demanding waste of paper and people attention - some breakthrough technology or methodology to manage big data to such extent is on its way! In the meantime the onus is on us to try to select and understand what really matters if we want to stay up to date in a certain field or make informed and responsible decisions.

Vicinity does not necessarily means similarity

Few days ago I was attending the presentation of a new coauthored research by two social scientists, talking about software platforms’ datafication. A colleague of theirs sat next to me.

I understood (or misunderstood, it doesn’t make too much difference for the sake of this story) from some words they exchanged while we were taking our seats that she had been away for a while, possibly she had had cancer or another debilitating condition, and had just come back to her work within the same University’s department.

During the presentation, I used pen and paper to write down some notes (for the record of evidences of my possible personality disorders, I must ask you to pay attention to the fact that I usually do take notes on a paper notebook, whereas in other cases I may use my laptop’s Notepad or TextEdit application, in others my smartphone in active mode, that consists in writing notes and taking pictures or in passive mode, id est lurking at what others people say on social media about the talk or searching for something that has just be mentioned by the speakers, and finally, it is true that sometimes I just put down a word on a piece of scrap paper or on a tissue. The last minimalist option is because there is possibly nothing else I want to remember from the gathering but an idea that may not have any relationship with the actual context in which I heard the word. I hope this clarifies something).

The lady next to me scarcely used a tablet to write something and to check or send emails. I noticed that in at least a couple of occasions she laid an eye on my notebook, that I find - as I guess everybody else on earth would do - annoying.

One of the speakers had used the word “posts” to refer to user generated contents distinct from the concept of “social data snatched detached and stored by the platform, together with the actions performed on such data”.

He used the word “post” while I was expecting or hoping he talked about the creation of contents or authoring instead.

So I made a comment to say just that: the chosen words revealed an underneath further interesting distinction between the act of creation by a certain author, that produces a certain content on one side and, on the other side, the possible multiple instances of such content (what the speaker called posts): these posts, fragmented and stored within the platform, with or without manipulations, augmentations and interpolations, with or without the original author’s awareness and consent, create spirals of nodes and links that may not have any residual capacity of representation: they are just another world.

In spite of the welcoming space, obvious reasons of time (ah, ah!) prevented me from adding something else I believe is very important in that: the further distinction between a work and its various manifestations is not only very relevant for reflections and methods interesting the social scientists or other researchers dealing with authoring and identity issues but, as far as I can say from my experience as an information designer, it is also a very pragmatical and fundamental data engineering notion.

The distinction allows us any reverse engineering and tracing or auditing possibility. It is what enable us to make corrections, improvements and changes in a deliberate, intentional and controlled manner. In fact the distinction between one data unit and its possible copies, instances, manifestations etc. etc. ensures that inference rules do not go out of control in any possible automated process, service or system while designing new applications or writing new machine learning algorithms - but it also true the distinction is important to prevent us from applying laws and regulations without thinking, in a pretty “ticking the boxes” mode that is compliance in nominal terms only.

Computer scientists and marketing experts tend to be unconditionally happy with so called “polycentric views” of social media developments and do not want to be at first very bothered with too much of abstract governance concerns they feel could mutilate creativity, serendipity or innovation. In some respects they feed the bias and profit from being seen as the “amoral technicians”.

I am myself pretty much happy to design systems durable and sustainable over time and, why not, systems able to adapt and fit into new spaces accepting all sorts of possible value propositions. But that does not mean unconditioned openness to behaviours that are likely to contradict the premises or the scopes of a process. Reality, Borges once said, is not always likely.

We know, from Harold Innis indeed, that small changes in communications equilibria may cause terrific shift of power, and generate unexpected conflicts, new social divides and uncertainties. We do not need hundreds pages of studies into the Arab Spring or the Russian infiltration into the US Elections to understand more about some fundamental notions of modern communications, do we?

While data and information engineers tend to look at the datafication phenomenon with the willing, energy and hopefully some tools to prevent problems, it looks like the platforms’ design and functioning is so far dominated by marketing and computer science experts that prefer to experiment, study and deal with the “real” thing of continue changes as they develop without too much of a critical reading and discussion of the consequences.

Going back to the story I was writing about, when I very briefly spoke to say what I wanted to say, I had the impression that one of the speakers, or possibly both, looked at the lady next to me and then at myself as they were questioning if being seated close to each other was in any way meaningful, if we had any other sort of affinity, or closeness, and then what the meaning of such relationship could be, if any.

Towards the end of the very interesting lecture, the attention of her colleagues had become so compelling and explicit (I think one of the lecturers called her name to solicit a question, but I did not catch it) than the lady next to me had to say something. She said we should consider the distinction between the “I” and the “me”.

I found it interesting and I think I nodedd, then I smiled looking at my notebook.

On my notebook, either when I take notes on paper or when I use Notepad / Textedit on a computer, if I want to remind myself of something like a comment, critique or idea connected to the main discourse, lecture or content given by the speakers, I tend to write and indent the text and clearly use a mark of attribution to remind myself that that is my idea, not an idea of the speakers or a question raised by someone else, like in the following paragraph:

	ME: I think everybody’s got the message here. 


In general, we can surely conclude here that the issue of an hyper-connected production, use and re-use of user generated contents that cause noise, pollution and security concerns is not new and in any application or new system we design we have to come with new solutions to address it.

What seems new to me, and potentially fertile too, is the tendency of social scientists and economists to eventually look at the “noise” issue of datafication from a holistic and socially responsible point of view that is usually what we as data engineers or data devils, information designers and architects, etc etc have always tried to do. In practice, we always have plenty of option to incorporate the users’ controls into a system or to allow transparency and accountability. If we do not use these options is because we are not good enough or we have been explicitly asked not to to it.

I hope these conclusions do not look cynical or desperate more than they actually are. In fact, I optimistically believe that we can bring about change and engineer a better “dataficated” (not sure I like the word, that is why is hyphenated here) universe if we start from a very basic distinction, so basic that the reader may consider if it is not an offence to suggest it here. I mean the distinction between data (or information) and communications. That is the subject of the next article.


(1) The Library Quarterly, 64 (1994), 2, p.117-129.
(2) In the Mirror of the Past: Lectures and Addresses 1978–1990 (New York: Marion Boyars, 1992), 9–10.