icm2re logo. icm2:re (I Changed My Mind Reviewing Everything) is an 

ongoing web column  by Brunella Longo

This column deals with some aspects of change management processes experienced almost in any industry impacted by the digital revolution: how to select, create, gather, manage, interpret, share data and information either because of internal and usually incremental scope - such learning, educational and re-engineering processes - or because of external forces, like mergers and acquisitions, restructuring goals, new regulations or disruptive technologies.

The title - I Changed My Mind Reviewing Everything - is a tribute to authors and scientists from different disciplinary fields that have illuminated my understanding of intentional change and decision making processes during the last thirty years, explaining how we think - or how we think about the way we think. The logo is a bit of a divertissement, from the latin divertere that means turn in separate ways.


Chronological Index | Subject Index

Otzi the iceman and the secrets of rankings

About the credibility of search engines’ results

How to cite this article?
Longo, Brunella (2017). Otzi the iceman and the secrets of rankings. About the credibility of search engines’ results. icm2re [I Changed my Mind Reviewing Everything ISSN 2059-688X (Print)], 6.6 (June).

How to cite this article?
Longo, Brunella (2017). Otzi the iceman and the secrets of rankings. About the credibility of search engines’ results. icm2re [I Changed my Mind Reviewing Everything ISSN 2059-688X (Print)], 6.6 (June).

London, 22 January 2018 - In a chapter of my 99 STARS collection - a forthcoming book on skills and competencies for the digital age - I have considered one of the most effective and successful exercises I designed while I was running, as director and instructional designer, my former italian business Palestra Internet Panta Rei (1995-2007). Here is how it looks in its synthetic “STAR” format:

SITUATION:   A consultancy and training business providing among other services advice on educational policies, information retrieval 
products and electronic information sources through various methodologies, including  e-learning courses. 

TASK:  To design an online exercise about evaluation of web data quality. This would be widely accessible and focus on the hidden 
editorial control existing in the representation of a subject within the media text of Google and other search engines. It should be 
also easy to replicate  into one’s own discipline or market sector. 

ACTION:   I chose a subject  interesting from various perspectives,  the discovery of the natural mummy called Otzi the Iceman
 in the Otztal Alps in 1991. The mummy had received attention from scientific, educational and commercial organisations. 
I selected four web sites representative of different types of information available on the web. Participants had to analyse each 
web site and define in what relevance ranking  order they  would suggest their own audiences to consult the four web sites.  
Then, they had to investigate the subject using five different search engines and take note of the position of each web site in the 
results pages of any search engine. Finally they would compare their own ranking choices with the ones applied by the various 
search engines’ algorithms for the same four web sites. 

RESULTS:   The Otzi exercise (the text of which is still available here, in italian) always delivered its rationale, showing that  search 
engine results pages  are not at all neutral representations of the contents  available online: their rankings are the result of proprietary
 formulas that take into account many calculations. Conversely, our own human relevance ranking depends on our own perceptions, 
reasoning and judgements in a certain context.  

Search engines and the fake news challenge

Fifteen years later, the Otzi exercise came up to my mind in a discussion with information professionals from various walks of life and studies - computer and information science, economics, engineering, history - debating the topic of the moment, fake news!, and how to assess the credibility of what we browse, learn and exchange online.

Oh dear, what a deja vu! This is an industry in which it seems people do three steps forward and two backwards at all times.

The problem of assessing data quality is at the heart of any digital work or function. The “web is for everyone” but we all want to reassure our customers (and be reassured) that we do have what it takes to separate the wheat from the chaff. That means not just accessing the information we need but be able to use it in a sound manner, understanding the underlying structures that show us some information - and not something else.

That is because the way in which an information retrieval system works is reducing everything, from the film Titanic to your boiler’s manual, to numbers and make calculations with them.

One veteran colleague from the public opinion and market research sector admitted recently during a conversation about politics that the reason of the irreducible difference between us is that he has always been interested in assessing quality through perceived values whilst I am more for assessing quality through objective measurements.

Ah! That can still be true, I replied, it has been true for a long time but I believe it is not anymore so simple as we were used to think: the two areas of evaluations - subjective values and objective measurements - have in a certain sense collapsed into each other arms following the internet revolution. And online search technologies have surely contributed to this if not the main cause.

Yes, you read right: there was an online world before the world wide web, that is where I happened to meet innumerable strange individuals including… Otzi the iceman!

The online information journey

In the early 1990s the online information world was very structured and tidy: there were costly databases accessible through various national and international providers in real time (that was so slow that you did not realise it was in real time! it was still a world of command languages, mainly black and green screens, and modem speed of maximum 1200 bits per second).

Information seeking was considered in the context of quite predictable behaviours performed by expert intermediaries who specialised in such tasks, sort of “all in one” librarian-computer-geek-electrician good at managing research briefings and computer floppy disks as well as telecom cables. These ancestors of today Chief Digital Officers and Data Engineers (or the like) applied the notion of relevance as it had been defined since the early experiments on the effectiveness of information retrieval systems in the late 1960 and early 1970s.

It was still part of the formal education of librarians, teachers, perhaps journalists and social researchers at the time to use checklists of pre-defined criteria considered the important bits to assess - or better to say to confirm - the authoritativeness and trustworthiness of the sources at their disposal.

An increasing number of sources was accessible also on CD-Rom and there were significative differences among anglo-saxon and latin Countries in terms of market production and usage of commercial databases, points of access (telecommunication networks) and experimentations with new multimedia and hypermedia but overall, the information world of the early 1990s was a world still dominated by the notion of scarcity: the databases’ format - and in some respects also its markets - had been relatively stable since the 1970s. Their market value was depending on their scarce availability and accessibility and so also their quality was usually evaluated at the point of access.

The immediate ease of use of the hypertext format first and the universality of world wide web soon after brought in the mid 1990s a huge disruption into the sector. Production, distribution and access were soon to be dis-intermediated and opened up to billion of end users. I published in 1993, upon request of the Association of the Italian Libraries, a little handbook on how to use a database (Banca Dati, AIB, 1983): the main purpose was to explain that we, as experts, were the guardians and gatekeepers of such big pots of magic, immaterial, invisible knowledge. The value of the information retrieved from publicly accessible commercial databases was in some respects taken for granted because previously assessed in respect of typical needs of a certain group of people or the whole of a certain organisation. The major breakthrough of my first internet project in 1993-94 was exactly in that becoming aware that abundance of access and not scarcity anymore would drive the evolution of the business models for electronic information vendors and new online publishers - although we would not see too much of a revolution in terms of profitability, but this is another story.

The abundance was everywhere in the early developments of the commercial internet: sources and materials, points of view and opinions, companies and non profit organisations etc etc. The traditional online world reacted with vigour, increasing the number of distributors for the same database: different hosts implementing different command languages to interrogate the same database were aiming at retaining customers and increasing the competition among platforms whilst undergoing a process of concentration of properties into few big players (Thomson Reuters, Dow Jones, Bloomberg, Pearson, Reed Elsevier). But then also these migrated their proprietary closed technical platforms to the WWW, quite precipitously. It was, at first, a disorienting transition for so many traditional database users, puzzled with a very rapid turnover of contractual terms, software features, new interfaces, new ways to manage requirements and measure compliance with customer expectations. I invented a way of managing such craziness saying we would have “jazz sessions” - that was my window of opportunity to learn about some methodological ideas what would be then incorporated into Agile Project Management.

And it was indeed a watershed moment for the whole of the communications industry. I had to find new reference points and new methods to measure data quality. This would inevitably point towards the perceived - thus subjective - value of data retrieved at the point of use but it still should be comparable against a fixed number of criteria to produce some objective evaluations.

Colleagues from the US academic world moved towards a new idea of “user-based” relevance. There have been innumerable abstract and philosophical stances of this concept but no substantial practical implementation, progress or innovation for several years, being the internet itself - with the proliferation of access to electronic databases via new web interfaces - the absolutely relevant revolution to deal with.

With new compelling alternative sources coming out of the blue every day from all around the world, the first generation of Internet based information services was surely both exciting and confusing at the same time but it did not have the problem of quality at its core because we saw the multiplication of access points and interfaces, and no significative change in the contents of primary sources.

The Internet cyclone

The majority of the new services available through Gophers and FTP did not fit into any available standard or satisfied bespoken evaluative checklists. How could we then decide what to pick up and integrate into otherwise organised, classified, systematic knowledge and information or business intelligence systems?

I turned out to some formal training in 1993-1994 and among my first hypermedia and internet courses I could not find a better mentor in Corrado Pettenati, an engineer turned librarian at CERN in Geneva, exactly where Tim Berners Lee had his… web kitchen, let’s say so (I was used to say at the time that knowledge management consists in nothing else than cooking delicious dishes picking up data ingredients here and there - no wonder why traditional librarians thought I would bring their profession into disrepute).

Pettenati fervently explained to few of us - anxious and impatient information professionals who went from Milan down to his native Genoa to try to get some clues on the new internet prophecies - what the world wide web and Mosaic (the first web browser) were consisting of and why and how such a new system had been developed. But of course there were no established patterns to follow: how to try to make sense of the internet into an established organised efficient electronic modern library and information system?

We had databases, and computers and telecommunication networks, local and wide area networks and so on and so forth. But we did not have any tool or method that seemed fit for purpose to manage the volatile and variable internet abundance of free sources for business, legal or scientific research. In 1996, colleagues documentalists invited me to have a say at the italian session of the 20th International On Line Information Meeting in London, after I had left the Berlusconi Group and found my own consultancy.

To cut it shortly, I said we are in a cyclone, whatever side of the matter you look at (2).

Farewell to the librarians’ checklists!

Few years after the initial commercial developments of the world wide web several researchers beside the American Library Association and other large organisations started new lines of reflections on information literacy and the need of new critical thinking skills that would be fit for the digital age and would overtake the traditional library and database instruction skills.

I was still concentrated on the IT, organisational and management aspects of the whole thing at the time - working on the organisation, design and development of new services for large organisations where the migration and then the integration of information retrieval systems and databases into the new web environments were enough challenging per se without bringing in new contents. But after a couple of years of “jazz sessions” to startup and manage websites, I was asked advice on how to evaluate the quality of information available in the open web from the perspective of libraries, research organisations and training providers.

Lucky me, I reconnected with the library and information specialists’ community just in time to catch up on… more checklists! Yes, eventually, an almost unanimous universal consensus was reached on how to evaluate websites: we should point people towards familiar criteria such as accuracy, authority, objectivity, currency and coverage or scope of the internet resources potentially relevant for a certain subject, discipline or purpose.

Not exactly the new paradigm that had been called from researchers and innovators in respect of the idea of relevance and that is as far as I know still waiting for a major discovery (1), or what I myself had hoped to achieve in my early speculations about the nature of internet contents, but at least it was something useful to share with several professional families dealing with the “eruptive” nature of the internet revolution.

However, such checklists were something we could easily talk about for hours without reaching any common evaluations: they were rarely implementable at the right moment by the right user for the right purpose. Impracticable at the point of use, they were mostly insufficient or not convenient as means to evaluate websites proactively especially in the business environment.

Everything in a web site - but possibly the title and the publication date, exactly as in a library catalog? - can be considered from so many angles and perspectives that it seemed simply preposterous to insist on the use of interminable checklists to assess the quality of a web source.

Old fashioned enumerative criteria, reflecting the need to find authoritativeness and representativeness of unanimously recognised primary and secondary sources, would still be in place and solidly considered. I myself developed few original ways to assess websites contents for the business user that I believe still provide some empirical evidence of effectiveness - I may return on this specific niche in another occasion. And I even tried to make fun of the whole thing, producing interactive childish puzzles that were much appreciated to learn about checklists for the evaluation of legal resources (one survived and it is still accessible here from my records, in italian).

But overall the pressure, scale and pace of exponential growth of information sources and multiple audience groups would make any checklist inadequate. Dealing with the problem of abundance instead of the problem of scarcity of information sources, with cognitive overload instead of blind ignorance not only came as a blast in our theoretical formal education background: it turned upside down our confidence in traditional Shannon’s general theory of information and required also a complete review of our agency role.

We carried on competing with the galloping pace of search engines technologies on the idea of relevance and not very much on the idea of uncertainty, that would have surely made a difference from a user perspective and anticipated some (very … relevant) problems now commonly encountered in AI applications.

Initial findings about using the web for marketing, business and competitive intelligence pointed my attention towards the convenience of end-users evaluations. So, in this way, learning from usability engineers, I was able to consider there was another way to shape the problem and bring into the assessment of the quality of web contents some of the methods established in the field of human computer interaction.

Lessons from the usability tests

At first, users’ words were just denoting or underlining properties and aspects of the information they had found useful to solve a problem. And I appreciated that I would have not always picked them up or prioritised in the same way. So that convinced me that this other interdisciplinary messy world of human computer interaction was worth studying and considering.

But after a couple of years I came up with my own conclusions and a user-based grids of criteria that I did not know how to call other than… user-based factors. That seemed working fine with different groups of people both in corporate and academic environments, accommodating that idea of user-based relevance some were talking about at Indiana University and in other Schools of Information Science.

This grid of criteria would reveal the inevitable subjective nature of users’ judgements in the selective process but it would immediately be subject to a sort of literary warrant or to the law of democratic consensus. Unfortunately it seemed that the more people became aware of such subjectivity and even in support of a certain degree of constructivism in approaching some conclusions about something, the less confident they were in their own evaluations. They preferred authoritative external sources more than just their own process of making judgements, signalling to me that the recursive or circular nature of this approach could become at a certain point (or scale) a disadvantage more than an independent success factor and should be equally weighted.

I was inspired to find a better solution, at least partially, by an american colleague who had made extensive research into similar information design issues for several audience groups and from whom I copied the idea to slightly change perspective and call my user-based grids of judgemental criteria as “content internal criteria”, that was in other terms just adding more meta-communication into the whole process: it turned out an excellent way to find a compromise, to focus on contents, simplify the grid and reposition it in such a way that the concept of trusting the users’ judgement on the source relevance became more acceptable to my groups of customers.

In fact, they immediately perceived they had to dig out such internal criteria analysing and classifying the contents of a source against a pre-defined set of questions or objectives, more than relying on “just” their own impressions and considerations on the source’s quality once and for all: in this way, everybody in a team could be given the necessary consensual reference points to assess first of all the pertinence of a certain website and stay open, or admit, different evaluations could be made against different contents of the same website or source.

We could develop a complex, multifactorial way to look at the contents from the users’ context perspective. It was a quite simple and nonetheless very robust way to challenge the first impression of a “plausibility of arguments” (or literal credibility) that some would consider at first sight relevant for assessing quality in the academic and scientific world (for instance, through a peer review process) but then would not pass the test of pertinence of discourse analysis nor a usability test.

PageRank or the elephant in the room

It was at that point of my personal journey in dealing with alternative methods to evaluate internet information sources that another huge revolution had become so important that could not be ignored any longer. I refer to the popularity rankings introduced by Google search engine.

It could not be more welcomed for at least two reasons: since Google was aiming at measuring the popularity of a certain source in a predictable, independent and transparent way, it offered the certainty that many user-based assessment processes would not be able to guarantee; secondly, it resulted exceptionally effective and efficient in picking up relevant sources for the purpose of answering specific information enquiries that fell outside the perimeter of the users main disciplines or sectors of interest.

Despite the immediate success of PageRank and the undoubtedly useful introduction of Google as a tool - if not the main or only tool - to access the world wide web contents, the demand for user-based evaluative criteria to assess information sources remained very high and quite identical in several different processes (not only business, academic and scientific research, library and information services and education but also procurement, import-export, public policies, communications and public relations - just to name the few I have become very familiar with).

I therefore ended up with a sort of new normative agenda to structure my courses and lectures: for any evaluative and constructive search engines exercise I always recommended to define and include complex content internal criteria but also to benchmark or compare the conclusions with Google’s popularity (PageRank) rankings and other search engines’ results pages.

These comparisons would reveal the credibility, currency, originality or completeness of the information that for a non expert or unfamiliar user would be quite difficult to dig out, especially without a comparative analysis of different sources.

That is the story of how the exercise “Otzi and the secrets of rankings” came about, revealing extraordinary insights on the advantages and drawbacks of user-based relevance.

Otzi’s future

Years later, my pilgrimage in early web science or data science pathways still resonates invaluable lessons. There are now thousands of researchers from different fields who develop, use, embed information retrieval technologies in all sorts of products, services, apps and so on. To this extent, they apply users based criteria at all times: sometimes they come up with their own and sometimes they use pre-defined criteria embedded in the chosen operating system or platform. Sometimes they have little choices (it is the case of everybody who wants to develop search services and apps for the Facebook world, for instance).

Further progress has been attempted or made with various technological developments including for instance the PICS project from the W3 Consortium and then with the extraordinary success of Amazon’ and others’ recommender algorithms that have created the notion of online reputation, opening up enormous opportunities (and risks) for advertising, for public relations, for policing the internet always playing around search configuration and perceived effectiveness. Also the same idea that we, as human beings who usually trust or not what to click on in few milliseconds, can perform an ordinated sequential assessment process of the credibility and quality of online information has been abandoned in favour of studying dual processing models of intellectual activities - the sort of thinking fast and slow evoked in the title of this magazine, icm2re.

In sum, today the landscape for information retrieval applications is potentially infinite: through “discovery” and “serendipity” users can be led towards more compelling, entertaining, interesting and engaging or socialising experiences and therefore developers are limited only by their lack of creativity. In some niche and business to business cases we still see pertinence, productivity and time-saving features as the most valued characteristics of a search or data mining system. But we have also entered new territories with more AI and machine learning applications available to everybody that tend to persuade us we do not need to be bothered to evaluate anything, just use it or do it or change it. These applications want to determine how we should search and access certain sources, and then what we should read, watch, buy, and then what we may or may not wish to learn and understand more.

Unfortunately we still do not have devised and implemented controls to measure the reliability and the credibility of what we find online in standard ways, that is the true point of the whole “fake news” debate: how both humans and algorithms should approach quality assessment of information is still a matter of art and not science. We have just possibly entered a phase in computer science history in which there is awareness that everything should be rebooted because there is no actual possibility of any realistic information security expectation without assured computing - and therefore not even protection from fraudulence and falsehood. However, the ridicule is getting the prime time once again when we consider that we still do not have common standard languages to define the skills needed for data engineering and information design and evaluation and yet we all, including computer science experts and masters of all trades, still blindly accept the authoritativeness of the rankings of search engine results as the indisputable Bible in many daily circumstances.

On almost everything that falls outside our immediate perimeter of expertise and on everything we need in a hurry we simply like the idea that search engines’ results are good enough: such judgements multiply the opportunities not only for advertisers to attract us towards new ideas and products but also for scammers and fraudsters. That is why I endorse Norman’s suggestion that procrastination “is good for people” very often, because it triggers more cognition and creativity: Planning ahead may be efficient, but it is not realistic in a variable, ever-changing world.

We like to trust opaque automated procedures - including of course conversational devices like Alexa, etc. - for which almost no reverse engineering is available. Has anybody educated us to persistently ask how a robot works and not to use it until we have a plausible answer?

We are just content most of the times to see that with the right motivation and being in the right place and at the right time, a sufficiently educated person can find out the pertinent predictable and even a bit serendipitous sources to answer whatever question through a search engine, without apparently any special or further need for data literacy skills at the point of use.

We are so proud to be able to find exactly what we are looking for, when we need it!

We like to think that there is plenty of invisible cyber-angels or robots out there, that select cluster and rank the available sources in the best possible ways on our behalf, as they were the absolute experts on something and not just query handler scripts and interfaces that have been designed to entertain us upon demand, and investment, of advertisers and sponsors.

There is little market appetite for a sort of “bundling search media literacy”. And yet we should be able to go beyond the appearances of search engines results that, no matter how manipulated and preposterous, we collectively tend to find compelling, surprising and after all very pertinent. In this respect, I believe that some ingenious recents directions in search engines developments could and should be reviewed and improved and made less cognitively, socially and economically risky than they are at the moment (4).

Confronted with the results of an updated (and quicker) version of my Otzi exercise, my veteran market research colleague accepted that it would be complicated to demonstrate that some objective measurements are not the results of earlier subjective assessments made by the same groups or clusters of users, in that providing evidence of the influence or impact of a certain campaign. Oh well, I said, that is how PageRank actually works: Google Rankings include an objective measurement of popularity on the grounds of all those subjective assessments expressed through relationships (links), that is why we need to ask ourselves what credibility of information really means.

In sum, I am afraid that my friend Otzi, “the Mummy who came from Ice”, still may know more and judge better than we do! after all somebody found him, though he probably would not endorse Norman’s fascination for procrastination!

As usually, I look forward to hearing (or preferably reading) from the robots, with optimism.

Notes

(1) Park, T.K (1994), Toward a theory of user-based relevance: a call for a new paradigm of inquiry, in JASIST, 45(3), p. 135-141. But see also academic literature produced in the following twenty years on the subject of user-based relevance and credibility, for instance: Xu, Y., Chen, Z. (2006), Relevance Judgment: What Do Information Users Consider Beyond Topicality? in JASIST, 57(7):961–973; Hjørland, B. (2010), The Foundation of the Concept of Relevance, in JASIST, 61(2):217–237, 2010; Johnson, T. J. & Kaye, B. K. (2013). The dark side of the boon? Credibility, selective exposure and the proliferation of online sources of political information, in “Computers in Human Behavior", 29(4), 1862-1871. Kwon, N. (2017), How work positions affect the research activity and information behaviour of laboratory scientists in the research lifecycle: applying activity theory, in IR Information Research, 22(1)

(2) Longo, B. (1997) I professionisti dell'informazione nel ciclone Internet. Intervento al 20 International On Line Information Meeting - Sessione italiana, Londra, 1996 e al Seminario AIDA di replica "20 anni di informazione online: bilanci e prospettive per il documentalista. Considerazioni a margine del 20 IOLIM", Roma, 16 aprile 1997. Pubblicata anche in "AIDA informazioni", 15(1997), n.3, p. 9-14.

(3) Norman, D. (2014) Why Procrastination is Good. After dinner talk at the 20th Anniversary Celebrarion of the Human Computer Interaction Institute, CMU, Pittsburh, PA.

(4) Bordino, I. et al. (2016), Beyond entities: promoting explorative search with bundles, in Inf Retrieval J (2016) 19:447–486.