Authorea

Using SPECS to evaluate computational serendipity

\label{specs-overview}

In this section, we use the elements of the conceptual framework described in Section \ref{sec:by-example} help to flesh out this definition, to develop quite detailed evaluation criteria. We adapt the Standardised Procedure for Evaluating Creative Systems (SPECS), a high-level, customisable evaluation strategy that was devised to judge the creativity of computational systems \cite{jordanous:12}.

In the three step SPECS process, the evaluator defines the concepts and behaviours that signal creativity, converts this definition into clear standards, and then applies them to evaluate the target systems. We follow a slightly modified version of Jordanous’s earlier evaluation guidelines, in that rather than attempt a definition and evaluation of creativity, we follow the three steps for serendipity.

Step 1: Identify a definition of serendipity that your system should satisfy to be considered serendipitous.

We adopt the definition of serendipity from Section \ref{sec:our-model}.

Step 2: Using Step 1, clearly state what standards you use to evaluate the serendipity of your system.

With our definition and other features of the model in mind, we propose the following standards for evaluating serendipity in computational systems. These criteria allow the evaluator to assess the degree of seredipity that is present in a given system’s operation.

(A - Definitional characteristics): The system can be said to have a prepared mind, consisting of previous experiences, background knowledge, a store of unsolved problems, skills, expectations, readiness to learn, and (optionally) a current focus or goal. It then processes a trigger that is at least partially the result of factors outside of its control, including randomness or unexpected events. It classifies this trigger as interesting, constituting a focus shift. The system then uses reasoning techniques and/or social or otherwise externally enacted alternatives to create a bridge from the trigger to a result. The result is evaluated as useful, by the system and/or by an external source. The evaluator should specify all of these aspects relative to the system under consideration at a sufficient degree of precision to show their processual interconnection.
(B - Dimensions): Serendipity, and its various dimensions, can be present to a greater or lesser degree. If the criteria above have been met, we consider the system (and optionally, generate ratings as estimated probabilities) along several dimensions: (\(\mathbf{a}\): chance) how likely was this trigger to appear to the system? (\(\mathbf{b}\): curiosity) On a population basis, comparing similar circumstances, how likely was the trigger to be identified as interesting? (\(\mathbf{c}\): sagacity) On a population basis, comparing similar circumstances, how likely was it that the trigger would be turned into a result? Finally, we ask, again, comparing similar results where possible: (\(\mathbf{d}\): value) How valuable is the result that is ultimately produced?
(C - Factors): Finally, if the criteria from Part A are met, and if the event is deemed sufficiently serendipitous to warrant further investigation according to the criteria in Part B, then in order to deepen our qualitative understanding of the serendipitous behaviour, we ask: To what extent does the system exist in a dynamic world, spanning multiple contexts, featuring multiple tasks, and incorporating multiple influences?

Step 3: Test your serendipitous system against the standards stated in Step 2 and report the results.

In Section \ref{sec:computational-serendipity}, we will pilot our framework by examining the degree of serendipity of existing and hypothetical computational systems.

Heuristics

\label{specs-heuristics}

Choose relevant populations to produce a useful estimate.

It isn’t necessary to assign explicit numerical values to \(\mathbf{a}\), \(\mathbf{b}\), \(\mathbf{c}\), and \(\mathbf{d}\), although that can be done if desired. More typically – and in all of the examples that follow – all that is required is to select a relevant population in order to make an estimate. With a population of one, there is no basis for comparison, whereas in large population, the chance of any highly-specific outcome may be vanishingly small. The aim is to highlight what – if anything – is special about the potentially serendipitous development pathway in comparison to other possible paths, and hopefully learn something relevant. Thus, we might choose to compare Fleming to other lab biologists, or to compare Goodyear to other chemists. Even if we were to shift the analysis and look at the much smaller populations of experimental pathologists or inventors with an interest in rubber, Fleming and Goodyear would have features that stand out, particularly when it comes to their curiosity.

Find the salient features of the trigger.

How can we we estimate the chance of the trigger appearing, if every trigger is unique? Consider de Mestral’s encounter with burrs. The chance of encountering burrs while out walking is high: many people have had that experience. The unique features of de Mestral’s experience are that he had the curiosity to investigate the burrs under a microscope, and the sagacity (and tenacity) to turn what he discovered into a successful product. The details of the particular burrs that were encountered are essentially irrelevant. This shows that it is not essential for all factors contributing to the likelihood score to be “low” in order for a given process of discovery and invention to be deemed serendipitous. In the general case, we are not interested in the chance of encountering a particular object or set of data. Rather, we are interested the chance of encountering some trigger that could precipitate an interested response. The trigger itself may be a complex object or event that takes place over a period of time; in other words, it may be a pattern, rather than a fact. Noticing patterns is a key aspect of sagacity, as well.

Look at long-term behaviour.

Although it is in no way required by the SPECS methodology outlined above, many systems (including all of the examples below) have an iterative aspect. This means that a result may serve as a trigger for further discovery. In such a case, further indeterminacy may need to be introduced to the system, lest the results be convergent, and therefor, infallible. In applying the critera to such systems, we consider long-term behaviour.