A new software technique for analyzing scientific papers shows that basic physics research leads to many more applications than is commonly believed. The technique, developed by Ronald Neil Kostoff at the U.S. Office of Naval Research and Jesus Antonio del Rio of the Centro de Investigación en Energía, Universidad Nacional Autonoma de México, takes a particular research paper and then identifies linguistic patterns in other papers that cite the original piece of work.
(Editor's Note: Paragraphing below is UniSci's, to make the content more readable online, and not that of the authors in their printed article.)
The authors write: "One of the papers examined -- 'Physics of the Granular State', published by Heinrich Jaeger and Sidney Nagel of the University of Chicago in 1992 (Science 256 1523--1531) -- had over 300 citations.
"As is typical with basic research, most of the papers citing this paper were basic research papers in the same field. However, about 20% of the citing papers were research or development papers in other disciplines, or development papers within the same discipline.
"(We found that text mining alone was able to identify all the themes that were not basic research, or were in basic research fields other than the dynamics of sand piles, thereby showing that it is not necessary to actually read all the abstracts to identify these types of applications.)
"There are three interesting features in the results... First, the distribution of the number of citation counts with time has a long tail and shows little sign of abating. This is one characteristic feature of a seminal paper.
"Second, the fraction of basic research papers from other fields citing Jaeger and Nagel ranges from about 15-25% annually, with no latency period evident. This may have been due to a combination of the intrinsic broad-based applicability of the subject matter and the publication of the paper in a high-circulation journal with very broad-based readership.
"Third, there was a four-year gap before the paper was cited by technology/development papers, although the reasons for this are not clear. The latency could have been due to the inability of the technology community to immediately recognize the potential applications of the research, or due to the information remaining in the basic research journals and not reaching the applications community.
"However, it could be that about four years are required for an application to be developed in this discipline."
On their technique, the authors note, "The combination of citation bibliometrics and text mining offers insights that would not emerge if each approach were used independently. Moreover, by removing the need to actually read abstracts (which could number hundreds of thousands in multi-generation citation analyses), text mining will make comprehensive assessments of research impact feasible.
"The results are also important in relation to the sponsorship of basic research. Over the past decade, the trend in both industry and government has been toward requirements-driven research. Globally, governments favour 'strategic research' over 'blue skies research', while industrial research is often funded on a profit-centre basis.
"While this type of needs-driven research may be beneficial in the short-term, it could prove a disadvantage in the long term. Would fundamental sand pile research, for instance, receive funding from fusion, air traffic-control or materials programmes, even though it could impact these or many other applications, as shown by citation mining?
"It is necessary to stress that sponsorship of some unfettered fundamental research must be protected to maintain the strategic long-term benefits it has on global technology and applications."
As the authors describe citation mining:
"Citation mining -- a combination of citation analysis and text mining (i.e., extraction of useful information from text) -- is another approach that can measure the impact, both direct and indirect, of fundamental research.
"Citation mining starts with a group of core papers, retrieves papers that cite these core papers, and then analyses various characteristics of these citing papers to determine the overall impact of the original papers."
The authors argue that traditional quantitative analysis of the impact of research -- such as the market share taken by a new technology -- underestimates the value of basic physics, which tends to have much longer-term effects.
In their words, "Basic physics research has both direct and indirect impacts on numerous areas of science and technology, but many assessment methods underestimate the sum total of these impacts."
They conclude, "The challenge over the next decade is to develop quantitative impact measurement techniques that are more suited to the unique characteristics of basic physics research, and will therefore provide a more balanced account of the impact of physics research. The approaches described in this article -- publications and patent citation techniques, econometric models, network modelling and citation mining -- are a small step in that direction, but much more research is needed."
(Reference: Physics World Digest June 2001.)
[Contact: Dr. Ronald Kostoff]
30-May-2001