Astronomy Thesaurus Introduction

R.M. SHOBBROOK & R.R. SHOBBROOK

Jump to one of these topics by clicking on the link:

INTRODUCTION

The main aim in producing this reference work is to try to standardise the terminology in the field of astronomy for the purposes of aiding unambiguous library cataloguing and more precise recall of data from computer databases. The thesaurus is intended for use by astronomy librarians and scientists. Its compilation was originally requested during a meeting of Commission 5 (Documentation) of the International Astronomical Union at the New Delhi General Assembly of the IAU in 1984.

Vocabulary control has become essential in the use of online databases, particularly where those responsible for data input and for searching databases are nonspecialists. Consistency, particularly in the input to computerised databases, is imperative if there is to be high precision recall in information retrieval. Conversely, for those accessing databases (the output process), the selection of relevant terms or those most specific to the information request, will result in higher precision recall. The thesaurus is also designed to give the librarian, cataloguer or indexer, as well as the astronomer, a context for astronomical terms with which they may not be familiar.

Sections 1 to 6 of this introduction describe the background to the compilation of the thesaurus and technical details of its construction. For information on its use in daily work, the reader may wish to proceed to Section 7, though knowledge of the meaning of the acronyms PT, U, UF, SN, BT, NT and RT, discussed in Section 4, is also required.

SUBJECT COVERAGE

It is expected that most of the terminology used in current research will have been included. Although many terms in regular use may not appear explicitly, they may be synthesized using Boolean logic for keyterm selection and for database searches. The list of primary terms was initially selected from existing subject authority lists as well as from recognised reference texts in the field; the basic reference sources used to compile the thesaurus are given in the Appendices. It has been supplemented and amended according to suggestions from a number of workers in various fields of astronomy during its evaluation in 1992. The authors have included those terms which are clearly related to astronomy but they have not included all aspects in broader subjects such as history, instrumentation, electronics, computing, physics, rocketry and space research. There are other thesauri which deal specifically with these areas, particularly the NASA Thesaurus (NASA SP- 7064) and the INSPEC Thesaurus (1987). In the interests of consistency in the field of astronomy, it is suggested that the terms used in this thesaurus be preferred to similar terms which may be found in other thesauri.

CONSTRUCTION OF THE THESAURUS

For a background to the early development of the project you are referred to the paper presented at the Washington IAU Colloquium No 110: Library and Information Services in Astronomy. (Shobbrook, 1989)

The American National Standards Institute (ANSI) standard No Z39 (1980) for thesaurus construction was used and the software package employed for building the thesaurus was LEXICON, published by Brisbane Business Centres, Brisbane, Australia (1989). Information on the principles of thesaurus construction was obtained from many sources, the most useful of which were Foskett (1982), Aitcheson & Gilchrist (1987), the American National Standards Institute (1980) and the British Standards Institution (1979).

The terms which have been selected reflect as far as possible 'common usage' by professional astronomers. There are international differences in usage but we hope with this thesaurus to minimise those differences and to encourage the use of the more widely accepted form of the terms. The BROADER, NARROWER and RELATED terms in the thesaurus shows the relationship of the terms one to another. The user is alerted to synonymous terminology and is also referred from a term not in (the widest) current usage to the term preferred.

American spelling has been adopted and users should take care either to truncate or to select both 'international' English and American English in their search strategy. Librarians who are using the thesaurus for assigning keyterms when cataloguing may prefer to use the most common spelling in their country for the term. They should annotate their copy of the thesaurus and make references accordingly.

ABBREVIATIONS

The abbreviations used in the thesaurus are described below.

PRIMARY TERM (PT).
In the thesaurus all terms in UPPER or lower case bold type are the primary terms for each entry. For the difference between upper and lower case terms, see under 'Use' below.

SCOPE NOTE (SN) or message.
These notes are in normal text, except that references to other PTs are in upper case. The notes have been used for the following purposes:

a) To instruct in the use of a term for indexing and retrieval, e.g. ENERGY - (SN) Combine with other terms
b) To define and to limit use, e.g. GALACTIC - (SN) Used as an adjective only for Milky Way properties.
c) To explain a principle of division (different meanings for the same word), e.g. ABERRATION - (SN) Of starlight; ABERRATIONS - (SN) Of optical images.

U: USE.
Directs users to a preferred term. The term following U will always be in UPPER CASE. The primary term is in bold type - as for all PTs - but in lower case. The user should turn to the given (U) upper case term to find the BTs, NTs and RTs (see below).

Note that the lower case terms, since they are often perfectly acceptable working alternatives to the upper case terms, may well be used as search terms for references to older publications or data. They are included because they were in fairly common use at some time, or by some authors. A typical example is BETA CEPHEI STARS. The currently accepted term is as just stated, but before about 1980 they were often called Beta Canis Majoris stars; many publications concerning these stars will not be found using only the first term.

UF: USE FOR.
The inverse of U(se). Terms - always in lower case - under the (upper case) primary term are those for which the primary term is to be substituted.

BT: BROADER TERM.
Indicates to which class or genus the term belongs. A term may belong to more than one class and thus may have more than one broader term, e.g.

B STARS are both EARLY TYPE STARS and YOUNG STARS.

NT: NARROWER TERM.
Terms which are sub-categories of the main term.

RT: RELATED TERM.
Directs a user to a selection of other terms which are closely related or associated with the PT but are not sub-categories of it. The RT list will often indicate other terms which might be used instead of, or in conjunction with, the PT.

The U/UF and the BT/NT relationships are reciprocal. In the LEXICON software, when 'TERM1' is entered as an NT for 'TERM2', then 'TERM2' is automatically listed as a BT for 'TERM1'. Similarly, the RT relationship is reciprocal; if TERM1 is entered as an RT for TERM2, then TERM2 is automatically listed amongst the RTs of TERM1.

It is important to realise that in a thesaurus, the BT/NT relationship is that of genus/species. It is not a 'whole/part' relationship. For example, ECLIPSING VARIABLE STARS and PULSATING VARIABLE STARS are NTs of VARIABLE STARS, since they are types of variable stars. On the other hand, PLANETS and SATELLITES are not NTs of SOLAR SYSTEM; they are parts of the solar system and appear only as RTs.

The list of RTs especially (though also to a lesser extent the BTs and NTs) for most terms is usually by no means exhaustive and no attempt has been made to include all terms conceivably related to each PT. Often a related or a broader term may be obvious from the words included in the PT itself and these words are often not listed in the subterms. Perhaps the best example is the word STARS. This word does not appear as a BT anywhere; i.e. STARS has no NTs - the list would be several pages long!

THE LENGTH OF THE THESAURUS

To control the size of the thesaurus without affecting its usefulness, a few general rules have been followed. They are:

  1. The more general terms in the list (such as RADIO ASTRONOMY, OPTICS, PHOTOMETRY, VARIABLE STARS, etc.) generally do not have a complete list of NTs, often including terms only one step down in the hierarchy. The hierarchical list (described in Section 7) can be expected to provide a complete 'tree' of terms for the broader subjects.

  2. In a database search, for instance, terms may be combined with the Boolean operators AND, OR and NOT. For example, we may combine: M STARS (AND) TEMPERATURE; or, we may broaden the search with: LATE TYPE STARS (AND) TEMPERATURE. Pre-coordination of terms into sub- categories has been kept to what we believe is a sensible level. For instance, STELLAR ATMOSPHERES (a major subject field) is used as an important subterm (BT, NT or RT) for many other terms in the thesaurus, but 'stellar magnetic fields', 'stellar radio sources' and 'stellar photometry' are not. Such pre-coordination has been kept to a minimum in the expectation that Boolean logic will be used to form search statements for most of the sub-categories.

  3. The adjectival form GALACTIC has been used mainly to refer to the MILKY WAY galaxy. For extragalactic aspects the term GALAXIES is suggested, e.g. GALAXIES (AND) GLOBULAR CLUSTERS would refer to clusters in other galaxies, whereas GALACTIC GLOBULAR CLUSTERS would refer to those in our own galaxy. An exception has been made in the case of GALACTIC NUCLEI and ACTIVE GALACTIC NUCLEI, since the latter term at least is well entrenched in the literature. In a database search, truncation symbols (e.g. ?, *, #, depending upon the software used) can be used to capture relevant items but may return a higher proportion of unwanted references.

  4. Names of celestial objects have generally been excluded. They are present only as names of types of objects such as S DORADUS STARS or BL LACERTAE OBJECTS for which the object itself is the prototype. It is strongly recommended that references to, or searches for, specific objects or features should use the nomenclature in the First Dictionary of the Nomenclature of Celestial Objects (Fernandez, Lortet & Spite, 1983).

    Although an author will probably prefer his paper title to refer to NGC 4755, or the Jewel Box, rather than to OCl 1250-600, the latter type of designation from the abovementioned publication should be used in the keyword list at least. Of course, the first two designations would always be included in a search profile if all earlier references are to be found. The rules are essentially those described for thesaurus terms under 'U' and 'UF' in Section 4 of this introduction.

In a few instances lower case letters have been included in a term, even though many databases are case-insensitive. Sometimes lower case letters are an important component of the term, such as in Be STARS or Ap STARS, so that for clear identification of the term the correct case is helpful.

Hyphens are not used. In the case of such terms as HERBIG-HARO OBJECTS or WOLF-RAYET STARS, the hyphen is replaced by a space. In the case of prefixes which are not words, no space replaces the hyphen (e.g. PREMAINSEQUENCE, SEMIREGULAR VARIABLES).

The possessive 's' is not used; for instance, instead of GOULD'S BELT, we have used GOULD BELT. In many cases of laws, constants or theories, it is already customary not to use the possessive case, but to use the name as an adjective (RAMAN EFFECT, HUBBLE LAW, etc.). We have adopted this convention in all cases in the interests of consistency.

The ANSI thesaurus standard rule for plurals has been used for nouns. However truncation symbols (e.g. '*') are recommended for a database search profile in order to capture all relevant documents. For instance, a search for SUPERNOVA* will find all references to SUPERNOVA, SUPERNOVAE and SUPERNOVAS.

THE APPENDICES

Terminology which can best be displayed in a table format has been placed in Appendices. It is intended that these terms (the planets, their satellites, the chemical elements and the constellations) be considered as part of the thesaurus and combined with other terms using Boolean operators. Their inclusion in the main thesaurus, together with the large number of terms with which they might be combined, would have added considerably to the bulk of the thesaurus without adding significantly to its information content. Terms such as LITHIUM ABUNDANCE or JUPITER ATMOSPHERE may be synthesised using the Boolean 'AND'. Abbreviations for the constellations and the elements will often be found in the literature but we suggest that the full words be used in keyword lists.

The lower case Greek letters are also included in the appendices. We have used the complete English word in the thesaurus (as in ALPHA PARTICLES, BETA CEPHEI STARS, H ALPHA, for instance), but have included the accepted two or three letter abbreviations for completeness.

THE HIERARCHICAL DISPLAY

The thesaurus software LEXICON has enabled the construction of an hierarchical display, where each indent across the page represents one step down in the hierarchy. The list displays the broader and narrower relationships and is useful for showing much more extensive and detailed hierarchical relationships than is the main thesaurus.

As has already been mentioned in Section 5, in many broad subject areas in the main thesaurus the lists of NTs are usually only one step down in the hierarchies. In the hierarchical list, the whole relationship 'tree' may be inspected at a glance when the uppermost term in a tree has been established. Inspection of the whole tree will often be the quickest method of selecting the most specific keyterms available for the precise description of a subject or publication.

Often a term may appear at two different levels in the hierarchical display. For instance, O STARS and B STARS have both BLUE STARS and OB STARS as broader terms; yet OB STARS are also a subset of BLUE STARS. The hierarchy is therefore (in part):

                    BLUE STARS
                    .       B STARS
                    .       O STARS
                    .       OB STARS
                    .       .       B STARS
                    .       .       O STARS

In this case it is considered that both the broader terms should be included for increased clarity of description of O STARS and B STARS, rather than only the one immediately above (viz. OB STARS) in the hierarchy.

USE OF THE THESAURUS

It is envisaged that the thesaurus will be used for the following main purposes.

  1. Assigning keyterms to scientific papers; this is usually best done by the authors themselves, rather than by non-specialists. If the thesaurus is to serve as originally intended by Commission 5 of the IAU - to standardise the astronomical terminology - , then authors should use only the upper case terms in the thesaurus in their keyterm lists.

  2. Extraction of references from databases, when the use of the same list of terms as is used by the authors can be expected to result in high precision recall of relevant data.

  3. Assisting in the updating of library classification schemes.

  4. It may be included in 'intelligent' databases, to help the user select alternatives or extensions to a searcher's initial list, as in proximity searching.

Other uses suggested to us have been as a source of terms for exclusion as passwords for observatory computers and as a supplementary dictionary of words used by a spelling checker for word processors.

In selecting terms for a particular purpose, users would have in mind a short list of terms relevant to their subject. For instance, an information manager would have a good idea of the subject matter of a publication to be classified, authors would select the main subject(s) of their publication, and database searchers would have a list of terms about which they wanted information.

However, initial statements for input or searching often need adjustment. Such a need may be apparent from the degree of relevance of the recalled data, in which case alternative or additional terms may be taken from either the thesaurus or the hierarchical list. It should be clear that the dangers inherent in the broadening or narrowing of search profiles (collecting irrelevant data and missing relevant data, respectively) will be significantly reduced if the thesaurus is used both for the original keyterm assignment and for the database search profiles.

Since the terminology in astronomy, as in many other fields, is continually growing and changing, it is recognised that this thesaurus will have to be updated frequently, preferably at intervals of no more than about two years. However, we plan to issue updates only as computer-readable files until a significant change (more than about 10%) has occurred.

ACKNOWLEDGMENTS

Building a thesaurus is a complex and time-consuming task and this astronomy thesaurus would not have been completed without the help and active support of a number of people. Initially, there was the dedication of the astronomy librarians listed on the title page who compiled the working list of terms from the sources mentioned above They have also given continuous moral and practical support.

Further critical contributions during the compilation of the thesaurus were provided by many librarians and astronomers. We cannot mention them all, but would like to thank particularly (in alphabetical order) David Allen, Heinz Andernach, Bob Bray, Ellen Bouton, Marlene Cummins, Helen Knudsen, John Shakeshaft, Wayne Warren and George Wilkins and for their help. In addition Ray Walsh, the supplier of LEXICON, has given generous support to his software.

This work has been funded primarily by the International Astronomical Union, with significant assistance also from the Anglo-Australian Observatory and the University of Sydney.

The authors and their collaborators hope you will find this thesaurus a most useful reference tool.

BIBLIOGRAPHY

Aitchison, J. & Gilchrist, A.,. "Thesaurus Construction: a practical manual", 2nd ed., London, ASLIB. 1987

American National Standards Institute, Inc. Guidelines for thesaurus structure, construction and use. ANSI Z39.19. New York, ANSI, 1980.

British Standards Institution. Guidelines for the establishment and development of monolingual thesauri. BS 5723. London, BSI, 1979.

Foskett, A.C., "The Subject Approach to Information", 4th ed., London, Clive Bingly, 1982.

Shobbrook, R. M. IAU Colloquium 110: Library and Information Services in Astronomy., edited by Stevens-Rayburn, S. and Wilkins, G. Washington DC, U.S. Naval Observatory, 1989.

Townley, H.M. & Gee R.D. "Thesaurus-Making: grow your own word-stock." London, Andre Deutsch, 1980.

[Back to the Top]


[Thesaurus Home] [Home] webmaster@mso.anu.edu.au