Chapter 25: Plant Identification

CHAPTER 25. PLANT IDENTIFICATION

Identification is a basic activity and one of the primary objectives of systematics. Although identification is a separate activity or process, in practice it involves both classification and nomenclature. Identification is simply the determination of the similarities or differences between two elements, i.e., two elements are the same or they are different. The comparison of an unknown plant with a named specimen and the determination that the two elements are the same also involves classification, i.e., when one correctly decides that an unknown belongs to the same group (species, genus, family., etc.) as a known specimen, the information stored in classification systems becomes available and applicable to the material at hand. Both processes--identification and classification--involve comparison and judgment and require a definition of criteria of similarities. Identification is, therefore, a basic process in classification with nomenclature playing an essential role in the retrieval of information and as a means of communication. According to Blackwelder (1967) "identification enables us to retrieve the appropriate facts from the system (classification) to be associated with some specimen at hand" and is "better described as the recovery side of taxonomy." In practice one commonly identifies a plant by direct comparison or the use of keys and arrives at a name. The practical aspects and methods of plant identification and identification systems are discussed in this chapter. For further information see Harrington and Durrell's book How to Identify Plants.

Section A. TRADITIONAL IDENTIFICATION METHODS

The methods of identification include (1) expert determination, (2) recognition, (3) comparison, and (4) the use of keys and similar devices. For a thorough and technical discussion of specimen identification see Sneath and Sokal (1973).

In terms of reliability or accuracy the best method of identification is expert determination. In general the expert will have prepared treatments (monographs, revisions, synopses) of the group in question, and it is probable that the more recent floras or manuals include the expert's concepts of taxa. Although of great reliability, this method presents problems by requiring the valuable time of experts and creating delays for identification. Recognition, according to Morse (1971) approaches expert determination in reliability. This is based on extensive, past experience of the identifier with the plant group in question. In some groups this is virtually impossible. A third method is by comparison of an unknown with named specimens, photographs, illustrations or descriptions. Even though this is a reliable method, it may be very time consuming or virtually impossible-due to the lack of suitable materials for comparison. The reliability is, of course, dependent on the accuracy and authenticity of the specimens, illustrations, or descriptions used in the comparison. The use of keys or similar devices (synopses, outlines, etc.) is by far the most widely used method and does not . require the time, materials, or experience involved in comparison and recognition.-

Keys in the traditional sense are a type of taxonomic literature (see Figures 25-1 & 25-2). Keys are devices consisting of a series of contrasting or contradictory statements or propositions requiring the identifier to make comparisons and decisions based on statements in the key as related to the material to be identified. The first modern type keys (dichotomous) clearly designed for identification were those of Lamarck in his Flore Francaise in 1778 (see Voss, 1952, for an interesting account of the history of keys and phylogenetic trees in systematic biology). The comments of A. P. de Candolle in his dedication to Lamarck in the third edition of this flora concerning keys are equally appropriate today:

"As to the artificial methods I have without hesitation, given preference to the one which you have contrived, and which consists, of leading the student to the name of the plant by always forcing. him to choose between two contradictory characters: in this I have permitted myself only the slight changes necessitated by the increase in the number of plants described. There, after your example, I have sought to distinguish the plants by the easiest and most apparent characters; and then these characters were not constant, I have tried to foresee their aberrations and to arrive at the same name by different routes; but this ease in the distinguishing of plants is very different in different families: in some, such as the crucifers, it is impossible to distinguish the genera without examination of the fruit. . . . When beginners undergo these difficulties in the use of the analytic method, I beg them, before blaming it, to reflect that the most accomplished botanists meet with the same embarrassment, and -that no method can make the work easier to students than it is to the masters . . . . But when the pupil knows the name, let him take care not to think he knows the thing! Referred by a number in the analytic method to the description, he will find in this second part the details which put together constitute the whole science." (Quoted from translation by Voss, 1952)

Section B-III of this chapter includes a discussion of the construction of identification keys and the application of computers to this process.

The following lists of suggestions are for the use and construction of traditional dichotomous keys.

Suggestions for the Use of Keys

Select appropriate keys for the materials to be identified. The keys may be in a flora, manual, guide' handbook, monograph, or revision (see Chapter 30). If the locality of an unknown plant is known, select a flora, guide, or manual treating the plants of that geographic area (see Guides to Floras in Chapter 30). If the family or genus is recognized, one may choose to use a monograph or revision. If locality is unknown. select a general work. If materials to be identified were cultivated, select one of the manuals treating such plants since most floras do not include cultivated plants unless naturalized.
Read the introductory comments on format details, abbreviations, etc. .before using the key.
Read both leads of a couplet before making a choice. Even though the first lead may seem to describe the unknown material, the second lead may be even more appropriate.
Use a glossary to check the meaning of terms you do not understand.
Measure several similar structures when measurements are used in the key, e.g. measure several leaves not a single leaf. Do not base your decisions on a single observation It is often desirable to examine several specimens.
Try both choices when dichotomies are not clear or when information is insufficient, and make a decision as to which of the two answers best fits the descriptions.
Verify your results by reading a description, comparing the specimen with an illustration or an authentically named herbarium specimen.

Suggestions for Construction of Keys

Identify all groups to be included in a key.
Prepare a description of each taxon (see Chapter 24 for details for description and descriptive format).
Select "key characters" with contrasting character states. Use macroscopic, morphological characters and constant character states when possible. Avoid characteristics that can only be seen in the field or on specially prepared specimens, i.e., use those characteristics that are generally available to the user.
Prepare a Comparison Chart (see Figure 25-3).
Construct strictly dichotomous keys.
Use parallel construction and comparative terminology in each lead of a couple.
Use at least two characters per lead when possible.
Follow key format (indented or bracketed see Figures 25-1 and 25-2).
Start both leads of a couple with the same word if at all possible and successive leads with different words.
Mention the name of the plant part before descriptive phrases, e.g., leaves or flowers blue not blue flowers, leaves alternate not alternate leaves.
Place those groups with numerous variable character states in a key several times when necessary.
Construct separate keys for dioecious plants, for flowering or fruiting materials and for vegetative materials when pertinent.

A DICHOTOMOUS KEY TO SELECTED GENERA OF SAXIFRAGACEAE

Shrub or woody vine. Woody vine; petals 7 or more 3. Decumaria Shrub; petals 4 or 5. Leaves alternate or on short spur branches. Leaves pinnately veined; ovary superior; fruit a capsule 1. Itea Leaves palmately veined; ovary inferior; fruit a berry 2. Ribes Leaves opposite. Petals usually 4;-stamens 20-40; fruit longitudinally dehiscent, not ribbed; 4. Philadelphus Petals usually 5; stamens 8-10; fruit poricidally dehiscent, 10- to 15-ribbed 5. Hydrangea Herbs. Staminodia present; petals more than 10 mm long 6. Parnassia Staminodia absent; petals less than 10 mm long. Leaves ternately decompound 7. Astilbe Leaves simple. Flowers solitary in leaf axils, or in short, leafy cimes. Sepals 4; carpels 2 8. Chrysosplenium Sepals 5; carpels 3 9. Lepuropetalon Flowers in racemes or panicles. Petals pinnatifid or fringed; stem leaves opposite 10. Mitella Petals not pinnatifid or fringed; stem leaves alternate or absent. Ovary 1-celled. Inflorescence paniculate; stamens 5 11. Heuchera Inflorescence racemose; stamens 10 12. Tiarella Ovary 2-celled. Stamens 5; leaves palmately lobed 13. Boykinia Stamens 10; leaves not palmately lobed 14. Saxifraga

Figure 25-1. Example of an indented (yoked) key. (From Radford, A. E., H. E. Ahles, and C. R. Bell. 1968. Manual of the Vascular Flora of the Carolinas. University of North Carolina Press. Chapel Hill, North Carolina. Used with permission.)

A DICHOTOMOUS KEY TO SELECTED GENERA OF SAXIFRAGACEAE

1. Shrub or woody vine 2. 1. Herbs 6. 2. Woody vine; petals 7 or more Decumaria. 2. Shrub; petals 4 or 5 3. 3. Leaves alternate or on short spur branches 4. 3. Leaves opposite 5. 4. Leaves pinnately veined; ovary superior; fruit a capsule Itea. 4. Leaves palmately veined; ovary inferior; fruit a berry Ribes. 5. Petals usually 4; stamens 20-40; fruit longitudinally dehiscent, not ribbed Philadelphus 5. Petals usually 5; stamens 8-10; fruit poricidally dehiscent, 10-15 ribbed Hydrangea. 6. Staminodia present; petals more than 10 mm long Parnassia. 6. Staminodia absent; petals less than 10 mm long 7. 7. Leaves ternately decompound Astilbe. 7. Leaves simple 8. 8. Flowers solitary in leaf axils, or in short, leafy cymes 9. 8. Flowers in racemes or panicles 10. 9. Sepals 4; carpels 2 Chrysosplenium. 9. Sepals 5; carpels 3 Lepuropetalon. 10. Petals pinnatifid or fringed; stem leaves opposite Mitella. 10. Petals not pinnatifid or fringed; stem leaves alternate or absent 11. 11. Ovary 1-celled 12. 11. Ovary 2-celled 13. 12. Inflorescence paniculate; stamens 5 Heuchera. 12. Inflorescence racemose; stamens 10 Tiarella. 13. Stamens 5; leaves palmately lobed Boykinia. 13. Stamens 10; leaves not palmately lobed Saxifraga. Figure 25-2. Example of a bracketed key. (Modified from Radford, A. E., 11. E. Ahles, and C. R. Bell. 1968. Manual of the Vascular Flora of the Carolinas. University of North Carolina Press. Chapel Hill, North Carolina. Used with permission.)

PLANT IDENTIFICATION EXERCISE

1. Identification of an unknown. Select an unknown specimen and identify it by keying in an appropriate manual, flora, or monograph. Verify your results by reading a description, by comparing with an illustration or by checking with your instructor.

2. Preparation of a comparison chart. Select 5 or more specimens from the group provided by your instructor. Identify each by keying. Verify your results. Prepare a description of each similar to those in a flora or manual. Be sure characters and character states are in the same order. Select contrasting character states and prepare a comparison chart (see Figure 25-3).

3. Construction of keys. Construct a dichotomous key to these specimens using the information in the comparison chart.

COMPARISON CHART
	Decumaria	Itea	Ribes	Parnassia	Heuchera	Saxifraga
Habit	Woody vine	Shrub	Shrub	Herb	Herb	Herb
Leaf arrangement	Opposite	Alternate	Alternate or on spur roots	Basal (Rosulate)	Basal (Rosulate)	Basal (Rosulate)
Petal Number	7-10	5	5	5	5	5
Locule Number	7-10	2	1	1	1	2
Stamen Number	7+	5	5	5 (stamonodia 5)	5	10
Fruit Type	Capsule	Capsule	Berry	Capsule	Capsule	Capsule

Figure 25-3. A comparison chart used in the construction of keys (for six of the genera in Figures 25-1 and 25-2).

Section B. RECENT AND NEW IDENTIFICATION METHODS*

*Adapted from "Specimen Identification and Key Construction with Time-Sharing Computers" by Larry E. Morse (Harvard University, Cambridge, Massachusetts), in Taxon 29: 269-282 (1971), with extensive revisions by Mr. Morse. Used with permission.

I. POLYCLAVES

Polyclaves of various kinds allow one to select the characteristics for use in identifying each specimen, taking his choices from some character set and repeating an elimination process until a tentative identification is made. A printed data table, chart, or matrix giving the status of various taxa for useful characteristics is readily used as a polyclave by listing the possible taxa on scratch paper and crossing out those which do not agree with the specimen's characters. Such data tables appear irregularly in the taxonomic literature, often for only the more difficult groups involved but occasionally for all the treated taxa, as done for medical bacteria by Cowan and Steel (1965). For large groups, the diagnostic tables are not only more powerful than the equivalent key, but also take less space to print. Lists of taxa having various characters were among the first nontabular polyclaves. These resemble the inverted files common in computerized information systems, where entries are listed according to their characteristics rather than characteristics by entries. Lists of taxa lacking specified features have also been produced; this modification expedites use as one may then jot down the possible taxa dn-d rapidly cross off those differing from the specimen. Polyclaves are readily mechanized, as shown by the familiar edge-punched cards and the less familiar window keys, as well as various mechanical devices. The possibility of a computerized polyclave was noted by Sokal and Sneath (1966) and by Williams (1967); implementation is straightforward once suitable data formats have been devised. The computerized polyclave system we developed at Michigan State University (Morse, in press) uses a General Electric Mark II timesharing system, but has also been tested on several other computers., Other computerized polyclaves include those of Boughey, Bridges, and Ikeda (1968), Dybowski And Franklin (1968), Goodall (1968), and Walker et al. (1968). As yet no comparison of these many parallel approaches has been made.

If an identification method requires, in general, use of all the characters in some list, it is no longer a polyclave, for no user options remain in selecting characters. Several such character-set methods have been devised, mostly statistical. Some computerized "keys" are also based on this model (e.g., Gyllenberg, 1965; Bogdanescu and Racotta, 1967; and Rypka, 1971). Although well intended, character-set algorithms seem to offer no advantages over polyclaves. However, pattern recognition methods, using no formal characters, might be employed effectively for identification in some cases. Such techniques linked with optical scanners, spectroscopy, or chromatography could even offer fully automated identification.

II. TAXA, CHARACTERS, AND DATA MATRICES

An objective identification system requires a priori information of three kinds: the pertinent taxa, the useful differentiating characters of these taxa, and the taxon/character data themselves. ,

The hierarchy of natural taxa is the foundation of biological information retrieval as well as systematic synthesis, since data can be stored, retrieved, and studied at any level of generalization (Morse, 1974). In data processing, we may usually regard a taxon as a group of one or more individuals or lower taxa judged sufficiently similar to each other to be treated together formally as a single evolutionary or informational unit at a particular level in the taxonomic hierarchy, and sufficiently different from other such groups of the same rank to be treated separately from them. Traditionally each taxon (except the highest) belongs to one and only one taxon of the next higher rank, implying each individual belongs to exactly one species (and has one name) in any particular taxonomic treatment of its group. Taxa are either monothetic or polythetic. For monothetic taxa, possession of a certain set of diagnostic characters is both necessary and sufficient for membership. Polythetic taxa are more loosely circumscribed, since presence of only "a large number" (rather than all) of a list of characters is required for membership. In other words, the members of a polythetic taxon exhibit overall similarity. Modern systematists employ the polythetic taxon concept in most of their work, as has been widely recognized in the past decade. However, dichotomous keys and other monothetic methods of identification are still used. Theoretically, identification schemes for modern biology ought to assure that no single-character difference, either in population variability or observer error, can result in a misidentification. Polythetic polyclaves offer a solution; here, no possibility is eliminated until several differences have accumulated between the taxon description and the unknown specimen. The appropriate threshold depends on the variability of the taxa, but toleration of just one or two differences often gives a marked improvement in identification success.

Our character model was originally developed for a list-structure representation of dichotomous keys (Morse, Beaman, and Shetler, 1968). It centers on the concept of character couplets (or two-state characters), and also allows coding of "dependent characters" (Williams, 1969). In our model, each character couplet represents' two contrasting alternatives about possible features of an individual population. (For convenience we often label the possibilities A and B, or true and false.) In other words, the character couplets are equivalent to the pairs (couplets) of contrasting leads in dichotomous keys, specifying a part of the organism and asking which of two alternative modifications of it is present. Such a question implies the specified part itself is present--the couplet "petals red" versus "petals yellow" is meaningless for apetalous plants as well as those having petals of different colors. A couplet can thus be inapplicable or non-comparable, often coded NC in numerical-taxonomy data An additional code is needed when a specimen is variable in its expression of the character states of a couplet; more precisely, variable means sometimes A and sometimes B. At other times the state of a couplet is not known for a specimen, perhaps because the character is difficult to determine or not recorded, or the specimen is incomplete, poorly preserved, or at the wrong stage of development; these cases are coded unknown. All binary-couplet characters can be readily encoded with these five character states, listed below with our code numbers for them:

0 =unknown

3 = false, or B

1 = true, or A

4 = inapplicable

2 = variable

Each character couplet in our model is a hierarchy of two binary (two-state) characters, one implicit and one explicit. Expression of either state of the explicit character depends on the presence of one particular state of the implied more general character--petal color depends on petal presence. As well illustrated in a diagram by Dale (1968), larger character hierarchies can be built by combination of a number of such couplets.

Extensions to multi-state and quantitative characters are, of course, necessary before our methods can be seriously considered for handling taxonomic information in general. For this we implemented a file system involving three kinds of taxon/character data: two-state characters, as described above; multi-state characters, and quantitative (numeric) characters (Morse, Peters, and Hamel, 1971). Multi-state and quantitative characters may also be coded as couplets for use with the current programs. The former can always be expressed as a list of yes/no binary characters, while the ranges of quantitative characters may be segmented as appropriate. Details of the MSU programs are available elsewhere (Morse, in press).

For identification and allied procedures, the preparation of data matrices as taxon/6haracter rather than specimen/character entries is preferable. In compiling effective taxon/character descriptions it is of course important to study an adequate sampling of specimens and other information, including the relevant literature, as otherwise the data cannot indicate the nature of the taxon as a whole. Yet if enough examples of a taxon were examined, virtually all the characters studied would eventually be marked variable, rendering the whole description useless. However, with polythetic identification methods the occasional oddity can be safely ignored in preparing a taxon description. For polythetic taxa, the code variable means noticeably variable, while A (or B) means usually A (or B), perhaps 80% (Cowan and Steel, 1965) or 85% (Lapage et al., 1973), or a number calculated from the data (Moller, 1962). In our present system, taxon/character data are presented in the format illustrated elsewhere (Morse, it press; Shetler et al., 1971). Briefly, such files consist of a taxon list, a character list, and a coded taxon/character data matrix, with additional fields for various parameters used by the programs. Allowance is also made for extensive documentation.

Hierarchical matrices aid in processing of taxonomic data since different characters may be used for different subgroups of taxa. In a hierarchical taxonomic matrix system, each taxon-description line may refer to another complete matrix differentiating subordinate taxa, permitting programs to work down through the hierarchy much as one uses a key to families, then a key to genera, and then a key to species. This method also allows the user to query the data bank beginning at any level he desires--if he knows his specimen is a Rhododendron, he need not first verify that it is an angiosperm!

In preparing identification data, one should include only characters of potential value in making identifications. A number of computer algorithms are available for character-set selection, but subjective screening appears ordinarily adequate for choosing a manageable set of characters for further study. The traditional key characters of the group offer an initial list, but new diagnostic characters should be sought, especially in difficult groups. Guidelines for character selection, offered in a multitude of works, include the following: presence in the usual material to be identified, ease of observation and interpretation, distinctness between the various states, independence from other characters used, and tolerance to environmental influences. Identification characters must not only be well expressed in nature but should also be clearly expressible verbally or pictorially. 06casionally one should describe what the user will think he is seeing, whether or not this be the actual case morphologically. The previous experience of the intended users should also be considered, and especially in educational works the characters might be easily learned and remembered in association with their taxa.

III. CONSTRUCTION OF IDENTIFICATION KEYS

The presentation of dichotomous keys is usually considered a-mandatory part of scholarly taxonomic publication, yet until recent years few advances were made in key-constructing theory over the methods of Grew or Lamarck, or even Aristotle and Theophrastus (Voss, 1952). In his study of the theory of keys, Osborne (1963) concludes that the most efficient key is the one in. which each dichotomy divides the remaining taxa symmetrically, or as nearly so as possible-, assuming the characters have equal probabilities of misinterpretation and the taxa are equally abundant. However, in reality some characters are far easier to use than others, and some taxa are keyed out more often.

The relative conveniences or ease of use of the various characters should be a major consideration in key construction, as one wants the easiest and fastest key which is sufficiently accurate. Ledley and Lusted (1960) suggest the characters (disease symptoms in their paper) be grouped into numbered sets or blocks of similar difficulty. In making a diagnosis they employ all available characters from one block before using any from the next.. We allow nine such blocks, with the characters in block one hardest and those in block nine easiest; any characters labelled zero are completely ignored by some routines. In our programs we treat the block codes, or character convenience values as natural logarithms of a convenience-of-use-number. Thus the nine symbols represent a range of about three to eight thousand. When rating characters, one should assign codes such that the intended users would prefer to study any three characters at some level than any single character at the next lower level, seven to one for a two-level difference, twenty to one for three levels, and so forth. The conveniences may, of course, be revised with experience or a change in user groups. Hall (1970) considers conveniences which vary from taxon to taxon, but finds it easier to rate each character only once, instead of once per taxon.

If the relative frequencies of the various taxa can be estimated for a situation, then the taxa can be considered accordingly in determining the evenness of division at a dichotomy, in effect taking each expected occurrence as a separate object to be identified. Determining such frequencies is not at all easy, since we are interested in how often a taxon will be collected and keyed out, a quite different value than how often it occurs in the field.

The intra-taxon variability of the characters is also an important consideration in keys. In our methods, characters are coded as -either variables or as (nearly) constant, leaving selection of a variability threshold to the researcher preparing the data. An alternative approach often used involves stating the variability of each taxon for each character as a quantitative entry in the data matrix, usually as a percentage. This percentage matrix would be valuable in a general taxonomic information system, especially while data were being collected and variability patterns were developing. However, the true-false-variable matrices present their data more clearly, and take far less effort to prepare. Also, we know no programs which utilize the percentages directly in constructing keys.

Many texts recommend the use of data tables in writing keys, but other mechanical aids are rarely described. Metcalf (1954) presents an index-card technique, and Peters (1969) used a computer to help incorporate additional taxa in keys. While developing our key-constructing techniques we prepared a key-editing program (Morse, Beaman, and Shetler, 1968), but work on the advanced system" outlined there was suspended in favor of our current research on key construction and identification procedures. Key editing is important in large projects such as Flora Europaea and the planned Flora North America, where numerous minor improvements are made in the keys during editing. However, if taxa are added or deleted, it is often better to develop an entirely new key.

The possibility of computerized key construction is often mentioned, yet we know of only four programs for construction of biological keys, namely those of Moller (1962), Hall (1970), and Pankhurst (1970, 1971), as well as our own (Morse, 1971 and in press). Pankhurst (1974) provides a comparison of some of these programs. Moller's method requires complete binary data, and has attracted little attention. Hall�s program utilizes quantitative data, printing a numeric version of the key which must then be rewritten before use. Pankhurst's algorithm differs from ours primarily in his use of rigid character-convenience blocks and his employment of the attribute value rather than hierarchical-couplet character concept. His program, like ours, prints the key directly. The production version of our MSU programs is described below; these allow mixed-data key construction with our new data matrices. Several matrix-reduction and monothetic-devisive algorithms in the literature resemble key construction: the KEYCALC program by Niemalg, Hopkins, and Quadling (1968) is typical. Also, some aspects of decision-tree and game-tree research in computer science could contribute to the theory of keys.

In preparing a key, one usually divides the initial group of taxa by a character couplet into two subgroups, each of which is independently divided into further subgroups, and so forth, until every taxon is distinguished from all others. Indeed, the further subdivision of a subgroup can be considered as construction of a full key to that local group, suggesting a concise recursive algorithm for computerization: construct the kev by dividing the original taxa into two subgroups by the best possible character couplet, then consider each of these subgroups separately, dividing them similarly. It is readily established that this procedure will produce a key if one can be made at all, but the optimality of such a key apparently remains uninvestigated. The four key-construction programs cited above all operate on this principle.

For use of this algorithm one needs a means for selecting the "best" character couplet for a given dichotomy, given the appropriate taxon/character data for a set of potentially useful characters. In our program, a preliminary screening of the characters is followed by a detailed evaluation of the potentially useful ones. In the preliminary scan, a character is eliminated from further consideration if it is coded as unknown or as inapplicable for any taxon in the local group, or if it is coded in only one way within that group. In other words, the remaining characters are all coded partly true and partly false; true and variable; false and variable; or true, false, and variable.

For keys these are obviously the only characters of immediate interest. Next, a combined measure of ease of use and dichotomizing power is determined for each character. A tally is made for each state (true, false, and variable), giving the sums of the taxon frequencies of the taxa so coded for a character. In effect, these three tallies give the total number of encounters expected under each state of a character. The tally for variable is then multiplied by the sum of the other two tallies, and half this produce is added to twice the product of the true and false tallies, giving a "dichotomizing value."

DV = 2qTqF + (1/2)qV(qT + qF),

where qT, qF, and qV are the respective tallies. High values of DV indicate well-divided dichotomies. In considering the character conveniences, these ratings are treated in exponential (values 3 through 8103) rather than condensed (1-9) form. The "best" couplet is found by maximizing the product of the character convenience exponential and the dichotomizing value. The character couplet having this product greatest is then used for that dichotomy of the key, and the taxa in the local group are sorted according to their recorded status for that character, placing variable taxa in both subgroups. Note the character conveniences have a powerful but not exclusive effect, as a 5-5 split at convenience 6 (7254) as well as a 6-4 (or worse) split at 5 (7004).

If the indented style is desired, the program prints the key . as it selects the dichotomies. However, for the bracketed style the key must be stored (in our list-structure representation) until completed, since division of variable taxa affects the numbering. It is then printed in one operation. The list-structure condensation can also be printed if desired, perhaps for later use in editing.

Ordinarily a batch- processing computer is adequate for key construction since no user interaction during processing is necessary. However, on-line key construction allows the effects of changes in the data to be seen quickly. An on-line key-editing system would also be possible, but most changes can be made with the text-edititg software usually supplied with time-sharing systems. For small groups, these methods may also be used manually..

IV. DYNAMIC POLYCLAVES

Polyclaves need not be automated, but might be computerized when a large, rapidly developing information base is involved; otherwise, imaginative publication methods suffice (e.g., Ogden, 1953; Leenhouts, 1966, Archbald, 1967; Duke, 1969; Hansen and Rahn, 1969; Shultz, 1973). Here on-line (usually timesharing) computers offer a clear advantage over batch-processing systems, since the on-line system allows the user to submit additional data during the execution of a program. This permits a dialogue or conversation between the user and the computer, with the machine printing questions and awaiting responses before continuing the processing. Such conversations can be about as terse or verbose as desired; we have taken a middle course, writing out most questions and taxon names in full, but sometimes using numerical codes for character states and couplets, as well as for several program options. Numeric coding of lengthy answers saves printing or typing time and reduces the chances for errors, assuming the user can copy a short number more accurately than a long descriptive phrase. For regular users of the system a faster abbreviated terminology is planned. As yet, the ideal of free-form language input is not practical for this or almost any information-retrieval system (Simmons, 1970).

Following selection of a group of possible taxa by the choice of the initial data file, our polyclave algorithm consists of three steps, repeated as necessary: (1) request the user to give one or more characteristics of his specimen; (2) eliminate all possibilities inconsistent with this partial description; and (3) print the results of the elimination, either an identification or some other action, and recycle to the first step if necessary. Several alternatives are available, including production of a useful-characters list and deletion of the effects of the last character set submitted. Since a user may start with a data matrix at any level in the taxonomic hierarchy, continuation to a subsequent matrix may be possible-after a family, genus, or other higher-level taxon is identified. After each identification, several diagnostic or peculiar characters of the taxon are listed as an immediate check of the suggested identification. Particularly with higher level matrices, "false positives" may occasionally occur when no single species has a character combination suggested by the generalized description of a taxon.

Polythetic identification, where one difference no longer implies elimination, is available as an option on our computer system. Here the program tallies the number of differences and eliminates a taxon only when its tally exceeds a user-determined value, commonly one, two, or three. Although it allows for greater taxon variability or user error, the polythetic polyclave is slower than the monothetic one since more characters must be submitted to assure complete elimination of all the other taxa.

When a list of useful characteristics is desired part way through a polyclave procedure, a subroutine selects the best characters to continue dividing the set of remaining possibilities. Actually, a portion of the key-constructing algorithm is used to determine the first such character, and repetition of the procedure (ignoring this character) gives the next-best character, and so forth. With the polyclave, of course, such recommended characters are merely suggestions which need not be employed. No matter which state the specimen shows, the recommended character will eliminate about half the possibilities. Any other character, on the average, will delete fewer taxa because it eliminates a larger number only when in its rarer state. However, if the user happens to notice his specimen displays a rare character, he can eliminate a large number of possibilities at once and identify his specimen much more rapidly. Since the ability to recognize rare characters and realize their power comes only through training and experience, the expert delights in the efficiency and power of a polyclave, while the neophyte is lost in its multitude of choices and prefers the supervision and security of the traditional dichotomous key.

PLANT IDENTIFICATION LITERATURE

Archbald, D. 1967. Quick-Key Guide to Trees: Trees of Northeastern and Central North America. Doubleday. Garden City, New York.

Blackwelder, R. E. 1967. Taxonomy, A Text and Reference Book. John Wiley & Sons, Inc. New York.

Bogdanescu, V., & R. Racotta. 1967. Identification of mycobacteria by overall similarity analysis. Journal of General Microbiology 48: 111-126.

Bossert, W. 1969. Computer techniques in systematics. In: Systematic Biology. National Academy of Science Publication 1692. Washington, D. C.

Boughey, A. S., K. W. Bridges, and A. G. Ikeda. 1968. An automated biological identification key. Museum of Systematic Biology. University of California (Irvine).

Cowan, S. T., and K. J. Steel. 1965. Manual for the Identification of Medical Bacteria. Cambridge University Press. Cambridge, England. Dale, M. B. 1968. On property structure, numerical taxonomy, and data handling. In: Heywood, V. H. Modern Methods in Plant Taxonomy. Academic Press. London and New York.

Duke, J. A. 1969. On tropical tree seedlings. I. Seeds, seedlings, systems, and systematics. Annals of the Missouri Botanical Garden 56: 125-161.

Dybowski, W., and D. A. Franklin. 1968. Conditional probability and the identification of bacteria: a pilot study. Journal of General Microbiology 54: 215-229.

Goodall, D. W. 1968. Identification by computer. BioScience 18: 485-488. Gyllenberg, H. 1965. A model for computer identification of micro-organisms. Journal of General Microbiology 39: 401-405.

Hall, A. V. 1970. A computer-based system for forming identification keys. Taxon 19: 12-18.

Hansen, B., and K. Rahn. 1969. Determination of angiosperm families by means of a punched-card system. Dansk Botanisk Arkiv 26: 1-46 + 172 punched cards.

Harrington, H. D., and L. W. Durrell. 1957. How to Identify Plants. Swallow Press. Chicago.

Lapage, S. P,, S. Bascomb, W. R. Willcox, and M. A. Curtis. 1973. Identification of bacteria by computer: general aspects and perspectives. Journal of General Microbiology 77: 273-290.

Ledley, R. S., a,-id L. B. Lusted. 1960. The use of electronic computers in medical data processing: aids in diagnosis, current information retrieval, and medical record keeping. Institute of Radio Engineers Transactions on Medical Electronics 7: 31-47.

Leenhouts, P. W. 1966. Keys in biology: a survey and a proposal of a new kind. Proceedings Koninklijke Nederlandse Akademie Van Wetenschappen 69 (ser. C): 571-596.

Metcalf, Z. P. 1954. The construction of keys. Systematic Zoology 3: 38-45. N61ler, F. 1962. Quantitative methods in the systematics of Actinomycetales. IV. The theory and application of a probabilistic identification key. Giornale di Microbiologia 10: 29-47.

Morse, L. E. 1974. Computer-assisted storage and retrieval of the data of taxonomy and systematics. Taxon 23, in press.

_______, (in press). Computer programs for specimen identification key construction, and description printing using taxonomic data matrices. Publications of the Museum, Michigan State University (Biological Series). East Lansing, Michigan.

_______, J. H. Beaman, and S. G. Shetler. 1968. A computer system for editing diagnostic keys for Flora North America. Taxon 17: 479-483.

_______, J. A. Peters, and P. B. Hamel. 1971. A general data format for summarizing taxonomic information. BioScience 21: 174-180, 186.

Niemela, S. I., J. W. Hopkins, and C. Quadling. 1968. Selecting an economical binary test battery for a set of microbial cultures. Canadian Journal of Microbiology 14: 271-279.

Ogden, E. C. 1953. Key to the North American species of Potamogeton. New York State Museum Circular 31.

Osborne, D. V. 1963.@ Some aspects of the theory of dichotomous keys. New Phytologist 62: 144-160.

Pankhurst, R. J. 1970. A computer program for generating diagnostic keys. Computer Journal 13: 145-151.

_______. 1971. Botanical keys generated by computer. Watsonia 8: 357-368.

Pankhurst, R. J. 1974. Automated identification in systematics. Taxon 23, in press.

Peters, J. A. 1969. Discussion. In: Systematic Biology. National Academy of Science Publication 1692. Washington, D. C.

_______ and B. B. Collette. 1968. The role of time-sharing computers in museum research. Curator 11: 65-75.

Rypka, E. W. 1971. Truth table classification and identification. Space Life Sciences 3: 135-156.

Shetler, S. G., J. H. Beaman, M. E. Hale, L. E. Morse, J. J. Cro ckett, and R. A. Creighton. 1971. Pilot data processing systems for floristic information. In: J. L. Cutbill. Advances in Data Processing for Biology and Geology. Tc-ademic Press. London and New York.

Shultz, L. M. 1973. Random-access key to the genera of Colorado wildflowers. University of Colorado Museum. Boulder, Colorado.

Simmons, R. F. 1970. Natural language question-answeri@ systems: 1969.

Communications of the Association for Computing Machinery 13: 15-30.

Sneath, P. H. A., and R. R. Sokal. 1973. Numerical Taxonomy: The Principles and Practice of Numerical Classification. W. H. Freeman and Company. San Francisco.

Sokal, R. R., and P. H. A. Sneath. 166. 'Efficiency in taxonomy. Taxon 15: 1-21.

Voss, E. G. 1952. The history of keys and phylogenetic trees in systematic biology. Journal of the Scientific Laboratories of Denison University 43: 1-25.

Walker, D., P. Milne, J. Guppy, and J. Williams. 1968. The computer-assisted storage and retrieval of pollen morphological data. Pollen et Spores 10:251-262.

Williams, W. T. 1967. The computer botanist. Australian Journal of Science 29:266-271.

_______. 1969. The problem of attribute-weighting in numerical classification. Taxon 18: 369-374.

POLYCLAVE EXERCISE

Preparation of a simple polyclave. Select a family of ten or more genera. Construct a master sheet listing an abbreviation or number for each genus and print in red on a 5 x 8 card. This list is the taxa master list underlay. By examining herbarium specimens or using an appropriate manual or monograph select and compile a list of 50 or more characters and character states for each genus. Duplicate (in black) the taxa master list underlay to equal the number of characters selected. These will become character state overlay cards. Label each card with a specific character state from the character state list. Search each specimen or description for the character state in question and punch out the generic name or number listed on the card for each genus having the character state in question. Punch a card for each character state, e.g., if one character state selected is leaves opposite, punch out the names of all genera known to have opposite leaves. To use the polyclave select a specimen identified as belonging to the family in question. Notice character states and begin sorting through overlay cards until you find a card with a characteristic that matches one of the cards you have prepared. Place the character overlay card over the taxa master list. Those names or numbers appearing in red through the holes will represent all genera possessing the character in question. Continue sorting and overlaying until all cards have been used or until a single genus name or number appears.

0 =unknown	3 = false, or B
1 = true, or A	4 = inapplicable
2 = variable