Using the Soundex with Census Records
An index and filing system called the Soundex is the key to finding the names of individuals among the millions listed in the 1880, 1900, 1910, 1920, and, for some states, 1930 federal censuses. The Soundex indexes include heads of households and persons of different surnames in each household.
The Soundex indexes are coded surname (last name) indexes based on the progression of consonants rather than the spelling of the surname. This coding system was developed and implemented by the WPA in the 1930s for the Social Security Administration in response to that agency’s need to identify individuals who would be eligible to apply for old-age benefits. Because early birth records are unavailable in a number of states, the 1880 census manuscripts became the most dependable means of verifying dates of birth for people who would qualify—those born in the 1870s. Widespread misspelling caused so many problems in matching names, however, that the Soundex system was adopted. Because locating eligible Social Security beneficiaries was the sole reason for creating the 1880 Soundex, only households with children ten years of age or under were included in that index. All households were included in the Soundex indexes for the 1900, 1910, 1920 and 1930 censuses.
- 1 How the Soundex Works
- 1.1 Coding a Surname
- 1.2 Use of Zero in Coding Surnames
- 1.3 Names with Prefixes
- 1.4 Names with Adjacent Letters Having the Same Equivalent Number
- 1.5 Different Names within a Single Code
- 1.6 Alphabetical Arrangement of First or Given Names within the Code
- 1.7 Mixed Codes
- 1.8 Soundex Reference Guide
- 1.9 Soundex Abbreviations
- 1.10 Native Americans, Asians, and Nuns
- 2 Soundex Research Tips
How the Soundex Works
Soundex index entries are arranged on cards, first in Soundex code order and then alphabetically by first name of the head of household. For each person in the house, the Soundex card should show name, race, month and year of birth, age, citizenship status, place of residence by state and county, civil division, and, where appropriate for urban dwellers, the city name, house number, and street name. The cards also list the volume number, enumeration district number, and page and line numbers of the original schedules from which the information was taken.
Coding a Surname
To search for a name it is necessary to first determine its Soundex code. Every Soundex code consists of a letter and three numbers; for example, S655. The letter is always the first letter of the surname. The numbers are assigned according to the following Soundex coding guide:
- B, P, F, V
- C, S, K, G, J, Q, X, Z
- D, T
- M, N
The letters A, E, I, O, U, W, Y, and H are disregarded. Consonants in each surname which sound alike have the same code.
Use of Zero in Coding Surnames
A surname that yields no code numbers, such as Lee, is L000; one yielding only one code number, such as Kuhne, takes two zeros and is coded as K500; and one yielding two code numbers takes just one zero; thus, Ebell is coded as E140. No more than three digits are ever used, so Ebelson would be coded as E142, not E1425.
Names with Prefixes
Because the Soundex does not treat prefixes consistently, surnames beginning with, for example, Van, Vander, Von, De, Di, or Le may be listed with or without the prefix, making it necessary to search for both possibilities. Search for the surname van Devanter, for example, with (V531) and without (D153) the “van-” prefix. Mc- and Mac- are not considered prefixes.
Names with Adjacent Letters Having the Same Equivalent Number
When two or more key letters or equivalents appear together (adjacent) they are coded as one letter with a single number. Thus a double “f” takes a single code (1). This rule is also followed in surnames when the first two letters have the same number equivalent. Pfeiffer, for example, is coded P160. Because “P” is the first letter of the surname, it is used (P---). The next letter, “f”, carries the same code (1) as does its equivalent “p” so is disregarded (the “P” takes the place of a 1). The vowels “e” and “i” are disregarded. Next is a double appearance of the letter “f” which is coded as 1 (the second “f” is disregarded). The vowel “e” is disregarded. The letter “r” is represented by 6, and in the absence of additional consonants, the code is rounded off with a zero. Other examples of double-letter names are Lennon (L550), Kelly (K400), Buerck (B620), Lloyd (L300), Schaefer (S160), Szucs (S200), and Orricks (O620). Occasionally the indexers themselves made mistakes in coding names, so it may be useful to try alternate codes based on possible errors. Also be aware that some immigrants with difficult last names may have been Soundexed under their first name; these names would then be listed alphabetically by last name.
Different Names within a Single Code
With this indexing formula, many different surnames may be included within the same Soundex code. For example, the similar-sounding surnames Scherman, Schurman, Sherman, Shireman, and Shurman are indexed together as S655 and will appear in the same group with other surnames, such as Sauerman or Sermon. Names that do not sound alike may also be included within a single code: Sinclair, Singler, Snegolski, Snuckel, Sanislo, San Miguel, Sungaila, and Szmegalski are all coded as S524.
Alphabetical Arrangement of First or Given Names within the Code
As described earlier, multiple surnames appear within most Soundex codes. Within each Soundex code, the individual and family cards are arranged alphabetically by given name. Marked divider cards separate most Soundex codes. Look also for known nicknames, middle names, or abbreviations of the first name.
Divider cards show most code numbers, but not all. For instance, one divider may be numbered 350 and the next one 400. Between the two divided cards there may be names coded 353, 350, 360, 365, and 355, but instead of being in numerical order, they are interfiled alphabetically by given name.
Soundex Reference Guide
For those who are unsure of their Soundex skills, most genealogical software programs and many genealogy websites include a Soundex Calculator. Also, most genealogical libraries have a copy of Bradley W. Steuart’s The Soundex Reference Guide: Soundex Codes to Over 125,000 Surnames.17
In addition to the letter-numerical codes, Soundex also uses a number of abbreviations, most of which relate to residents’ relationships to the head of the household (see table 5-3). NR (not recorded) is a frequently found abbreviation.
Native Americans, Asians, and Nuns
Names of nuns, Native Americans, and Asians pose special problems. Phonetically spelled Asian and Native American names were either coded as one continuous name or by what seemed to be a surname. For example, the Native American name Shinka-Wa-Sa may have been coded as Shinka (S520) or Sa (S000). Nuns were coded as if “Sister” were the surname, and they appear in each state’s Soundex under the code S236, but not necessarily in alphabetical order.
Soundex Research Tips
The Soundex indexes can be especially useful in identifying family units, because all members of the household are listed on the Soundex cards under the name of the head of the household. Often, census searches begin with only a surname and the name of the state in which a person or family lived in a given census year. In such cases, the Soundex can be a means of determining surname distribution throughout the state. A search can often be narrowed to a smaller geographic area within a state. Once the county of origin is determined through census work, whole new paths of research open up. The Soundex can also be used to locate orphaned children living with persons of other surnames and to identify families with grandparents living under the same roof. They are sometimes listed on the Soundex cards, even though they may not be indexed separately.