Editor's note: Readers unfamiliar with the basics of Boolean operators (using
ands and ors) should reference Mr. Neill's previous article which provides a basic
overview of these terms and their uses.
The Internet contains a vast amount of information. One significant
difficulty is locating Web sites that might be useful in researching a
specific genealogical problem. Lists of links are one way to deal with
this problem. However, they have several drawbacks:
1) The links might be outdated.
2) I must understand the list's organizational structure.
3) The list might not be complete or contain the site I want or need.
4) The list of links will not include every surname that appears on a
given Web page or site.
While not to criticize lists of links, these drawbacks limit the
effectiveness with which any set of links can be used. Using search
engines is a much broader approach to finding Web pages and is not
limited to those pages that have been categorized on a list of links.
However, search engines have their own limitations, including:
1) They don't contain every page on the Internet.
2) They contain hundreds of millions of pages.
3) Frequently, query boards are not searched by search engines.
4) Online databases are not searched by use of search engines.
5) I might have wrong information that hinders my ability to locate the
page I need, including:
a) Incorrect spelling of surname
b) Incorrect location(s)
c) Incorrect dates of vital events
Some of these limitations can be hurdled more easily than others. It
should be remembered that not every page containing genealogical
information is included in the search engines.
An extended example will concentrate on a German immigrant, Peter
Bieger. Peter was born in Germany in the 1830s and emigrated to Warsaw,
Hancock, Illinois. His wife's name was Barbara. As Bieger is not a
common name, it might be best to begin with a simple search such as
peter AND bieger
This search assumes the Web page will contain the spelling of "bieger."
This spelling may be incorrect, or may be one of many variants.
Alternate spellings could be incorporated into the search.
peter AND (bieger OR bickert OR berger OR beger)
All the variant spellings are placed in one parenthesis, replacing the
one spelling of the surname. Alternate spellings are connected with ORs
and not ANDs as one page may not contain all possible surname variants.
(It should be noted that there are other possible spelling variants
which are omitted in the interest of brevity.)
If a search returns too many hits, it may be necessary to refine the
search further (this would be especially true if the names were
extremely common). The search could be refined by using other known
information about the subject, such as Peter's birthplace.
peter AND (bieger OR bickert OR berger OR beger) AND germany
Or the state of Illinois could be used
peter AND (bieger OR bickert OR berger OR beger) AND illinois
There are potential pitfalls to using a location in order to refine the
search. One is that a Web page containing information on Peter might not
mention the word "Germany" or "Illinois." A more serious problem is
that the location may be abbreviated, "Ger" for Germany and "Ill,"
"Ills.," or "IL" for Illinois (while Ills. is generally no longer used
as an abbreviation for Illinois, it may appear in the transcription of
an original document). These abbreviations can also be incorporated into
the search.
Again the parenthesis are added around the last portion as we are
replacing the search for Germany with the broader search of "germany OR
ger."
It may be necessary to perform a similar search involving Illinois
peter AND (bieger OR bickert OR berger OR beger) AND (illinois OR ill OR
ills)
A search can also be conducted that combines both locations.
peter AND (bieger OR bickert OR berger OR beger) AND ( (germany OR ger)
AND (illinois OR il) )
Analyzing this search to fully understand it may be in order. We can
think of this search as being conducted in several parts:
- Searching for Peter
- Searching for bieger OR bickert OR berger OR beger
- Searching for germany OR ger
- Searching for illinois OR il OR ills
The last two searches are grouped together with a parenthesis in the
original search term, which indicates that the entire search in this set
of parenthesis groups two searches [and begins with the second "(" and
ends with the final ")"-referred to as Pot 3]. The way this search is
constructed, we can think of three large pots:
- Pot 1 is pages that contain Peter
- Pot 2 is pages that contain bieger OR bickert OR berger OR beger
- Pot 3 is pages that contain (germany OR ger) AND (illinois OR il)
All three pots are connected with ANDs. This means the only pages that
will appear in the final pot are those pages that appear in Pots 1, 2,
and 3. Pots 1 and 2 are fairly straightforward. A closer look at Pot 3
is in order.
Pot 3 is a more complex search, which can be thought of in two parts.
Part 1 locates those pages that contain "Germany" or "ger." Part 2
locates those pages that contain "illinois" or "il." Parts 1 and 2 are
connected with an AND, which means that only pages that appear in both
Part 1 and in Part 2 will appear in the combination, which has been
termed Pot 3.
The previous search requires that both Germany and Illinois (or one of
the variant spellings) appear on the Web page. It would be reasonable to
modify the search so that only one of the locations needed to be on the
page. This could be done by replacing the final AND with an OR,
obtaining:
Peter AND (bieger OR bickert OR berger OR beger) AND ( (germany OR ger)
OR (illinois OR il) )
Using Counties and Other Localities
A search conducted using just states and countries as the only
localities may still result in a large number of matches, especially if
the first and last names being used are common. Using more specific
geographic information will narrow your search and should only be done
if broader searches produce too many results to search effectively.
In this case the names are unusual enough that using just the county is
not a problem. However, there is more than one Hancock County in the
United States and this may result in hits outside the area of research.
In this case, a more focused search would be:
peter AND (bieger OR bickert OR berger OR beger) AND ( hancock AND
(illinois OR il)
Variant Spellings
Variant spellings can always be included in your search. If I'm
searching for:
john AND trautvetter
I can replace the search with:
john AND (trautvetter OR troutfetter OR trautfetter OR trantvetter)
Including variant spellings is easy. Replace the original word with a
set of parenthesis that contains variant spellings connected with OR.
Don't use AND or else it will require all the variant spellings to be on
the same page.
Nicknames present a similar problem to alternate spellings and location
abbreviations. They can be dealt with in a similar manner.
A search for:
elizabeth AND rampley
can be more effectively entered as:
(elizabeth OR betsy OR beth OR eliza) AND rampley
Why Not Use Genealogy?
You can use the word "genealogy" as a part of your search (by adding
"AND genealogy") to your search phrase. However, not all pages that
have genealogical information contain the word "genealogy." The word
genealogy can also be misspelled in one of several ways ("geneology"
being the most prevalent).
It might seem easier to just search for "Peter Bieger." However, a
search of this type (where the phrase "Peter Bieger" is searched for)
will not catch references where the word "Peter" is not directly in
front of the word "Bieger." The page will not be returned as a hit if
the phrase involving Peter Bieger appears as "Bieger, Peter" (as it
might in an index); as Peter middle name Bieger (as it might if someone
knows his middle name); or as a phrase similar to "the first Bieger
ancestor was named Peter." Chances are you do not want to miss those
references.
Proximity Operators
Some search engines allow the use of the NEAR operator in addition to
ANDs and ORs. NEAR functions similarly to an AND, but the difference is
that the words on either side of the NEAR must be within a certain
number of words of each other (a "word" is generally defined to be a
series of characters not separated by a space). Some search engines
allow the user to enter a number to indicate just how "near" and others
have only one setting. Users should read the help pages for the specific
search engine they are using to determine if the near operator can be
used (not all sites support it) and how to specify the word distance. If
the distance is not specified, a default value will be used (generally
ten).
The search for Peter Bieger search can be refined using NEAR, as in:
This search would result in those pages where the words "peter" and
"bieger" are near each other (how "near" depends upon the search
engine). The advantage to using NEAR is that the researcher may not be
interested in those pages where "peter" and "bieger" are 1200 words
apart. Use of NEAR may be especially desired when searching for common
first names or surnames.
If the researcher were looking for Web pages on Hancock County,
Illinois, a search could be entered using the NEAR operator as:
hancock NEAR illinois
The following phrases (among others) would be located with this search:
"Hancock County, Illlinois"
"Town, Hancock, Illinois"
"In Illinois, Hancock County I think"
"State of Illinois, County of Hancock"
and similar phrases where Hancock and Illinois are within ten words of
each other.
But I Don't Want To Type All Those Searches!
You don't have to. Use the power of your computer. Type the searches
into your word processor and then simply cut and paste them from that
program into the search box at search engine's Web site. Once you have
the searches entered in your word processor you can use them in whatever
search engines you are using (assuming they support Boolean searches and
use of the word NEAR). I would not enter the same search in fifty search
engines. Using two or three of the major ones should catch the majority
of pages. The searches should be saved so that you can use them again a
few weeks or months later in order to search for pages again. Somewhere
in the document that contains the text of your searches keep the search
engine's name and URL and the date you performed the searches. Remember,
it's just as important to track online research as it is to track
offline research.
Avoid Overly Complex Searches
It's possible to create searches more complicated than the ones used
here. However, the more complex your search, the greater the chance
that it might not search in exactly the way you think it will
(especially if several sets of nested parenthesis are used). If you
aren't certain how the search will be conducted, you should not use it.
Not knowing what you are searching for is not effective and is not good
genealogy.
Search Engines
Altavista
http://www.altavista.com uses the advanced search
feature which supports Boolean searches.
Hotbot
http://www.hotbot.com
Metacrawler
http://www.metacrawler.com
Good Luck!
Michael John Neill, is the Course I Coordinator at the Genealogical
Institute of Mid America (GIMA) held annually in Springfield, Illinois,
and is also on the faculty of Carl Sandburg College in Galesburg,
Illinois. Michael is the education columnist for the FGS FORUM and is on
the editorial board of the Illinois State Genealogical Society
Quarterly. He conducts seminars and lectures on a wide variety of
genealogical and computer topics and contributes to several genealogical
publications, including Ancestry and Genealogical Computing.