Advanced Database Searching
When searching through the thousands of literature resources available online and in subject databases, it often takes some skill to get exactly the results you are looking for. There are number of simple techniques that can be used to broaden or narrow results in a search, but you must learn to use them effectively. By understanding how databases use controlled vocabularies and learning to effectively use tools for combining search terms, you can drastically increase the efficiency of your literature search. The following are very common and very simple ways to increase the effectiveness of your search:
- Phrase searching using quotation marks
- Boolean operators AND, OR, and NOT
- Controlled vocabulary for precise subject headings
- Proximity searching to find a word close to another word in the text
- Combining all of the above into precise search strings
Database searching is very often an iterative process, whereby a user may create a search, view the results, and then go back and reformulate a search. By using these techniques regularly, and reformulating your search in different ways, you can quickly develop the skill you need to effectively and thoroughly conduct your literature search.
Phrase searching is one of the simplest, yet most effective techniques for getting the search results you want. When you enter a string of keywords into a search, these words can co-occur anywhere in text being searched. However, by placing a phrase within quotation marks, you restrict your search to include that phrase exactly as you enter it. If you are looking for a very narrow topic or even a title or other key phrase, using quotation marks can efficiently restrict your search to yield more relevant results.
"hearing aids" - as a phrase search returns more specific results for auditory assistive devices than a keyword search without the quotes. The latter might also return results about "a UN hearing on the AIDS epidemic".
"speech impediments" - as a phrase is much more specific than a keyword search, which might return results about "a congressional speech on impediments to health care".
Be very careful with phrase searching in quotes, since you can easily overlimit by using unusual or lengthy phrases. The longer a phrase is, the less likely it is there is a result that will contain that exact wording. If you find you are receiving too few results, try shortening the search phrase appearing in quotation marks or try using keywords without quotes.
Wildcard searching refers to the use of a symbol to stand for a letter or series of letters in a search string. Most databases use either an asterisk (*) or a question mark (?) to accomplish this function. This technique can effectively broaden a search and can be used to capture variant spellings or different endings to a word root.
For example, if you wanted to perform a single search that returned the results of articles containing the two common spellings of the "pediatric" (pediatric and paediatric), you could use a wildcard. By entering "p?ediatric" in your search box, you would be given results containing either of these two spellings.
- wom?n = women, woman
- orthop?edic = orthopedic, orthopaedic
Some resources may use different symbols for wildcard searching. Look to the help page of each resource for specifics.
Commonly used symbols include: !, *, #, ?. Help pages are available that will outline how to do this type of searching in a specific database.
Truncation functions similarly, but is used to capture multiple lengthy endings on a root word. Also called "stemming", since this technique typically attaches a symbol to the end of a root stem, this method can also broaden your search to include results with many variants of a word. Again similar symbols are used here as with wildcard searching (!,?,#,*), though "*" is most common.
For example, searching for "genet*" lets the system fill in the ending of the word with any combination of text, so this would give results including genetic, genetics, genetically, etc.
- child* = child, children, child's, children's, childhood
- rehabilitat* = rehabilitate, rehabilitates, rehabilitating, rehabilitation, rehabilitative
- surg* = surgery, surgical, surgeon, surgeons
*Be careful not to reduce the word too far or you may return an enormous number of results. In the above example of genetic, etc., truncating your search string all the way down to "ge*" would yield millions of results, since it would match any results containing a word starting with the letters GE-
Most database companies offer a help page with specifics on the symbols used and how to perform these functions in a given database. You can either consult the help page or ask your librarian for additional help.
Boolean operators, named for the mathematician George Boole, are simple connector words that can be used to combine search terms when querying a database. By combining search terms, users can either expand or limit their results by being very explicit about combinations of terms in their results. AND and NOT operators will more tightly constrain a search, yielding fewer results, while OR will broaden a search to include even more results.
Using the search operator AND limits results by requiring that they contain both terms specified. This a good way to narrow your search when you are retrieving too many hits. The more terms you link up with AND, the more specific the search becomes, and the fewer results will be retrieved.
- knee AND surgery
- exercise AND "physical therapy"
- nursing AND pain AND drugs
Be careful not to overlimit with AND or eventually you get zero results, since the search can be so specific that nothing matches the search string.
Summary of Boolean AND
- Used to NARROW a search to fewer results
- Results must contain all terms
- Default function in most databases
- May use the the symbol "+" in place of AND
The OR connector can be used to broaden results using multiple terms. This type of search will return results containing any one of the specified terms. This can be very useful for capturing variant spellings of a word or for searching on synonyms (words with similar meaning). The OR operator expands your search since an item can contain either one term or the other.
- pediatric OR paediatric
- psychiatry OR psychology
- college OR university
Using OR can quickly increase the number of results as you add terms. By adding enough terms or by using very common words, you might even retrieve every item in the database!
Summary of Boolean OR:
- Used to BROADEN the results of a search
- Combines multiple similar concepts in a single search
The NOT connector can be used to exclude results containing certain terms. This can be useful for limiting results by removing items that might be matching results that are not wanted. For example, if I were searching for articles about the game pool, that is billiards, I might create this search: "pool NOT swimming". This would help to exclude articles about swimming pools, but retain the rest of the results, most of which would presumably be about the game.
Using NOT can also help to carve out specific areas of a very broad field by removing the inapplicable areas.
- nursing NOT surgical
- "pain management" NOT drugs
It is VERY easy to create a search that gives zero results using NOT. Either choosing too many words or choosing a very common word can quickly reduce the number of results.
Summary of Boolean NOT:
- Used to NARROW the results of a search
- Excludes items containing specified words
- May be expressed with the symbol "-" in some databases
Using proximity operators when searching a database allows a user to locate results where words occur near each other. Typically, this is done by searching for words within X number of words of another word in a phrase or sentence. This a great technique for expanding simple phrases into variants of that phrase to include a larger number of results. This allows for more specificity than Boolean operators like AND since the words must not only coappear in the same text, but they must appear near each other in the text.
The "Near Operator", can be used to locate words that occur near each other. The user can choose how close the specified words should appear using a number. This is usually specified using an "n", but sometimes requires "n/" followed by a number. If no number is used, the system will search for words immediately adjacent to one another.
- therapy n3 occupational (retrieves "occupational therapy" but also "therapy in an occupational context", since "occupational" occurs within 3 words away from "therapy")
- limb n4 pain - (returns results matching "limb pain" but also the phrase "pain in the upper limb")
Note: The near operator does not require any order to the occurrence of the words, they may either occur before or after each other.
The "Within Operator" can also be used to locate words that occur near each other. Again the user specifies a number to indicate how closely the words may occur. This usually is done by using the letter "w", though sometimes "w/" is needed, followed by a number.
- Hillary w2 (retrieves Hillary Clinton and Hillary Rodham Clinton)
- William w2 Hearst (retrieves William Hearst and William Randolph Hearst)
Note: The order of the words in the "within" search phrase specifies the order in which they occur in the text.
- Proximity operators locate words near each other in the text
- A number indicates how closely words should occur
- Symbols are usually either N# or N/# and W# or W/#
- Within operator requires words to appear in the specified order
Many databases and catalogs use special vocabulary for indexing articles and other resources, much like the index of a book will only use specific terms. This is a very precise set of words used to describe the nature of the content very precisely. This means that regardless of any indiosyncratic language used in the article, if it covers a given topic it will be indexed with a controlled term.
For a very simple example, maybe one article refers to cats as "cats", while another refers to them exclusively as "felines". An index would use a single subject term to index both, say CATS, thus capturing them all. Typically, the controlled term is the most commonly used term in the field or fields covered by the database.
Medical Subject Headings (MeSH) uses the term STROKE to cover:
- "A group of pathological conditions characterized by sudden, non-convulsive loss of neurological function due to brain ischemia or intracranial hemorrhages"
This then is the term that would be used as a subject heading for any article on this topic, even if the article only ever refers to this idea with the words "cerebrovascular accidents" or "apoplexy".
Each database may use slightly different language to capture these subject headings, so it helps to explore single databases and pay attention to the vocabulary used for the indexing. Subject headings can usually be seen within the catalog entry for each item. For example below is an article entry from ProQuest Science & Technology. You can see the database-specific subjects listed just below the abstract.
By clicking on any of these terms, a search can be created within the database for other items that list that same subject.
You can also utilize these terms by combining them into more specific searches instead of simply searching for keywords.
In most databases, controlled vocabulary:
- Utilizes a drop-down menu to select subjects rather than keywords
- May be listed as "Subject Heading", "Subject", "Main Subject" or "SU"
- May require utilizing "Advanced Searching" features
While modern keyword searching is incredibly effective, utilizing database-specific subject headings instead of just keywords is a helpful way to limit your results to something very specific. By searching using the controlled subject headings, one can limit to articles that are actually about a specific topic, instead of simply containing a word somewhere in the article. These subject headings or thesauri can also be browsed to get a sense of how those terms are used. See Scope Notes and Related Terms below.
In order to aid understanding of exactly which concepts a controlled term or subject heading may include, most databases offer a browsable index of the subject terms. These include "scope notes" that define the range of coverage for each term. For example, the definition for STROKE within the Medical Subject Headings (MeSH) above came from the scope notes. These notes also include a number of related terms, narrower and broader terms, and indicate other phrases that might be used, but for which the subject heading is the preferred term.
One example is the MeSH term PARAPLEGIA. The MeSH listing for PARAPLEGIA is given below in the screen shot below.
MeSH term PARAPLEGIA
The scope notes for PARAPLEGIA state that this term should be used to refer :
- "severe or complete loss of motor function in the lower extremities and lower portions of the trunk."
It also notes that it should be used in place of many synonym phrases that mean roughly the same thing, such as "paralysis of the lower limbs".
By searching paraplegia instead of paralysis, it is clear that we would receive much more specific results that match the concept of paraplegia, whereas if we searched PARALYSIS, we might receive thousands of articles on other forms of paralysis. This is a very useful technique for narrowing a large number of results when needed.
ERIC (Educational Resource Information Center): READING DIFFICULTIES
The scope notes for "Reading Difficulties" show that this is used refer to:
- "problems in reading, caused either by disabilities associated with psychological processes or by such factors as physical or sensory impairments, cultural background, low ability, etc."
Note how this might capture different results than the ERIC subject heading for "Reading Failure", which captures simply "Lack of achievement or accomplishment in reading", though these terms are obviously closely related.
Exploring and understanding these vocabularies as they apply to the common databases in your field is very useful in increasing the efficiency of any literature search you might do. These subject headings can be viewed at the bottom of most article records within a database, and they can also be browsed and explored by locating them on the internet. A few specific database vocabularies are discussed below.