Search for Terms and Phrases (Full-Text Boolean Searching)
Connectors Summary:
Connector | Function |
[space] | Words separated only by a space are treated as a phrase by default |
" " | Forced Phrase - used for phrases that use the words "and", "or", or "not" in them |
AND | Includes two terms, phrases, or conditions anywhere (even hundreds of pages apart) in the same document |
OR | Allows for either of two terms, phrases, or conditions |
AND NOT | Excludes a term, phrase, or condition |
w/n | Returns results where words appear within a specified proximity of each other in either direction |
NOT w/n | Finds a given word on the condition that it is not within a specified proximity of another given word |
pre/n | Returns results wherein the second word follows the first word within a specified proximity of words |
* | Allows for pluralized forms of words as well as any endings added on to whatever is typed before the * |
( ) | Groups conditions together to be treated as one concept |
xfirstword | Refers to the location of the first word in a document when used with a proximity connector like w/n or pre/n |
Search Strategy Summary (Video Below):
# | Concept | Guidance |
---|---|---|
1 | Add criteria |
At minimum, consider adding a date range. Consider document categories, industries, exchanges, etc. |
2 | Type a basic search |
Consider using and Consider using or (Example): shareholder rights plan (does not require an and or an or) |
3 | Use Proximity Connectors |
Think about replacing phrases with relationships using proximity connectors like w/5 (within 5 words of) or w/500 (within 500 words of). (Example): shareholder rights plan - only retrieves this phrase. shareholder w/5 rights w/5 plan - also gets shareholder rights protection plan, shareholder protection rights plan, etc. |
4 | Synonyms |
Look at the words in your search and consider which of them could be interchangeable with other words that would target the same concept or disclosure. Write these synonyms in, separating them by writing or in between them. (Example): shareholder or unitholder w/5 rights w/5 plan or program - also gets unitholder rights program and shareholder rights program |
5 | Plurals and suffixes |
Look for words you would like variations of and put an asterisk * at the end of the word. (Example): shareholder* or unitholder* w/5 rights w/5 plan or program - also gets shareholders, shareholder's, shareholders', unitholders, unitholder's, unitholders', etc. |
Video #1: How Much Do Full-Text Searches Really Help?
The video and accompanying case study below explore the impact of full-text searching. They compare general searches (without full-text capabilities) to four increasingly refined levels of full-text searching, demonstrating just how valuable this approach can be.
Key Takeaways:
-
Save time: Full-text searching can eliminate 95% of wasted time compared to simple Ctrl+F searches.
-
Get better results: Using proximity connectors, wildcards, and synonyms can generate more than 10x more relevant results than basic keyword searches.
-
Learn the basics: Even if you don’t need full-text searches all the time, understanding the fundamentals can make a big difference. Start by watching the video below!
How Much Do Full-Text Searches Really Help? – Case Study
This case study accompanies the above video and explores how each component of the search strategy summarized above affects a set of search results in real time. The numbers expressed below are taken from the above video
Step 1: Searching by Category Only (No Search Terms)
Objective: Find Financial Statements and Management Discussion and Analysis (MD&A) documents from Consumer Products and Industrial Products companies.
Steps:
-
In the SEDAR Filings dataset, locate the Industry and Document Category criteria among your usable criteria to the left of the Search button
-
If you don't have Industry and Document Category criteria visible (please check well - they are most likely visible already), then add them by clicking + Add Criteria in the upper left, underneath the full text search field.
-
Once added, make the following selections:
- Under Industry, select Consumer Products and Industrial Products.
- Under Document Category, select Financial Statements and Management Discussion and Analysis.
- Set Filing Date to Last 2 Years.
-
Click Search.
Results:
-
1000 documents retrieved.
-
These are all financial statements and MD&As, but there’s no way to tell if they mention net profit or sales growth without manually searching inside each document.
-
This search does not use a full-text component so we have no idea how many of the results contain the language we're interested in.
Step 2: Adding Search Terms with AND
Question: Do I need to find specific words or phrases in these documents?
Steps:
-
Type net profit and sales growth
-
This instructs Avantis to limit your results to only documents that have these two exact phrases anywhere in them.
-
You do not need to enclose the phrases in quotation marks. Any words without recognized connectors in between them will be taken as exact phrases
-
-
Click Search again.
Results:
-
34 documents retrieved (compared to 1000 before).
-
This means 966 (96.6%) of the original results were irrelevant noise.
-
Each document now highlights where "net profit" and "sales growth" appear
-
If you open your left panel and choose the Keywords tab, you will see every context within which the phrases occur in the document.
-
However, the search does not account for variations in phrasing (it is not even looking for plurals) or related phrases.
Takeaway: Even this simple search saves 95% of the time compared to manually checking 1000 documents.
Step 3: Using a Proximity Connector ("w/n")
Questions:
-
Are my exact phrases too limiting or too exact?
-
Do I need to find related words in the same discussion, even if I can't predict all the various ways they might be used in a phrase?
Steps:
-
Update your existing search
-
Old version: net profit and sales growth
-
New version: net profit AND (sales w/5 growth)
- AND does not need to be capitalized - this is only done for effect, with the reader in mind
- Avantis reads AND and and as exactly the same word
- AND does not need to be capitalized - this is only done for effect, with the reader in mind
- The w/5 instructs Avantis to retrieve only documents that feature the exact phrase "net profit" in them as well as any occurrence of the word "sales" that appears within 5 words of the word "growth"
- Had you used w/10 instead of w/5 Avantis would look for both words within 10 words of each other instead of 5 words
- Had you used w/100 instead of w/5 Avantis would look for both words within 100 words of each other instead of 5 words
- Proximity connectors allow you to set maximum distances between your terms with absolute control and precision
-
- Click Search again.
Results:
-
81 documents retrieved (more than double the amount in Step 2).
-
This search finds variations like:
-
"sales growth"
-
"growth in fourth-quarter sales"
-
"growth in international sales"
-
-
The W/5 proximity connector ensures "sales" and "growth" appear within 5 words of each other, making the search more flexible while still being precise.
Takeaway: Proximity searching expands results without adding noise, helping you capture relevant discussions.
Step 4: Adding Synonyms (Using "or")
Question: Can any of my search terms be replaced with other words that would yield good results?
Steps:
-
Update your existing search
-
Old version: net profit AND (sales w/5 growth)
-
New version: net profit AND (sales or market w/5 growth or increase)
- This will include results where the word market is within 5 words of increase or grow
-
-
Click Search again.
Results:
-
250 documents retrieved (more than 7x the amount in Step 2).
-
This could be expanded further with other synonyms like improve, augment, ramp up, double in the place of grow or increase depending on your needs and appetite for variation.
- to do this you'd simply use "or" in between each synonym - eg - net profit AND (sales or market w/5 grow or increase or augment or double or ramp up)
Takeaway: Using synonyms helps capture different ways the same concept is expressed, significantly improving search coverage without much effort.
Step 5: Adding a Wildcard ("*")
Question: Do I need variations of the forms of my search terms (plurals, suffixes, tenses, etc.)?
Steps:
-
Update your search
- Old version: net profit AND (sales or market w/5 growth or increase)
- New version: net profit* AND (sales or market* w/5 grow* or grew or increas*)
- This instructs Avantis to retrieve only documents that feature the exact phrases net profit or net profits in them as well as any occurrence of either the word sales or else any variation of the word market that appears within 5 words of any variation of the word grow or increase
- Note that simply putting an asterisk after grow will not get the word grew since it doesn't start with the same four letters as grow so we add grew as a synonym or separate word
- Click Search again.
Results:
-
380 documents retrieved (almost 11x the amount in Step 2).
-
Wildcard (*) expands search terms:
- profit* → profit, profits (since "profit" is part of a phrase, this won't likely trigger words like "profiting", which are unlikely to follow the word "net"
- market* → market, markets, marketplace, marketplaces
- grow* → grow, grows, growing, grown, growth
- increas* → increase, increases, increased, increasing
-
We had to manually add grew as a synonyms since it won't be generated by grow*.
Takeaway: Using wildcards increases flexibility without requiring separate search terms for every variation.
Final Thoughts
Search Level |
Results Retrieved |
Improvement from Step 2 |
---|---|---|
Step 1: Criteria Only |
1000 |
Baseline |
Step 2: Basic Keywords |
34 |
96.6% noise removed |
Step 3: Prox. Connectors ("w/n") |
81 |
2.4x more relevant results |
Step 5: Synonyms ("or") |
250 |
7.3x more relevant results |
Step 4: Wildcards ("*") |
380 |
11.2x more relevant results |
Key Takeaways:
✅ Even a basic full-text search (Step 2) eliminates 95% of irrelevant results.
✅ Proximity connectors, synonyms, and wildcards dramatically increase relevant results.
✅ Combining these techniques can make your searches exponentially more effective.
The best way to get into the habit of terms and connectors (Boolean) searching would be to apply one step at a time until you are confident enough that you can mix the steps together as you type out your search. There is nothing wrong with building your search out step by step every time you want to run a comprehensive search.
Would you like to see an interactive demo of these techniques? Let us know!
CASE STUDY, CONCLUSIONS:
- Even the most basic search, the search in step 2 above, cuts out 95% of the noise you would have to wade through on SEDAR or any method relying on Ctrl+F to find words in documents
- The most specific search, the search in step 5 above, using (1) a proximity connector , (2) synonyms(separated by an or ), and (3) wildcards, found over 11 times as many relevant documents as the basic search, the search in step 2 above, did
- By sorting results by rank you will start with the most relevant results (those that have the search terms most frequently occurring and most tightly clustered together) first so you don’t have to look all of the 380 results but can just look at the highest/best matches among them
- Open the viewing panel in the left margin of the document and click on the keywords tab to see all the uses of your search terms, throughout the document
To search with connectors
1. Intro Level : Make sure you are at least familiar with the first 5 connectors in the chart below ([space], AND, OR, AND NOT, *)
a. Look at the 6 th connector (w/n) and consider whether or not it would be useful for you to use in your searching. If the answer is no, you will not need to know more about terms and connector searching than the basics above
2. Basic Level : Look at the entire list of connectors in the Basic Terms and Connectors Searching chart below
a. If you find you have no unanswered questions and you are not interested in knowing more, you will not need more than basic level
3. Intermediate and Advanced Levels : Look at the second chart below – Intermediate and Advanced Terms and Connectors Searching
a. The first 3 examples are intermediate and the last 3 are advanced applications of Terms and Connectors Searching
i. Understanding and employing intermediate and advanced terms and connectors searching gives you a lot more power over what you look at and allows you to cut out a lot of noise from your "needle in a haystack" searches
Table #1: Basic Terms and Connectors Searching
Connector |
Example |
Retrieves |
Highlights |
---|---|---|---|
(space) |
region of incorporation |
Documents that contain the exact same phrase searched for. Quotation marks are not normally necessary. IMPORTANT EXCEPTION: Some phrases do require quotation marks in order to be recognized – see "" (quotes) connector below. |
The exact phrase region of incorporation |
AND |
warrant AND consideration |
Documents that contain both terms |
Both terms anywhere in the document, regardless of proximity |
OR |
warrant OR consideration |
Documents that contain either term OR both terms |
Either term anywhere in the document |
AND NOT |
warrant AND NOT consideration |
Documents that contain one term but must not contain the other |
Only the term warrant and must not contain the term consideration |
* |
warrant* |
Documents that contain any term beginning with the specified string |
Any term that starts with warrant – including warrants, warranted, warranty, warranties, etc. |
w/n |
warrant w/10 consideration |
Documents containing one term within a certain number of words of the other. Not looking for an exact phrase, but for terms that form part of an idea or topic. |
Either term whenever it appears within 10 words of the other |
pre/n |
warrant pre/10 consideration |
Documents where one term precedes the other by a specified number of words (or fewer) |
Both warrant and consideration, where warrant precedes consideration by 10 words or fewer. If warrant is 11+ words before, neither is highlighted |
NOT w/n |
warrant NOT w/10 consideration |
Documents with at least one instance of a term not within a specified distance of another |
Highlights warrant only when it is not within 10 words of consideration. May still exist within 10 words elsewhere, but those instances won’t be highlighted |
xfirstword |
warrant w/10 xfirstword |
Specifies the first word in a document. Used with w/n to find terms near the beginning |
Every instance of warrant that appears within 10 words of the first word in the document |
"" (quotes) |
"warranties and representations" "incorporated or deemed to be incorporated" "not limited to" |
Documents with the exact phrase, including connectors like and, or, not treated as normal search terms |
Use quotes when the exact phrase includes and, or, or not – so they’re treated as regular search terms, not connectors |
% |
wa%rrant |
Documents containing words similar to the one searched |
Finds misspellings like warant, warrrant, etc. |
Video #2: Basic Terms and Connectors Searching
Table #2: Intermediate and Advanced Terms and Connectors Searching
Level |
Search |
Retrieves |
Highlights |
---|---|---|---|
Intermediate
|
(warrant AND consideration) OR common shares |
Documents that: 1) Have both warrant and consideration 2) But don’t necessarily contain common shares
1) Contain common shares 2) But don’t necessarily contain either warrant or consideration |
Highlights any occurrences of warrant, consideration, or common shares that are found in the relationships specified in the search. Warrant will only be highlighted if the term consideration is in the document. |
Intermediate
|
warrant AND (consideration OR common shares) |
Documents that: 1) Contain warrant 2) And also contain EITHER consideration or common shares |
Highlights any occurrences of warrant, consideration, or common shares that are found in the relationships specified in the search. |
Intermediate
|
(warrant AND consideration) w/10 common shares |
Documents that: 1) Contain warrant within 10 words of common shares 2) As long as the same document ALSO contains consideration within 10 words of common shares |
Warrant is highlighted only if: • It is within 10 words of common shares, and • Consideration is ALSO within 10 words of common shares Consideration is highlighted only if: • It is within 10 words of common shares, and • Warrant is ALSO within 10 words of common shares
• It is within 10 words of both warrant and consideration |
Advanced
|
common shares w/20 warrant w/10 consideration |
Documents that: 1) Contain warrant within 20 words of common shares 2) Contain warrant within 10 words of consideration 3) Also contain consideration within 10 words of common shares
|
Common shares must be: • Within 20 words of warrant, and • Within 10 words of consideration
• Within 20 words of common shares, and • Within 10 words of consideration
• Within 10 words of common shares, and • Within 10 words of warrant |
Advanced
|
common shares w/20 (warrant w/10 consideration) |
Documents that: 1) Contain warrant within 10 words of consideration 2) Contain warrant OR ELSE consideration within 20 words of common shares 3) Consideration can be any distance from common shares if the first two conditions are met |
Common shares must be: • Within 20 words of warrant, OR • Within 20 words of consideration
• Within 10 words of consideration, and • Within 20 words of common shares ONLY IF consideration is not: - in the 10 words that follow consideration ,
- and in the 20 words following common shares
• Within 10 words of warrant, and • Within 20 words of common shares ONLY IF warrant is not
|
Advanced
|
common shares w/10 warrant w/15 consideration w/20 collectively |
Documents that contain: 1) Warrant within 10 words of common shares 2) Consideration within 15 words of common shares 3) Collectively within 20 words of common shares
• Warrant within 15 words of consideration • Consideration within 20 words of collectively In this search, common shares is the anchor term. |
Common shares must be: • Within 10 words of warrant, • Within 15 words of consideration, and • Within 20 words of collectively
• Within 10 words of common shares, and • Within 15 words of consideration
• Within 15 words of common shares, • Within 15 words of warrant, and • Within 20 words of collectively
• Within 20 words of common shares, and • Within 15 words of consideration |