‘The Principles Of Unknown Opinion Retrieval’ by Mark Whitney

As legal researchers, we are servants to the tasks of retrieving known and unknown opinions.

Everyone understands the simple administrative process for retrieving a known opinion from a database. One enters the citation and clicks Search. The process is flawless because it forces us to speak in a three part code consisting of volume number, reporter edition and page number.

But what about unknown opinions? Why is there not some simple administrative process for quickly surfacing the most cited and/or most recent and/or most binding and/or most relevant unknown opinions citing your case for your point of law in your primary jurisdiction(s)? And how do you know you didn’t miss something?

Minutes from now, you will completely understand how to retrieve — with precision — all relevant unknown opinions from a database, by simply leveraging the identical code and/or concepts you use every day in letters, memos, briefs and arguments.

This memo reveals the principles used by my company when supporting professional researchers from every jurisdiction and practice area. To my knowledge, this is the first memo to distinguish the ‘retrieval task’ (surfacing all relevant unknown opinions in a database) from the ‘research task’ (selecting and analyzing the most relevant opinions from a set of results). The principles set forth in this memo solve for any unknown opinion retrieval task you may confront, regardless of practice area or jurisdiction.

Because it was impossible to read all the opinions in the library, in 1875, the Key Number System (KNS) was created so researchers could browse a partial, manmade index of the law by concept. KNS has always been imperfect, imprecise and contrived, but for 100 years it was the best we could do.

In 1975, West Publishing (now Canadian media conglomerate Thomson Reuters) introduced an opinion database. By definition, such databases electronically index 100% of opinion text. It is beyond dispute that the complete index is superior to the partial index, and that when one enters a term, one is in fact ‘reading’ all the opinions in the database.

The significance of this point cannot be overstated to anyone concerned with crafting a brief or argument of unsurpassed quality. Our recently released Apples To Apples case study reveals a woeful lack of continuity between WestlawNext search results and West’s revered Statutes Annotated. Nearly 50% of the first ten opinions identified as “Most Relevant” by the WestlawNext algorithm, receive no mention in the annotations of the related statute!

It is astonishing and unfortunate that some 35 years after the introduction of the electronic opinion database, not a single law school in the United States offers a course called “Retrieval.”

Accordingly, legal research continues to be about browsing inferior partial indexes on the page or screen. No standards govern the task of retrieving unknown opinions from a complete electronic index, more commonly known as a case law database. This memo solves for that problem.

CODE OR CONCEPT: The Two Types Of Opinion Retrieval Questions

We do not have thousands or tens of thousands or millions of potential questions controlling our opinion retrieval tasks. We have two.

Every retrieval task is controlled by either a codified or non-codified item of information. “Code or Concept?” This is the question we silently ask ourselves in support of professional researchers, from every practice and jurisdiction, whenever they reach out to us for answers to their toughest questions.

If your retrieval task is controlled by a codified item of information, meaning a constitutional amendment, statute, rule, regulation or case citation, before entering a single term, you must first retrieve the controlling item of information and print it. To avoid ‘word guessing’ and to obtain pure results, you will need ongoing reference to the black letter text of any such items to ensure that in refining your Search, you consistently speak in the language of the law. Happenstance is not a process. If we have learned anything at TheLaw.net Corporation, it is that legal researchers today do not subscribe to a uniform set of best practices.

A research support call to TheLaw.net frequently sounds like this:

CALLER: I need cases because I want sanctions against opposing counsel because they did blah, blah, blah. When I enter ‘sanctions’ all I get is blah, blah, blah.
THELAWNET: What is the local rule of civil procedure governing your search?

All research starts with the black letter law. Retrieve it. Print it. Read it and make this Rule One in your office. Then, and only then, are you ready to perform the nearly administrative task of retrieving any existing unknown opinions.

If your retrieval task is controlled by a non-codified item of information, meaning a concept developed by judges through judicial or administrative opinions, rather than through legislative statute or executive regulation, the Concept leads you to related opinions, sometimes resulting in information overload.

The superior, first-mover analytics provided by our CiteTrak algorithm facilitates at-a-glance identification and selection of the Most Relevant, Most Recent, Most Cited, and/or Most Binding opinions citing your case for your Code and/or Concept.


I know of one perfect search that solves for any legal research question. “Court.” We know the term court appears in every opinion. Enter it and read the cases. I know what you’re thinking: “Impractical.”

What else do we know?

  • We know that if our Search is driven by a statute, then the opinion mentions that statute number.
  • We know if our Search is driven by a rule, the opinion mentions the rule number.
  • We know if our Search is driven by a regulation, the opinion mentions the regulation number.
  • We know if our Search implicates a constitutional concern, the opinion mentions the amendment.
  • We know if our Search is driven by a particular opinion, related opinions mention our book and page citation.

Ultimately, every Search is driven by one of the five foregoing codified items of information. We accordingly, Anchor our query with Code and Filter it by Concept(s).

Search Driven By Constitutional Amendment

  • CORRECT: “fifth amendment” and “ex-post facto”
  • INCORRECT: “due process” and “ex-post facto”

Fifth Amendment is Code. It’s all you need. Due Process standing alone is an ambiguous term that implicates multiple Federal and state constitutional guarantees. Ambiguous terms waste your time because they lead to diffuse results. Codes lead to relevant results.

Search Driven By Statute, Regulation or Rule

  • CORRECT: 1983 and “excessive force”
  • INCORRECT: “civil rights” and “excessive force”

We are in the Second Circuit. The controlling section number of the Federal civil rights statute is Code; Civil rights is not. Our Filter – and excessive force – retrieves the subset of Federal civil rights opinions that include at least one instance of the controlling Concept, in this instance, excessive force. The CiteTrak algorithm tells us that 821 F.2d 913 is the Most Relevant and Most Cited opinion matching our search terms. Watch how we incorporate this into the final step of our Search, resulting in our Hot List.

Note To Boolean Ninjas: You may be asking, “What about the within connecter?”

With our new CiteTrak algorithm, you don’t need it. In fact, with our search engine and the algorithm powering WestlawNext, if you add w/5 or /5 or /s or /p you are essentially overriding algorithms that already consider Proximity of terms to each other. Accordingly, it is incorrect to essentially add an arbitrary second steering wheel that negatively implicates your Relevancy Ranking, severely reducing the quality of your results. For the same reason we no longer need to include the Title Number (42) in our query. ‘1983’ does the trick. And yes, you can use the ampersand if you prefer.

Search Driven By Citation Code

  • Correct: “821 F.2d 913” and “qualified immunity”
  • Incorrect: “821 F.2d 913” and “excessive force” and “qualified immunity”

821 F.2d 913 has been cited hundreds of times nationally for reasons that are unknown to us. We like the discussion of qualified immunity contained therein. Accordingly, we Anchor our final query with the Citation Code and our refined Concept qualified immunity. We omit excessive force from our revised query because, by implication, it is already included in the Citation Code. Why overthink it?

CiteTrak retrieves less than 100 quality opinions scattered among 15 of the 315 jurisdictions indexed in the database. Again, relying on the best-of-breed power of the CiteTrak algorithm, we click one more time, and without re-executing our Search, a list of less than 20 opinions is displayed and they are all in the Second Circuit. This is our Hot List!

All 20 opinions include at least one express reference to our Citation Code 821 F.2d 913, together with our Concepts excessive force and qualified immunity. From our Hot List of 20 opinions, CiteTrak automatically tells us that 66 F.3d 416 is Most Cited and that 494 F.3d 344 is Most Recent, representing the Second Circuit’s last word on this topic. But, of course, these results are so directly on-point, we want to see the other 18 opinions, too.