Does the standard review protocol need to change to find treasure in short message data?
You know the usual drill for starting a new document review: You explain the case to your document review team, and then assign them each a batch of documents to start reviewing and coding for responsiveness and privilege.
If you have put a lot of thought into the review, the batches may be organized in any number of ways:
(a) custodian, so that each reviewer has the chances to become familiar with a particular custodian’s job function and involvement;
(b) time period, so that the reviewers can keep an eye on the action and see how documents or chats might fit into the timeline; and/or
(c) by documents that your system’s AI deems most likely to be responsive based on some initial coding done by the case team.
If you are using Technology-Assisted Review and AI tools, the next sets of batches, perhaps while maintaining the reviewer’s focus, will be targeted at documents most likely to be responsive, with garbage files being systematically relegated to the tail end of review or categorically culled from the review set.
If you haven’t put a lot of thought into the review, batches may be simply random, or even organized by how they were uploaded: the dreaded “linear review”. Your reviewers are slogging through custodians’ lunch planning emails and online shopping, while potentially relevant documents are buried in the haystack. Given how bored and devalued your reviewers will feel by the time they get to those documents, some may be missed.
Even if you’re being thoughtful about using prioritization, batching by family, and making reviewers the masters of specific subjects, your reviewers are almost always limited to their batches. They cannot go searching through the database for context to explain jargon terms, new “characters,” or the timing of particular events. After all, they are being judged on how many documents they tag per hour, not whether they identified a new source for privileged content or formed a better understanding of the key jargon used in the finance department.
Enter short message data to make things even worse. The chance of a highly relevant document being missed is elevated with “short message” formats like text messages and Slack chats. In short message exchanges, the participants are having an ongoing, highly informal conversation. If the reviewer is looking at only one or a few messages in a long string, they likely lack the necessary context to understand the conversation and where it fits. Critical information will be overlooked or misinterpreted when this happens.
Say the case’s “silver bullet” document is a chat between defendants in a trade secrets case in which one asks another, “could u to create another algo from scratch? Trying to make this go away. 😬” (Source, modified.) Figuring out whether this is a relevant and responsive document requires understanding the abbreviation of algorithm, the timing of the conversation (the day after the trade secret suit was filed), the roles of the participants, and perhaps the anxiety conveyed by the emoji. A reviewer in the last days of the review may have developed this context and will recognize the importance, but what if the chat surfaced on day one of the review? It might not have been identified as key … or it might not have been identified as responsive at all.
Should we change the review process to account for this disconnect? This is a conversation that the tech-focused side of the aisle often misses, and an ongoing one. Here are a few of my thoughts on how we can ensure short messages don’t fall through the cracks.
1. Catch it in the Quality Control Process
A good QC process might recover our example chat from the dumpster in a number of ways. For example:
From a chronology of key discussions identified during review, the QC team notices the lack of any key communications reacting to the filing of the lawsuit, and decides to look back through.
At the end of the review, we search the non-responsive documents for the roots of some key search terms, such as “algo!”
AI flags the chat as one it thinks we probably got wrong, because other key responsive documents referred to “the algo.”
AI identifies the chat as having unusually emotional content, given the anxious grimace emoji and the desire to “make this go away.”
A QC reviewer looks through the list of emojis used in chats in the database and identifies a few as unusual or indicating a problem.
2. Empower Your Reviewers to Investigate
Reviewers are told to remain within the “four corners” of a document when doing electronic review. This is a necessary consequence of larger, more distributed, often outsourced review teams reviewing enormous volumes of data. We lock everyone into their lane, limiting their ability to revisit previous documents, examine context, look at duplicates and family members, etc.
But picture your favorite legal show or movie. Our protagonist, a plucky pair, are paging through drawers, piles, or boxes of paper, looking for evidence that will solve the case. The junior member of the team is about to toss a folder into the trash pile when she re-focuses on a post-it note on the first page. “Where did I see that before…” she murmurs to herself, and then dives into another pile. “Aha! Yes! I knew I’d seen that acronym before. This is how they were getting into the banks’ systems!”
Fantastical and outdated, maybe. But those flashes of inspiration can’t happen when reviewers are discouraged from looking outside their current batch. Conversations in office environments may span several different communications platforms over the course of a day, with an email request that results in a group chat and file share followed by a subsequent text to close the loop. Even the best batching techniques might not group the email, chat, files, and text into the same batch, but it’s important your reviewers be able to connect the dots.
Tech solutions may include expanding the “sidebar references” that currently supply a reviewer with information on family members and duplicates. Perhaps a visual of where the document appears in the case timeline, or a list of other communications between the same parties during that day? Identifying other documents with similar content, based on AI analysis?
Alternatively, it may be that the problem proposes its own solution: Although our reviewers are rarely in the same room these days, a group chat might allow them to make connections between documents that they might not otherwise have seen alone.