Skip to content
Courts & Justice

How the Legal Services Corporation is analyzing court data to increase access to justice

Kristen Sonday  Co-founder & CEO of Paladin

· 6 minute read

Kristen Sonday  Co-founder & CEO of Paladin

· 6 minute read

Legal Services Corporation, which funds legal access for low-income Americans, is seeking to use real-time civil court data so legal aid providers can better help their clients

Founded in 1974 and funded by the United States Congress, the Legal Services Corporation (LSC) is the single largest funder of civil legal aid for low-income Americans — as well as for justice tech-related projects — in the country. One of its largest internal projects is the Civil Court Data Initiative (CCDI), which explores how access to real-time civil court data might help legal aid providers’ respond better to changing legal needs. I spoke with Holly Stevens, LSC’s Chief Data Officer, to learn more about the program and its implications.

Kristen Sonday: Tell us about your role within the Legal Services Corporation.

Holly Stevens: Within LSC, I lead the Office of Data Governance and Analysis. We are a group of data professionals — researchers, data engineers, data scientists, analysts, and web developers — working to provide data to legal aid organizations and other stakeholders.

Kristen Sonday: What is the Civil Court Data Initiative?

Holly Stevens: Every day, millions of Americans struggle with civil legal issues that limit their access to housing, healthcare, employment, education, and more. When these cases go to court, they are heard in more than 15,000 individual state and local courts across the country, and most individuals do not have access to legal assistance or representation.

Due to the highly decentralized nature of our courts, key stakeholders lack access to data about these cases — for example, the impact of legal representation (or lack thereof) and the case outcomes. This lack of access to data hinders transparency, informed decision-making, resource allocation, the identification of disparate treatment and outcomes, and the ability to measure the impact of interventions. Without data, it’s challenging to address systemic issues, improve access to justice, and ensure that the court systems operate in a fair and equitable manner.

The CCDI began in 2019 to determine how access to real-time civil court data could inform legal aid providers’ response to changing local needs. Our vision is to democratize court data and get it in the hands of organizations who can help enhance engagement with the courts, decrease default or failure-to-appear rates, and improve legal services use of data.

Kristen Sonday: How does CCDI work, and what are a few examples of real-world applications for the open-source library, CleanCourt?

Holly Stevens: In the existing system, the names of those involved in a civil case are recorded in electronic filing systems as free text or transcribed from paper dockets, leading to typos or differences in formatting when recorded over time. While these differences may be slight, they can pose significant challenges when analyzing the data in aggregate — for example, when a corporation has changed its business designation multiple times.

CCDI has developed several processes for cleaning names in aggregate from court lookup websites and data-sharing initiatives and has consolidated these methods in the CleanCourt open-source library. These methods employ natural language processing to parse and group party names and are more complex than common methods of data cleaning. As a result, we can drastically reduce the time required to run the complex string cleaning with high-quality results. In fact, CleanCourt can analyze more than four million records in less than 90 minutes.

Standardizing court data is important so we can have high confidence and integrity in the data with which we’re working and understand the impact of certain patterns on the justice system accurately. For example, recent research in Michigan, showed that more than 70% of debt collection cases were filed by just 10 companies. The same trend has been documented in eviction cases in cities across the country — a small number of landlords disproportionately drive eviction filings. Being able to identify the repeat players who commit bad practices, overload our court system, and treat tenants unlawfully is crucial to ensuring these cases are properly addressed.

In another example, as pandemic-related eviction protections expired in Virginia, we developed an early warning system for eviction filings accessible to legal aid providers across the state. The system includes weekly reports on evictions in every local jurisdiction in Virginia. Advocates use the reporting to develop outreach activities, including having lawyers in courts on specific dates that have a large number of eviction hearings scheduled, and working with community organizers to respond to mass evictions in specific properties.

Kristen Sonday: What are some of the challenges of using this data for good?

Holly Stevens: The biggest challenge is balancing the privacy of individuals with the need for data collection and increased transparency. There’s a real risk for people facing eviction or debt collection in which commercial entities can use the data to create better algorithms to screen tenants with a prior eviction or use predatory lending schemes with those who have faced debt collection.

Kristen Sonday: How can other organizations access LSC’s civil court data and collaborate?

Holly Stevens: Organizations can access the data set at, and we are always open to collaboration. We work directly with legal aid organizations to customize data for their use cases, which is important given their lack of resources. We ensure we get them what they need to improve outreach, put lawyers in rural courts when hearings are scheduled, evaluate their practices, or advocate for systemic change.

Kristen Sonday: How do you think about CCDI’s potential impact?

Holly Stevens: When you’re looking at millions of records of data that are messy and challenging to understand, you have to remember that this data is really about people. People who are parties to a case, people trying to keep up with fast-moving dockets, and people who are trying to help (and some who aren’t).

Our biases are reflected in the ways we’ve collected the information, the ways we have or have not invested in better data collection and analysis, and the ways we analyze the data. Courts are in the unenviable position of handling these cases which often the law is ill-equipped to help resolve. Low-income families may have one triggering event that led to the legal problem — such as a medical bill or a child in the hospital — that can lead to job loss and then missed rent payments. We see so much opportunity in this data to help.

More insights