ARTICLE

How Thomson Reuters develops reliable AI technology

A fundamental approach to rigorous testing by engineers, data scientists, and industry experts

Every professional who depends on technology, especially products that employ artificial intelligence (AI), is rightly concerned about issues such as data quality and lack of transparency, consistency, and reliability. 

That’s why we employ an exacting testing and validation process to ensure every product or program we release performs to the highest possible standards. We’ve been doing this for over 30 years for the products we develop for legal, tax and accounting, risk and fraud, news professionals, and professional service firms.

With a rich history in applied research activities, Thomson Reuters Labs focuses on revolutionary technology for real-world business problems. Our customers’ need for trustworthy information drives our research — along with the recent breakthroughs in machine learning and AI research.

The people we serve require the right information — in the proper context — and often under tight time constraints. So, we’ve adopted a comprehensive approach that closely examines accuracy, bias, usability, depth and breadth of information, and data drift to give professionals the confidence that the information they receive is accurate and complete.

This approach requires data scientists, engineers, and domain experts to test hundreds of examples to determine if a product or program can meet our standards.

How we approach AI development

Every time we begin a project incorporating AI, we start by defining the task: what exactly is the AI meant to accomplish? 

Building an artificial intelligence model based on what it is supposed to accomplish gives us the ability to stress test it without having to completely design the AI solution first. We can, therefore, lay out the features that allow the product or program to function properly and develop a reliable AI model in parallel. Once both meet their requirements, the developers integrate the AI model into the newly developed product feature, and they perform further rigorous testing to ensure that its usability meets our standards.

During the AI model development, we refine the model and adjust it to meet the predefined requirements. We then do error analysis, which entails observing how and where the model did not make the correct prediction or provide the proper information. Then, we work to understand why and determine how to make corrections.

This cycle of iterative refinements continues until we are satisfied that the AI is performing the appropriate tasks accurately.

Critical components of developing AI

We must integrate two critical components into AI’s development: the involvement of domain experts and the implementation of continuous evaluation.

Involvement of domain experts

Model development is not simply the responsibility of scientists. To create a model that performs the way it needs to function, we need domain experts who implicitly understand what the end user will need. These people help the scientists develop the appropriate prompts to test for accuracy, bias, and consistency.

If we’re designing AI to benefit the legal profession, we need experts with that specific background. Conversely, if it’s to work for accounting or tax professionals, we require domain experts who know what is accurate and most beneficial to those users. 

That said, a domain expert developing prompts can’t solve every problem or issue. We need scientists to maintain the rigor of the testing, figure out how to overcome challenges, and improve what needs fixing.

Implementation of continuous evaluation

We employ strict scientific evaluation methods to ensure the models’ continued accuracy and high-quality answers.

Checking accuracy

For accuracy, we ask how accurate the information produced was. How complete was the information?

We measure accuracy by assessing the extent to which an AI system consistently produces correct predictions or outputs based on the inputs or data given. If the accuracy is unsatisfactory, the model is adjusted and reevaluated until we achieve satisfactory accuracy.

Identifying bias

It’s important to be alert to potential bias in what the AI delivers. Eliminating AI bias requires drilling down into datasets, machine learning algorithms, and other elements of AI systems to identify sources of potential bias.

Regulating data drift

Data drift refers to changes in the input data's statistical characteristics over time, which can affect the model's performance. Information changes rapidly and often. Every field using our technology has new regulations, laws, interpretations, and more shifting and updating data. Keeping abreast of those changes is vital to ensuring the model is as accurate as possible.

Establishing usability

We evaluate how well the model works under real-world conditions. First, we perform research to determine what users need. Then, we test the model to see how it works in everyday situations. Next, we solicit feedback from industry users to help us refine the model and build trust in its validity. 

Human involvement at every stage

By involving human expertise in developing, testing, and using AI — also known as a “human in the loop” approach — we can reduce risks around privacy and inaccuracy. Ultimately, AI is a tool created by and for humans, enhancing our capabilities rather than replacing us. Human values and ethics should always guide its development to ensure it complements human efforts.

The result of careful testing and validation

The rigorous testing and validation of AI technology is not just a technical necessity; it's a profound responsibility. It will truly allow professionals to trust and embrace the transformative power of AI

Each test we conduct, each validation we complete, is a step toward building a world where technology enhances professionals’ capabilities and gives greater confidence in decision making.

CoCounsel

Rely on our trusted generative AI assistant, trained by industry experts and backed by authoritative content