Streamlining Commission Reconciliation: An AI Approach
By Maria Mahin and Sam Islam
Background, problem statement
Brokers are paid commissions by insurance carriers on a per month basis, but they must follow certain guidelines in order to be eligible, otherwise they will not be paid their commission. If a broker believes that they were incorrectly denied a commission, they can reach out to Oscar to ask for an explanation, a process we call a “commission reconciliation.”
The challenges our team faces when determining why a broker was not paid include translating broker data and business rules into a human readable and interpretable response; this can be time intensive. The data and rules governing commission payments can change over time, making it difficult to consistently and accurately determine exactly why a broker was not paid a commission without subject matter expertise.
Currently, our broker operations team performs the commission reconciliation process. Broker ops specialists must utilize both historical knowledge and deep understanding of the underlying Oscar data to develop a concise answer for brokers. This can be a time-intensive process.
Technical implementation/overview
In order to help answer these questions, we leveraged AI to apply business domain rules to existing broker data to help explain a commissions outcome.
First, we took a set of commission identifiers as an input, which helped us query for data from our database. We initially used these identifiers to fetch relevant liability records, which served as the basis for further processing.
Depending on the information contained within the liability data, we determined what question types to proceed with (i.e. if the focus should be around a commission block or commission’s recipient). This dictated the tables queried in subsequent data fetching, as well as the final prompt and business rules supplied to the LLM.
Once an execution pathway was determined by leveraging values from the fetched data and checking against a number of predefined use cases, we retrieved additional data required to answer the target question. This included information such as the expected broker and their agency alignments, policy attribution information, and other data points like broker FFM certification and appointment status. From the data, we extracted relevant portions to supply to the LLM as context, and post-process it into a format better suited for model interpretation. This currently consists of comma separated rows, representing an abbreviated picture of the source database records.
Finally, we supplied a commission question, the relevant context data, and a set of business rules to the LLM, and asked it to answer the question using the data provided. Business rules are represented in natural language, and are stored in Oscar’s dynamic configuration system (called Configutron). This configurability allows us to support versioned rule sets, facilitates collaboration on prompts by engineering, product, and operations, and supports quicker development workflows without having to deploy code changes to update rules. The resulting response is surfaced back to the user, as well as the context data used to derive the answer. This data can serve as a reference point for cross-validation of the LLM generated response.
Below is an example of a case and the AI bot response. We are currently using dummy IDs since we are in a pilot phase, but we will be moving to more human readable IDs in the future:
Input a given broker, the person the broker believes they should be paid a commission, the commission month they are expecting a payment.
2. AI bot returns a response and relevant contextual data to show proof of work.
3. Operations could then take the above response and contextual data to quickly and accurately respond to the broker.
Findings
Our current sample size is small as this is just a pilot, but we’re already seeing clear signs of greatly increased accuracy compared to manual review by operations.
After comparing the current performance of our AI bot to the manual reconciliations conducted by operations in 2023, we've observed a significant improvement. We define an “accurate case” strictly as one that is completely correct, down to the minor details, and does not require a second review by a higher-level expert. While Oscar is largely accurate in reconciling commissions, there are some minor errors that do not impact the outcome of the reconciliation but may require further manual review. The AI bot's first-pass accuracy rate, where the case would not require an escalation, is 19 percentage points better than the initial manual checks. In further test scenarios, the bot’s accuracy has increased by as much as 38 percentage points.
Areas for improvement and Next Steps
Refine prompts and handle additional cases: Iterate on prompts for conciseness, clarity, and detail when expressing business rules. Handle cases where there are shortcomings in LLM interpretation of data (e.g. date arithmetic when handling grace periods). Examine failed test cases, determine the root cause of incorrect responses, and incrementally address these gaps.
Continue to build out a suite of test cases: Define more test cases derived from real-world examples, and work with operations to evaluate LLM outputs against human produced responses.
We are continuing to iterate on the prompt so that we can expand beyond the pilot phase to fully cover all commission reconciliation reasons, enabling operations to spend less time reconciling commissions and to provide faster responses to our brokers.