OscarAI Blog
In the spirit of constantly learning, we are experimenting with several AI use cases and documenting our discoveries.
Inside our Latest Hackathon
We recently wrapped up our latest Hackathon –– an event that brought together Oscar employees from various teams to brainstorm new AI use cases and explore how our tech can enhance the healthcare experience for our members, providers, and brokers.
Now is the Time to Prepare Best Practices to Use AI Responsibly to its Full Potential
Research on AI is moving unpredictably fast. Though the use of AI holds great potential in the healthcare space to improve affordability, efficiencies and even outcomes, the stakes are inherently higher in use cases that involve patient care.
Oscar Health’s early observations on OpenAI o1-preview
Yesterday, OpenAI dropped a preview of a new series of generative models focused on reasoning, dubbed “OpenAI o1.” Our initial observations are that the “o1-preview” model is more autonomous in generating a set of steps to solve a problem, more precise, and able to handle more complex tasks with higher consistency in output.
Harnessing OpenAI to Enhance the Healthcare Experience
The healthcare industry is too complex, costly, and cumbersome. Since our inception, we have been at the forefront of using innovative technologies and leveraging our tech stack to simplify the insurance process and improve member outcomes.
Our team recently worked with OpenAI on a case study analyzing Oscar's use of AI across our business — and our opportunity to improve healthcare. We found success within the following use cases…
Needle in a Haystack, Part 2: Testing GPT’s Ability to Read Clinical Guidelines
Part 2 of 3
Our providers leverage a repository of guidelines, varying by state, that outline conditions under which a service will be covered. These guidelines are specific to a member’s policy and plan type, and it can be a lot of information for providers to sift through. Our team is streamlining this process by teaching GPT to help clinicians filter through paperwork, allowing them more time to care for patients.
Needle in a Haystack: Using LLMs to Search for Answers in Medical Records
Part 1 of 3
We are constantly thinking about how to make the workflows of clinicians less tedious, so they can do what they do best: serve patients and provide care. In this next series of posts, we will share our learnings from a different application with the same goal: improving clinician workflows to deliver faster, better care for our members.
Related Condition Search
In the first step of our ‘find care’ pipeline, Oscar has an omni-search bar that’s able to distinguish between reasons for visit, providers, facilities, and drugs. The bar uses lexicographic (i.e., close in dictionary order) techniques to find autocomplete matches from each group type and a rules-based model decides which results from each group should be surfaced.
Curious Language Model Limitations
Language models are awesome and all, but my favorite research papers are those that show where they fail. It's easier to understand hard limits than soft capabilities. Here are four recent papers with good examples for the limits of current LLMs.
Why Notation Matters
A practical observation on LLMs that is both encouraging and worrisome: it is surprising how much NOTATION matters. Simply how you write something down makes it much easier or harder for a transformer to understand the actual meaning behind it. It’s a little like LLMs are able to read a quantum physics book, but only if it’s written in 18-point Comic Sans to look like a Dr. Seuss book.
Streamlining Commission Reconciliation: An AI Approach
Brokers are paid commissions by insurance carriers on a per month basis, but they must follow certain guidelines in order to be eligible, otherwise they will not be paid their commission. If a broker believes that they were incorrectly denied a commission, they can reach out to Oscar to ask for an explanation, a process we call a “commission reconciliation.”
AI Use Case: Messaging Encounter Documentation
Last year, we expanded our AI functionality to include a new use case: generating initial drafts for providers to document their secure messaging-based visits with patients.
Call Summarization: comparing AI and human work
Summarization is considered a top use case for generative AI. Our call quality team ran a side-by-side evaluation, and you’ll see the results from these real calls. The initial findings show that AI performs comparably to our Care Guides in summarizing calls overall, but this performance isn’t evenly distributed.
Enforced planning and reasoning within our LLM Claim Assistant
The Claim Assistant starts by formulating a strategic plan to tackle the problem at hand. This plan is an array of thoughts or subgoals, much like breaking down a large task into smaller, manageable pieces.
A User-Centric Approach to Working with AI
Take a look at how user research helped inform our experimentation in two areas: automated messaging with members and clinical documentation for primary care providers.
GPT-4 Turbo Benchmarking
The pace of improvement of large language models (LLMs) has been relentless over the past year and a half, with new features and techniques introduced on a monthly basis. In order to rapidly assess performance for new models and model versions, we built a benchmarking data set and protocol composed of representative AI use cases in healthcare we can quickly run and re-run as needed.
Evaluating the Behavior of Call Chaining LLM Agents
We’re developing a GPT-powered agent designed to answer queries about claims processing. However, providing GPT-4 with sufficient context to respond to questions about internal systems presents a significant challenge due to the API request’s limited payload size.
AI Use Case: Electronic Lab Review
Oscar continues to experiment and iterate on clinical-AI-human use cases through Oscar Medical Group (OMG). OMG is a team of 120+ providers who offer virtual urgent and primary care for our members. It operates on top of Oscar’s in-house technology stack, including our internally-built Electronic Health Record (EHR) system.
A Simple Example for Limits on LLM Prompting Complexity
LLMs are capable of spectacular feats, and they are also capable of spectacularly random flame-outs. A big systems engineering issue remains figuring out how to tell one from the other. Here is an example for the latter.
Oscar Claim Assistant Built On GPT-4
Behind each health insurance claim are millions of combinations of variables that take into account a mixture of rules relating to regulations, contracts, and common industry practices among other factors. When a doctor has a question about a particularly complex claim, they contact one of Oscar’s claim specialists, who interprets the claims system’s output. It’s a complex and labor intensive process for health insurers.
Campaign Builder Actions
GPT enables new types of automation through Campaign Builder. This allows Oscar and +Oscar clients to deliver relevant interventions and intelligently monitor for signals to better serve members’ and patients’ clinical needs.