> we build in verification and deterministic checks, so customers can confirm it was done right
I didn't see this in your demo, how is this being implemented? You're entering fairly important information into EHRs like allergies and medications, that's more than just high-stakes, that's medical malpractice territory. I think you folks are really pushing the limits of the FDA's medical device exemptions for administrative work here. Are you working with the FDA on any kind of AI medical device certification?
What EHRs do you integrate with?
arcb 2 days ago [-]
Great questions. You're right that this is a high-stakes domain. Today, we only perform data entry in cases where we can deterministically verify that the information was correctly entered. Otherwise, we fail the task and flag it to the team. Re how - in the data entry case, we compare our source and destination data. For example, a JSON entry in our source must be present, without transformation, in the appropriate section of the EHR, verified by OCR. I'm taking a note to add this to our video. We also wouldn't take on anything close to diagnosis or treatment
We're also not operating autonomously: 100% of our outputs are reviewed by the clinical team shortly after entry, as part of their regular process. That feedback loop is essential, both for safety and for evolving the system responsibly.
Amongst EHRs we currently work with Athena, though we do a lot of work on isolated file stores that our customers create for us.
2 days ago [-]
ljm 2 days ago [-]
How do you deterministically verify that information was correctly entered?
Is it just
validates :email, format: :email
In Rails?
arcb 2 days ago [-]
It depends on the source and destination. The trickiest case is when we're using browser agents for data entry. We can use the fact that we focus on repetitive tasks to our advantage - we know what sections of UI we need to check, and for what data. We can verify correctness by re-extracting data from the destination UI (via the DOM or OCR) and checking:
That all expected fields were entered correctly.
That no unexpected or extraneous data was added.
When we have access to a direct data source (like an API or file store), verification is simpler — we can do deterministic checks and directly confirm the values.
We're also building a validation library for common field types (dates, contact info, etc.) to enforce constraints and avoid propagating unverifiable or out-of-range values.
candiddevmike 2 days ago [-]
If you have an API, why are you using browser agents?
arcb 2 days ago [-]
We don't use browser agents if an when we have an API - we prefer the strongest data types we can access. It comes down to what our customers can work with. Some of them are fairly technical (have an IT team), and some aren't (have a legacy portal and operate on spreadsheets / paper).
sibeliuss 1 days ago [-]
Coming from a family of medical professionals I can now go back and tell them: just wait, the future looks bright.
This is one of the most grueling, labor intensive, boring, error prone, hate my job areas possible. It makes perfect sense for agents to perform these tasks, but as with everything else there will be a labor impact.
arcb 1 days ago [-]
We hope to be part of that brighter future! On the labor impact front, what we're seeing is that there is so much pent up demand for care, that any time we free up staff, they enable more throughput or more depth on casework. I hope we create more potential for humans to improve care because of our work.
sibeliuss 1 days ago [-]
This will certainly be the case, and the promise of this tech. Thanks for taking on this space.
Mashimo 2 days ago [-]
Oh wow, I work in with Hospital software and we have APIs for referrals (from general practitioner). Thank god.
What we currently are looking at is scheduling of staff, because somehow that involves different software (dr. vs nurse), and the staff builds a spreadsheet, and then enter it into other software. Totally whack how much time and effort they spend on that.
arcb 1 days ago [-]
I hear you. We constantly think of the value of clinicians' time. What could they be doing if they didn't have to do high-volume data entry. In one of our customers' cases, they instantly started helping doctors with responding to patients directly.
noleary 2 days ago [-]
You mentioned your early customer in obesity medicine.
Are there specific kinds of clinics that are an especially good fit for you? Are you seeing any patterns in the kinds of clinics that are relatively eager to adopt an AI product like yours?
I don't have any feedback on what you're up to, I just think it's interesting!
arcb 2 days ago [-]
Thanks! From our side, we’re currently focused on specialty and multi-specialty groups. For example, obesity medicine, cardiology, pathology centers, radiology clinics... These groups tend to have repeatable workflows and a lot of operational toil. That makes them a good fit for automation. Even modest time savings let clinicians go deeper on casework, or see more cases. We're also working with smaller and medium-sized groups (think 10 to 100 doctors), since it helps us sit directly with clinicians and get high-quality feedback.
Compared to 3 or 4 years ago, clinicians are much more open to AI. They've heard of ChatGPT or ambient scribes, and they often come to us with specific ideas of what they want AI to solve. Talking to them is one of my favorite parts of the job.
That said, we also hear a lot of of requests from groups that we have to turn down. Sometimes we can't guarantee success, or the product just isn’t ready for that use case. As an example, a ton of clinical interfaces only work on desktops, which we'd like to support but don't yet. We're hoping to grow into those over time.
terrib1e 2 days ago [-]
Are you hiring? I currently work at a large organization that administers govt contracts including the health marketplaces, so I have some insight into this stuff.
arcb 2 days ago [-]
We're looking to hire our first few engineers in the next few months, and we would love to talk to anyone who has an interest in working in this domain! If you'd like to talk more we're at founders at bitboard.work and are fast to respond.
MK_Dev 2 days ago [-]
This is a pretty cool idea and implementation. Any more details on the tech stack you guys are using (besides `browser-use`)?
arcb 2 days ago [-]
Thank you! We have a fork of browser-use that lets us hand hold web agents since we know our tasks are repetitive. We can cache expected paths and fire alerts if we go off the rails. We'd love to contribute it back at some point, mainly a question of bandwidth.
We're evaluating Cua (https://www.ycombinator.com/companies/cua) to containerize our agents; am a fan so far. We're also putting Computer Use agents from (OAI and Anthropic) to the test. Many legacy ERPs don't run in the browser and we have to meet them there. I think we're a few months away from things working reliably and efficiently.
We're evaluating several of the top models (both open and closed) for browser navigation (claude's winning atm) and PDF extraction. Since we're performing repetitive tasks, the goal is make our workflows RL-able. Being able to rely on OSS models will help a lot here.
We're building our own data sets and evaluations for many of the subtasks. We're using openai's evals (https://github.com/openai/evals) as a framework to guide our own tooling.
Apart from that, we write in Typescript, Python, and Golang. We use Postgres for persistence (nothing fancy here). We host on AWS, and might go on premises for some customers. We plan on investing a lot into our workflow system as the backbone of our product.
I prefer open source when possible. Everything's new and early, and many things require source changes that others might not be able to prioritize.
Edit - one thing I'd love to find a good solution for is reliably extracting handwriting from PDF documents. Clinicians have to do this a ton to keep the trains running on time, and being able to digitize that knowledge on the go will be huge.
Very open to ideas here. We're seeing great tools and products come up by the day, including from our own YC batch.
ahstilde 2 days ago [-]
very cool! What's the biggest thing you learned from Forward that you're applying here?
arcb 2 days ago [-]
Thank you! One of the big ones is that clinicians don't want more screens; they're already overloaded. So we're succeeding if we're invisible and yet effective for our customers. We're not dogmatic here - we can see customer-facing UIs being useful for focused and necessary usecases. For example, changing their process should be something close to an email or a chat with their agent, but not a complex process builder that they have manage runtime complexity in.
Another is that once you free up clinician time, they will quickly find higher-leverage tasks. It shows how overloaded the system is, and that there's pent-up demand to make it better.
beingalikhan 2 days ago [-]
how are you guys handling HIPAA compliance for AI Agents? how is it that data in motion is secure?
arcb 1 days ago [-]
Great question. In the web agent case, we solely use HTTPS, and only between resources we either directly control (our servers), or whitelisted customer websites where we connect on HTTPS. An HTTP connection would fail the call stack, as would visiting a non-whitelisted link. A lot of our work happens away from the browser (in APIs and data stores), where we encrypt at rest and in motion.
mikeqq2024 2 days ago [-]
same question for HIPAA compliance
SuperNinKenDo 2 days ago [-]
I wish you a very be on the receiving end of your "services" and their consequences.
I didn't see this in your demo, how is this being implemented? You're entering fairly important information into EHRs like allergies and medications, that's more than just high-stakes, that's medical malpractice territory. I think you folks are really pushing the limits of the FDA's medical device exemptions for administrative work here. Are you working with the FDA on any kind of AI medical device certification?
What EHRs do you integrate with?
We're also not operating autonomously: 100% of our outputs are reviewed by the clinical team shortly after entry, as part of their regular process. That feedback loop is essential, both for safety and for evolving the system responsibly.
Amongst EHRs we currently work with Athena, though we do a lot of work on isolated file stores that our customers create for us.
Is it just
In Rails?That all expected fields were entered correctly.
That no unexpected or extraneous data was added.
When we have access to a direct data source (like an API or file store), verification is simpler — we can do deterministic checks and directly confirm the values.
We're also building a validation library for common field types (dates, contact info, etc.) to enforce constraints and avoid propagating unverifiable or out-of-range values.
This is one of the most grueling, labor intensive, boring, error prone, hate my job areas possible. It makes perfect sense for agents to perform these tasks, but as with everything else there will be a labor impact.
What we currently are looking at is scheduling of staff, because somehow that involves different software (dr. vs nurse), and the staff builds a spreadsheet, and then enter it into other software. Totally whack how much time and effort they spend on that.
Are there specific kinds of clinics that are an especially good fit for you? Are you seeing any patterns in the kinds of clinics that are relatively eager to adopt an AI product like yours?
I don't have any feedback on what you're up to, I just think it's interesting!
Compared to 3 or 4 years ago, clinicians are much more open to AI. They've heard of ChatGPT or ambient scribes, and they often come to us with specific ideas of what they want AI to solve. Talking to them is one of my favorite parts of the job.
That said, we also hear a lot of of requests from groups that we have to turn down. Sometimes we can't guarantee success, or the product just isn’t ready for that use case. As an example, a ton of clinical interfaces only work on desktops, which we'd like to support but don't yet. We're hoping to grow into those over time.
We're evaluating Cua (https://www.ycombinator.com/companies/cua) to containerize our agents; am a fan so far. We're also putting Computer Use agents from (OAI and Anthropic) to the test. Many legacy ERPs don't run in the browser and we have to meet them there. I think we're a few months away from things working reliably and efficiently.
We're evaluating several of the top models (both open and closed) for browser navigation (claude's winning atm) and PDF extraction. Since we're performing repetitive tasks, the goal is make our workflows RL-able. Being able to rely on OSS models will help a lot here.
We're building our own data sets and evaluations for many of the subtasks. We're using openai's evals (https://github.com/openai/evals) as a framework to guide our own tooling.
Apart from that, we write in Typescript, Python, and Golang. We use Postgres for persistence (nothing fancy here). We host on AWS, and might go on premises for some customers. We plan on investing a lot into our workflow system as the backbone of our product.
I prefer open source when possible. Everything's new and early, and many things require source changes that others might not be able to prioritize.
Edit - one thing I'd love to find a good solution for is reliably extracting handwriting from PDF documents. Clinicians have to do this a ton to keep the trains running on time, and being able to digitize that knowledge on the go will be huge.
Very open to ideas here. We're seeing great tools and products come up by the day, including from our own YC batch.
Another is that once you free up clinician time, they will quickly find higher-leverage tasks. It shows how overloaded the system is, and that there's pent-up demand to make it better.