Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲Tosijs-schema is a super lightweight schema-first LLM-native JSON schema library (npmjs.com)

46 points by podperson 76 days ago | 28 comments

podperson 76 days ago [-]

I wrote this library this weekend after realizing that Zod was really not designed for the use-cases I want JSON schemas for: 1) defining response formats for LLMs and 2) as a single source of truth for data structures.

7thpower 76 days ago [-]

What led you to that conclusion?

dsabanin 76 days ago [-]

Zod's validation errors are awful, the json schema it generates for LLM is ugly and and often confusing, the types structures Zod creates are often unintelligible in the and there's even no good way to pretty print a schema when you're debugging. Things are even worse if you're stuck with zod/v3

light_hue_1 76 days ago [-]

None of this makes a lot of sense. Validation errors are largely irrelevant for LLMs and they can understand them just fine. The type structure looks good for LLMs. You can definitely pretty print a schema at runtime.

This all seems pretty uninformed.

sesm 76 days ago [-]

What's wrong with Zod validation errors?

nerdponx 76 days ago [-]

And what makes this different? What makes it LLM-native?

podperson 76 days ago [-]

It generates schemas that are strict by default while Zod requires you to set everything manually.

This is actually discussed in the linked article (READ ME file).

halayli 76 days ago [-]

That's not true based on zod docs. https://zod.dev/api?id=objects

most of the claims you're making against zod is inaccurate. the readme feels like false claims by ai.

podperson 76 days ago [-]

It seems to be true to me. And aside from the API stuff (because I am far from an expert user of Zod) all of this has been carefully verified.

podperson 76 days ago [-]

1. Zoe’s documentation, such as it is 2. Code examples

taveras 76 days ago [-]

Happy to see more tools in the data schema space.

Will you support Standard Schema (https://standardschema.dev)? How does this compare to typebox (https://github.com/sinclairzx81/typebox)?

kevmo314 76 days ago [-]

> For large arrays (>97 items) and large dictionaries

How did we end up in a world where 97 items is considered large?

vages 75 days ago [-]

Mind your off-by-1s: 97 items is not large, 98 is.

76 days ago [-]

yunohn 76 days ago [-]

> It checks a fixed sample of items (roughly 1%) regardless of size

> This provides O(1) performance

Wouldn’t 1% of N still imply O(N) performance?

podperson 76 days ago [-]

N is increasing. O(1) means constant (actually capped). We never check more than 100 items.

SkiFire13 76 days ago [-]

Then it's not 1%, because if you have 100k items and you check at most 100 you have checked at most 0.1% of items.

bbminner 76 days ago [-]

While llms accept json schemas for constrained decoding, they might not respect all of the constraints.

_heimdall 76 days ago [-]

Had you considered using something like XML as the transport format rather than JSON? If the UX is similar to zod it wouldn't matter what the underlying data format is, and XML is meant to support schemas unlike JSON.

podperson 76 days ago [-]

JSON Schema is a schema built on JSON and it’s already being used. Using XML would mean converting the XML into JSON schema to define the response from the LLM.

That said, JSON is “language neutral” but also super convenient for JavaScript developers and typically more convenient for most people than XML.

_heimdall 76 days ago [-]

Maybe I missed a detail here, sorry if that's the case!

Why would we need to concert XML, which already supports schemas and is well understood by LLMs, back to JSON schema?

verdverm 76 days ago [-]

Because most of the world uses JSON and has rich tooling for JSONSchemas, notable many LLM providers allow JSONSchemas to be part of the request when trying to get structured output

_heimdall 75 days ago [-]

LLM providers allow sending any string of text though, right? In my experience the LLM understands XML really well, though obviously that doesn't negate them from understanding JSONSchema.

verdverm 74 days ago [-]

No, it's more than just text now, it's more than just an LLM for the most part now too. They are agentic systems with multiple LLMs, tools, and guardrails

When you provide a JSONSchemea, the result from the LLM is validated in the code between before passing on to the next step. Yes the LLM is reading it too, but non LLM parts of the system use the schema as well

This is arguably much more important for tools and subagents, but also these things are being trained with JSONSchema for tool calling and structured output

yeasku 76 days ago [-]

LLMs are not people.

We want a format for LLMs or for people?

76 days ago [-]

drowsspa 76 days ago [-]

As a person myself, I very much prefer JSON

_heimdall 75 days ago [-]

MCP isn't meant for humans though, I'm not side why it matters what a human would prefer

podperson 76 days ago [-]

JSON schema is very human readable.

_heimdall 75 days ago [-]

Why does that matter though? MCP is meant for LLMs not humans, and for something like this lib it seems the human side if the API is based on JavaScript not JSON.

Rendered at 08:04:54 GMT+0000 (Coordinated Universal Time) with Vercel.