Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲Launch HN: General Instinct (YC P26) – Frontier models on edge devices

14 points by guanming0717 1 hours ago | 7 comments

rohansood15 22 minutes ago [-]

Have you benchmarked against other 3-bit dynamic quants like Unsloth? I am sorry but this framing against a full precision, newer, smaller MoE just seems misleading. Also, Gemma-4-26B-A4B is not the SOTA for edge. Even at launch, that would be the 31B.

guanming0717 18 minutes ago [-]

Yes I did, with other SOTA quant methods like HQQ, AWQ etc. You can find more info in our blog :) https://general-instinct.com/blog/frontier-moe-sub-4-bit

rohansood15 3 minutes ago [-]

I can't find it. Can you state your performance versus comparable 3-bit quantization from Unsloth/Bartowski?

XenophileJKO 28 minutes ago [-]

I'm still kind of surprised that people are targeting edge deployment of MoE models. By definition they optimize for computation cost at the expense of memory efficiency. We generally need the opposite on the edge.

I'm hoping to see more work in the other direction with cyclic/looped transformers and other memory dense approaches.

VikRubenfeld 1 hours ago [-]

You've likely heard about this - he'd probably like to talk to you and might potentially give you some good PR.

https://www.youtube.com/watch?v=rAzT5lcezPs&t=467s

smokel 33 minutes ago [-]

For those too lazy to watch someone talk on video for ages to make a point:

The link is to a famous YouTuber called PewDiePie and he uses a local LLM to parse his email, to save time with that. They have an autoreply system and get notified about urgent matters.

guanming0717 1 hours ago [-]

Thanks for sharing! I'd love to chat with him. Would you be open to introducing us? :)

Rendered at 18:02:04 GMT+0000 (Coordinated Universal Time) with Vercel.