Next.js App Router + React Server Components Demo

NHacker Next

new
past
show
ask
show
jobs
submit

▲Launch HN: Cactus (YC S25) – AI inference on smartphones (github.com)

88 points by HenryNdubuaku 12 hours ago | 42 comments

cco 12 hours ago [-]

I've been using Cactus for a few months, great product!

Makes it really easy to plug and play different models on my phone.

If anybody is curious what a Pixel 9 Pro is capable of:

Tokens: 277- TTFT: 1609ms 9 tok/sec

qwen2.5 1.5b instruct q6_k

Sure, here's a simple implementation of the Bubble Sort algorithm in Python:

def bubble_sort(arr): n = len(arr) for i in range(n): # Flag to detect any swap in current pass swapped = False for j in range(0, n-i-1): # Swap if the element found is greater than the next element if arr[j] > arr[j+1]: arr[j], arr[j+1] = arr[j+1], arr[j] swapped = True # If no swap occurs in the inner loop, the array is already sorted if not swapped: break

# Example usage: arr = [64, 34, 25, 12, 22, 11, 90] bubble_sort(arr) print("Sorted array is:", arr)

This function sorts the array in ascending order using the Butbble Sort algorithm. The outer loop runs n times, where n is the length of the array. The inner loop runs through the array, comparing adjacent elements and swapping them if they are in the wrong order. The swapped flag is used to detect if any elements were swapped in the current pass, which would indicate that the array is already sorted and can be exited early.

HenryNdubuaku 12 hours ago [-]

Thanks for the kind words, we’ve improved performance now actually, follow the instructions on the core repo.

Same model should run 3x faster on the same phone.

These improvements are still being pushed to the SDKs though.

cco 9 hours ago [-]

Wow! 3x is huge.

I've had great experiences with gpt-oss20b on my laptop, a genuinely useful local model.

3x probably doesn't get my Pixel Pro 9 to being able to run 20b models, but its getting close!

HenryNdubuaku 8 hours ago [-]

Although GPT OSS 20B has 1.7B activated parameters which will be fast, 20B weights is a lot for developers to bundle or consumers to download. That’s the actual problem.

ks2048 5 hours ago [-]

This paragraph is a bit confusing:

> While Cactus can be used for all Apple devices including Macbooks due to their design, for computers/AMD/Intel/Nvidia generally, please use HuggingFace, Llama.cpp, Ollama, vLLM, MLX. They're built for those, support x86, and are all great!

It reads like you're saying for all Apple devices (which would include iOS), use these other things.(?) For iOS, are you trying to beat performance of other options? If so, it would be helpful to include comparison benchmarks.

HenryNdubuaku 2 hours ago [-]

We are focused on phones and we did add some benchmarks and will add more. However, anyone can see performance for themselves with the repo directly.

pzo 11 hours ago [-]

FWIW They change license 2 weeks ago from Apache 2.0 to non commercial. Understand they need to pay the bills but lost trust with such move. Will stick with react-native-ai [0] that is extension of vercel aisdk but with also local inference on edge devices

[0] react-native-ai.dev

observationist 11 hours ago [-]

Open source for the PR, then switching to non-open licensing is a cowardly, bullshit move.

https://github.com/cactus-compute/cactus/commit/b1b5650d1132...

Use open source and stick with it, or don't touch it at all, and tell any VC shitheels saying otherwise to pound sand.

If your business is so fragile or unoriginal that it can't survive being open source, then it will fail anyway. If you make it open source, embrace the ethos and build community, then your product or service will be stronger for it. If the big players clone your work, you get instant underdog credibility and notoriety.

HenryNdubuaku 11 hours ago [-]

Thanks for sharing your thoughts. Honestly, I’d be annoyed too and it might sound like an excuse, but our circumstance was quite unique, it was a difficult decision at that time being an open-source contributor myself.

It’s still free for the community, just that corporations need a license. Should we make this clearer in the license?

typpilol 10 hours ago [-]

Yes.

Just say that in the license.

HenryNdubuaku 10 hours ago [-]

Done, thanks, let us know anything else.

HenryNdubuaku 11 hours ago [-]

Understandable, though to explain, Cactus is still free for personal & small projects if you fall into that category. We’re early and would definitely consider your concerns on license in our next steps, thanks.

mdaniel 11 hours ago [-]

For fear of having dang show up and scold me, I'll just add the factual statement that I will never ever believe any open source claim in any Launch HN ever. I can now save myself the trouble of checking, because I can be certain it's untrue

I already knew to avoid "please share your thoughts," although I guess I am kind of violating that one by even commenting

theturtletalks 10 hours ago [-]

I agree, I've seen so many products start open source to gain traction, get into YC, and then either go closed source or change the license. That's a bait and switch and I appreciate the comment pointing it out.

I downloaded Cactus a couple months back because I saw a comment, but bait and switch like this makes we want to look for an actual open source solution.

HenryNdubuaku 9 hours ago [-]

The license change doesn’t affect you based on your explanation actually, the licence has been updated with clearer words. We really appreciate you as a user, please share any more feedback you have, thanks.

theturtletalks 9 hours ago [-]

I don’t appreciate you dismissing my claim. When I installed Cactus chat months ago, the company was claiming that Cactus chat would allow users to connect to other apps on their device and allow them to be controlled by AI.

Your license change goes against that. You say it’s free for personal use but how many times do people create something for personal use and monetize it later? What if I use Cactus chat to control a commercial app? Does that make Cactus chat use “commercial”?

HenryNdubuaku 10 hours ago [-]

It’s absolutely fine to share your thoughts, that’s the point of this post, we want to understand where people’s heads are at, it’s what determines our next decisions. What do you really think? I’m genuinely asking so I don’t think mods will react.

trollbridge 7 hours ago [-]

Here’s an example of what I want to do: ship our application entirely open source/free (AGPL3), but with options for interested parties who want to pay us for support/consulting to do so. Likewise, we want interested parties who want to build their own proprietary app on top of our stack to be able to do so.

Mixing in a “you have to pay if you’re a corporation” licence makes this difficult if not impossible, particularly if we wanted deep integration with eg Cactus. We don’t want to police a “corporation” who wants to use our open source software.

HenryNdubuaku 2 hours ago [-]

Thanks for pointing this out, another factor for us to figure out. We waive the license for such cases, wanna get in touch? I don’t think your consumers have to worry about the license.

giveita 6 hours ago [-]

Tried the android app but model download froze. Are you using the same docker-style repositories as Ollama. Because they suck. If you do I suggest use your own s3 instead.

HenryNdubuaku 2 hours ago [-]

We host on HuggingFace, were you able to get it to work eventually?

mritchie712 12 hours ago [-]

how many GB does an app packaged with Qwen3 600m + Cactus take up?

e.g. if I built a basic LLM chat app with Qwen3 600m + Cactus, whats the total app size?

HenryNdubuaku 12 hours ago [-]

400mb if you ship the model as an asset. However, you can also build the app to download the model post-install, Cactus SDKs support this, as well as agentic workflows you’d need.

VladVladikoff 12 hours ago [-]

How does this startup plan to make money?

HenryNdubuaku 12 hours ago [-]

Cactus is free for hobbyists and personal projects, but we charge a tiny fee for commercial use which comes with more features that are relevant for enterprises.

binary132 5 hours ago [-]

I couldn’t find a pricing page on your site. How tiny is tiny?

HenryNdubuaku 2 hours ago [-]

It’s custom for now as we are calibrating to see what works for everyone, wanna get in touch?

nextworddev 10 hours ago [-]

Will this drain my battery

HenryNdubuaku 10 hours ago [-]

This was one of the issues we set out to solve, so not as much as you’d expect.

cientifico 8 hours ago [-]

Can you clarify the following sentence:

> We are open-source (https://github.com/cactus-compute/cactus). Cactus is free for hobbyists and personal projects, with a paid license required for commercial use.

If it is open-source, one is free to distribute even for commercial use by definition. Which one is correct and what's your business model?

kvakkefly 8 hours ago [-]

Why do you believe open source means free to use and distribute commercially?

Cheer2171 2 hours ago [-]

Are you joking or just new? This is a foundational, bedrock principal of open source.

https://opensource.org/faq#commercial

ApolloRising 7 hours ago [-]

Would you consider adding a mode where it could go online if the user instructed it to?

HenryNdubuaku 2 hours ago [-]

You can add a web_search tool, checkout what these guys did with Cactus: https://anythingllm.com/mobile

joseph4521 5 hours ago [-]

AI Dungeon should contact you to make an offline mode again.

HenryNdubuaku 2 hours ago [-]

Ok, looking forward to it!

dcreater 11 hours ago [-]

Does it incorporate web search tool?

HenryNdubuaku 11 hours ago [-]

It can incorporate any tool you want at all. This company’s app use exactly that feature, you can download and get a sense of it before digging in. https://anythingllm.com/mobile

dcreater 11 hours ago [-]

The first picture on the android app store page shows Claude Haiku as the model

HenryNdubuaku 11 hours ago [-]

Thanks for noticing! The app is just a demo for the framework, so devs can compare the open-source models against frontier Cloud models and make a decision. We removed the comparison now so those screenshots indeed has to be updated.

apwell23 8 hours ago [-]

curious. what are the usecases for <100ms latency ?

HenryNdubuaku 8 hours ago [-]

Real-time video and audio inference.

asdfrgtfhgnjn 6 hours ago [-]

[dead]

Rendered at 03:51:02 GMT+0000 (Coordinated Universal Time) with Vercel.