NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
What you learn by making a new programming language (ntietz.com)
tombert 155 days ago [-]
I've had two projects that end up being "oops, I made an interpreter".

It starts innocently enough, you just have a JSON that has some basic functionality. Then you decide it would be cool to nest functionality because there's no reason not to, so you build a recursive parser. Then you think it'd be neat to be able to add some arguments to the recursive stuff, because then you can more easily parameterize the JSON. Then you realize it might be nice to assign names to these things as you call them, you implement that, and now you're treating JSON as an AST and you're stuck with maintaining an interpreter, and life is pain.

rqtwteye 155 days ago [-]
This happened to me with an XML based rules engine I wrote. First I needed conditions, then I introduced variables, then loops, then if-then-else. When I needed to handle errors, I realized that I just had invented something like BASIC in XML format. The interpreter was surprisingly short and concise mainly because XML ensured I didn't have to do the parsing.

Switched to dynamically compiled C# eventually.

sjducb 155 days ago [-]
> and life is pain

So glad you finished with this. Right now I’m working with a guy who wants to write an interpreter…

prisenco 155 days ago [-]
Writing an interpreter for a project that isn't explicitly an interpreter is a code smell equivalent to microwaving fish.

It is an incredibly fun side project though.

bunderbunder 155 days ago [-]
I think this really depends a lot on the task as well as how you frame it.

A whole lot of the "parse, don't validate" ethos is potentially about handling incoming data with complicated formats by writing a minimalist interpreter that reads the input and outputs either guaranteed-valid domain objects or error messages. Most the time you can get away with just a parser, but one occasionally runs into more complex formats that are easier to handle if you think of them as a declarative language that you handle using general-purpose interpreter implementation patterns. And it's just not a wheel worth reinventing.

The "very clever JSON" example the parent poster gives isn't particularly farfetched. The description reminds me of, for example, every Jetbrains REST API I've ever had to interact with. Also the data file formats for some commercial applications I've had to interact with. I don't like formats like this, but when I'm stuck with them, choosing the most maintainable option from the choices I've been left with is not a code smell; it's just sound engineering.

diggan 155 days ago [-]
> Right now I’m working with a guy who wants to write an interpreter…

Stuff like that works out great if you have a static set of features that will never change and you can at one point say "it's done", and won't have to touch it again.

Problem is that features are almost never static across their lifetime, and some poor sucker has to modify it at one point.

__s 154 days ago [-]
Skip straight to Lua
gavinhoward 155 days ago [-]
See also the Configuration Complexity Clock: https://mikehadlow.blogspot.com/2012/05/configuration-comple... .

This is why I have a separate config language and code language.

My config language is essentially JSON with newline separators and a first-class binary type (base64). I added little else.

When I get the temptation to add code to it, I just pull out my other, general-purpose language instead.

aapoalas 155 days ago [-]
I'm basically in charge of maintaining and developing a product that (on purpose) started at 9 o'clock. We've resisted the call to implement loops and such in our DSL but I can hear the wolves howling and I doubt I have much longer...
everforward 155 days ago [-]
I've hit this point a few times, and I always advocate for cutting the losses and switching to embedding an existing interpreter. Lua and Lisp both have embeddings for most popular languages, Starlark is Python-y and embeddable, many languages can embed a Javascript engine.

Imo, users asking for stuff like loops is a signal that they want a real language and not a bolted-on DSL. I've seen far too many cases of "extending a DSL until it's a full-flegdged but awkward and horrible language".

Ansible's YAML and Terraform's are my prime examples. Both have grown into basically full languages featuring imports, for loops, etc, and both suck to use because of how awkward reaching for them is. I don't want to have to remember Ansible's bastardized, YAML-encoded for loop syntax, just let me use a Python for loop.

pxc 155 days ago [-]
I think the world needs programmable config langs because slinging YAML and JSON quickly becomes miserable (as is extending them through templating alone) and general-purpose programming languages usually have shitty ergonomics for writing configuration.

The only question in my mind is whether our common config langs should be Turing-complete (e.g., Nix, Nickel, Jsonnet, Pkl) or not (e.g., HCL, CUE, Starlark, Dhall), which I think can only be determined through experience over the long term.

__MatrixMan__ 154 days ago [-]
If possible, I'd say it's best to just stay at 12:00.

Let them be "hard coded" in ~/.config/myapp/config.py or such so that it's very clear which edits should only be done by a wizard and which ones are suitable for a user.

This way all your users are suitably placed on a slippery slope where one edit in the same language but to another file will transition them from user to contributor.

Jail them in a separate language and they'll stay users forever.

Besides, config languages never have the level of polish (re: tab completion and static analysis) that the main language does. Why deny your users the niceties that you set up for yourself?

pxc 154 days ago [-]
> Besides, config languages never have the level of polish (re: tab completion and static analysis) that the main language does.

They don't have to!

> Why deny your users the niceties that you set up for yourself?

Don't do this, of course. :)

For instance, Nickel gained[1] an LSP implementation years before it hit version 1.0. When editing Nickel configs,byou get all the stuff you'd expect for a programming language: syntax highlighting, type hints, autocompletion, etc. But beyond static analysis, Nickel can also validate your configurations dynamically, according to arbitrary predicates-- and that too, integrates with your editor through the language server. Nickel ships with a command-line evaluator, REPL, pretty printer for data structures, code formatter, documentation generator, and CLI tab completions for Bash, zsh, Fish, PowerShell, and Elvish. Upstream distributes the tooling (which is, Go-style, a single executable with subcommands) as a Nix flake, a Docker image, and static executables for Linux aarch64 and x86_64 (< 30 MB, for now at least). Nickel doesn't yet have a package manager[3], but other than that it honestly has a better tooling story than many general-purpose programming languages currently in widespread use.

CUE, an older (but not old) configuration language (very cool in its own right), likewise has great tooling. It ships with a command-line evaluator, a pretty printer, a formatter, package management based on Go modules (i.e., it has both a `mod` and a `get` subcommand), a few refactoring tools, and tab completions for bash, zsh, and fish.[4] It currently lacks a language server[5] but there is recent and ongoing work to enable that.

There's no reason that a configuration language can't have first-rate tooling!

There is a new crop of really innovative, powerful configuration languages out there. I hope they start catching on.

> This way all your users are suitably placed on a slippery slope where one edit in the same language but to another file will transition them from user to contributor.

> Jail them in a separate language and they'll stay users forever.

This, I admit, deeply appeals to me. It's one of the things I really like about the Nix language, wart-y though it may be: the tools you use to configure your NixOS computer are the same ones you can go on to use to extend NixOS. I think that's an enormous strength.

But I also like for simple configuration to look really simple, and not trouble you much with the details of the language (or, God forbid, configuring the runtime). Perhaps programmable configuration languages, then, are good fits for tools like build systems and packaging systems (where the 'upstream' code users might next want to play with is also essentially fancy, expressive configuration in a special domain), but embedded DSLs might be better for applications (and work for packaging and build systems, too).

Re: creating a continuous road for users to hack on the software they use, GNU is really trying to do something special there: with Guix you have application configuration, package management, and system configuration handled in a DSL in Guile Scheme. But then Guix itself is also written in Guile! And at the same time, Guile is the official extension language of the GNU project, so GNU applications all (hopefully?) have Guile bindings, so you can use the same language to plug into (hopefully?) all the other GNU applications on your system. It's a cool idea.

--

1: https://github.com/tweag/nickel/pull/405

2: https://www.tweag.io/blog/2024-05-16-nickel-programmable-lsp...

3: https://github.com/tweag/nickel/issues/1585

4: https://cuelang.org/docs/reference/command/

5: https://github.com/cue-lang/cue/issues/142

__MatrixMan__ 154 days ago [-]
> there's no reason that a configuration language can't have first-rate tooling!

Perhaps I shouldn't have made it about the configuration language's maturity. What I mean is that however mature it is, it's still yet another thing. At least from what I've been seeing lately, fewer tools > better tools.

Suppose that on first run you generate a config file like this:

    from configtypes import Strategy

    def strategy() -> Strategy:
        return Strategies.balanced
And then the user maybe changes it to

    from datetime import datetime
    from configtypes import Strategy

    def strategy() -> Strategy:
        # be aggressive on Tuesdays
        if datetime.today().weekday() == 2:
            return Strategy.aggressive
        return Strategy.balanced
But actually it's `Strategy.assertive` not `Strategy.aggressive`, so this is an invalid configuration. I'm sure you can set up an editing experience for the user which gives them a red squiggly when they try to reference a nonexistent value, or which gives them an error that indicates a line number when they try to use it. But those are things you want for your dev environment anyway, right?

And maybe the user is curious: "how is this used?". There are editor actions like "go to definition" or "find references" which they can use to find more information. Those will be interrupted if the language is different.

I just feel like whatever time you might spend on integrating the configuration language is going to be better spent on docs, error messages, or making it easy to summon a well-configured editor environment (which is the same both for devs or config authors).

---

I've experienced what you're describing with nix. It's magical. I haven't toyed with Guix yet, but I want to eventually write helix plugins (it uses a scheme for its plugin language, or it will...), so maybe I'll give Guix a try when I get to that "learn scheme" item on my todo list.

For now I'm focused on how to help users who aren't accustomed to navigating a project with files in more than one language (scientists mostly), and I just don't think the juice is worth the squeeze re: creating more than one world, one for them and one for "proper" devs. Maybe maybe I'm overlooking the use case of large non-user-serviceable things because I wish things in general were smaller and more user-serviceable.

pxc 154 days ago [-]
> What I mean is that however mature it is, it's still yet another thing.

That's definitely true. And I didn't have scientific applications in mind as I wrote, either. I was thinking of domains where external DSLs for configuration are already predominant, and also of DSLs that I've enjoyed working with and their particular advantages. But everything you're saying makes perfect sense to me.

> And maybe the user is curious: "how is this used?". There are editor actions like "go to definition" or "find references" which they can use to find more information. Those will be interrupted if the language is different.

Right, the validation errors your config lang hands you might tell you what's wrong, but they won't be Python or R or $PROGLANG errors. It would be extra and special work to somehow get an editor to know how to jump between those two worlds, and I'm not even immediately sure how I'd try to do it if I wanted to.

> For now I'm focused on how to help users who aren't accustomed to navigating a project with files in more than one language (scientists mostly), and I just don't think the juice is worth the squeeze re: creating more than one world, one for them and one for "proper" devs.

I have to admit that I think you might be right, partial as I am to some fancy new config langs. But on some level it's already the case that those two separate worlds exist, right? Polyglot projects and companies are super common in software.

> that "learn scheme" item on my todo list.

I'm doing this right now in a real casual way with some friends! We're reading a chunk of The Little Schemer every other week. The book is easy, and that pace is gentle. If you're interested in joining or following along, feel free to DM me on NixOS Discourse.

__MatrixMan__ 154 days ago [-]
I think that we're each focusing on the right thing for the domains that we're thinking of, they're just different domains. When I first heard about Nickel I was like "wow, I'm gonna use this everywhere"--because I too get excited about fancy new languages--and I think that was premature. I'm looking forward to when the opportunity arises, but I don't think it'll become everywhere.

Config languages are definitely the way to go for system config and polyglot projects and maybe also places where the config/project divide is natural in some way such that users reasonably can spend their whole life without writing only config code (kubernetes comes to mind). I just tend to lean somewhat radically away from those things... to such an extreme that I've been thinking about making the code editor (and all its various configs) a project-level dependency instead of a system config.

I know it sounds crazy, but If you bundle the tools needed to work on a project into that project, then wherever it ends up, the user has what they need to work on it, even if the internet is down: Little dev-focused operating systems bundled with each package so that the actual operating system needs to be interacted with hardly at all. Something something right to repair...

---

Re: The Little Schemer, I actually have a copy of that on my desk, I have spent less than 5 minutes on it. I'll find you on NixOS Discourse :)

pxc 154 days ago [-]
> I've been thinking about making the code editor (and all its various configs) a project-level dependency instead of a system config.

> I know it sounds crazy, but If you bundle the tools needed to work on a project into that project, then wherever it ends up, the user has what they need to work on it, even if the internet is down: Little dev-focused operating systems bundled with each package so that the actual operating system needs to be interacted with hardly at all.

I think it's totally sensible. Presumably some people will always like to bring their own editor, and for that, having direnv supply an language server, compiler or interpreter, etc., plus including a generic editorconfig file is nice. But why not go all the way and provide ready-to-use, pre-configured VSCodium and Neovim?

I'm planning on putting together something like that for my roommate who is a non-programmer but interested in learning to code. Right now he's working through Learn Enough Developer Tools to Be Dangerous to get some Linux and CLI basics down, then we're gonna do Linux From Scratch together and some programming (starting with The Little Schemer and hopefully ending with SICP). I'm making Nix environments for all the programming books I'm currently reading, and the Scheme one will bundle Emacs and DrRacket.

__MatrixMan__ 153 days ago [-]
I was also motivated in this direction by some friends I want to teach to code. There's so much for a newbie to overcome that has nothing to do with code.

What I'm struggling with right now is:

How do you make supplying your own vim, say, versus the one that's default for this project... How do you make that feel like supplying a nondefault "Editor" type argument to a function? I'm aware that flake inputs support that sort of trickery via input overrides from the CLI, but there's nothing about that (besides perhaps the name of the input) which ensures that you only put editors in that slot. And then correspondingly, editors are a bit too varied to be interchangable like that. If a project defaults to VSCodium and I want to inject helix... that's like gonna take some hackery which prevents it from feeling like dependency-injected magic.

pxc 153 days ago [-]
I hadn't planned on something so neat. I was just thinking I'd put them on the path with nonstandard executable names like `pvim` or `pcode`, and then people could use them in combination or not, individually enable-able as devenv modules. One of them could be aliased to `edit` or something like that if configured as the default editor, though.

So the interface would look like

  editors.vim.enable = true;
  editors.vim.package = pkgs.neovim;
  editors.vim.plugins = [];
  editors.vscodium.enable = true;
  editors.defaultEditor = "vim";
and users would set or modify that in devenv.local.nix¹.

And the editors module would handle constructing the customized editor packages from the specified base packages, lists of plugins/extensions, included config, etc., as well as wrapping them so that they are called by the names `pvim`, `pcode`, `edit`, etc., and pointing them to gitignored RC files that live in-repo which users can edit.

--

https://devenv.sh/files-and-variables/

gavinhoward 155 days ago [-]
Hate to burst your bubble, but because of `break` and early return, Starlark is basically Turing-complete.

https://gavinhoward.com/2024/03/what-computers-cannot-do-the...

pxc 154 days ago [-]
The Starlark spec discusses this, and I consulted it before I wrote my comment.

> It is a dynamic error for a function to call itself or another function value with the same declaration.

> This rule, combined with the invariant that all loops are iterations over finite sequences, implies that Starlark programs are not Turing-complete. However, an implementation may allow clients to disable this check, allowing unbounded recursion.

https://github.com/bazelbuild/starlark/blob/master/spec.md

I do kind of agree with you about this, though:

> At that point, just give your users a `while` loop for usability's sake.

I'm generally in favor of powerful, purpose-built configuration languages, including Turing-complete ones. And the goal of assuring users that their programs will terminate at all is definitely aligned with a fuzzier goal, less formalizable, that their programs will terminate in a 'reasonable' period of time, whatever that means.

jolt42 155 days ago [-]
Personally I've seen 9pm be a holy grail, an exceedingly advantageous position to be in. But, it's exceedingly challenging to figure out what that DSL should and should not have as well.
CyberDildonics 155 days ago [-]
Lots of people learn the same lesson at some price.

Data and execution are two separate things and should remain as separate as possible.

The reason is that once you mix data and execution, suddenly you don't know what your data is until you execute it. Then you can only deal with it in the context of whatever tools you write and you can never just look at it straight.

On the other hand now execution depends on lots of data and is no longer modular or general/generic and so becomes a one off solution somewhere.

At some point everyone has the "what if" idea and hopefully it only burns them instead of lots of other people through poor design.

renox 154 days ago [-]
Nice theory, how do you do in the real world where you have a 'data' whose value depends on the OS used?
CyberDildonics 154 days ago [-]
What are you talking about exactly? This thread is about static data files. Did you think I was saying a program shouldn't have variables?
anonymoushn 155 days ago [-]
I'm pretty happy with "json scripting" for an implementation[0] of card game[1] with relatively low rules complexity. For a time it could evaluate arithmetic expressions, but I got rid of that because it was a bit unwieldy. The main pain point is that it runs slower than I'd like, so I may end up porting it all to actual Javascript functions or to Zig.

[0]: https://github.com/sharpobject/yisim/blob/master/swogi.json

[1]: https://store.steampowered.com/app/1948800/Yi_Xian_The_Culti...

tombert 155 days ago [-]
The first time I did this, I ended up writing stuff that could use replace strings with regular expressions, do arithmetic, string concatenation, and a few other things. It was honestly kind of cool but it was horrible to maintain and I wish I had just kept the JSON as data.
axegon_ 155 days ago [-]
Likewise, I've done that more times than I'd like to admit. And I took it a step further with an LLM like a month and a half ago fenced behind a number of json and yaml instructions, containing conditions and validators. It works like a charm but yeah... Oops, I did it again.
082349872349872 155 days ago [-]
> Oops, I did it again

I played with `eval`, got lost in codegen: O(AB), maybe?

lallysingh 155 days ago [-]
Every program expands until it becomes a compiler or checks mail. Emacs's sin was doing both.
shortrounddev2 155 days ago [-]
> Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp

https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule

karmakaze 155 days ago [-]
It's especially prevalent when expanding configuration capabilities.
efilife 155 days ago [-]
Is this a joke I'm not getting? Any examples of such software?
steeleduncan 155 days ago [-]
I think the point is that

- you write some piece of software

- then you add a straightforward configuration language

- then you add variables because you don't want too much copy/paste

- then you add if statements to allow conditional configuration

- then you add loops because you are sick of seeing configuration that consists of unrolled loops

- ...

At some point your configuration system is Turing complete, so could be considered a programming language. However it was never designed as such, so it is not a horrible one

TeX would be a good example of this. Ansible playbooks are (I believe) Turing complete, and YAML itself has such a huge spec, that if it isn't Turning complete and/or self aware already, I can't imagine it'll be long before it is

AtlasBarfed 155 days ago [-]
But if configuration is the example, a full fledged Lisp program as your configuration is TOO MUCH. Do you want configuration files functioning as malware vectors?

And what about Lisp is enduser friendly? Configuration files are intended to address people that are programming-lite to poweruser-not-programmer level. Lisp s-expressions are recursive tree structures with prefix ordering, which is a "great filter" for the IQ of users.

Greenspun's law, being a joke of Lisp users, actually indicates the inability of the Lisp community to understand why their language is niche to ultra-high-IQ people. If syntax patterns, formats, and programming languages survive for decades, there is a grassroots practical reason for their existence that ironically the ultra-smart all-knowing Lisp hacker just can't properly fathom.

everforward 155 days ago [-]
> And what about Lisp is enduser friendly?

It's not, it's developer-friendly. A basic Lisp interpreter is like a few hundred lines of code, so it's a very tempting system to embed. I think Lua is similar, but I haven't messed with it.

It's also powerful enough that most things aren't incredibly awkward to specify in it. YAML/JSON/TOML/etc have issues expressing repetitive things that can be solved in a for-loop/map in Lisp.

I don't even think Lisp should be an IQ filter. I wouldn't call myself a great or even good Lisp programmer, but the prefix syntax and s-expr's aren't all that hard to pick up. I don't think it's any more difficult than JSON, although it is far less common.

kazinator 154 days ago [-]
A fully bracketed prefix notation is unambiguous. It requires no knowledge of precedence rules. Small children can learn it. Your editor can indent it in a consistent way, no matter how you break it into multiple lines. You're almost never left wondering what element of the program is a child of what other element. You know the argument position of everything; there is no guesswork. If you misplace a parenthesis, the wrong indentation clues you in. Working with Lisp syntax requires few brain cycles; therefore, the syntax per se doesn't demand high intelligence.

If you can fog a mirror, you can probably edit Lisp.

pkkm 154 days ago [-]
> ultra-high-IQ people

I think that's much more stereotype than reality. People see all these parentheses and think that it must be a difficult language that requires a big mindset shift, like these purely functional languages with dependent types. In reality, it takes a week or two, plus an editor plugin that highlights matching parentheses, to get used to writing (function arg-one arg-two) instead of function(arg_one, arg_two). After that, it's smooth sailing. Just another decently designed language that's dynamically but strongly typed. Feels like writing Python, except with a bit of metaprogramming from time to time.

I think you're overestimating the importance of syntax and underestimating semantics. It doesn't take a very long program to get bitten by bad semantics. A few years ago, I was writing a game extension in Lua. It was only a couple hundred lines, and I still wasted hours debugging problems that would've been one minute fixes in Python or Lisp. All because of Lua's ridiculous idea of conflating arrays with hashtables, plus its habit of returning nil instead of raising an error when you do something that doesn't make sense.

jimbokun 155 days ago [-]
The joke is that C and C++ lack important features found in Common Lisp, so real world programs end up adding those features indirectly as the size of the program grows.
nosioptar 155 days ago [-]
Ratpoison, a tiling window manager for Linux, did this. The devs hit the point where they realized they were basically implementing parts of Lisp to allow more configuration. They ended up creating StumpWM in Common Lisp to be like Ratpoison if it allowed more configuration.
port19 154 days ago [-]
That's a cool piece of trivia
saghm 155 days ago [-]
I think the initial intent of the quote was to ridicule non-lisp users for reinventing the wheel, but to me it always reads as a fundamental misunderstanding of the fact that programming language constraints are _features_ rather than downsides.
jjtheblunt 155 days ago [-]
It's a famous quote and it is of course meant to be humorous, and a statement of how vast the functionality of common lisp is (huge spec and book)
pfdietz 155 days ago [-]
I'm not sure I'd call it "huge" anymore, when the C++ specification (at least, this one from 2020) is 1841 pages long.

https://isocpp.org/files/papers/N4860.pdf

jjtheblunt 152 days ago [-]
yeah fair point, agreed!
efilife 152 days ago [-]
This thread is very interesting. My question got downvoted. Hovever, everyone gives completely different answers for the question, which means the answer isn't really that obvious. Reddit mentality
jpgvm 155 days ago [-]
Angry upvote.

Maybe it's true failing though was stopping short of becoming the OS.

yjftsjthsd-h 155 days ago [-]
It's not its own kernel, but emacs can run on Linux as PID 1, at which point it rather seems like it should count as an OS. Given its affiliation, emacs/Linux probably still counts as GNU/Linux, but still...
01HNNWZ0MV43FF 155 days ago [-]
I say if it's not running as the kernel, it's not an OS.

Otherwise you could "boot to" any terminal app.

Nano? OS. Cat? OS. Echo? Believe it or not, OS. Good grief.

kbolino 155 days ago [-]
PID 1 has some unique responsibilities. In particular, any process that gets orphaned when its original parent exits is automatically reparented by PID 1, and so the process running as PID 1 must watch for unexpected SIGCHILD and clean up the zombies with waitpid or similar.
striking 155 days ago [-]
One could argue (many have, in the form of a tedious copypasta) that Linux is not the operating system because an operating system needs both a kernel and a userland.

Why wouldn't `nano` count as a userland in the same way as anything provided by a typical distro today?

yjftsjthsd-h 154 days ago [-]
Well, cat and echo can't run other programs, so no they wouldn't qualify. I'd have to check the nano docs to see if it can spawn other programs, but I'd be shocked if it's any good at it. In stark contrast, emacs can and is good at running programs, including interactively in terminals that it manages itself. I'm happy to agree that the kernel is a big part of the OS, but a kernel alone does not an OS make.
155 days ago [-]
randomdata 155 days ago [-]
What we do know is that the S in OS stands for system. A kernel alone does not a system make. There needs to be other components to round out an entire system. Why does Emacs not fit the bill?
155 days ago [-]
saghm 155 days ago [-]
Maybe a potential use case for GNU Hurd?
lallysingh 148 days ago [-]
There are no use cases for Hurd.
bunderbunder 155 days ago [-]
In fairness, this is also every modern Web browser's sin.
Doches 155 days ago [-]
> It's special that we make our own tools

I've always taken this to heart, but not necessarily with programming languages. Any piece of software that helps run my business that I can reasonably make and maintain myself, I do. I build my own CI/CD app, orchestration/deployment tool, task planner, bug tracker, release note editing & publishing tools, blog editor, logstash/viewer for exceptions, etc.

Does building (and especially maintaining!) all of these tools take up a lot of time, and distract me from the primary business of building the software that I actually sell? Sure, of course it does. But it also keeps me fresh and forces me to expand the scope of ideas that I regularly work with, and keeps me from becoming "that guy who makes that one app" and who isn't capable of moving outside of his comfort zone.

And while that doesn't (yet) extend to building my own tools in my own languages, it certainly does extend to writing my own DSLs for things like configuration management or infrastructure. My tools may be homerolled and second-rate, but they're mine (dammit!) and -- this part is important -- no one can take them away from me or catch me out with a licensing rug-pull.

norir 155 days ago [-]
I think many people underestimate how easy it is to get started writing a language. It is a bit like improvising music: it's just one note followed by another note followed by another. Almost any intermediate level programmer can write a program that parses a hello, world program and translates it into the language they already know. Once you have hello, world, you add features. Eventually you realize you made mistakes in your initial design and start over but your second design will be better than the first because of the knowledge that you now have. After enough iterations, you will have a good language that you like (even though it won't be everyone's cup of tea).
sixthDot 154 days ago [-]
Very true. Working on a proglang is more like running a marathon than a 100 meters. Another thing that would surprise people is that how few complex data structures or subtil algorithms are required. For example in styx-lang, my little retard baby language, there is literaly no binary searches, no sorting, no hash map, no hash sets, yet it is still fast because actually a compiler most of the time only has to deal with very small vectors (5 parameters, 10 enum members, 15 statements in a body, etc.)
codr7 154 days ago [-]
Yeah, the high stakes are part of the thrill, because it's oh so easy to paint yourself into enough of a corner to have to throw the whole thing away and start over.
JohnMakin 155 days ago [-]
One of the most fundamental experiences I ever had was attempting a graduate level course at the end of a long series on compilers. You really get an eye opening view of how languages are translated into the language the machine understands. After going through a few toy languages and then finally tackling creating a simple JVM, here is the #1 thing I would go back to myself and scream until I was blue -

Make your initial grammar SUPER simple. Like, don't go guns blazing and try to add all these cool features you saw in another language or always thought about. Start stupid, stupid simple, get that working, then build on top of it.

lioeters 155 days ago [-]
This is why the Lisp syntax is a great candidate for an exercise in making your own language. For example, Make a Lisp. https://github.com/kanaka/mal

It's simple to lex and parse into an abstract syntax tree, so you can get on with exploring more interesting aspects of programming beyond the mere syntax. (Not to say that there aren't interesting aspects of grammar and innovative syntax, but those can probably be explored later on as macros.)

Last time I created a toy language, I implemented a C-like infix syntax but still used a Lisp evaluator at its core from a previous project.

JohnMakin 155 days ago [-]
> It's simple to lex and parse into an abstract syntax tree,

There are some "cheats" for this, tools like ANTLR etc. that are good at generating parsers from a particular grammar. But of course, I think a beginner should try to do this on their own to get a feel for it.

Personally, for me, I do find writing parsers a little tedious and not the most fun part of making a language.

systemBuilder 154 days ago [-]
The difficulty in learning a language is proportional to the SQUARE of the number of BNF rules! Let that sink in. When last I looked, C had 120 rules and C++ had 250. C++ was already out of control and has a bunch of really stupid features that nobody with any intelligence uses for anything other than showing off (and let me tell you - there are A LOT of showoffs at Google!) Anyway, that's why C++ is 4x harder to learn than C ... I call it ... "Don's Law".
hoosieree 155 days ago [-]
Building an interpreter for myself, can confirm. Saying "no" to feature requests is especially hard because I'm the one requesting them!
maxbond 154 days ago [-]
> Make your initial grammar SUPER simple.

I would go further and say don't write a parser at first, unless what's novel and interesting about your language is it's syntax. Use the configuration language of your choice (like TOML or YAML) to write your ASTs directly, so you can focus on writing the runtime/backend and playing with the semantics of your language (where the novelty probably is).

When you feel it's the appropriate time, you can circle back to the frontend and implement a parser. But if you're writing a DSL, a config language may well be good enough. An additional bonus is that your config language syntax will make it easier to write certain unit tests.

I've had several experiments in writing a new language, and the first few times I got completely bogged down in parsing. I learned a lot about parsing, which is great, but it made it difficult for me to get at the meat of a project.

whartung 155 days ago [-]
For many simple languages, the most complex construct is the expression.

Lots of things come to light there. Lots of recursion/fun with stacks, operator precedence, the type system, parameter passing. Pretty much a good solid chunk of language is wrapped up in expressions.

Get expressions working, and the rest starts to readily fall into place.

Hunpeter 155 days ago [-]
I've been working on a toy compiler on-and-off, which is basically just a "reinvent-the-wheel simulator" since I really haven't looked at much existing literature. A very janky, bug-prone part of it is a sort-of mini parser generator, which you can feed a dictionary of rules (as strings) to. This, while slowing down the compilation speed, has allowed me to expand the grammar incrementally from dead simple to more complex, which has been a nice thing.
Jeaye 155 days ago [-]
Anyone wanting to work on a new language is most welcome to help out on mine: jank. It's a native Clojure dialect on LLVM with C++ interop and all the JIT goodies one expects from a lisp.

jank is currently part of a mentorship program, too, so you can join (for sree) and get mentored by me on C++, compiler dev, and Clojure runtime internals.

1. https://jank-lang.org/ 2. https://clojureverse.org/t/announcing-the-scicloj-open-sourc...

aeonik 154 days ago [-]
Were you able to implement transducers at the core of Jank? Or did you end up sticking to the existing Java implementation as much as possible?
Jeaye 154 days ago [-]
Transducers, in Clojure, are implemented in Clojure, rather than in Java. In jank, the Clojure source for transducers is exactly the same. For example: https://github.com/jank-lang/jank/blob/a14f4d7c7e8097d5ab588...
aeonik 145 days ago [-]
You're right, my bad. I was thinking back to when Rich Hickey talked about how he wishes he could go back and bake transducers into the core of Clojure.

I can't find the reference right now though.

Might be a good Clojure 2.0 thing.

graypegg 155 days ago [-]
I think maybe a good middle ground is write an interpreter for an already spec'd esoteric language like brainfuck. [0]

It's really fun. Brainfuck specifically is great because there's a lot to optimize with only 6 total operations. (An example, multiplication has to be done as repeated addition in a loop, make a multiply AST node! [1]) and you could knock out a (BF => AST => Anything you want) compiler in an afternoon!

Bonus, there's a lot of really impressive brainfuck scripts out there. Nothing compares to seeing your compiler take in some non-sense ascii, and spit out an application that draws the mandlebrot fractal.

[0] https://esolangs.org/wiki/Brainfuck

[1] https://github.com/graypegg/unfuck/blob/master/src/optimiser...

mckn1ght 155 days ago [-]
Similarly, I’ve been thinking lately about forking Swift, because there’s a lot I love about the language, but also a lot of, IMO, unnecessary sugar and redundant surface area.
danielvaughn 155 days ago [-]
Last year I tried to build a language and I wholeheartedly agree - it's amazing how much it teaches you. My particular language was merely meant to be transpiled to other languages, so I didn't get into the runtime or compilation stuff. But I quickly learned why braces and ignoring whitespace is so important. I also had to think extremely hard and carefully about the exact syntax and what each token meant. It's a very rewarding intellectual activity.

One thing I'd like to add is that even though you can totally write your own parser, it's an absolute joy to use Tree-sitter:

https://tree-sitter.github.io/tree-sitter

I plug it every time I get a chance. It makes refactoring your grammar incredibly easy, and lets you just focus on your syntax.

cvoss 155 days ago [-]
I've dreamed about making my own language for about 10 years or so. Started out just messing around. My vision for what it would be and its purpose has changed over time. About 2 years ago, I "got serious" about designing and implementing it, though that doesn't mean I've spent a serious amount of time on it yet. But it's happening!

It's a language for the domain of writing and verifying formal proofs. Basically, I didn't enjoy the experience of working with the couple of proof assistants I tried, so I'm doing my own thing. My objective is to create a language where I can document "everything I know" about math, if for no other reason than to prove to myself that I know those things, and to return to that knowledge if it ever slips away.

It's so much fun!

alexwashere_ 155 days ago [-]
Sounds neat - got any example code you could share?
cvoss 155 days ago [-]
Not ready to do that yet. But I hope so one day!
zX41ZdbW 155 days ago [-]
It will be good to add "Programming Language Checklist" to the references: https://www.mcmillen.dev/language_checklist.html
rzimmerman 155 days ago [-]
I spent time on a compile-to-JS language and found it very rewarding: https://github.com/rzimmerman/kal

This was before async/generators were added to JS and callback hell was quite real. I wanted to shape it in the way I’d learned to program in Visual Basic. Very human readable. The result is no longer useful, but it was a fun goal to have the compiler compile itself.

kstenerud 155 days ago [-]
I haven't made a programming language (and never will), but I did build a BNF-inspired metalanguage for describing text and binary formats to scratch the itch of trying to describe a binary data format I was developing:

The metalanguage: https://dogma-lang.org/

It's even got a syntax highlighter: https://marketplace.visualstudio.com/items?itemName=ksteneru...

The binary format I wanted to describe: https://github.com/kstenerud/concise-encoding/blob/master/cb...

cardiffspaceman 155 days ago [-]
Landin wrote a paper called, "The next seven hundred programming languages."[1] The paper predicts quite a bit of the present. So I named my programming language, DCC.

[1] https://www.cs.cmu.edu/~crary/819-f09/Landin66.pdf

liamilan 154 days ago [-]
I built Crumb (https://github.com/liam-ilan/crumb) a year ago, before starting university. It completely changed the way I conceptualized programming as a whole. You start feeling deja-vu every time you open a new language, and the "ah-ha!" feeling you get when you see something in another language you had to think about when implementing your own is super rewarding.

A year later (this summer) I used Crumb to land my first job at a pretty cool startup! The payoff was way more than I could have ever expected.

morning-coffee 155 days ago [-]
The timing of this article is great for me as lately I'm fascinated by the Forth language and the simplicity behind its apparent strangeness. I've been tempted to start playing with similar ideas just for fun.

(https://ratfactor.com/forth/the_programming_language_that_wr... is a great read, btw.)

Lerc 154 days ago [-]
I still hope to make one, but a combination of ADHD, depression and so many things to do and learn keep getting in the way.

At my current rate I'll know everything and be ready to get started on the day I die.

But my feature wishlist is

First class functions, Garbage collection, Transparent parallelism, Explicit parallelism, Type inference with strong type consistency, Dynamicly typed by annotation, Operator overloading (every language should either have vector/matrix operators or the ability to build them)

And a few more that I can't recall just now. I've made it hard for myself.

Failing that maybe just a hacked JavaScript without implicit type conversion between objects/strings etc. (the source of most "Wat?"), frozen array tuples, operator overloading. Implied "this." on identifiers defined and used inside class definitions.

CleanRoomClub 154 days ago [-]
As someone with similar mental health barriers, what strategies do you employ to overcome them?
Lerc 154 days ago [-]
I have not overcome them. Currently on dexamphetamine, my partner has noticed an improvement, me less so. Have been productive for the last week or two so making the most of it.

List making, planning, routines, and pretty much anything that follows someone saying "You just need to ..." Doesn't work for me, or for any of the other ADHD sufferers I know of.

maxbond 154 days ago [-]
I wouldn't say I've overcome anything either but I'll throw in my two cents.

Personally I have accepted a.) most of my projects will be unfinished when I die and b.) if I am just staring at an empty terminal session and nothing is happening, it's not going to help to heap abuse on myself and feel guilty for not being productive. The only path forward will involve accepting that work isn't happening and doing whatever it is I can find a spoon for. On really bad days, that means just going to sleep. The sooner I accept that the sooner I will find a way to the other side, and nothing is getting done before then.

Reading about Taoism and Zen has helped. In terms of productivity, I have appreciated the book The Creative Act: A Way of Being by Rick Rubin. It's an attempt to apply Taoist ideas in a modern context, but where a Taoist text would say "the sage does XYZ," Rubin says "the artist." It's about finding a way to continue being creative when doing so is painful.

Medication and therapy have been a mixed bag for me, but my friends have had to go through multiple medications and therapists, so I remain optimistic. I'm trying to get into the habit of exercising and meditating.

Wishing the best for you both.

mjhay 155 days ago [-]
> It will be a bad language, and that's okay

This same advice could be applied to most hobbies. It doesn't have to be good, and it certainly doesn't have to make money. It just has to be fun and rewarding. If you learn something, even better.

> Go Forth, make something fun

*golfclap*

loscoala 154 days ago [-]
I also came across FORTH through hacker news. I ended up developing a compiler that can translate FORTH into C.

For the interested reader:

https://github.com/loscoala/goforth

It was a great experience and I can only recommend trying to develop a programming language yourself.

stevekemp 155 days ago [-]
And here's my tutorial FORTH, based upon a thread from hacker news:

https://github.com/skx/foth

Forth is always appealing, whether literally, or in puns.

mjhay 155 days ago [-]
Very cool, I'll check it out. I've been meaning to give Go a spin.
atum47 155 days ago [-]
Long time ago I was stuck at the airport and I end up writing a interpreter in Python. I stop the project at arithmetic, so, basically a fancy calculator. After that I saw a really interesting video about Shunting Yard algorithm, so I gave that a got as well [1]. At some point, I want to try to write a programming language, I know a little bit about assembly but it is most theory; haven't done much programming using it (only basic stuff, back in college) but I find it fascinating.

1 - https://github.com/victorqribeiro/shuntingYard

PodgieTar 155 days ago [-]
I made a little toy compiler for a university project many years back, and I agree with the article - it's quite a nice way to get hands on with syntax and helps you think a bit more deeply about what is actually happening.

https://github.com/Podginator/Rattle/tree/master

It used JavaCC, which I found to be a pretty simple way to get up and running.

I also worked a job that used yacc to create their own DSL for a piece of software. Same thing, really. Easy enough to get up and running, and messing around with.

tunesmith 155 days ago [-]
I've had a pet idea for a long time that is nonsensical but I still wish existed. I'd like a programming language that encodes the "why", like forces you to compile in the business reason for the code in question. And then it'd automatically survey you, and if your prior business assumptions are no long true, then compilation would fail, forcing you to remove or rewrite until the code fits your "why" again.
pclmulqdq 155 days ago [-]
One project I started recently was a dice rolling discord bot. Halfway into it, I realized that I really wanted an AST for dice roll expressions to avoid a whole bunch of bugs and special cases, so that project is now mainly a compiler.

More seriously, I have been very tempted recently to make a programming language specifically for cryptography, but I am holding off until I can no longer stand assembly.

lifeisstillgood 154 days ago [-]
I just realised I went down the rabbit hole yesterday. I needed to track work to be done on my work project - so I started writing Todos. And then grepped them, and started extracting things from the todos - like a dotted name (foo.component.feature) and now I have realised it’s a DSL and I should have used lexx not regexes and … it’s a good idea through !
systemBuilder 154 days ago [-]
When I was in high school I learned about BNF and so i wrote a program that let you type in BNF rules and then it would run a recognizer on an arbitrary string to decide if the string met the BNF rules. I don't know if I could write that again, but it was a definite eye-opener and I learned a ton from that project ...
TrackerFF 155 days ago [-]
In woodworking you make tools all the time, namely jigs. So much of woodworking, whether it is making cabinets, musical instruments, art, or whatever involves making your jigs, templates, and a bunch of other stuff. Hell, even larger projects like complex workbenches.

You rarely make actual tools, though. It's unheard of that a woodworking goes on to make their own router, band saw, planer, jointer, chisels, etc. - but you can learn a ton by starting with the absolute bare basics, before investing a ton in expensive tools.

Kind of makes me wonder where this analogy fits (if at all) in the world of software engineering: Some tools are probably either too complex, or don't really make sense making, if you're going to use it to actually build something.

I mean, it is a good intellectual exercised for the curious, and you pick up a bunch of things underway, but at some point it is probably good to ask yourself if your time is better spent on something else.

mikewarot 154 days ago [-]
Making a whole new software/hardware ecosystem is what happens if you keep going... behold... the BitGrid.

Turing Complete, Actually performant, and possibly crazy enough to work

[1] https://esolangs.org/wiki/Bitgrid

[2] https://github.com/mikewarot/Bitgrid

And my latest attempt at things, because Pascal isn't popular enough, includes C and a Web emulator

[3] https://github.com/mikewarot/Bitgrid_C

bitbasher 154 days ago [-]
I've been thinking about programming languages a bit and I agree. I was looking at Odin and some other languages.

I find it fascinating a language can be purpose-built, like Odin. It's a language that was built for game development. It got me thinking what a language built for web development may look like.

It also got me thinking, we have 'game engines' which are tools built around the same idea. Why don't we have web-CRUD engines? Sure we have frameworks, but we don't have entire purpose built applications for building fullstack applications.

elashri 155 days ago [-]
It is always my dream to write an interpreter for my imaginary language that does provide a more human readable syntax to write scripts. It is basically language that compiles to bash because it is everywhere but I really hate writing bash although I write bash/zsh code a lot.

Yes I can use python (other scripting language for that) but this is not cool as writing your own programming language and also you don't always have access to python environment. Bash just runs everywhere (at least for me). It is also my motivation to learn about compilers and how they work.

news_to_me 155 days ago [-]
This is so true. I feel like I've learned more about programming in the last two years making Cassette[1] than in the past decade of professional software development.

Every developer should make a language at some point. I want to see everyone's weird, half-baked languages! I want to hear about what other people struggled with, and the tradeoffs they've made!

[1]: https://cassette-lang.com

giancarlostoro 155 days ago [-]
This looks neat. I tried VB.NET a few weeks ago, and felt like it was mostly pointless since almost all resources are done in C# and so you have to translate from C# to VB.NET anytime you want to do anything, which can be a bit of a pain.

But one thing I really liked was the "begin end" type of code blocks.

martyalain 154 days ago [-]
I learned that all programming languages come from the lambda calcul, created in the 30s, ten years before the computer era, a simple text rewriting machine using nothing but 2 operators, abstraction and application, working on a sea of words. Something like the Yin and Yang of TAO. More in http://lambdaway.fr .
flobosg 155 days ago [-]
“Beautiful Racket” comes to mind: https://beautifulracket.com/
Moon_Y 153 days ago [-]
Similar to woodworkers making their own tools, as programmers, we have the ability to create everything from operating systems to tools. This privilege is rare in other fields, allowing us to get closer to the tools we use. This autonomy enables us to have a deeper understanding and mastery of the tools and technologies we rely on.
0x70run 153 days ago [-]
Are you a summary bot?
Moon_Y 153 days ago [-]
I just feel like this analogy is quite fitting.
looneysquash 155 days ago [-]
Are there some good resources for learning type theory?
throwaway17_17 154 days ago [-]
I would suggest Robert Harper's "Practical Foundations for Programming Language". [1] This take a relatively thorough approach to the development of language semantics, with particular emphasis on Type Theory. Addition work on Type Theory is scattered throughout academic work, but Harper is good place to start. The implementation of 'normal' type systems in mainstream languages are for the most part just semi-performant straight forward instances of the more formal algorithms presented in the literature. Searching for lectures by Neel Krishnaswami on YouTube will yield some decent results on developing Type Systems as well.

1 - http://www.cs.cmu.edu/~rwh/pfpl.html

codr7 154 days ago [-]
Highly recommended! And I do mean new. Because having an opinion is good, and validating it is even better. I find it to be mostly educational, humbling and fulfilling; but occasionally VERY frustrating because there's just no end of things to fix and improve.

https://github.com/codr7/sharpl

giancarlostoro 154 days ago [-]
Nice, I really want to see a Clojure equivalent on .NET where its a Lisp language, but it fits in with its ecosystem.
neonsunset 154 days ago [-]
There is https://github.com/clojure/clojure-clr

(I cannot testify for usability, the project is alive but it seems to need more contributors to really get going)

IWeldMelons 155 days ago [-]
https://old.reddit.com/r/ProgrammingLanguages/ is a treasure trove for those who wants to engage in the quest.
DeborahWrites 155 days ago [-]
Every ~3yrs an article like this floats across my screen.

Every time, I think it sounds SO fun.

And then do nothing.

Given this time I have the excellent excuse of a new job, I don't expect to break the trend this year.

But one day . . . one day . . .

bsnnkv 155 days ago [-]
My version of this was building my own tiling window manager (I also ended up building my own hotkey daemon with its own syntax and parser along the way)
kabdib 155 days ago [-]
Write the debugger first. Or at least, very early.

You're going to need one anyway . . .

allanren 155 days ago [-]
Seems the future of programming language is natural language
itishappy 155 days ago [-]
Good news, you can make new natural languages too!
maxbond 154 days ago [-]
Well, it would be a constructed language if you did it intentionally, but your point stands in that you can make a new language.
AnimalMuppet 155 days ago [-]
No. No, I really shouldn't.

Yes, I might gain a better understanding of how things work. I can do that lots of ways; others may be a better use of my limited time. (For that matter, doing other things that don't give me a better understanding of how things work may be a better, more valuable use of my limited time.)

Yes, I could get a language that fits what I need better... in theory. In practice, it would be buggy, inconsistent, and incompletely implemented. It would not actually fit what I need better than existing languages.

whartung 155 days ago [-]
You should write a language for the same reason you should write a game.

It's not hard for a game to touch all sorts of corners of the computer world that routine work will never come near. Games are great because they're easy to focus on, you know pretty much precisely what you want it to do. They also tend to be made of small parts, yet extensible, in terms of adding features if you still have wind in your sails.

And if not, hey, you have a few more data structures under your belt. Who knows when those may come in handy again.

Languages are similar. Get past a few "hard parts" and they can be a fun ride.

For extra fun, put the language into your game. Dog food for everyone.

kemitche 155 days ago [-]
The article doesn't claim that you should make a language and actually use it - the author focuses on the learning experience. He makes valid points.

No, it's probably not the right learning experience for everyone (clearly, it won't be yours), but there's definitely value in it for some people. And if reading the article nudges a few more people to take that learning dive, I think that's a good thing.

ted_dunning 154 days ago [-]
Having written several toy languages, I think that the process is supremely educational in ways that other efforts are not. In particular, it gives you an awareness of how these things can go wrong so that you can see where your day job efforts can go wrong if extended too far in some direction. You always learn about distinctions that were never apparent before you got into the details.

Without the freedom to make a mess of things, you live too cautious a life. It's worth digging in to some of these efforts even if they do wind up a mess. It's not because the mess will be meaningful. It's because you will understand better what was good and bad about what you did and how you got there.

pc86 155 days ago [-]
How is using a language that is buggy, inconsistent, and incomplete any better than use a better language to write an app that is buggy, inconsistent, and incomplete?

Checkmate.

anta40 155 days ago [-]
Different context, though. I'm pretty sure that my Java/Go/Kotlin apps are buggy. So what, bugs are always exist. Just fix them. But as a language user, it's not my job to fix compiler/RTL library bugs. Write a bug report on GitHub, instead.

But when you are designing a programming language and writing the interpreter/compiler, well you are on your own :D

AnimalMuppet 155 days ago [-]
Yes, my apps are buggy, inconsistent, and incomplete (as are all apps). But they're less bad than my language would be, because they're my day job, and my language wouldn't be.

So, not checkmate at all.

jbs789 155 days ago [-]
Umm if we are looking for analogies. I would prefer a castle built on rock rather than one built on sand.
TeMPOraL 155 days ago [-]
> rather than one built on sand

Well, that is exactly what your app is if it's built on modern "state of the art" tooling - you're probably pulling in a thousand trivial dependencies; every day, something gets updated, and you're rolling dice on whether or not your program will build.

Writing your own language to build your app in creates a lot of big problems, but not those particular problems.

PhilipRoman 155 days ago [-]
Better than a sandcastle built on a rock, I suppose...
napierzaza 155 days ago [-]
I remember when I was making music. It was really fun and engaging. Then I learned about microcontroller projects for making my own synthesizer. It was fun and it took me two years to build my synth. At the end of it I had not made music for long enough to have lost the knack, and I was sick of even learning how to use the synth because I was worried I would have to debug it. So I just didn't make music ever again. Why consume yourself with some sub project that goes nowhere?
155 days ago [-]
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 08:36:49 GMT+0000 (Coordinated Universal Time) with Vercel.