Language | quietfanatic talks about programming

Just about every programming language has operators, right? An operator is a piece of syntax (or a function that looks like a piece of syntax) that operates on one or more terms. Those terms can themselves be composed out of terms and operators, of course. The most familiar operators are the arithmetical ones (+, -, *, etc.), but programming languages require other operators for a whole host of other things like function definitions, message passing, declarations, etc.

So what is the tersest operator? It is the operator that is the easiest to type. The easiest operator to type is the one you don’t have to type: the operator that isn’t there. When you stick two terms next to one another with no operator inbetween, that is called concatenation. Some languages are called Concatenative Languages because they believe their use of concatenation is the most fundamental use.

In a well designed language, the things you type most often tend to be the shortest ones. So the invisible concatenation operator tends to be the thing you want to do the most often, or at least something that happens commonly. Hence:

In Haskell and family, concatenation is used to apply a function.
In C and family, it’s used to declare a name with a type (though it’s parsed weirdly).
In Smalltalk and family, putting a word after something sends a message to it (calls a method on it).
In algebra, putting two letters next to one another means multiplication.
In regular expressions, putting two atoms together does sequential matching.
In Perl, putting two terms in a row is a syntax error.

Ironic, I know. Actually, Perl isn’t that bad, because it has noun markers. Leaving the ‘&’ sigil off of a function name makes it a prefix operator rather than a term, and subsequent terms are arguments to the function. Thus function calls are just as terse as in Haskell (but with the opposite precedence and associativity).

So this is one of the questions I am considering regarding my language (which I shall soon have to name). I am mostly fluctuating between C’s usage and Haskell’s usage. If I adopt concatenation for function calls, type annotations will require a ‘:’ like in ML. If I adopt concatenation for type annotations, function calls will require parens. Let’s contrive some sample code for both:

number_finder s:Str = {
    .match = find_number s
    .value : Int = match.parse
}

number_finder (Str s) = {
    .match = find_number(s)
    Int .value = match.parse
}

(Requisite background info: this is a prototype-based object system. Members of an object are made public by putting a ‘.’ before them when you declare them.)

Both uses of concatenation have their advantages and disadvantages, but we must also consider the nature of language when deciding. In Haskell pretty much every operation is a function call. But in this language, many calculations are performed with method calls instead, and an infix ‘.’ is almost as easy to type as a space. In addition, there will be less currying than in Haskell. So using concatenation for function calls won’t gain as much as it does in Haskell. On the flip side, using it for type annotations won’t gain as much as it does in C, because type annotations aren’t as necessary as they are in C (notice we left it off of .match).

The third alternative, I suppose, is using concatenation for method calls, like in Smalltalk and Self. However, this is a point at which I believe other linguistic concerns also start affecting the picture. Having a ‘.’ between method calls creates a feeling of high-precedence cohesion that a space would break. In addition, it would nullify the sweet syntax of using a prefix ‘.’ in a member declaration to make that member public. Also, again, ‘.’ is also really easy to type.

What are the non-terseness factors affecting the other choices? One of the principles of clarity is to make the more important parts appear more prominent. When you’re declaring a member of an object, the name is usually the most important part, which suggests that value : Int = (...) is more clear than Int value = (...), because it puts the name is first and all the names in the object will line up. But object members aren’t the only names you’re declaring. When you declare a function parameter, usually the type is more important than the name (at least, from an outsider’s perspective). This would suggest that find (Str area, Int num) is more clear than find (area:Str, num:Int). And you’ll probably leave types off of member declarations more often than function parameters.

A language could conceivably use concatenation for both things, provided all the types are predeclared (so the parser knows whether the left side is a type or a function), but I hate predeclarations.

So what do you think? I’m leaning toward the C usage, because I think it looks a little cleaner; perhaps I’m just more used to it because I learned C++ before Haskell. Which reminds me that you shouldn’t underestimate historical conventions either. What should I use the tersest operator for? Or should I push the waterbed in a different direction like Perl does?

How do you go about doing that?

And now your language is already doomed to obscurity. What? Already? That’s not very fair! Let me explain why.

Inspired by someone I know who did design a rather successful programming language, I thought it would be the coolest thing to design an even better language. Because I’m hubristic like that, I guess. I’m pretty sure he won’t mind.

But I have realized that a programming language is a tool more than it is a work of art. Art can be created for its own sake. Tools must be created for a purpose. If the single thing you want do is to create a programming language, that will determine the nature of the language you create. It will probably be a beautiful language. And it will be useless for any real programming. So nobody will use it except those few people who are interested in art languages. Humans are good at working toward a purpose; your end result will reflect your purpose.

Oh, but if you want to create an actually useful language, you don’t have to give up on that. To do that, you have to find a purpose for it, and pretend that was why you wanted to make the language all along. Humans are good at pretending.

Humans are also good at becoming what they pretend to be. In many of my other projects, I’ve had to wrestle with the problems presented to me by the underlying language. This is natural for a programmer, of course; no tool is perfect in every way. I’m sure some if not most programmers have dreamed of creating their own language at some time or other. Most of them are content enough to stick with what they’ve got. A few get frustrated enough to attempt to create their own language. A few of those succeed. I hope to be one of them.

So, what will be the purpose of my programming language? To be multi-purpose, of course!

Right, right.

In order to make a good tool you need to think of a more concrete goal than that. I can imagine some wise person saying “In order to design for everyone, you must design for yourself.” Well, it may not be strictly true, but being my own customer does ensure that my product is useful to at least someone. As large as this world is, there are probably people with the same needs as me, who could use my language.

So clearly, to be as relevant as possible, I need to be as needy as possible.

So here is the need I have come up with. It happens to match well with my other programming hobby. I want to create a video game with my programming language.

But there are already plenty of game-oriented languages out there, right? Not exactly. Those other languages like GML or Squirrel or Lua or ActionScript are scripting languages, made for the high-level specification of events that happen in a low-level engine that was written in C++ or something. I want my language to provide for every part of the system. Including, hopefully, the design process. So, to sum up my long-term plans, the language needs to:

Wrangle complicated data around at compile-time, like polygon shapes, item specifications, room specifications, tilemaps, etc.
Allow for easily writing up stateful actor logic.
Connect to C and C++ libraries for OpenGL and a physics engine (if I don’t write the latter myself).
Produce an executable that is fast and efficient, and during the main loop performs no dynamic memory allocations, to attain realtime performance (I have this going in C++ and it is pretty fun to work around this constraint)
Let me create an IDE that serves as a level editor and image editor, and lets me manipulate various kinds of data as both code and WYSIWYG, using both a CLI and a GUI which is itself modifiable. Features like copying and pasting data and undoing will be implemented on the language level.

Now that you see it, you can admit that this list is much more daunting than the stated purpose of “game programming” would lead you to believe. In fact, these requirements are already beyond any other programming language I know of. Oh hey, just for good measure, let’s add in another insane requirement, though it’s kind of unrelated:

The core of the language should be flexible enough to compile to weird things like Javascript. Not all the core features will be available there, of course (such as file IO and unboxed types), but those that are can be taken advantage of.

How on earth is the language going to do all those things, some of which are completely incompatible? The keyword is as above: flexibility. Flexibility in letting the core provide slightly different features in different environments, and in letting the programmer sandbox themselves into a more restricted system with a modified core. Believe it or not, I have fairly concrete ideas on how all of these things can work together and form an elegant language in the end. You’ll see more specifics here in the months and years to follow.

You are skeptical of my ambition, I can tell. I am approaching this as a constraint-solving puzzle, and it is definitely the largest puzzle I have ever taken on in my life. I have been thinking about it for two whole years already. Only now am I confident enough in my designs to show some of them off in public. This is a form of art that I am choosing to dedicate much of my time to.

quietfanatic talks about programming

Category Archives: Language

The Tersest Operator

So You Want to Design a Programming Language