Separate parsing from reading shell input by jpco · Pull Request #263 · wryun/es-shell

jpco · 2026-03-16T05:49:15Z

NOTE: I've been a bit aggressive with unilaterally merging PRs lately, but this one will not be getting merged without explicit feedback.

The short version: This PR changes the $&parse primitive. Previously, $&parse would take an optional set of prompts and use that to read and parse shell input, producing a parsed command. Now, $&parse takes a command and calls that command once or more in order to read shell input, which it parses. %parse has been rewritten in such a way that its outward-facing behavior hasn't changed while using the new $&parse.

The impact of all this is that it decouples prompting, reading, parsing, and writing to history. You can see that in these snippets from initial.es. In this first one, we define the %read-line function, which takes a prompt, prints it, and then reads a line. If it exists, the $&readline primitive can implement this function, or it can be done with an echo and a $&read.

if {~ <=$&primitives readline} {
	fn-%read-line = $&readline
} {
	fn %read-line prompt {
		echo -n $prompt
		$&read
	}
}

The second snippet uses this %read-line function, along with the new $&parse and the %write-history hook, to do all the work that would previously happen within $&parse:

let (in = (); p = $prompt(1))
unwind-protect {
	$&parse {
		let (r = <={%read-line $p}) {
			in = $in $r
			p = $prompt(2)
			result $r
		}
	}
} {
	if {!~ $#fn-%write-history 0 && !~ $#in 0} {
		%write-history <={%flatten \n $in}
	}
}

Doing this allows arbitrary flexibility in writing to history or prompting -- per-line history writing such as before #65 can be done by people who want that, or a different prompt could be used for each and every line. Something like zsh's TRANSIENT_RPROMPT could be added. You could even add a hook to transform read-in text before parsing it, in order to add !!-style history expansion or old-school string-replacement-based aliases.

This PR also moves both $&parse and $&readline towards being "normal" primitives. In fact, with this PR, all the readline code in the shell is collected into a single file readline.c. Adding support for an alternative library should only require adding similar primitives in a similar file and an es script to tie things together, which is a much simpler integration story than the dozen or so #if HAVE_READLINE blocks the shell has had till now. (This starts to connect to a potential concept of "extensible primitives", or even "modules", which is in my opinion a very interesting way to begin to direct the shell. Es could inherit loadable modules from Inferno's sh, a shell that arguably descended from es!)

Improving the potential to swap out $&parse may sound strange, but could also have practical benefits. In addition to changes to internals (switching to a hand-written parser for better error reporting, or an incremental parser, etc.), swappable parsers could be a coarse way to allow certain changes to syntax.

A note on the particular API of $&parse -- why not go even simpler and have $&parse take an already-read string which it is to parse? This would make $&parse, effectively, a "push-style" parser, where input is read and then "pushed into" the parser by its caller. The major issue with this for es is that push-style parsers must be able to be called multiple times over the course of a single parse: it would be called with one line, return a value indicating it needs more input, and then would be called again with the next, and this would repeat until enough has been fed to it that it can return a fully-parsed statement.

In es, the problem with this pattern is managing parser state across calls. How do we differentiate a second $&parse call for an in-progress bit of code from a second, unrelated $&parse call, especially given es lacks opaque handles, so we can't just say $&parse $parser ...? It is much simpler to use a "pull-style" parser, where we give $&parse a way to read input for itself, and have each $&parse call correspond with a single complete parse run.

One change that comes with this PR is how it affects the -i flag. Previously, the following would happen:

; echo 'echo hello world' | ./es -i
; echo hello world
hello world
; ;

Now, this happens:

; echo 'echo hello world' | ./es -i
hello world
;

The difference is that before, es would call the readline library if the shell was interactive and reading from stdin. Now, it calls the readline library if it's interactive and reading from a TTY. This is a corner-case, and other shells aren't consistent on the behavior; I believe that the new behavior is easier to model and explain in the shell (given that the reader commands are always reading from their stdin, it breaks some abstractions to make them judge whether it's "really" the shell's stdin.)

At least one potential follow-on improvement to consider: we may want to add some information for $&parse to pass to reader commands when calling them. One example would be something indicating whether a heredoc is being read, in order to do something like print a different prompt. I'm not in a hurry to add anything here; I think more time is needed to think of what might be included. (I also imagine that a richer API to a curretly-running parser might eventually be useful for something like a tab-completion system, but also don't have any real design for that yet.)

This primitive will be used as a target for $&prompt's new "reader commands". We don't shuffle any code around for this yet; consolidating readline logic into something like a readline.c file will be done later.

$&newparse is like $&parse, except instead of taking prompt arguments and doing all the reading itself, it takes a reader command as its arguments and calls out to that to fetch input. Ideally $&parse should be replaced with $&newparse and then a lot of code can be cleaned up.

Not all of this actually needs $&newparse as a prerequisite, like input->eof and removing the various fill functions.

This makes $&parse redirect shell input to stdin as if it's an actual redirection operator, and has $&readline actually use the stdin and stdout the shell has given it.

I don't know how ideal this is (or the implementation), but it seems to work well at least initially. Also add some tests for $&parse, not all of which are passing yet.

wryun · 2026-03-25T09:29:49Z

Having quickly read what you've written, this sounds good to me. But ultimately I think you'll have to live with being the only active contributor and make these kind of calls.

From my perspective, if you're primarily refining/reworking the internals I'm not overly concerned, particularly when you're adding tests. Ok, strictly primitives are user facing, but even advanced es users are going to touch these only once or twice.

This is pretty consistent with other shells' behavior, and makes $&read work as an input to $&parse.

jpco · 2026-03-25T17:21:44Z

Having quickly read what you've written, this sounds good to me. But ultimately I think you'll have to live with being the only active contributor and make these kind of calls.

I just want to make sure I'm not trying to take things in a direction nobody else likes :) as long as I have some assent on the broad strokes, I'm happy.

From my perspective, if you're primarily refining/reworking the internals I'm not overly concerned, particularly when you're adding tests.

Reworking the internals is my favorite part!

Ok, strictly primitives are user facing, but even advanced es users are going to touch these only once or twice.

Yeah, the fact that this PR is user-facing is the thing that distinguishes it from #205, #223, #261, #262 -- even if we leave the %parse function backwards-compatible.

Primitive backward compatibility is something I feel a bit queasy about, and it's also relevant to #210. I think (hopefully for reasons better than the fact that it's convenient for me) that, at least at this point in es' development, primitives should be allowed to change backwards-incompatibly. The fact that the API of each primitive isn't really documented in the man page seems like a point in agreement with me. However, it's not exactly user-friendly to just do this without giving users a prior warning about what behaviors they can or can't expect to have breaking changes.

One of the goals with the "extensible primitives" stuff I've been thinking about is to help make this story better; I'd like to make it possible for users to pick which version of which primitives they want to use. But that's further out.

jpco · 2026-03-28T14:41:31Z

I'm separating out the change to $&read, and the addition of $&readline, into their own PRs, #264 and #265 respectively. I want to make sure to get the same level of confidence with how each of those changes look as I have about the changes to $&parse here.

It's not the right thing long-term, but it's the narrowest change that gives good behavior now.

jpco added 7 commits February 22, 2026 06:48

Create $&readline primitive

e442242

This primitive will be used as a target for $&prompt's new "reader commands". We don't shuffle any code around for this yet; consolidating readline logic into something like a readline.c file will be done later.

Post-$&newparse changes and cleanup

519c4f5

Not all of this actually needs $&newparse as a prerequisite, like input->eof and removing the various fill functions.

Merge remote-tracking branch 'upstream/master' into scriptparse

6783dce

Remove at-startup inithistory()

365c7c4

Fix memory leak

705b2a4

Fix segfault in $&readline

b2d82a2

jpco marked this pull request as draft March 17, 2026 04:09

jpco added 4 commits March 18, 2026 07:39

Fix fd behavior for $&parse and $&readline

417b44b

This makes $&parse redirect shell input to stdin as if it's an actual redirection operator, and has $&readline actually use the stdin and stdout the shell has given it.

Expose "no-fd" Input to readers as /dev/null on stdin

ac7bb44

I don't know how ideal this is (or the implementation), but it seems to work well at least initially. Also add some tests for $&parse, not all of which are passing yet.

Fix up tests

40b4352

Only actually readline if the input is a tty

d75f6af

Skip NUL characters in $&read

9721992

This is pretty consistent with other shells' behavior, and makes $&read work as an input to $&parse.

jpco mentioned this pull request Mar 29, 2026

Create a $&readline primitive #265

Merged

Merge remote-tracking branch 'upstream/master' into scriptparse

e9c0a02

jpco force-pushed the scriptparse branch from e737a56 to e9c0a02 Compare March 30, 2026 04:12

jpco added 2 commits March 30, 2026 07:28

Handle NUL bytes in $&parse's builtin $&read

5d0e333

Re-add $&read fallback to $&readline.

a634195

It's not the right thing long-term, but it's the narrowest change that gives good behavior now.

jpco marked this pull request as ready for review March 30, 2026 15:06

Fix tests with --enable-strict

a5fd4de

jpco merged commit d77a440 into wryun:master Mar 31, 2026
1 check passed

jpco deleted the scriptparse branch March 31, 2026 15:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate parsing from reading shell input#263

Separate parsing from reading shell input#263
jpco merged 16 commits intowryun:masterfrom
jpco:scriptparse

jpco commented Mar 16, 2026 •

edited

Loading

Uh oh!

wryun commented Mar 25, 2026

Uh oh!

jpco commented Mar 25, 2026

Uh oh!

jpco commented Mar 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jpco commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wryun commented Mar 25, 2026

Uh oh!

jpco commented Mar 25, 2026

Uh oh!

jpco commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jpco commented Mar 16, 2026 •

edited

Loading

jpco commented Mar 28, 2026 •

edited

Loading