Guide to Procs and Funcs

YSH has two major units of code: shell-like proc, and Python-like func.

This doc compares the two mechanisms, and gives rough guidelines.

Table of Contents
Tip: Start Simple
At a Glance
Procs vs. Funcs
Func Calls and Defs
Proc Calls and Defs
Common Features
Spread Args, Rest Params
The error builtin raises exceptions
Out Params: &myvar is of type value.Place
Proc-Only Features
Lazy Arg Lists where [x > 10]
Open Proc Signatures bind argv
Methods are Funcs Bound to Objects
Usage Notes
3 Ways to Return a Value
Procs Compose in Pipelines / "Bernstein Chaining"
Summary
Appendix
Implementation Details
Related

Tip: Start Simple

Before going into detail, here's a quick reminder that you don't have to use either procs or funcs. YSH is a language that scales both down and up.

You can start with just a list of plain commands:

mkdir -p /tmp/dest
cp --verbose *.txt /tmp/dest

Then copy those into procs as the script gets bigger:

proc build-app {
  ninja --verbose
}

proc deploy {
  mkdir -p /tmp/dest
  cp --verbose *.txt /tmp/dest
}

build-app
deploy

Then add funcs if you need pure computation:

func isTestFile(name) {
  return (name => endsWith('._test.py'))
}

if (isTestFile('my_test.py')) {
  echo 'yes'
}

At a Glance

Procs vs. Funcs

This table summarizes the difference between procs and funcs. The rest of the doc will elaborate on these issues.

Proc Func
Design Influence

Shell-like.

Python- and JavaScript-like, but pure.

Shape

Procs are shaped like Unix processes: with argv, an integer return code, and stdin / stdout streams.

They're a generalization of Bourne shell "functions".

Funcs are shaped like mathematical functions.

Architectural Role (Oils is Exterior First)

Exterior: processes and files.

Interior: functions and garbage-collected data structures.

I/O

Procs may start external processes and pipelines. Can perform I/O anywhere.

Funcs need an explicit io param to perform I/O.

Example Definition
proc print-max (; x, y) {
  echo $[x if x > y else y]
}
func computeMax(x, y) {
  return (x if x > y else y)
}
Example Call
print-max (3, 4)

Procs can be put in pipelines:

print-max (3, 4) | tee out.txt
var m = computeMax(3, 4)

Or throw away the return value, which is useful for functions that mutate:

call computeMax(3, 4)
Naming Convention

kebab-case

camelCase

Syntax Mode of call site

Command Mode Expression Mode
Kinds of Parameters / Arguments
  1. Word aka string
  2. Typed and Positional
  3. Typed and Named
  4. Block

Examples shown below.

  1. Positional
  2. Named

(both typed)

Return Value Integer status 0-255

Any type of value, e.g.

return ([42, {name: 'bob'}])
Relation to Objects none

May be bound to objects:

var x = obj.myMethod()
call obj->myMutatingMethod()
Interface Evolution

Slower: Procs exposed to the outside world may need to evolve in a compatible or "versionless" way.

Faster: Funcs may be refactored internally.

Parallelism?

Procs can be parallel with:

  • shell constructs: pipelines, & aka fork
  • external tools and the $0 Dispatch Pattern: xargs, make, Ninja, etc.

Funcs are inherently serial, unless wrapped in a proc.

More proc features ...
Kinds of Signature

Open proc p { or
Closed proc p () {

-
Lazy Args
assert [42 === x]
-

Func Calls and Defs

Now that we've compared procs and funcs, let's look more closely at funcs. They're inherently simpler: they have 2 types of args and params, rather than 4.

YSH argument binding is based on Julia, which has all the power of Python, but without the "evolved warts" (e.g. / and *).

In general, with all the bells and whistles, func definitions look like:

# pos args and named args separated with ;
func f(p1, p2, ...rest_pos; n1=42, n2='foo', ...rest_named) {
  return (len(rest_pos) + len(rest_named))
}

Func calls look like:

# spread operator ... at call site
var pos_args = [3, 4]
var named_args = {foo: 'bar'}
var x = f(1, 2, ...pos_args; n1=43, ...named_args)

Note that positional args/params and named args/params can be thought of as two "separate worlds".

This table shows simpler, more common cases.

Args / Params Call Site Definition
Positional Args
var x = myMax(3, 4)
func myMax(x, y) {
  return (x if x > y else y)
}
Spread Pos Args
var args = [3, 4]
var x = myMax(...args)

(as above)

Rest Pos Params
var x = myPrintf("%s is %d", 'bob', 30)
func myPrintf(fmt, ...args) {
  # ...
}
...
Named Args
var x = mySum(3, 4, start=5)
func mySum(x, y; start=0) {
  return (x + y + start)
}
Spread Named Args
var opts = {start: 5}
var x = mySum(3, 4, ...opts)

(as above)

Rest Named Params
var x = f(start=5, end=7)
func f(; ...opts) {
  if ('start' not in opts) {
    setvar opts.start = 0
  }
  # ...
}

Proc Calls and Defs

Like funcs, procs have 2 kinds of typed args/params: positional and named.

But they may also have string aka word args/params, and a block arg/param.

In general, a proc signature has 4 sections, like this:

proc p (
    w1, w2, ...rest_word;     # word params
    p1, p2, ...rest_pos;      # pos params
    n1, n2, ...rest_named;    # named params
    block                     # block param
) {
  echo 'body'
}

In general, a proc call looks like this:

var pos_args = [3, 4]
var named_args = {foo: 'bar'}

p /bin /tmp (1, 2, ...pos_args; n1=43, ...named_args) {
  echo 'block'
}

The block can also be passed as an expression after a second semicolon:

p /bin /tmp (1, 2, ...pos_args; n1=43, ...named_args; block)

Some simpler examples:

Args / Params Call Site Definition
Word args
my-cd /tmp
proc my-cd (dest) {
  cd $dest
}
Rest Word Params
my-cd -L /tmp
proc my-cd (...flags) {
  cd @flags
}
Spread Word Args
var flags = :| -L /tmp |
my-cd @flags

(as above)

...
Typed Pos Arg
print-max (3, 4)
proc print-max ( ; x, y) {
  echo $[x if x > y else y]
}
Typed Named Arg
print-max (3, 4, start=5)
proc print-max ( ; x, y; start=0) {
  # ...
}
...
Block Argument
my-cd /tmp {
  echo $PWD
  echo hi
}
proc my-cd (dest; ; ; block) {
  cd $dest (; ; block)
}
All Four Kinds
p 'word' (42, verbose=true) {
  echo $PWD
  echo hi
}
proc p (w; myint; verbose=false; block) {
  = w
  = myint
  = verbose
  = block
}

Common Features

Let's recap the common features of procs and funcs.

Spread Args, Rest Params

The error builtin raises exceptions

The error builtin is idiomatic in both funcs and procs:

func f(x) {   
  if (x <= 0) {
    error 'Should be positive' (status=99)
  }
}

Tip: reserve such errors for exceptional situations. For example, an input string being invalid may not be uncommon, while a disk full I/O error is more exceptional.

(The error builtin is implemented with C++ exceptions, which are slow in the error case.)

Out Params: &myvar is of type value.Place

Out params are more common in procs, because they don't have a typed return value.

proc p ( ; out) {
  call out->setValue(42)
}
var x
p (&x)
echo "x set to $x"  # => x set to 42

But they can also be used in funcs:

func f (out) {
  call out->setValue(42)
}
var x
call f(&x)
echo "x set to $x"  # => x set to 42

Observation: procs can do everything funcs can. But you may want the purity and familiar syntax of a func.


Design note: out params are a nicer way of doing what bash does with declare -n aka nameref variables. They don't rely on dynamic scope.

Proc-Only Features

Procs have some features that funcs don't have.

Lazy Arg Lists where [x > 10]

A lazy arg list is implemented with shopt --set parse_bracket, and is syntax sugar for an unevaluated value.Expr.

Longhand:

var my_expr = ^[42 === x]  # value of type Expr
assert (myexpr)

Shorthand:

assert [42 === x]  # equivalent to the above

Open Proc Signatures bind argv

TODO: Implement new ARGV semantics.

When a proc signature omits (), it's called "open" because the caller can pass "extra" arguments:

proc my-open {
  write 'args are' @ARGV
}
# All valid:
my-open
my-open 1 
my-open 1 2

Stricter closed procs:

proc my-closed (x) {
  write 'arg is' $x
}
my-closed      # runtime error: missing argument
my-closed 1    # valid
my-closed 1 2  # runtime error: too many arguments

An "open" proc is nearly is nearly identical to a shell function:

shfunc() {
  write 'args are' @ARGV
}

Methods are Funcs Bound to Objects

Values of type Obj have an ordered set of name-value bindings, as well as a prototype chain of more Obj instances ("parents"). They support these operators:

Usage Notes

3 Ways to Return a Value

Let's review the recommended ways to "return" a value:

  1. return (x) in a func.
  2. Pass a value.Place instance to a proc or func.
  3. Print to stdout in a proc

Obsolete ways of "returning":

  1. Using declare -n aka nameref variables in bash.
  2. Relying on dynamic scope in POSIX shell.

Procs Compose in Pipelines / "Bernstein Chaining"

Some YSH users may tend toward funcs because they're more familiar. But shell composition with procs is very powerful!

They have at least two kinds of composition that funcs don't have.

See #shell-the-good-parts:

  1. Shell Has a Forth-Like Quality - Bernstein chaining.
  2. Pipelines Support Vectorized, Point-Free, and Imperative Style - the shell can transparently run procs as elements of pipelines.

Summary

YSH is influenced by both shell and Python, so it has both procs and funcs.

Many programmers will gravitate towards funcs because they're familiar, but procs are more powerful and shell-like.

Make your YSH programs by learning to use procs!

Appendix

Implementation Details

procs vs. funcs both have these concerns:

  1. Evaluation of default args at definition time.
  2. Evaluation of actual args at the call site.
  3. Arg-Param binding for builtin functions, e.g. with typed_args.Reader.
  4. Arg-Param binding for user-defined functions.

So the implementation can be thought of as a 2 × 4 matrix, with some code shared. This code is mostly in ysh/func_proc.py.

Related

Generated on Sun, 25 Aug 2024 12:30:01 -0400