Remove `Vec` requirement #392

therealbnut · 2024-12-12T05:05:43Z

Hi,

I've wanted to be able to store repeated values into a structure of my choosing, instead of a Vec.

This change does that, without needing a temporary intermediate Vec. I believe it will also improve performance as expressions like $(expression ** delim) will currently create a temporary Vec.

Two examples from the tests of how this may improve some usage:

-        = v:( "a" / "\n" )* { v.len() }
+        = v:( "a" / "\n" )* {
+            // Does not store the items.
+            let counter: ItemCounter = v;
+            counter.count()
+        }

     pub rule keyvals() -> HashMap<i64, i64>
-        = kvs:keyval() ++ "\n" {
-            kvs.iter().cloned().collect::<HashMap<i64, i64>>()
-        }
+        = kvs:keyval() ++ "\n" { kvs }

Currently this works by requiring the collection the values are stored into to implement Default and Extend. It may sometimes be necessary to be explicit about the type like the ItemCounter example above. It also may not be able to infer the type of the Extend<T>'s T if it implements it for multiple types, in which case the new-type pattern might be necessary. I don't imagine that will be common though.

Another pattern that might be useful is to be able to instantiate the collection type with some context. I'm not sure of a good way to do this at the moment. I looked into returning an iterator, but you still need to be able to evaluate whether parsing fails before the iterator is evaluated.

therealbnut · 2024-12-12T05:32:49Z

Instead of Default it may be useful to have our own trait which can be instantiated from the rule parameters (if there are any). Examples of where this might be useful is for something like string interning, or putting the AST in contiguous memory storage for cache efficiency. I'm not sure how you'd achieve this without something like specialisation though. I'll think on this more.

kevinmehall · 2024-12-26T19:33:00Z

I'm concerned about the reliance on type inference here. It's definitely a breaking change (which isn't necessarily a problem; there will be an 0.9). Especially because the workaround for type inference failure requires introducing a variable in the rule action, which is rather ugly.

I'm wondering if it would be better to make it explicit with something like x()* as HashSet or maybe even x()* in (context.new_list()). I used parentheses on the latter because the rust-peg metagrammar has a rule covering Rust's type expressions, but Rust value expressions are more daunting to parse, so all Rust value expressions in the grammar are wrapped in () or {} so rustc's parser is responsible for delimiting them.

For ** or ++, not sure if it would want to be written as x() ** as HashMap "," vs x() ** "," as HashMap. Somewhat awkward to have an operator with 3+ params.

Using extend(Some(x)) to do push via a std trait is kind of clever.

therealbnut · 2024-12-26T19:52:54Z

I'm wondering if it would be better to make it explicit with something like x()* as HashSet or maybe even x()* in (context.new_list()).

One alternative I’ve been considering is having it expressed at the rule level. This makes the type explicit on the rule definition.

A possible way to express this is if the context has the same name as part of the rule:

rule keyvals(kvs: &mut HashMap<i64, i64>) -> 
         = kvs:keyval() ++ "\n"

This would extend into the kvs context rather than creating a new variable. I haven’t checked yet, but hopefully it’s an error to shadow parameters like this currently, and so wouldn’t be a breaking change anymore.

kevinmehall · 2024-12-26T20:36:52Z

What's the use case where the caller constructs the container? It seems like it would be more common for the rule to want to construct and return it.

And I believe shadowing is currently allowed. Variable binding just expands to let or nested match, and Rust allows shadowing.

therealbnut · 2024-12-26T20:44:10Z

The use case is the same as the rest of the PR. I was imagining it might be a cleaner way to express it.

Instead of your suggestion like this:

rule xs() = x()* in (context.new_list())

From the caller you would do:

xs(context.new_list())

therealbnut · 2024-12-26T20:51:07Z

To clarify, the use case is to be able to store it in custom collections, and use things like memory pools and flattened/contiguous storage. Although it was more about keeping the rule syntax simple and flexible than it was about the responsibility of the caller/callee.

I would expect that if you aren’t passing in the thing to be extended then it would use the old behaviour. Although that does make it cumbersome in a similar way as the type inference. It’s an idea I’m still thinking through.

therealbnut added 2 commits December 12, 2024 15:54

proof of concept

c445b4b

Fix grammar type inference and run bootstrap

9c49d86

fix core namespaces

b14b1c8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove `Vec` requirement #392

Remove `Vec` requirement #392

therealbnut commented Dec 12, 2024 •

edited

Loading

therealbnut commented Dec 12, 2024 •

edited

Loading

kevinmehall commented Dec 26, 2024

therealbnut commented Dec 26, 2024 •

edited

Loading

kevinmehall commented Dec 26, 2024

therealbnut commented Dec 26, 2024

therealbnut commented Dec 26, 2024 •

edited

Loading

Remove Vec requirement #392

Are you sure you want to change the base?

Remove Vec requirement #392

Conversation

therealbnut commented Dec 12, 2024 • edited Loading

therealbnut commented Dec 12, 2024 • edited Loading

kevinmehall commented Dec 26, 2024

therealbnut commented Dec 26, 2024 • edited Loading

kevinmehall commented Dec 26, 2024

therealbnut commented Dec 26, 2024

therealbnut commented Dec 26, 2024 • edited Loading

Remove `Vec` requirement #392

Remove `Vec` requirement #392

therealbnut commented Dec 12, 2024 •

edited

Loading

therealbnut commented Dec 12, 2024 •

edited

Loading

therealbnut commented Dec 26, 2024 •

edited

Loading

therealbnut commented Dec 26, 2024 •

edited

Loading