summary refs log tree commit diff stats
path: root/src/lib.rs
diff options
context:
space:
mode:
Diffstat (limited to 'src/lib.rs')
-rw-r--r--src/lib.rs127
1 files changed, 57 insertions, 70 deletions
diff --git a/src/lib.rs b/src/lib.rs
index 6d8a7e0..824d54e 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -14,89 +14,76 @@
 //!
 //! ## Syntax Elements of Datafu Expressions
 //!
-//! FIXME still need to update these...
-//!
-//! An arrow is `->` and indicates indexing/iteration. Whether indexing or
-//! iteration is used is defined by the elements that follow, with iteration
-//! being used by default.
-//!
-//! A variable is a sequence of alphanumeric characters, not starting with
-//! a digit. The value of the matched element will be identified by this name.
-//!
-//! A literal is a sequence of characters delimited by `'`, optionally
-//! followed by `?`, with `%` as the escape character, and defines a
-//! string-keyed indexing operation. A literal can contain any character,
-//! except unescaped `%` or `'` symbols, which must be escaped as
-//! `%%` and `%'`, respectively. The sequence of characters defined by
-//! a literal is used as the string object in the indexing operation.
-//!
-//! A parameter is `$`, optionally followed by `?`, followed by a
-//! sequence of alphanumeric characters, not starting with a digit, and
-//! defines an object-keyed indexing operation. The sequence of characters
-//! defined by a parameter is used to retrieve, from the pattern's
-//! definitions, the object to be used in the indexing operation.
-//!
-//! A regex is a sequence of characters delimited by `/`, optionally
-//! followed by `?`, with `%` as the escape character. A regex can
-//! contain any character, except unescaped `%` or `/` symbols, which
-//! must be escaped as `%%` and `%/`, respectively. The sequence of
-//! characters defined by a regex is passed to the `regex` crate, which
-//! may apply further restrictions on the characters used, and is used to
-//! accept the respective keys processed by the iterator.
-//!
-//! A predicate is `:`, optionally followed by `?`, followed by an
-//! `$` and a sequence of alphanumeric characters, not starting with a
-//! digit, and is used to accept values to be processed based on an
-//! external [`Predicate`].
-//!
-//! A key match is a datafu expression (including, but not limited to, the
-//! empty datafu expression) enclosed within `[` and `]`, optionally
-//! prefixed with an identifier and zero or more predicates, and applies the
-//! enclosed predicates and datafu expression to the key (or index) being
-//! processed. A key match enables additional validation of keys and/or
-//! extraction of values from keys, and accepts a key if and only if the
-//! enclosed predicates accept the key and the enclosed expression matches the
-//! key. The matched key is stored in the identifier.
-//!
-//! A subvalue is a datafu expression (including, but not limited to, the
-//! empty datafu expression) enclosed within `(` and `)`, and applies
-//! the enclosed datafu expression to the value (or index) being processed.
-//! A subvalue enables the ability to match multiple values on the same
-//! object, and accepts a value if and only the enclosed expression
-//! matches the value. A subvalue can be made optional by the presence of
-//! a `?` after the subvalue - in case of no match, it will just omit
-//! the relevant keys in the result. Optional subvalues are unrelated to
-//! non-validating syntax elements (see below), they just use the same
-//! syntax.
+//! A datafu pattern starts with an optional value matcher and is otherwise a
+//! sequence of iterative elements (see "arrow" below) followed by subvalues.
+//!
+//! A datafu pattern is composed of the following elements:
+//!
+//! 1. An arrow, `->` indicates iteration. (Note that datafu operates directly
+//!     on serde and deserialization is an iterative process.)
+//!
+//!     Always followed by a matcher, or a name and an optional matcher.
+//! 2. A name is a sequence of alphanumeric characters, not starting with a
+//!     digit. A name collects a matched value. The value will be identified by
+//!     the name.
+//! 3. A matcher is one of the following 5 elements.
+//! 4. A literal is a quoted string delimited by `'` with `%` as the escape
+//!     character. A literal can contain any Unicode scalar value, and the only
+//!     allowed escape codes are `%%` and `%'`, for `%` and `'` respectively,
+//!     which must be escaped. A literal matches a string value and can be
+//!     optional.
+//! 5. A parameter is `$`, optionally followed by `?`, followed by a sequence
+//!     of alphanumeric characters, not starting with a digit. This is
+//!     currently unimplemented.
+//! 6. A regex is a quoted string delimited by `/`, with `%` as the escape
+//!     character. A regex can contain any Unicode scalar value, and the only
+//!     allowed escape codes are `%%` and `%/`, for `%` and `/` respectively,
+//!     which must be escaped. A regex element matches a string value against
+//!     the regex it represents, and can be optional.
+//! 7. A predicate is `:`, optionally followed by `?`, followed by `$` and a
+//!     sequence of alphanumeric characters, not starting with a digit. This is
+//!     currently unimplemented.
+//! 8. A type is `:`, optionally followed by `?`, followed by one of the
+//!     following keywords: TODO
+//! 9. A key match is a sequence of iterative elements followed by subvalues,
+//!     prefixed by either a matcher or a name and an optional matcher,
+//!     and enclosed within `[` and `]`, optionally followed by `?`. A key
+//!     match applies to map keys and sequence indices, and aside from the
+//!     previously mentioned requirement is otherwise just a datafu pattern.
+//! 10. A subvalue is a sequence of iterative elements followed by subvalues,
+//!     enclosed within `(` and `)`, optionally followed by `?`. A sequence of
+//!     subvalues enables matching distinct patterns on the same value, like
+//!     distinct fields of a struct.
 //!
 //! Some syntax elements can be validating or non-validating. Validating
 //! syntax elements will return a [`errors::MatchError::ValidationError`]
-//! whenever a non-accepted element is encountered, whereas non-validating
-//! ones will skip them. Whether an element is validating is determined by
-//! the absence of an optional `?` in the documented position. Note that
-//! it is possible for a validating syntax element to still yield results
-//! before returning a [`errors::MatchError::ValidationError`], so one
-//! needs to be careful when writing code where such behaviour could
-//! result in a security vulnerability.
+//! whenever a value or subpattern fails to match, whereas non-validating
+//! ones will skip them. In general, whether an element is validating is
+//! determined by the absence of an optional `?` in the documented position,
+//! with the exception of key matches, which instead use it to filter entries
+//! that didn't match.
 //!
-//! The empty pattern matches anything, but only does so once.
+//! The empty pattern matches anything, but only does so once. Empty subvalues
+//! are ignored.
 //!
 //! ## Syntax of Datafu Expressions
 //!
 //! Datafu Expressions follow the given syntax, in (pseudo-)extended BNF:
 //!
 //! ```text
-//! expression ::= [type] [predicate] {arrow tag} {subvalue}
-//! tag ::= identifier [arg] [predicate] | arg [predicate]
-//! arg ::= parameter | literal | regex | keymatch
+//! pattern ::= [matcher] tree
+//! tree ::= {tag} {subvalue}
+//! tag ::= arrow [keymatch] valuematch
+//! valuematch ::= matcher | name [matcher]
+//! matcher ::= parameter | literal | regex | predicate | type
 //!
 //! arrow ::= '->'
-//! keymatch ::= '[' [name] expression ']' ['?']
-//! subvalue ::= '(' expression ')' ['?']
+//! keymatch ::= '[' valuematch tree ']' ['?']
+//! subvalue ::= '(' tree ')' ['?']
 //! ```
 //!
-//! For a description of the terminals "parameter", "literal", "regex" and
-//! "predicate", see "Syntax Elements of Datafu Expressions" above.
+//! For a description of the terminals "parameter", "literal", "regex", "type"
+//! and "predicate", see "Syntax Elements of Datafu Expressions" above.
 //!
 //! # Examples
 //!
@@ -116,7 +103,7 @@
 //! {"a": {"1": {"y": true}, "2": {"x": true, "y": true}}}
 //! ```
 //!
-//! Produces the results for the sub-JSON
+//! Produces the same results as if matched against the sub-JSON
 //!
 //! ```json
 //! {"a": {"2": {"x": true, "y": true}}}