Data::Match - Complex data structure pattern matching |
Data::Match - Complex data structure pattern matching
use Data::Match qw(:all); my ($match, $results) = match($structure, $pattern);
use Data::Match; my $obj = new Data::Match; my ($match, $results) = $obj->execute($structure, $pattern);
Data::Match provides extensible complex Perl data structure searching and matching.
None are exported by default. :func
exports match
and matches
, :pat
exports all the pattern element generators below, :all
exports :func
and :pat
.
A data pattern is a complex data structure that possibly matches another complex data structure. For example:
matches([ 1, 2 ], [ 1, 2 ]); # TRUE
matches([ 1, 2, 3 ], [ 1, ANY, 3 ]); # TRUE
matches([ 1, 2, 3 ], [ 1, ANY, 2 ]); # FALSE: 3 != 2
ANY
matches anything, including an undefined value.
my $results = matches([ 1, 2, 1 ], [ BIND('x'), ANY, BIND('x') ]); # TRUE
BIND($name)
matches anything and remembers each match and its position with every BIND($name)
in $result->{'BIND'}{$name}
. If BIND($name)
is not the same as the first value bound to BIND($name)
it does not match. For example:
my $results = matches([ 1, 2, 3 ], [ BIND('x'), 2, BIND('x') ]); # FALSE: 3 != 1
COLLECT($name)
is similar to BIND but does not compare first bound values.
REST
matches all remaining elements of an array or hash.
matches([ 1, 2, 3 ], [ 1, REST() ]); # TRUE matches({ 'a'=>1, 'b'=>1 }, { 'b'=>1, REST() => REST() }); # TRUE
FIND
searches at all depths for matching sub-patterns.
matches([ 1, [ 1, 2 ], 3], FIND(COLLECT('x', [ 1, REST() ])); # is true.
See the test script t/t1.t
in the package distribution for more pattern examples.
When a BIND
or COLLECT
matches a datum, an entry is collected in $result->{BIND}
and $result->{COLLECT}
, respectively. (This might change in the future)
Each entry for the binding name is a hash containing 'v'
, 'p'
and 'ps'
lists.
'v'
'p'
match_path_*
. 'p'
is not collected if $matchobj->{'no_collect_path'}
.
'ps'
match_path_str
) that describes where the match was for each match. 'ps'
is collected only if $matchobj->{'collect_path_str'}
.
All patterns can have sub-patterns. Most patterns match the AND-ed results of their sub-patterns and their own behavior, first trying the sub-patterns before attempting to match the intrinsic behavior. However, OR
and ANY
match any sub-patterns;
For example:
match([ ['a', 1 ], ['b', 2], ['a', 3] ], EACH(COLLECT('x', ['a', ANY() ])) ) # TRUE
The above pattern means:
For EACH element in the root structure (an array):
COLLECT each element, into collection named 'x'
, that is,
An ARRAY of length 2 that starts with 'a'
.
=back
On the other hand.
match([ ['a', 1 ], ['b', 2], ['a', 3] ], ALL(COLLECT('x', [ 'a', ANY() ])) ) # IS FALSE
Because the second root element (an array) does not start with 'a'
. But,
match( [ ['a', 1 ], ['a', 2], ['a', 3] ], ALL(COLLECT('x', [ 'a', ANY() ])) ) # IS TRUE
The pattern below flattens the nested array into atoms:
match( [ 1, 'x', [ 2, 'x', [ 3, 'x'], [ 4, [ 5, [ 'x' ] ], 6 ] ] ], FIND(COLLECT('x', EXPR(q{! ref}))), { 'no_collect_path' => 1 } )->{'COLLECT'}{'x'}{'v'};
no_collect_path
causes COLLECT
and BIND
to not collect any paths.
Match slices are objects that contain slices of matched portions of a data structure. This is useful for inflicting change into substructures matched by patterns like REST
.
For example:
do { my $a = [ 1, 2, 3, 4 ]; my $p = [ 1, ANY, REST(BIND('s')) ]; my $r = matches($a, $p); ok($r); # TRUE ok(Compare($r->{'BIND'}{'s'}{'v'}[0], [ 3, 4 ])); # TRUE $r->{'BIND'}{'s'}{'v'}[0][0] = 'x'; # Change match slice matches($a, [ 1, 2, 'x', 4 ]); # TRUE }
Hash match slices are generated for each key-value pair for a hash matched by EACH
and ALL
. Each of these match slices can be matched as a hash with a single key-value pair.
Match slices are useful for search and replace missions.
By default Data::Match is blind to Perl object interfaces. To instruct Data::Match to not traverse object implementation containers and honor object interfaces you must provide a visitation adapter. A visitation adapter tells Data::Match how to traverse through an object interface and how to keep track of how it got through.
For example:
package Foo; sub new { my ($cls, %opts) = @_; bless \%opts, $cls; } sub x { shift->{x}; } sub parent { shift->{parent}; } sub children { shift->{children}; } sub add_child { my $self = shift; for my $c ( @_ ) { $c->{parent} = $self; } push(@{$self->{children}}, @_); }
my $foos = [ map(new Foo('x' => $_), 1 .. 10) ]; for my $f ( @$foos ) { $f->add_child($foos->[rand($#$foo)); }
my $pat = FIND( COLLECT('Foo', ISA('Foo', { 'parent' => $foos->[0], REST() => REST() } ) ) ); $match->match($foos, $pat);
The problem with the above example is: FIND
will not honor the interface of class Foo by default and will eventually find a Foo where
$_->parent eq $foos->[0]
through all the parent and child links in the objects' implementation container. To force Data::Match to honor an interface (or a subset of an interface) during FIND
traversal we create a 'find' adapter sub that will do the right thing.
my $opts = { 'find' => { 'Foo' => sub { my ($self, $visitor, $match) = @_;
# Always do 'x'. $visitor->($self->x, 'METHOD', 'x');
# Optional children traversal. if ( $match->{'Foo_find_children'} ) { $visitor->($self->children, 'METHOD', 'children'); }
# Optional parent traversal. if ( $match->{'Foo_find_parent'} ) { $visitor->($self->parent, 'METHOD', 'parent'); } } } } my $match = new Data::Match($opts, 'Foo_find_children' => 1); $match = $match->execute($foos, $pat);
See t/t4.t
for more examples of visitation adapters.
Data::Match employs a mostly-functional external interface since this module was inspired by a Lisp tutorial (``The Little Lisper'', maybe) I read too many years ago; besides, pattern matching is largely recursively functional. The optional control hashes and traverse adapter interfaces are better represented by an object interface so I implemented a functional veneer over the core object interface.
Internally, objects are used to represent the pattern primitives because most of the pattern primitives have common behavior. There are a few design patterns that are particularly applicable in Data::Match: Visitor and Adapter. Adapter is used to provide the extensibility for the traversal of blessed structures such that Data::Match can honor the external interfaces of a class and not blindly violate encapsulation. Visitor is the basis for some of the FIND
pattern implementation. The Data::Match::Slice
classes that provide the match slices are probably a Veneer on the array and hash types through the tie meta-behaviors.
UNIVERSAL::isa
to search for the adapter; it uses ref
. This will be fixed in a future release.
Since hash keys do not retain blessedness (what was Larry thinking?) it is difficult to have patterns match keys without resorting to some bizarre regexp instead of using isa
.
match_path_set
and match_path_ref
do not work through 'METHOD'
path boundaries. This will be fixed in a future release.
BIND
and COLLECT
need scoping operators for deeply collected patterns.
If you find this to be useful please contact the author. This is alpha software; all APIs, semantics and behaviors are subject to change.
This section describes the external interface of this module.
Default options for match
.
Matches a structure against a pattern. In a list context, returns both the match success and results; in a scalar context returns the results hash if match succeeded or undef.
use Data::Match; my $obj = new Data::Match(); my $matched = $obj->execute($thing, $pattern);
use Data::Match qw(match); match($thing, $pattern, @opts)
is equivalent to:
use Data::Match; Data::Match->new(@opts)->execute($thing, $pattern);
Same as match
in scalar context.
Returns a perl expression that will generate code to point to the element of the path.
$matchobj->match_path_str($path, $str);
$str
defaults to '$_'
.
Returns a string suitable for Data::DRef.
$matchobj->match_path_DRef_path($path, $str, $sep);
$str
is used as a prefix for the Data::DRef path.
$str
defaults to ''
;
$sep
defaults to $Data::DRef::Separator
or '.'
;
Returns the value pointing to the location for the match path in the root.
$matchobj->match_path_get($path, $root);
$root
defaults to $matchobj-gt{'root'}
;
Example:
my $results = matches($thing, FIND(BIND('x', [ 'x', REST ]))); my $x = $results->match_path_get($thing, $results->{'BIND'}{'x'}{'p'}[0]);
The above example returns the first array that begins with 'x'
.
Returns the value pointing to the location for the match path in the root.
$matchobj->match_path_set($path, $value, $root);
$root
defaults to $matchobj-gt{'root'}
;
Example:
my $results = matches($thing, FIND(BIND('x', [ 'x', REST ]))); $results->match_path_set($thing, $results->{'BIND'}{'x'}{'p'}[0], 'y');
The above example replaces the first array found that starts with 'x' with 'y';
Returns a scalar ref pointing to the location for the match path in the root.
$matchobj->match_path_ref($path, $root);
$root
defaults to $matchobj-gt{'root'}
;
Example:
my $results = matches($thing, FIND(BIND('x', [ 'x', REST ]))); my $ref = $results->match_path_ref($thing, $results->{'BIND'}{'x'}{'p'}[0]); $$ref = 'y';
The above example replaces the first array that starts with 'x' with 'y';
Version 0.05, $Revision: 1.12 $.
Kurt A. Stephens <ks.perl@kurtstephens.com>
Copyright (c) 2001, 2002 Kurt A. Stephens and ION, INC.
the perl manpage, the Array::PatternMatcher manpage, the Data::Compare manpage, the Data::Dumper manpage, the Data::DRef manpage, the Data::Walker manpage, the Data::Diver manpage and ``Easy Matching of nested data structures'' at http://perlmonks.org/
Data::Match - Complex data structure pattern matching |