Hash::MostUtils − Yet another collection of tools for operating pairwise on lists.
This module provides a number of functions for processing hashes as lists of key, value pairs.
my
@found_and_transformed =
hashmap { uc($b) => 100 + $a }
hashgrep { $a < 100 && $b =˜ /[aeiou]/i } (
1 => 'cwm',
2 => 'apple',
100 => 'cherimoya',
);
my @keys = lkeys @found_and_transformed;
my @vals = lvalues @found_and_transformed;
foreach my $key (@keys) {
my $value = shift @vals;
print "$key => $val\n";
}
while (my ($key, $val) = leach @found_and_transformed) {
print "$key => $val\n";
}
my $serialized = join ',', hashsort { $a−>{key} cmp
$b−>{key} } %hash;
By default, none. On request, any of the following:
Return the "keys" of LIST. Perl’s keys() keyword only operates on hashes; lkeys() offers an approximation of the same functionality for lists.
my @evens =
lkeys 1..10;
my @keys =
lkeys # give me back those keys (i.e. the letters)
hashgrep { $b > 100 } # find key/value pairs where the
value is > 100
map { $_ => int(rand(1000)) } 'a'..'z'; # turn 'a'..'z'
into key/value pairs with random values
The "keys" of a list are the even-positioned items. Note that in the case of an ">empty slot<" in a sparse array, the key will be "undef".
Return the "values" of LIST. Perl’s values() keyword only operates on hashes; lvalues() offers an approximation of the same functionality for lists.
my @odds = lkeys
1..10;
my @values =
lvalues # give me back those values (i.e. the letters)
hashgrep { $a > 100 } # look for key/value pairs where
the key is > 100
map { int(rand(1000)) => $_ } 'a'..'z'; # make 26 random
keys from 1−1000, with fixed keys
The "values" of a list are the odd-positioned items. Note that in the case of an ">empty slot<" in a sparse array, the value will be "undef".
Iterate over an ARRAY, HASH, ARRAYREF, or HASHREF, returning successive "key/value" pairs. This behaves functionally identically to Perl’s built-in "each" keyword; however, it is useful for arrays and array− and hash-references. This function handles objects which are built around blessed array− and hash-references.
my @array =
(1..4);
while (my ($k, $v) = leach @array) {
print "$k => $v\n";
}
print "$_\n" for @array;
__END__
1 => 2
3 => 4
1
2
3
4
Using "leach" to gather key/value pairs from a collection is guaranteed to be non-destructive to that collection. One pattern that’s useful for iterating arrays and arrary references in pairs is to use "splice", which has the possibly unintended side effect of destroying the subject collection:
my @array =
(1..4);
while (my ($k, $v) = splice @array, 0, 2) {
print "$k => $v\n";
}
print "$_\n" for @array;
__END__
1 => 2
3 => 4
Note the distinction between saying that this function is
leach ARRAY
rather than
leach LIST
Perl does not allow this behavior:
while (my ($k,
$v) = leach 1..10) { # can't leach a list, only an array
# do something with this key/value tuple
}
But don’t worry, Perl also doesn’t allow for this behavior:
while (my ($k,
$v) = splice 1..10, 0, 2) { # can't splice a list, only an
array
# do something with this key/value tuple
}
"hashmap", "hashgrep", and "hashapply" all act like their corresponding "map", "grep", and "List::Utils::apply" but for one notable exception: whereas "map", "grep", and "apply" all eat items from the given list one-by-one and assign that current value to $_, "hashmap", "hashgrep", and "hashapply" all eat items from the given list two-by-two, and assigns them to $a and $b.
The names $a and $b were chosen because they’re already in lexical scope in Perl due to "sort"’s need for them.
If you have a singular occurance of $a and $b within your program, you will probably see this warning from Perl:
Name 'main::a'
used only once: possible typo at ...
Name 'main::b' used only once: possible typo at ...
I’ve just gotten in the habit of adding:
use strict;
use warnings; no warnings 'once';
when I see that message.
This acts similar to
map BLOCK LIST
with the exception that "map" eats items off of LIST one at a time, assigning the current value to $_; whereas "hashmap" eats items off of LIST two at a time, assigning the first value to $a and the second value to $b.
# naive
transformation of this hash into (101 => 'A', 102 =>
'B')
my %hash = (
a => 1,
b => 2,
);
my %transformed =
hashmap { $b + 100 => uc($a) }
%hash;
Just like "map", your BLOCK will be called without any arguments. Like perl’s keyword "map", this function maintains the order of LIST.
"hashmap" is simply a prototyped alias for n_map(2, CODEREF, LIST), so all of the documentation to "n_map" applies here.
This acts similar to
grep BLOCK LIST
with the exception that "grep" eats items off of LIST one at a time, assigning the current value to $_; whereas "hashgrep" eats items off of LIST two at a time, assigning the first value to $a and the second value to $b.
# lame object
dumper
my $object = Some::Class−>new(...);
my %dump =
hashgrep { $a !˜ /ˆ_/ && ! ref($b) } #
hide private fields and internal data structures
%$object;
Just like "grep", your BLOCK will be called without any arguments. Like perl’s keyword "grep", this function maintains the order of LIST.
"hashgrep" is simply a prototyped alias for n_grep(2, CODEREF, LIST), so all of the documentation to "n_grep" applies here.
This is similar to "List::MoreUtils::apply":
apply BLOCK LIST
with the usual exception: "apply" eats items off of LIST one at a time, assigning to $_; whereas "hashapply" eats items off of LIST two at a time, assigning the first value to $a and the second value to $b.
Normal "apply" can be written as map:
my @words = qw(apple banana cherimoya); my @clean1 = map { tr/aeiou//d; $_ } @words; # @clean1 = @words = qw(ppl bnn chrmy);
@words = qw(apple banana cherimoya); my @clean2 = apply { tr/aeiou//d } @words; # @clean2 = qw(ppl bnn chrmy); @words = qw(apple banana cherimoya);
Note that "apply" does not transform the original data, whereas "map" does. Similarly, "hashapply" does not transform the original data, whereas "hashmap" might.
Note that "apply" does not need to explicitly return $_, whereas "map" does. Similarly, "hashapply" does not need to explicitly return a key/value tuple ($a, $b), whereas "hashmap" does need to return something.
Like "apply", "hashapply" will not transform the original LIST.
Sort LIST by BLOCK, handling two tuples at a time. $a and $b will each have the form:
$a = +{key =>
..., value => ...};
$b = +{key => ..., value => ...};
This call:
my %hash = (a
=> 1, n => 14, m => 13, b => 2, z => 26);
my @sorted =
hashsort { $b−>{key} cmp $a−>{key} }
%hash;
Is equivalent to this:
my %hash = (a
=> 1, n => 14, m => 13, b => 2, z => 26);
my @sorted =
map { ($_−>{key} => $_−>{value}) }
sort { $b−>{key} cmp $a−>{key} }
map { +{key => $_, value => $hash{$_} }
keys %hash;
"hashsort" is the "sort"−body of a Schwartzian transform over a list of tuples.
With the exception of "hashsort", each of the pairwise functions mentioned so far − "leach", "hashmap", "hashgrep", "hashapply" − are actually implemented in terms of more generic N−ary forms. This means that if you need to process a list in sets of N, where N is > 2, you may use the n_* forms of these functions.
Variable naming becomes more interesting when moving beyond 2 items. Whereas $a and $b are always in lexical scope, once you go to N of 3, you need to agree on some variable naming convention.
$a and $b work nicely for the first two elements of a list; so $c is the third, and $d the fourth, and so on. One limitation of this naming scheme is that you may not easily go beyond N of 26 − but if you find yourself needing that, you’ll find the code simple to extend.
In order to prevent ’strict refs’ from complaining about $c..$z, you’ll need to address those variables a bit differently:
my @sets =
n_map 6, sub { [$a, $b, $::c, $::d, $::e, $::f] },
n_apply 3, sub { $_ *= 3 for $a, $b, $::c },
n_grep 3, sub { $::c > 4 },
(1..9); # @sets = ([12, 15, 18, 21, 24, 27]);
I personally find the transition between $b and $::c to be a bit jarring visually, so the one time I wrote a line like the above I chose to write it as $::a and $::b.
my @sets =
n_map 6, sub { [$::a, $::b, $::c, $::d, $::e, $::f] },
n_apply 3, sub { $_ *= 3 for $::a, $::b, $::c },
n_grep 3, sub { $::c > 4 },
(1..9); # @sets = ([12, 15, 18, 21, 24, 27]);
Iterate over LIST, returning successive "key/values" sets.
my @list =
(1..9);
while (my ($k, @v) = n_each 3, @list) {
# do something with this $k and @v
}
There’s nothing that says your N needs to remain constant:
my @list = (
a => 1,
b => 1, 2,
c => 1, 2, 3,
d => 1, 2, 3, 4,
);
my $n = 2;
my %triangle;
while (my ($k, @v) = n_each $n++, @list) {
$triangle{$k} = \@v;
}
__END__
%triangle = (
a => [1],
b => [1, 2],
c => [1, 2, 3],
d => [1, 2, 3, 4],
);
There’s probably something clever that you can do with this that I just don’t understand. Please drop me a line if you know what it is.
"map" CODEREF over LIST, operating in N−sized chunks. Within the context of CODEREF, values of LIST will be selected and aliased. LIST must be evenly divisible by N.
See "GENERIC N−ARY FORMS OF VARIOUS LIST-WISE FUNCTIONS" for a discussion of variable names.
my @transformed
= n_map(
3,
sub { "$a, $b $::c!\n" },
qw(goodnight sweet prince goodbye cruel world),
);
# @transformed = ("goodnight, sweet prince!\n",
"goodbye, cruel world!");
If you are consistently n_map’ping by some N, then you might consider wrapping n_map so the call syntax looks more like one of Perl’s functional keywords:
sub tri_map
(&@) { unshift @_, 3; goto &n_map }
my @transformed =
tri_map { "$::a, $::b $::c!\n" }
qw(goodnight sweet prince goodbye cruel world);
# @transformed = ("goodnight, sweet prince!\n",
"goodbye, cruel world!");
"grep" for CODEREF over LIST, operating in N−sized chunks. Within the context of CODEREF, values of LIST will be selected and aliased. LIST must be evenly divisible by N.
See "GENERIC N−ARY FORMS OF VARIOUS LIST-WISE FUNCTIONS" for a discussion of variable names.
my @found =
n_grep(
3,
sub { $a =˜ /good/ && $::c =˜ /prince/ },
qw(goodnight sweet prince goodbye cruel world),
);
# @found = qw(goodnight sweet prince);
Just as with "n_map", writing a small bit of gloss to make your N of n_grep work in a functional manner is simple, and makes your code more readable:
sub tri_grep
(&@) { unshift @_, 3; goto &n_grep }
my @found =
tri_grep { $::a =˜ /good/ && $::c =˜
/prince/ }
qw(goodnight sweet prince goodbye cruel world);
# @found = qw(goodnight sweet prince);
"List::Utils::apply" CODEREF to LIST, operating in N−sized chunks. LIST must be evenly divisible by N.
See "GENERIC N−ARY FORMS OF VARIOUS LIST-WISE FUNCTIONS" for a discussion of variable names.
my @uppercase =
n_apply(
3,
sub { uc $::c }
qw(goodnight sweet prince goodbye cruel world),
);
# @uppercase = qw(goodnight sweet PRINCE goodbye cruel
WORLD);
Just as with "n_map", writing a small bit of gloss to make your N of n_apply work in a functional manner is simple, and makes your code more readable:
sub tri_apply
(&@) { unshift @_, 3; goto &n_apply }
my @uppercase =
tri_apply { uc $::c }
qw(goodnight sweet prince goodbye cruel world);
# @uppercase = qw(goodnight sweet PRINCE goodbye cruel
WORLD);
I like these functions, but they’re decidedly different from everything up to this point. They are mostly used to turn an existing hash reference or object into a smaller representation of itself.
Looks into HASHREF and extracts the key/value pairs of the keys named in LIST. If a key in LIST is not present in HASHREF, returns undefined.
my %hash =
(1..10);
my %slice = hash_slice_of \%hash, qw(5, 7, 9, 11);
__END__
%slice = (
5 => 6,
7 => 8,
9 => 10,
11 => undef,
);
If you only want to get back key/value pairs for keys in LIST that exist in HASHREF, just add a "hashgrep":
my %hash =
(1..10);
my %slice =
hashgrep { exists $hash{$a} }
hash_slice_of \%hash, qw(5, 7, 9, 11);
__END__
%slice = (
5 => 6,
7 => 8,
9 => 10,
);
Calls the methods named in LIST on OBJECT and returns a hash of the results. If a method in LIST can not be performed on OBJECT, you will get the standard "Can’t call method −>... on object" error that Perl throws in this circumstance.
my $object =
...;
my %out = hash_slice_by $object, qw(foo bar baz);
__END__
%out = (
foo => 'output of foo',
bar => 'output of bar',
baz => 'output of baz',
);
Note that you may not use "hash_slice_by" to pass arguments to the methods given in LIST. Note too that your methods are invoked in scalar context.
Rename the keys in HASH by the mapping table provided by BLOCK. HASH may be a real hash, or it may be an array that you are treating like a key/value store.
my %hash = (crow
=> 'black', snow => 'white', libro => 'read all
over');
my %spanish = rekey { crow => 'corvino', snow =>
'nieve' } %hash;
__END__
%spanish = (
corvino => 'black',
nieve => 'white',
libro => 'read all over',
);
Rename the values in HASH to the mapping table provided by BLOCK. HASH may be a real hash, or it may be an array that you are treating like a key/value store.
my @start =
(apple => 'red', apple => 'green');
my @translated = revalue { red => 'rojo', green =>
'verde' } @start;
__END__
@translated = (
apple => 'rojo',
apple => 'verde',
);
Reorder the values in LIST by the mapping table provided by BLOCK. LIST may be either an array or a list. In general this function will not work on hashes.
my @array =
(1..5);
my @reindexed = reindex { map { $_ => $_ + 1 } 0..$#array
} @array;
__END__
@reindexed = (undef, 1..5);
The names and behaviors of most of these functions were initially developed at AirWave Wireless, Inc. I’ve re-implemented them here.
This software would be trapped on my hard drive were it not for Logan Bell’s encouragement to release it. Separating the personal time I have put into this from the professional time afforded by my employer, Shutterstock, Inc. would be very difficult. Thankfully I haven’t needed to; when I asked to share this, Dan McCormick simply said, "Go for it! Thanks for hacking."
(c) 2013 by Belden Lyman
This library is free software: you may redistribute it and/or modify it under the same terms as Perl itself; either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.