Hash::MostUtils - Yet another collection of tools for operating pairwise on lists.

NAME DESCRIPTION SYNOPSIS EXPORTS FUNCTIONS TO MAKE ARRAYS ACT LIKE HASHES lkeys LIST lvalues LIST leach [ ARRAY | HASH | ARRAYREF | HASHREF ] FUNCTIONS TO OPERATE ON LISTS, ARRAYS, AND HASHES AS TUPLES hashmap BLOCK LIST hashgrep BLOCK LIST hashapply BLOCK LIST hashsort BLOCK LIST GENERIC N−ARY FORMS OF VARIOUS LIST-WISE FUNCTIONS n_each N, LIST n_map N, CODEREF, LIST n_grep N, CODEREF, LIST n_apply N, CODEREF, LIST GRAB BAG hash_slice_of HASHREF, LIST hash_slice_by OBJECT, LIST rekey BLOCK HASH revalue BLOCK HASH reindex BLOCK LIST ACKNOWLEDGEMENTS COPYRIGHT AND LICENSE

NAME

Hash::MostUtils − Yet another collection of tools for operating pairwise on lists.

DESCRIPTION

This module provides a number of functions for processing hashes as lists of key, value pairs.

my @found_and_transformed =
hashmap { uc($b) => 100 + $a }
hashgrep { $a < 100 && $b =˜ /[aeiou]/i } (
1 => 'cwm',
2 => 'apple',
100 => 'cherimoya',
);
my @keys = lkeys @found_and_transformed;
my @vals = lvalues @found_and_transformed;
foreach my $key (@keys) {
my $value = shift @vals;
print "$key => $val\n";
}
while (my ($key, $val) = leach @found_and_transformed) {
print "$key => $val\n";
}
my $serialized = join ',', hashsort { $a−>{key} cmp $b−>{key} } %hash;

EXPORTS

By default, none. On request, any of the following:

FUNCTIONS TO MAKE ARRAYS ACT LIKE HASHES

lkeys LIST

Return the "keys" of LIST. Perl’s keys() keyword only operates on hashes; lkeys() offers an approximation of the same functionality for lists.

my @evens = lkeys 1..10;
my @keys =
lkeys # give me back those keys (i.e. the letters)
hashgrep { $b > 100 } # find key/value pairs where the value is > 100
map { $_ => int(rand(1000)) } 'a'..'z'; # turn 'a'..'z' into key/value pairs with random values

The "keys" of a list are the even-positioned items. Note that in the case of an ">empty slot<" in a sparse array, the key will be "undef".

lvalues LIST

Return the "values" of LIST. Perl’s values() keyword only operates on hashes; lvalues() offers an approximation of the same functionality for lists.

my @odds = lkeys 1..10;
my @values =
lvalues # give me back those values (i.e. the letters)
hashgrep { $a > 100 } # look for key/value pairs where the key is > 100
map { int(rand(1000)) => $_ } 'a'..'z'; # make 26 random keys from 1−1000, with fixed keys

The "values" of a list are the odd-positioned items. Note that in the case of an ">empty slot<" in a sparse array, the value will be "undef".

leach [ ARRAY | HASH | ARRAYREF | HASHREF ]

Iterate over an ARRAY, HASH, ARRAYREF, or HASHREF, returning successive "key/value" pairs. This behaves functionally identically to Perl’s built-in "each" keyword; however, it is useful for arrays and array− and hash-references. This function handles objects which are built around blessed array− and hash-references.

my @array = (1..4);
while (my ($k, $v) = leach @array) {
print "$k => $v\n";
}
print "$_\n" for @array;
__END__
1 => 2
3 => 4
1
2
3
4

Using "leach" to gather key/value pairs from a collection is guaranteed to be non-destructive to that collection. One pattern that’s useful for iterating arrays and arrary references in pairs is to use "splice", which has the possibly unintended side effect of destroying the subject collection:

my @array = (1..4);
while (my ($k, $v) = splice @array, 0, 2) {
print "$k => $v\n";
}
print "$_\n" for @array;
__END__
1 => 2
3 => 4

Note the distinction between saying that this function is

leach ARRAY

rather than

leach LIST

Perl does not allow this behavior:

while (my ($k, $v) = leach 1..10) { # can't leach a list, only an array
# do something with this key/value tuple
}

But don’t worry, Perl also doesn’t allow for this behavior:

while (my ($k, $v) = splice 1..10, 0, 2) { # can't splice a list, only an array
# do something with this key/value tuple
}

FUNCTIONS TO OPERATE ON LISTS, ARRAYS, AND HASHES AS TUPLES

"hashmap", "hashgrep", and "hashapply" all act like their corresponding "map", "grep", and "List::Utils::apply" but for one notable exception: whereas "map", "grep", and "apply" all eat items from the given list one-by-one and assign that current value to $_, "hashmap", "hashgrep", and "hashapply" all eat items from the given list two-by-two, and assigns them to $a and $b.

The names $a and $b were chosen because they’re already in lexical scope in Perl due to "sort"’s need for them.

If you have a singular occurance of $a and $b within your program, you will probably see this warning from Perl:

Name 'main::a' used only once: possible typo at ...
Name 'main::b' used only once: possible typo at ...

I’ve just gotten in the habit of adding:

use strict;
use warnings; no warnings 'once';

when I see that message.

hashmap BLOCK LIST

This acts similar to

map BLOCK LIST

with the exception that "map" eats items off of LIST one at a time, assigning the current value to $_; whereas "hashmap" eats items off of LIST two at a time, assigning the first value to $a and the second value to $b.

# naive transformation of this hash into (101 => 'A', 102 => 'B')
my %hash = (
a => 1,
b => 2,
);
my %transformed =
hashmap { $b + 100 => uc($a) }
%hash;

Just like "map", your BLOCK will be called without any arguments. Like perl’s keyword "map", this function maintains the order of LIST.

"hashmap" is simply a prototyped alias for n_map(2, CODEREF, LIST), so all of the documentation to "n_map" applies here.

hashgrep BLOCK LIST

This acts similar to

grep BLOCK LIST

with the exception that "grep" eats items off of LIST one at a time, assigning the current value to $_; whereas "hashgrep" eats items off of LIST two at a time, assigning the first value to $a and the second value to $b.

# lame object dumper
my $object = Some::Class−>new(...);
my %dump =
hashgrep { $a !˜ /ˆ_/ && ! ref($b) } # hide private fields and internal data structures
%$object;

Just like "grep", your BLOCK will be called without any arguments. Like perl’s keyword "grep", this function maintains the order of LIST.

"hashgrep" is simply a prototyped alias for n_grep(2, CODEREF, LIST), so all of the documentation to "n_grep" applies here.

hashapply BLOCK LIST

This is similar to "List::MoreUtils::apply":

apply BLOCK LIST

with the usual exception: "apply" eats items off of LIST one at a time, assigning to $_; whereas "hashapply" eats items off of LIST two at a time, assigning the first value to $a and the second value to $b.

Normal "apply" can be written as map:

my @words = qw(apple banana cherimoya); my @clean1 = map { tr/aeiou//d; $_ } @words; # @clean1 = @words = qw(ppl bnn chrmy);

@words = qw(apple banana cherimoya); my @clean2 = apply { tr/aeiou//d } @words; # @clean2 = qw(ppl bnn chrmy); @words = qw(apple banana cherimoya);

Note that "apply" does not transform the original data, whereas "map" does. Similarly, "hashapply" does not transform the original data, whereas "hashmap" might.

Note that "apply" does not need to explicitly return $_, whereas "map" does. Similarly, "hashapply" does not need to explicitly return a key/value tuple ($a, $b), whereas "hashmap" does need to return something.

Like "apply", "hashapply" will not transform the original LIST.

hashsort BLOCK LIST

Sort LIST by BLOCK, handling two tuples at a time. $a and $b will each have the form:

$a = +{key => ..., value => ...};
$b = +{key => ..., value => ...};

This call:

my %hash = (a => 1, n => 14, m => 13, b => 2, z => 26);
my @sorted =
hashsort { $b−>{key} cmp $a−>{key} }
%hash;

Is equivalent to this:

my %hash = (a => 1, n => 14, m => 13, b => 2, z => 26);
my @sorted =
map { ($_−>{key} => $_−>{value}) }
sort { $b−>{key} cmp $a−>{key} }
map { +{key => $_, value => $hash{$_} }
keys %hash;

"hashsort" is the "sort"−body of a Schwartzian transform over a list of tuples.

GENERIC N−ARY FORMS OF VARIOUS LIST-WISE FUNCTIONS

With the exception of "hashsort", each of the pairwise functions mentioned so far − "leach", "hashmap", "hashgrep", "hashapply" − are actually implemented in terms of more generic N−ary forms. This means that if you need to process a list in sets of N, where N is > 2, you may use the n_* forms of these functions.

Variable naming becomes more interesting when moving beyond 2 items. Whereas $a and $b are always in lexical scope, once you go to N of 3, you need to agree on some variable naming convention.

$a and $b work nicely for the first two elements of a list; so $c is the third, and $d the fourth, and so on. One limitation of this naming scheme is that you may not easily go beyond N of 26 − but if you find yourself needing that, you’ll find the code simple to extend.

In order to prevent ’strict refs’ from complaining about $c..$z, you’ll need to address those variables a bit differently:

my @sets =
n_map 6, sub { [$a, $b, $::c, $::d, $::e, $::f] },
n_apply 3, sub { $_ *= 3 for $a, $b, $::c },
n_grep 3, sub { $::c > 4 },
(1..9); # @sets = ([12, 15, 18, 21, 24, 27]);

I personally find the transition between $b and $::c to be a bit jarring visually, so the one time I wrote a line like the above I chose to write it as $::a and $::b.

my @sets =
n_map 6, sub { [$::a, $::b, $::c, $::d, $::e, $::f] },
n_apply 3, sub { $_ *= 3 for $::a, $::b, $::c },
n_grep 3, sub { $::c > 4 },
(1..9); # @sets = ([12, 15, 18, 21, 24, 27]);

n_each N, LIST

Iterate over LIST, returning successive "key/values" sets.

my @list = (1..9);
while (my ($k, @v) = n_each 3, @list) {
# do something with this $k and @v
}

There’s nothing that says your N needs to remain constant:

my @list = (
a => 1,
b => 1, 2,
c => 1, 2, 3,
d => 1, 2, 3, 4,
);
my $n = 2;
my %triangle;
while (my ($k, @v) = n_each $n++, @list) {
$triangle{$k} = \@v;
}
__END__
%triangle = (
a => [1],
b => [1, 2],
c => [1, 2, 3],
d => [1, 2, 3, 4],
);

There’s probably something clever that you can do with this that I just don’t understand. Please drop me a line if you know what it is.

n_map N, CODEREF, LIST

"map" CODEREF over LIST, operating in N−sized chunks. Within the context of CODEREF, values of LIST will be selected and aliased. LIST must be evenly divisible by N.

See "GENERIC N−ARY FORMS OF VARIOUS LIST-WISE FUNCTIONS" for a discussion of variable names.

my @transformed = n_map(
3,
sub { "$a, $b $::c!\n" },
qw(goodnight sweet prince goodbye cruel world),
);
# @transformed = ("goodnight, sweet prince!\n", "goodbye, cruel world!");

If you are consistently n_map’ping by some N, then you might consider wrapping n_map so the call syntax looks more like one of Perl’s functional keywords:

sub tri_map (&@) { unshift @_, 3; goto &n_map }
my @transformed =
tri_map { "$::a, $::b $::c!\n" }
qw(goodnight sweet prince goodbye cruel world);
# @transformed = ("goodnight, sweet prince!\n", "goodbye, cruel world!");

n_grep N, CODEREF, LIST

"grep" for CODEREF over LIST, operating in N−sized chunks. Within the context of CODEREF, values of LIST will be selected and aliased. LIST must be evenly divisible by N.

See "GENERIC N−ARY FORMS OF VARIOUS LIST-WISE FUNCTIONS" for a discussion of variable names.

my @found = n_grep(
3,
sub { $a =˜ /good/ && $::c =˜ /prince/ },
qw(goodnight sweet prince goodbye cruel world),
);
# @found = qw(goodnight sweet prince);

Just as with "n_map", writing a small bit of gloss to make your N of n_grep work in a functional manner is simple, and makes your code more readable:

sub tri_grep (&@) { unshift @_, 3; goto &n_grep }
my @found =
tri_grep { $::a =˜ /good/ && $::c =˜ /prince/ }
qw(goodnight sweet prince goodbye cruel world);
# @found = qw(goodnight sweet prince);

n_apply N, CODEREF, LIST

"List::Utils::apply" CODEREF to LIST, operating in N−sized chunks. LIST must be evenly divisible by N.

See "GENERIC N−ARY FORMS OF VARIOUS LIST-WISE FUNCTIONS" for a discussion of variable names.

my @uppercase = n_apply(
3,
sub { uc $::c }
qw(goodnight sweet prince goodbye cruel world),
);
# @uppercase = qw(goodnight sweet PRINCE goodbye cruel WORLD);

Just as with "n_map", writing a small bit of gloss to make your N of n_apply work in a functional manner is simple, and makes your code more readable:

sub tri_apply (&@) { unshift @_, 3; goto &n_apply }
my @uppercase =
tri_apply { uc $::c }
qw(goodnight sweet prince goodbye cruel world);
# @uppercase = qw(goodnight sweet PRINCE goodbye cruel WORLD);

GRAB BAG

I like these functions, but they’re decidedly different from everything up to this point. They are mostly used to turn an existing hash reference or object into a smaller representation of itself.

hash_slice_of HASHREF, LIST

Looks into HASHREF and extracts the key/value pairs of the keys named in LIST. If a key in LIST is not present in HASHREF, returns undefined.

my %hash = (1..10);
my %slice = hash_slice_of \%hash, qw(5, 7, 9, 11);
__END__
%slice = (
5 => 6,
7 => 8,
9 => 10,
11 => undef,
);

If you only want to get back key/value pairs for keys in LIST that exist in HASHREF, just add a "hashgrep":

my %hash = (1..10);
my %slice =
hashgrep { exists $hash{$a} }
hash_slice_of \%hash, qw(5, 7, 9, 11);
__END__
%slice = (
5 => 6,
7 => 8,
9 => 10,
);

hash_slice_by OBJECT, LIST

Calls the methods named in LIST on OBJECT and returns a hash of the results. If a method in LIST can not be performed on OBJECT, you will get the standard "Can’t call method −>... on object" error that Perl throws in this circumstance.

my $object = ...;
my %out = hash_slice_by $object, qw(foo bar baz);
__END__
%out = (
foo => 'output of foo',
bar => 'output of bar',
baz => 'output of baz',
);

Note that you may not use "hash_slice_by" to pass arguments to the methods given in LIST. Note too that your methods are invoked in scalar context.

rekey BLOCK HASH

Rename the keys in HASH by the mapping table provided by BLOCK. HASH may be a real hash, or it may be an array that you are treating like a key/value store.

my %hash = (crow => 'black', snow => 'white', libro => 'read all over');
my %spanish = rekey { crow => 'corvino', snow => 'nieve' } %hash;
__END__
%spanish = (
corvino => 'black',
nieve => 'white',
libro => 'read all over',
);

revalue BLOCK HASH

Rename the values in HASH to the mapping table provided by BLOCK. HASH may be a real hash, or it may be an array that you are treating like a key/value store.

my @start = (apple => 'red', apple => 'green');
my @translated = revalue { red => 'rojo', green => 'verde' } @start;
__END__
@translated = (
apple => 'rojo',
apple => 'verde',
);

reindex BLOCK LIST

Reorder the values in LIST by the mapping table provided by BLOCK. LIST may be either an array or a list. In general this function will not work on hashes.

my @array = (1..5);
my @reindexed = reindex { map { $_ => $_ + 1 } 0..$#array } @array;
__END__
@reindexed = (undef, 1..5);

ACKNOWLEDGEMENTS

The names and behaviors of most of these functions were initially developed at AirWave Wireless, Inc. I’ve re-implemented them here.

This software would be trapped on my hard drive were it not for Logan Bell’s encouragement to release it. Separating the personal time I have put into this from the professional time afforded by my employer, Shutterstock, Inc. would be very difficult. Thankfully I haven’t needed to; when I asked to share this, Dan McCormick simply said, "Go for it! Thanks for hacking."

COPYRIGHT AND LICENSE

This library is free software: you may redistribute it and/or modify it under the same terms as Perl itself; either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.

Updated 2024-01-29 - jenkler.se | uex.se