Solution
For newline-terminated lists (like /usr/share/dict/words
) in files named input1
and input2
:
join -t "$(printf '\n')" -1 2 -2 2 -o 1.1,2.1 input1 input2 | paste -d '' - -
Explanation
join
will consider each line of input1
and input2
as an array of fields separated by newline characters (-t "$(printf '\n')"
). Since there is exactly one newline character per complete line, each line (minus its terminating newline character) will form the first field entirely, all later fields will be "virtual" and empty.
-1 2 -2 2
tells join
to join lines where the second field of the first file matches the second field of the second file. As stated, these fields are empty, so each line from input1
will match each line from input2
. The result will be the Cartesian product of the two sets of lines. For each member the tool will print the first field from the first file followed by the first field from the second file (-o 1.1,2.1
), but because first fields are in fact our whole lines (without newlines), we will get all possible combinations in the form of:
line from input1line from input2
The newline after line from input2
appears because a record ends here, this is fine. The newline after line from input1
appears because our chosen separator is the newline character. This separator was perfect for input, it's wrong here in the output. paste -d '' - -
is to fix this. The tool takes one line from its standard input (-
) and concatenates it with the next line also from its standard input (-
) with nothing in between (-d ''
); and so on. This way each ordered pair of lines being a member of the Cartesian product becomes:
line from input1line from input2
Notes
If a list uses spaces and/or tabs as separators, use the following command to convert it to a newline-terminated list:
<blank-separated-list { tr ' \t''\n'; echo; } | grep . >newline-terminated-list
Leading separators will be ignored, trailing separators will be ignored, consecutive separators will be treated as one. If
blank-separated-list
contains an incomplete line thenecho
will fix this. There will be no empty lines innewline-terminated-list
.Posixly
join
requires its inputs to be text files, so doesgrep
(see this answer to learn what this means).paste
requires text files, except there is no limit to line lengths.tr
shall accept any input.