Quantcast
Channel: User Kamil Maciorowski - Super User
Viewing all articles
Browse latest Browse all 837

Answer by Kamil Maciorowski for Find files (strictly) older than another file

$
0
0

Preliminary note

The purpose of this answer is to provide code that is strict and portable (for comparison: find … -newerct … from this answer and [ … -ot … from this answer are not portable).


Basic solution

For every file that passes your ! -newer deployment_metadata.txt, check if deployment_metadata.txt is -newer:

find . ! -newer deployment_metadata.txt -exec sh -c '   [ -n "$(find deployment_metadata.txt -prune -newer "$1")" ]' find-sh {} \; -print

Notes

  • find-sh is explained here: What is the second sh in sh -c 'some shell code' sh?

  • -prune is in case deployment_metadata.txt is a directory.

  • [ -n "$(find …)" ] converts the non-empty or empty output from the inner find into exit status 0 or 1 respectively, this becomes the exit status of sh; then -exec of the outer find evaluates as true or false respectively.

  • We want to know if the resulting string is empty or not, it can either be empty or be exactly the name of the reference file (deployment_metadata.txt in our case). If the name of the reference file contained newline characters only, remember $(…) strips all trailing newlines; in this case our test would not be able to tell the two possibilities apart. To solve this problem you should supply a broader path, e.g. ./…. The problem occurs only for a basename consisting of nothing but newline character(s); in practice you don't need to care, unless you deliberately use such name instead of deployment_metadata.txt.

  • In [, -n is the default. I used -n explicitly in case the name of the reference file starts with - and could be interpreted by [ as an option. The point is not every implementation of [ is smart enough to recognize the default case early by the fact there is exactly one argument before ]. The name of your reference file is deployment_metadata.txt and it's safe anyway, but if you ever want to use -whatever then with the explicit -n it should still work.

  • We need a shell to make $(…) work. There will be one sh and one find for every file that passes your original test. Creating a new process is relatively slow, so the solution may perform poorly.

  • Our -exec is enough to test what we want. This means ! -newer deployment_metadata.txt is not strictly needed. It's useful though, it improves the performance. Each time ! -newer evaluates to false, our costly -exec is not evaluated. Without this preliminary test the -exec would be evaluated for every file tested by the outer find.

  • You can add more tests/actions before our -exec or/and directly before (or instead of) -print. Our whole -exec … \; is equivalent to hypothetical -older deployment_metadata.txt you'd like to have. This is the beauty of find: with -exec you can build virtually any test.


Possible improvement

If all you want is to -print the result, the method can be optimized slightly:

find . ! -newer deployment_metadata.txt -exec sh -c '   for f do      [ -n "$(find deployment_metadata.txt -prune -newer "$f")" ] \&& printf "%s\n" "$f"   done' find-sh {} +

In this approach one sh will serve multiple pathnames supplied by the outer find (still find can run multiple sh processes in sequence, in case all of the pathnames to be served would trigger argument list too long error, find is that smart). This way we lower the number of sh processes. We added printf, but printf is a builtin in virtually any implementation of sh and as such it runs as a part of the sh process, not a separate process.

You can still add more find-specific tests before our -exec, but not after (well, technically you can, but this won't work as expected because -exec …+ always evaluates as true). Properly crafted shell code just before (or instead of) printf may work as further tests, but because it will probably involve spawning additional processes, it will defy the purpose.


Viewing all articles
Browse latest Browse all 837

Trending Articles