Quantcast
Channel: User Kamil Maciorowski - Super User
Viewing all articles
Browse latest Browse all 675

Answer by Kamil Maciorowski for How can I get the extension(s) of a file based on its content?

$
0
0

Why does file --extension not work for me?

Not only for you. See this question. One of the comments there seems right:

Maybe just a very, very incomplete feature?

I haven't found any standard Unix tool to do the conversion, so your idea may be the easiest solution anyway.

An idea would be to use file --mime-type and then create a dispatch table array that maps known mime-types to their extensions, but I'd much rather have a simpler and safer solution.

Note such a map exists, it's /etc/mime.types. See this another question on Unix & Linux SE. Based on one of the answers I came up with the following function:

function getext() {   [ "$#" != 1 ] && { echo "Wrong number of arguments. Provide exactly one." >&2; return 254; }   [ -r "$1" ] || { echo "Not a file, nonexistent or unreadable." >&2; return 1; }   grep "^$(file -b --mime-type "$1")"$'\t' /etc/mime.types |      awk -F '\t+''{print $2}'}

Usage:

getext test_text_file.txt   # it takes just one argument

Tailor it to your needs, make it a script etc. The main concerns:

  • If succeeded (exit status 0), the output may be non-empty or empty (not even \n).

  • Some mime-types return more than one extension. You can use cut -d '' -f 1 to get at most one, it may be not the one you want though.

  • So a custom map file instead of /etc/mime.types may be useful. This command will show you which mime-types exist in the current directory (and subdirectories):

      find . -type f -exec file -b --mime-type {} + | sort | uniq
  • grep shouldn't match more than once (at least with /etc/mime.types); ^ (line start) and $'\t' (tab) are there to avoid partial matching. Use grep -m 1 ... (or head -n 1 later) to be sure you'll get at most one line.


Viewing all articles
Browse latest Browse all 675

Trending Articles