In case you want to recover one or few text files with partially known content
If the file you want to recover is a plain text file (as Linux understands it, i.e. UTF-8) and the filesystem where the file used to be is/was neither encrypted nor compressed, in Linux use strings on the block device (partition) holding the filesystem.
For each file given, GNU
stringsprints the printable character sequences that are at least 4 characters long (or the number given with the options below) and are followed by an unprintable character.
(source: man 1 strings)
You want something like:
strings -aw -e S -n 10 /dev/sdX1 >/another/filesystem/extracted(or pv /dev/sdX1 | strings -aw -e S -n 10 >/another/filesystem/extracted to see the progress).
Then extracted will be a text file you can view with less, search with grep etc. In my tests -e S was crucial to detect UTF-8 text with multi-byte characters.
Notes:
-n 10tells the tool to print sequences at least 10 bytes long. The manual says "characters" but my tests with UTF-8 multi-byte characters show it's "bytes" for sure. The lower the number, the more garbage you will get. On the other hand you should not exceed the block size used by the source filesystem, which is at least 512 (the lowest common sector size for block devices). The point is your file may be fragmented and-nhigher than the block size will miss a textual block, if it happens to be between non-textual data. If your file was tiny (smaller than-nyou used) then you might miss it completely. Similarly you may miss the tail part of your desired file, if the part happens not to be adjacent to other text.extractedwill probably be relatively huge anyway, too big for "manual" inspection. You will probably need to use a good text editor or a pager (capable of handling large text files) to interactively search for the string you know was in the file you want to recover. Or usegrep(possibly with-A,-B; seeman 1 grep) to search for the string. This way you will hopefully locate the relevant fragment ofextracted.The file you're after may be fragmented, scattered, not necessarily in sequence. In
extractedthere may be old versions, there may be fragments of other files (garbage, including text-alike fragments of binary files); all these possibly interleaved.extractedas a whole will be a textual jigsaw puzzle. Consider using the-s(--output-separator) option ofstrings, but keep in mind if there are unrelated fragments strictly adjacent in the filesystem then you won't get a separator between them, as if they were one bigger chunk.If the filesystem you're trying to recover data from is on SSD and TRIM was performed after the mishap in which you lost the file, then there's a risk the content of the file is gone. This is a bad scenario.
On the other hand, if the filesystem is on SSD and TRIM was performed before the mishap, and there was no TRIM later, then the TRIM may have wiped out unrelated old data, old versions of files etc., but not the content of the file you're after. In effect you will get less garbage from
strings. This is a good scenario.As you can see, SSD may be a disadvantage or an advantage. For HDD these scenarios do not apply. Virtual disks may support something similar to TRIM.
In the beginning I wrote "the filesystem […] neither encrypted nor compressed". An encrypted or compressed filesystem would store textual data not in its plain form, so
stringswould be useless. I guess some other features of some filesystems may lower your chances or cause some extra garbage.If you have enough RAM, consider copying
extractedto/dev/shm(or usevmtouch -l) to speed up your work withgrepor something.The whole idea requires a string you know was in the file. Using the name of the file as a known string won't help you locate the content because in general filenames and actual data are stored separately, not necessarily near each other. This observation leads us to a preemptive strategy (i.e. in advance, before any mishap) that can make your important text files more prone to be recovered by our method after a future mishap.
Let's suppose you want to store a SerialKey for VeryImportantSoftware in a text file. The key is
J7f9e7sc. Do not store the key only, build the text file like this:SerialKey for VeryImportantSoftware: J7f9e7scIn case you ever need to recover this file and you decide to use our method with
strings, search forSerialKeyand/orVeryImportantSoftware, or even forSerialKey for VeryImportantSoftwareif you remember this is the exact string.