Quantcast
Viewing all articles
Browse latest Browse all 664

Answer by Kamil Maciorowski for How to print all lines matching the pattern 1 and print only line with pattern 2 which is the line before line containing pattern 1?

<data sed -n '   /someheader/ {h;d}   /zzz/ {      x      /./ p      s/.*//      x      p      d   }'

Explanation:

  • If the current line contains someheader, copy it to the hold space for later; delete the pattern space and start a new cycle, so the line cannot match zzz later in the code. In effect such line will be treated as a header, even if it contains zzz; it will not be printed immediately. (But see the note down below.)
  • Otherwise, if the current line contains zzz, do the following:
    • Exchange the content of the hold and pattern spaces, so zzz is now in the hold space, and the previously stored header (if any) in the pattern space.
    • If the pattern space is not empty, print it. This prints the previously stored header, but prints nothing (not even an empty line) if there was no header or if the header has been purposely forgotten.
    • Replace the whole pattern space with an empty string, so the header (if any) just printed is now forgotten and will not be printed again if another zzz occurs before the next header.
    • Exchange again. The current line containing zzz is now back in the pattern space.
    • print the pattern space, i.e. the current line with zzz.
    • delete the pattern space and start a new cycle. This step is not really necessary in the code as posted, but it allows you to easily adjust the code in the way noted below.

Note:

  • As stated, a line containing someheader will be treated as a header, even if it contains zzz. If you want such line to be treated as a non-header, just move the /someheader/ {…} line to under the /zzz/ {…} block, so the /zzz/ {…} block is first.

The solution allows you to parse data in a sane way, even if the data is like this:

zzz without headerUSA someheaderxxxyyyzzzUK someheaderambiguous someheader zzz   # a header or a non-header?aaxxxzzzbzzz                       # multiple zzz under one headerINDIA someheaderxxsssyyy

Viewing all articles
Browse latest Browse all 664

Trending Articles