<data sed -n ' /someheader/ {h;d} /zzz/ { x /./ p s/.*// x p d }'
Explanation:
- If the current line contains
someheader
, copy it to theh
old space for later;d
elete the pattern space and start a new cycle, so the line cannot matchzzz
later in the code. In effect such line will be treated as a header, even if it containszzz
; it will not be printed immediately. (But see the note down below.) - Otherwise, if the current line contains
zzz
, do the following:- E
x
change the content of the hold and pattern spaces, sozzz
is now in the hold space, and the previously stored header (if any) in the pattern space. - If the pattern space is not empty,
p
rint it. This prints the previously stored header, but prints nothing (not even an empty line) if there was no header or if the header has been purposely forgotten. - Replace the whole pattern space with an empty string, so the header (if any) just printed is now forgotten and will not be printed again if another
zzz
occurs before the next header. - E
x
change again. The current line containingzzz
is now back in the pattern space. p
rint the pattern space, i.e. the current line withzzz
.d
elete the pattern space and start a new cycle. This step is not really necessary in the code as posted, but it allows you to easily adjust the code in the way noted below.
- E
Note:
- As stated, a line containing
someheader
will be treated as a header, even if it containszzz
. If you want such line to be treated as a non-header, just move the/someheader/ {…}
line to under the/zzz/ {…}
block, so the/zzz/ {…}
block is first.
The solution allows you to parse data
in a sane way, even if the data is like this:
zzz without headerUSA someheaderxxxyyyzzzUK someheaderambiguous someheader zzz # a header or a non-header?aaxxxzzzbzzz # multiple zzz under one headerINDIA someheaderxxsssyyy