I have a bunch of files that contain XML tags like:
<h> PIDAT <h> O
I need to delete everything what comes after the first <h>
in that line, so I can get this:
<h>
For that I'm using
sed -i -e 's/(^<.*?>).+/$1/' *.conll
But it seems that sed is not recognizing the $1
. (As I understand, $1
should delete everything what is not contained in the group). Is there a way I can achieve this? I'd really appreciate if you could point me in the right direction.
PS: I tested those expressions on a regex app and they worked, but it is not working from the command line.