Match a row as long as possible
I'm going to parse a position base file from a legacy system. Each column
in the file has a fixed column width and each row can maximum be 80 chars
long. The problem is that you don't know how long a row is. Sometime they
only have filled in the first five columns, and sometimes all columns are
used.
If I KNOW that all 80 chars where used, then I simple could do like this:
^\s*
(?<a>\w{3})
(?<b>[ \d]{2})
(?<c>[ 0-9a-fA-F]{2})
(?<d>.{20})
...
But the problem with this is that if the last columns is missing, the row
will not match. The last column can even be less number of chars then the
maximum of that column.
See example
Text to match a b c d
"AQM45A3A text " => AQM 45 A3 "A text " //group d has 9 chars instead
of 20
"AQM45F5" => AQM 45 F5 //group d is missing
"AQM4" => AQM 4 //group b has 1 char instead
of 2
"COM*A comment" => Comments do not match (all comments are prefixed
with COM*)
" " => Empty lines do not match
In this example, EACH row that I want to parse, is starting with AQM
How should I design the Regular Expression to match this?
No comments:
Post a Comment