Home » Questions » Computers [ Ask a new question ]

Regex: To pull out a sub-string between two tags in a string

Regex: To pull out a sub-string between two tags in a string

"I have a file in the following format:

Data Data
Data
[Start]
Data I want
[End]
Data

I'd like to grab the Data I want from between the [Start] and [End] tags using a Regex. Can anyone show me how this might be done?"

Asked by: Guest | Views: 330
Total answers/comments: 4
Guest [Entry]

"\[start\]\s*(((?!\[start\]|\[end\]).)+)\s*\[end\]

This should hopefully drop the [start] and [end] markers as well."
Guest [Entry]

"\[start\]\s*(((?!\[start\]|\[end\]).)+)\s*\[end\]

This should hopefully drop the [start] and [end] markers as well."
Guest [Entry]

"$text =""Data Data Data start Data i want end Data"";
($content) = $text =~ m/ start (.*) end /;
print $content;

I had a similar problem for a while & I can tell you this method works..."
Guest [Entry]

"A more complete discussion of the pitfalls of using a regex to find matching tags can be found at: http://faq.perl.org/perlfaq4.html#How_do_I_find_matchi. In particular, be aware that nesting tags really need a full-fledged parser in order to be interpreted correctly.

Note that case sensitivity will need to be turned off in order to answer the question as stated. In perl, that's the i modifier:

$ echo ""Data Data Data [Start] Data i want [End] Data"" \
| perl -ne '/\[start\](.*?)\[end\]/i; print ""$1\n""'
Data i want

The other trick is to use the *? quantifier which turns off the greediness of the captured match. For instance, if you have a non-matching [end] tag:

Data Data [Start] Data i want [End] Data [end]

you probably don't want to capture:

Data i want [End] Data"