finding occurrence of all words in text
I want to find occurrences of words in a string. Currently, I'm doing this
by splitting the string on space and finding all occurrences from a given
list of words. However, this obviously fails when the strings don't have
any whitespace in them. My strings could be of various kind and I would
like to know one way which captures everything.
Below is my code
mytext = "this is just a test for foo bar"
words = ["foo", "test", "awesome"]
mytextwords = mytext.toLowerCase().split(/\s+/)
assert ["foo", "test"] == words.findAll{it.toLowerCase() in mytextwords}
The above works fine, however, it won't work when mytext is any of the below:
mytext = "/this/is/just/a/test/for/foo/bar" //foo and test should be found
mytext = "this is just a+test for foo+bar" //foo and test should be found
mytext = "this is just (atest) for /foo //only foo and not test should be
found
Question
I guess it would be best if the string can be split on white space and all
other special characters.
No comments:
Post a Comment