Sunday, January 7, 2018

yogi_Split A String In Cell C2 To Pull Items In Master List (Cells D2:D)

Google Spreadsheet   Post  #2344

Yogi Anand, D.Eng, P.E.      ANAND Enterprises LLC -- Rochester Hills MI     Jan-07-2018
Filtering an array where elements may occur within each other as substrings
I am trying to take an input string and return a set of substrings that are found within it. The issue is that my master substrings are very similar, and some contain the contents of others.
Here is a sample data set:
- ABC News
- CBC News
- MTV News

If I input "ABC and MTV News," I would like the return value to be "ABC" and "MTV News," NOT including MTV (despite the fact that it is technically a substring). I would also be able to input "ABC and ABC News" and have both returned correctly. Ideally I'd like to do it in a way that isn't limited to this set of data (so an if statement like "if MTV is found AND MTV News is found then..." would not be suitable). My intention is to use the filtered data for vlookup purposes, so hardcoding the data differently isn't really an option either. Another user suggested using regexextract and regexmatch and changing the master data, but I'm wondering if it may be possible to utilize them without having to do that.

Most critically, the input string is not restricted to a certain format. All of the following should be valid and return the same thing:

"ABC News and MTV"
"ABC News, MTV"
"I like watching TV. My favourite news channel is ABC News. I also like MTV."

All of the above should return "ABC News" and "MTV" as found strings, and exclude "ABC," as its only appearance is within an existing element. However, the following should also work:

"ABC, ABC News, and MTV"
"ABC News, ABC, MTV"
"I like watching ABC, especially ABC News. I also like MTV."

All of the above should return "ABC," "ABC News," and "MTV," with no exclusion of "ABC" because it was found independent of its appearance within "ABC News."

Sample sheet:

Input: B1
Master data: D and E
Vlookup sample: B13