ruby on rails - Given 2 strings, get array of all separate substring phrases greater than x in length -


lets have array of strings:

["carflam fizz peanut butter", "fizz foo", "carflam foo peanut butter"] 

the output of function get_array_of_substrings_larger_than(min), get_array_of_substrings_larger_than(3), should ["peanut butter", "carflam", "fizz"], because there @ least 2 strings share each of elements.

i can't quite figure out how write this. note, it's not same comparing every string others , taking largest substring -- in example above, carflam second largest substring.

"peanut butter" because when compare "carflam fizz peanut butter" , "carflam foo peanut butter", largest common substring "peanut butter". second largest substring "carflam", both of should in output indepdently, however, "peanut" , "butter" should not in output because both contained in larger substring

thanks help

so first of all, think clear you're asking largest phrase, lack of better word. largest substrings see in example array "carflam f" , " peanut butter". , feel free change ary argument if that's known quantity in whatever class you're using:

def get_array_of_phrases_larger_than(ary, min)   = []    # ugly, span range of possible phrases each item in   # array, building them one-dimensional array if meet minimum   # length requirements   ary.each |phrase|     words = phrase.split     last = words.length - 1     (0..last).each |from|       (from..last).each |to|         p = words[from..to].join(" ")         << p if p.size > min       end     end   end    # list of repeated keys   repeated = all.group_by(&:to_s).select { |_, v| v.size > 1 }   keys = repeated.keys    # list of longest keys, such exclude "peanut" , "butter"   # if "peanut butter" exists   longest = repeated.select |key, _|     keys.select { |k| k.include?(key) }.size == 1   end    # sort in reverse order length   longest.keys.sort_by { |k| -k.size } end  @ary = ["carflam fizz peanut butter", "fizz foo", "carflam foo peanut butter"]  get_array_of_phrases_larger_than @ary, 3 # => ["peanut butter", "carflam", "fizz"] 

note agnostic strings come from, potentially have false positive ["butter butter", "foo", "baz"] returning ["butter"], i'll leave exercise reader.


Comments