algorithm - Matching CLOSEST file in given ASCII Text Files -


problem:

i have around 20 ascii text files, each having size less 10^9 bytes.another ascii text file (say foo) given. program strategically match contents of foo given 20 files , print name of closest matching file. contents of foo might match partially.

since file size large ,i'm wondering:

1.how use information retrieval(since don't know ir)

2.which data structure should use store such information

3.what best algorithm implement it.

i know i'm asking much, i'm stuck @ problem , not able find out how approach.any appreciated.thanks!

so assume file contain text. can each 1 of file big string. make 20 vectors or arrays. go through file , put each word element in vector. create vectors size of 20 store matching of each of file create word vector given file well. create loop run through these vectors if @ given index found match of these 20 vectors , given vectors. increase value corresponding file in match storing vectors. @ end, highest value in match storing vector indicate file best match.


Comments

Popular posts from this blog

monitor web browser programmatically in Android? -

Shrink a YouTube video to responsive width -

wpf - PdfWriter.GetInstance throws System.NullReferenceException -