Optimizing php/mysql translation lookup with huge database and hash indexes -


i'm using utf8 mysql database. checks if translation in database , if not, translation , stores in database.

select * `translations` `input_text`=? , `input_lang`=? , `output_lang`=?; 

(the other field "output_text".) basic database, first compare, letter letter, input text "input_text" "text" field. long characters matching keep comparing them. if stop matching, go onto next row.

i don't know how databases work @ low level assume basic database, search @ least 1 character every row in database before decides input text isn't in database.

ideally input text converted hash code (e.g. using sha1) , each "input_text" hash. if database sorted rapidly find of rows match hash , check actual text. if there no matching hashes return no results though each row wasn't manually checked.

is there type of mysql storage engine can or there additional php can optimize things? should "input_text" set kind of "index"? (primary/unique/index/fulltext)

is there alternative type of database compatible php far superior mysql?

edit: talks b-tree vs hash indexes mysql:

http://dev.mysql.com/doc/refman/5.5/en/index-btree-hash.html

none of limitations hash indexes problem me. says

they used equality comparisons use = or <=> operators (but very fast)

["very" italicized them]

new question:

how set "input_text" text hash index? btw multiple rows contain same "input_text"... alright hash index?

http://dev.mysql.com/doc/refman/5.5/en/column-indexes.html

says "the memory storage engine uses hash indexes default" - mean i've got change storage engine , set column index index?

a normal index clause should enough (be sure index fields, it'll big on disk, faster). fulltext indexes when you're using like clauses ;-)

anyway, kind of lookups, should use nosql store redis, it's blazingly fast , has in-memory store , data persistence through snapshots.

there extension php here : https://github.com/nicolasff/phpredis

and you'll have redis keys in following form: your_project:input_lang:word:output_lang better data management, replace each value values , you're go ;)


Comments

Popular posts from this blog

monitor web browser programmatically in Android? -

Shrink a YouTube video to responsive width -

wpf - PdfWriter.GetInstance throws System.NullReferenceException -