Hadoop not suitable for distributed processing across many sites? -
i've read few articles suggesting hadoop designed work on cluster @ single physical location, not number of distributed nodes (e.g. running distributed cluster on internet multiple sites).
does have real experience trying use hadoop across mutliple sites? kind of issues run into? or better go different framework (e.g. boinc).
if there's difference between executing on set of relatively local nodes vs on set of distributed nodes in increased time required move large amounts of data , forth between nodes. if have problem involves crunching, aggregating , joining large amounts of data sending large amounts of data between nodes. means no matter platform choose (hadoop, storm, etc) have deal issue. boinc or other volunteer-based system may cheaper, implementation still hit high data transfer costs. furthermore, you'll introduce node heterogeneity mix make implementation more interesting develop , debug.
and way, hadoop , boinc 2 different animals solving different problems.
Comments
Post a Comment