Example: biology
MapReduce: Simplied Data Processing on Large Clusters
pair for each input document (where the hostname is extracted from the URL of the document). The re-duce function is passed all per-document term vectors for a given host. It adds these term vectors together, throwing away infrequent terms, and then emits a nal hhostname;term vectori pair. To appear in OSDI 2004 2
Download MapReduce: Simplied Data Processing on Large Clusters
Information
Domain:
Source:
Link to this page:
Please notify us if you found a problem with this document: