Main features
- Cheap huge amount of storage
- process huge amount of storage quickly
- can store unstructured data like text, images and video
- saleable to infinity (nodes)
- data is software protected against hardware failure
Components included in
the basic download
- HDFS : a java based distributed file system which can
store all kind of data without prior organization
- MapReduce : a software programming model for parallel
computing
- YARN : schedule and handle resource request from distributed
applications
Other components
exists :pig,hive,hbase,zookeeper,ambari,flume,sqoop,oozie
How does data get into
Hadoop :
- you can load files to the HDFS using simple java
commands.
- in case you have many files you can invoke a script
that loads the files in parallel
- .......
Nathan
No comments:
Post a Comment