Apache Hadoop is a free and open source implementation of frameworks for reliable, scalable, distributed computing and data storage. It enables applications to work with thousands of nodes and petabytes of data, and as such is a great tool for research and business operations. Hadoop was inspired by Googles MapReduce and Google File System papers.
Hadoop is a large-scale distributed batch processing infrastructure. While it can be used on a single machine, its true power lies in its ability to scale to hundreds or thousands of computers, each with several processor cores. Hadoop is also designed to efficiently distribute large amounts of work across a set of machines.
The Hadoop framework is implemented in Java, and you can develop MapReduce applications in Java or any JVM-based language or use one of the following interfaces:
a utility that allows you to create and run jobs with any executables (for example, shell utilities) as the mapper and/or the reducer.
a SWIG-compatible (not based on JNI) C++ API to implement MapReduce applications.
Ready to start your tutorial with us? That's great! Send us an email and we will get back to you as soon as possible!