

When you’re done, type vagrant destroy to shutdown VMs. Then, vagrant ssh into one of the nodes and test the connection with ping. Using the following Vagrantfile, simply type vagrant up in the command line and then you’ll have a running cluster (linux) with two VMs with the specified properties (CPU, RAM, IP Address, etc.).
#DOWNLOAD VIRTUALBOX VM 5.0.32 FOR MAC PRO#
Test system: MacBoook Pro (Late 2013) 2.3 GHz Intel Core i7, 16 GB RAM, 512 GB SSD. With just two commands and zero configuration, Vagrant automatically builds and configures a fully featured virtual machine for any purpose. Vagrant takes the idea of the cloud, and makes this idea available on anyone’s desktop.


We have a quick solution called Vagrant, which is a tool for building complete development environments. However, as mentioned in the beginning, what if we want to test our hadoop/spark program, hive or pig scripts in the real hadoop distributed mode? What if we don’t want to rent out cloud services which we cannot afford with incurring cost. Let’s think about Hadoop ecosystem: Hadoop pseudo distributed mode or local model in single node is an option to get started. Another option is using the cloud a revolutionary movement founded on the idea of cheap, disposable computing resources. Creating multiple nodes, wiring, networking and complex software installations take a lot of time. The clustered servers (nodes) are connected by physical cables and by software. To achieve this, one option is to build real physical servers (on-premise) for cluster environment. But, for relatively large scale, distributed applications a multi-node cluster is more suitable. For most small scale applications a single-node cluster should suffice.
