Hadoop - Configuration
Configuration Files
Hadoop will use default settings if not told otherwise in site.
Config files can be found in either $HADOOP_HOME/conf
or $HADOOP_CONF_DIR
- yarn-site.xml
- core-site.xml
- hdfs-site.xml
Core
fs.default.name
The default setting is local file system, do not be surprised if you see your local files when calling $ hadoop fs -ls::
<property>
<name>fs.default.name</name>
<value>file:///</value>
</property>
Set it to HDFS::
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
fs.trash.interval
Enable trash bin (disabled by default) (1440 min = 24 hr)
<property>
<name>fs.trash.interval</name>
<value>1440</value>
</property>
hadoop.tmp.dir
The default tmp folder is /tmp/hadoop-${user.name} in default::
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-${user.name}</value>
</property>
Add the following in site to override::
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop</value>
</property>
HDFS
default:$HADOOP_HOME/src/hdfs/hdfs-default.xml
site:$HADOOP_HOME/conf/hdfs-site.xml
dfs.name.dir / dfs.data.dir
Set the folder for namenode and datanode. ${hadoop.tmp.dir}/dfs/name and ${hadoop.tmp.dir}/dfs/data will be used by default. Set to other folders if you want::
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/dfs/data</value>
</property>
dfs.replication
Set it to one for pseudo-cluster::
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
MapReduce
default:$HADOOP_HOME/src/mapred/mapred-default.xml
site:$HADOOP_HOME/conf/mapred-site.xml
mapred.job.tracker
The host and port that the MapReduce job tracker runs at. It is set as "local" in default, meaning jobs are run in-process as a single map and reduce task::
<property>
<name>mapred.job.tracker</name>
<value>local</value>
</property>
Set it to localhost in site::
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
hadoop-env.sh
Remember to set JAVA_HOME in this file