Hadoop - Scala API

Last Updated: 2022-02-06

Scala code can be executed as scripts, without compiling first. Just make sure hadoop and all other required jars are added to classpath

$ scala -cp `hadoop classpath` script.scala

where hadoop classpath can conveniently generate a string with all the jars.

A simple example script to print the path names:

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.fs.Path

val conf = new Configuration()
val fs = FileSystem.get(conf)

fs.listStatus(new Path("path")).map(s => println(s.getPath().getName))