Spark Trouble Shooting - Total size of serialized results is bigger than spark.driver.maxResultSize

Updated: 2018-12-11

Error

ERROR TaskSetManager: Total size of serialized results of 8113 tasks (1131.0 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
ERROR TaskSetManager: Total size of serialized results of 8114 tasks (1131.1 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
ERROR TaskSetManager: Total size of serialized results of 8115 tasks (1131.2 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
ERROR TaskSetManager: Total size of serialized results of 8116 tasks (1131.3 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)

Cause

  • caused by actions like RDD's collect() that send big chunk of data to the driver

Solution

  • set by SparkConf: conf.set("spark.driver.maxResultSize", "3g")
  • set by spark-defaults.conf: spark.driver.maxResultSize 3g
  • set when calling spark-submit: --conf spark.driver.maxResultSize=3g