Spark and HDFS on Kuberenetes data locality
Spark and HDFS on Kuberenetes data locality
I'm trying to run Spark on K8 and struggling a bit with data locality. I'm using the native spark support but just watched https://databricks.com/session/hdfs-on-kubernetes-lessons-learned. I've followed the steps there in setting up my HDFS cluster (namenode on first k8 node, using host networking). I was wondering if anyone knows if the fix to the spark driver presented has been merged into the mainline spark code?
I ask as I still see ANY locality in places I'd expect NODE_LOCAL.
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.