Telemetry/Custom analysis with spark: Difference between revisions

Added FAQ
(Added FAQ)
Line 46: Line 46:


   dataset = sqlContext.read.load("s3://the_bucket/the_prefix/the_version", "parquet")
   dataset = sqlContext.read.load("s3://the_bucket/the_prefix/the_version", "parquet")
=== I got a REMOTE HOST IDENTIFICATION HAS CHANGED! error ===
AWS recycles hostnames, so removing the offending key from $HOME/.ssh/known_hosts will remove the warning. You can find the line to remove by finding the line in the output that says
  Offending key in /path/to/hosts/known_hosts:2
Where 2 is the line number of the key that can be deleted. Just remove that line, save the file, and try again.
29

edits