Business Intelligence Blogs

View blogs by industry experts on topics such as SSAS, SSIS, SSRS, Power BI, Performance Tuning, Azure, Big Data and much more! You can also sign up to post your own business intelligence blog.

Power Query to HDFS – Remote Name Could Not Be Resolved

  • 2 May 2014
  • Author: DevinKnight
  • Number of views: 7784
  • 0 Comments

I wanted to share a quick resolution I had to a problem with connecting Power Query to data stored in HDFS.

Seems easy enough. Launch Excel and through Power Query select From Other Sources –> From Hadoop Files (HDFS)

image

Next you’re prompted to provide your Hadoop server name then click OK.  I’ll be using the Hortonworks Sandbox environment for this demo.

image

I then see all the files available on HDFS.

image

If I try to explore the data in these files by clicking Binary next to any of the files listed I’m presented with the following error: DataSource.Error: HDFS cannot connect to server ‘sandbox.hortonworks.com’. The remote name could not be resolved: ‘sandbox.hortonworks.com’.

image

The issue here is with the name resolution and is a configuration issue. To fix this you must put your Hadoop cluster details in the HOSTS file on your machine that’s running Power Query. 

To edit the HOSTS file (I recommend you take a backup of it first) you’ll need to launch a text editor as administrator then open the HOSTS file found in the following folder:

C:\Windows\System32\drivers\etc

Provide the IP address and host name of the server as shown below then save.

image

Next time I try to connect to a file on HDFS everything now works properly.

image

Hope this helps!

Print
Tags:
Rate this article:
No rating
DevinKnight

DevinKnightDevinKnight

Other posts by DevinKnight

Please login or register to post comments.