Business Intelligence Blogs

View blogs by industry experts on topics such as SSAS, SSIS, SSRS, Power BI, Performance Tuning, Azure, Big Data and much more! You can also sign up to post your own business intelligence blog.

Being Productive with HDInsight

  • 9 April 2013
  • Author: cprice1979
  • Number of views: 7496

This post will be the holding place where I put misc. tools and tips for HDInsight 

Build Tools

1. Apache ANT (

Extract archive to c:\ant\ then modify the classpath to include Ant:

set ANT_HOME=c:\ant

set PATH=%PATH%;%ANT_HOME%\bin
2. Apache IVY (

  • Copy Ivy.JAR to Ant lib folder

3. Git Client (


Data Preparation/Research Tools

1. CURL (


3. Enthought Data Platform (EDP) (

4. GNU Parallel ( )



Community contributed user defined functions for PIG

  • Retrieve source from Git:
    git clone
    ls Pig
    git checkout -b branch-0.9 remotes/origin/branch-0
  • Build Pig and then PiggyBank using Ant
  • Pig Script:
    -- myscript.pig
    REGISTER C:\Users\Administrator\pig\contrib\piggybank\java\piggybank.jar;
    A = LOAD 'student_data' AS (name: chararray, age: int, gpa: float);
    B = FOREACH A GENERATE myudfs.UPPER(name);
    DUMP B;
Categories: Analysis Services
Rate this article:
No rating


Other posts by cprice1979

Please login or register to post comments.