Big Data Training Series
image001

Practical Guide to Big Data Analytics with Pig Latin, Hive and Scilab

Start your Big Data journey by examples using Open Source tools!

 


“A must-attend training. You will acquire the necessary skills to harness large data set for valuable information”

Course Synopsis

image003

Big data is a term for data or data sets that are very large and/or complex in a way that conventional data processing are unable to handle them. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, querying and information privacy/security. The term "big data" often refers simply to the use of predictive analytics, user behaviour analytics, or certain other advanced data analytics methods that extract value from data, and seldom to a particular size of data set.

Data analytics is now a priority for any organisations to identify market opportunities for services and products. It is found that more than 77% of top organisations consider data analytics a critical component of the business performance.

image004

Who Should Attend

This training is suitable for participants (engineers, scientists, data analysts, academician, researchers and alike) and anyone who would like to understand on how to deal with big data with the available open source tools.

Human Resource Development Fund (HRDF)

Our courses may be submitted to HRDF for SBL claims. Kindly check with your Human Resource Department or Training Unit. Alternatively, we could also assist you in your application. Call us now to enquire!

 


Course Outline

Apache Hadoop

Apache Hadoop is an open-source software framework used for distributed storage and processing of very large data sets while Apache Ambari is aimed at making Hadoop management simpler via the Ambari Viewer. In this section, you will familiarize yourself with the Ambari Viewer interface.

  • Introduction to Apache Hadoop
  • Getting familiar with Ambari Viewer

Learning Pig Latin Language

Apache Pig is a high level platform for creating programs to run on Apache Hadoop. It uses the Pig Latin language. In this section, you will learn the fundamentals of the Pig Latin language. This includes learning how to load data from HDFS, write commands to perform analysis and storing the results. 

  • Introduction to Apache Pig
  • Learning the basics of Pig Latin by loading and storing data to and from HDFS, performing data processing
  • Running a script inside Pig View

Learning Hive Query Language

Apache Hive is data warehouse infrastructure built on top of Hadoop. It uses Hive Query Language (HiveQL), a SQL-like language to access the data in Hadoop. In this section, you will the fundamentals of HiveQL. This includes learning how to create and load tables, sending queries as well as performing simple visualization in Hive Viewer.

  • Introduction to Apache Hive
  • Learning the basics of HiveQL by creating and loading a Hive table, sending queries to Hadoop
  • Simple visualization inside Hive Viewer

WebHDFS, HCatalog, WebHCat

WebHDFS is a built-in component of HDFS that allows you to access HDFS via REST API. HCatalog is a table and storage management layer for Hadoop. It allows other Hadoop tools such as Pig to access the Hive metadata. WebHCAT is REST API component of HCatalog. It allows you to send Pig and Hive jobs through REST API. In this section, you will learn on how to access the HDFS via REST API. You will also learn on how to access Hive tables in Pig. You will also learn how to send Pig or Hive jobs via REST API. 

  • Introduction 
  • Accessing HDFS via HTTP
  • Accessing Hive Tables via HCatalog
  • Sending Pig/Hive jobs via WebHCat

Scilab

Scilab is a high level open source numerical computation software. In this section, you will learn on how to access Hadoop as well as sending Pig and Hive jobs from Scilab.  You will also learn to perform data analysis and visualization inside Scilab.

  • Introduction to Scilab
  • Scilab Basics
  • Accessing Hadoop from Scilab
  • Sending Pig/Hive Jobs from Scilab
  • Data Analysis using Scilab
  • Visualization inside Scilab

To obtain details of the course (fee, location and etc.), kindly obtain a registration form by email tina@tritytech.com

Provide us with your name, organization & mobile contact number.

You may also call us at +603-80637737 or fill up our Training Enquiry form.

 

© 2010-2017 Trity Technologies Sdn Bhd. All Rights Reserved.