Accessing Hadoop Data Using Hive Cognitive Class Exam Quiz Answers

Clear My Certification January 12, 2024 Cognitive Class Leave a comment 202 Views

Enroll Here: Accessing Hadoop Data Using Hive Cognitive Class Exam Quiz Answers

Accessing Hadoop Data Using Hive Cognitive Class Certification Answers

Module 1 – Introduction to Hive Quiz Answers – Cognitive Class

Question 1: Which company first developed Hive?

Starbucks
Facebook
HP
Yahoo

Question 2: Hive is a Data Warehouse system built on top of Hadoop. True or false?

True
False

Question 3: Which of the following is NOT a valid Hive Metastore config?

Server Metastore
Local Metastore
Remote Metastore
Embedded Metastore

Module 2 – Hive DDL Quiz Answers – Cognitive Class

Question 1: Which of the following commands will list the databases in the Hive system?

DISPLAY ALL DB;
SHOW ME THE DATABASES;
DISPLAY DB;
SHOW DATABASES;

Question 2: MAPS are a Hive complex data type. True or false?

True
False

Question 3: An index can be created on a Hive table. True or false?

True
False

Module 3 – Hive DML Quiz Answers – Cognitive Class

Question 1: LOAD DATA LOCAL means that the data should be loaded from HDFS. True or false?

True
False

Question 2: Which of the following commands is used to generate a Hive query plan?

QUERYPLAN
SHOWME
HOW
EXPLAIN

Question 3: Data can be exported out of Hive. True or false?

True
False

Module 4 – Hive Operators and Function Quiz Answers – Cognitive Class

Question 1: Which of the following is NOT a built-in Hive function?

triplemultiple
floor
upper
round

Question 2: Users can create their own custom user defined functions. True or false?

True
False

Question 3: Which of the following is NOT a valid Hive relational operator?

A ATE B
A IS NOT NULL
A LIKE B
A IS NULL

Accessing Hadoop Data Using Hive Final Exam Answers – Cognitive Class

Question 1: What is the primary purpose of Hive in the Hadoop architecture?

To provide logging support for Hadoop jobs
To support the execution of workflows consisting of a collection of actions
To support SQL-like queries of data stored in Hadoop in place of writing MapReduce applications
To move data into HDFS

Question 2: Hive is SQL-92 compliant and supports row-level inserts, updates, and deletes. True or false?

True
False

Question 3: In a production setting, you should configure the Hive metastore as

Remote
Local
Embedded
None of the above

Question 4: The Hive Command Line Interface (CLI) allows you to

retrieve query explain plans
view and manipulate table metadata
perform queries, DML, and DDL
All of the above

Question 5: When using the Hive CLI, which option allows you to execute HiveQL that’s saved in a text file?

hive -d
hive -S
hive -e
hive -f

Question 6: Which statement is true of “Managed” tables in Hive?

Dropping a table deletes the table’s metadata, NOT the actual data
You can easily share your data with other Hadoop tools
Table data is stored in a directory outside of Hive
None of the Above

Question 7: Hive Data Types include

Maps
Arrays
Structs
A subset of RDBMS primitive types
All of the Above

Question 8: The PARTITION BY clause in Hive can be used to improve performance by storing all the data associated with a specified column’s value in the same folder. True or false?

True
False

Question 9: The LOAD DATA LOCAL command in Hive is used to move a datafile in HDFS into a Hive table structure. True or false?

True
False

Question 10: The INSERT OVERWRITE LOCAL DIRECTORY command in Hive is used to

copy data into an externally managed table
load data into a Hive Table
append rows to an existing Hive Table
export data from Hive to the local file system

Question 11: Hive supports which type of join?

Left Semi-Join
Inner Join
Full Outer Join
Equi-join
All of the Above

Question 12: With Hive, you can write your own user defined functions in Java and invoke them using HiveQL. True or false?

True
False

Question 13: Which of the following is a valid Hive operator for complex data types?

S.x where S is a struct and x is the name of the field you wish to retrieve
M[k] where M is a map and k is a key value
A[n] where A is an array and n is an int
All of the above

Introduction to Accessing Hadoop Data Using Hive

Apache Hive is a data warehouse infrastructure built on top of Hadoop that provides a SQL-like query language called HiveQL for querying and managing large datasets. It allows users to interact with data stored in Hadoop Distributed File System (HDFS) using familiar SQL syntax. Here’s a basic guide on how to access Hadoop data using Hive:

1. Setup Hadoop and Hive:

Ensure that you have Hadoop and Hive installed and configured on your system or cluster. You can download Hadoop and Hive from the Apache Hadoop and Apache Hive websites.

2. Start Hadoop Services:

Start the necessary Hadoop services, including the Hadoop Distributed File System (HDFS) and YARN (Yet Another Resource Negotiator).

3. Start Hive Metastore:

Hive uses a metastore to store metadata about tables and partitions. Start the Hive metastore service.

4. Launch Hive Shell:

Open the Hive shell by typing the following command in your terminal.
This will open the Hive interactive shell.

5. Create a Database:

Create a database to organize your Hive tables.

6. Define an External Table:

Define an external table that points to your Hadoop data in HDFS. For example, if you have a CSV file in HDFS, you can create a table.
Adjust the column names, data types, and file format according to your data.

7. Query Data:

Once your table is defined, you can query the data using HiveQL.
Use standard SQL queries to analyze and manipulate your data.

8. Perform Operations:

Hive supports various operations such as filtering, aggregating, and joining data.

9. Save Results:

You can save the results of a query to another Hive table or export them to a file in HDFS or the local file system.

10. Exit Hive Shell:

Type exit; to exit the Hive shell when you are finished.

Additional Tips:

You can also use Hive with HiveQL in non-interactive mode by storing your queries in a script file and executing them with the hive -f command.
Hive supports various file formats, including ORC (Optimized Row Columnar), Parquet, and Avro. Choose the format that best suits your data and performance requirements.

This is a basic guide to get you started with accessing Hadoop data using Hive. Depending on your specific use case and data structure, you may need to customize the HiveQL statements and configurations accordingly.