Enroll Here: Accessing Hadoop Data Using Hive Cognitive Class Exam Quiz Answers
Accessing Hadoop Data Using Hive Cognitive Class Certification Answers
Module 1 – Introduction to Hive Quiz Answers – Cognitive Class
Question 1: Which company first developed Hive?
- Starbucks
- HP
- Yahoo
Question 2: Hive is a Data Warehouse system built on top of Hadoop. True or false?
- True
- False
Question 3: Which of the following is NOT a valid Hive Metastore config?
- Server Metastore
- Local Metastore
- Remote Metastore
- Embedded Metastore
Module 2 – Hive DDL Quiz Answers – Cognitive Class
Question 1: Which of the following commands will list the databases in the Hive system?
- DISPLAY ALL DB;
- SHOW ME THE DATABASES;
- DISPLAY DB;
- SHOW DATABASES;
Question 2: MAPS are a Hive complex data type. True or false?
- True
- False
Question 3: An index can be created on a Hive table. True or false?
- True
- False
Module 3 – Hive DML Quiz Answers – Cognitive Class
Question 1: LOAD DATA LOCAL means that the data should be loaded from HDFS. True or false?
- True
- False
Question 2: Which of the following commands is used to generate a Hive query plan?
- QUERYPLAN
- SHOWME
- HOW
- EXPLAIN
Question 3: Data can be exported out of Hive. True or false?
- True
- False
Module 4 – Hive Operators and Function Quiz Answers – Cognitive Class
Question 1: Which of the following is NOT a built-in Hive function?
- triplemultiple
- floor
- upper
- round
Question 2: Users can create their own custom user defined functions. True or false?
- True
- False
Question 3: Which of the following is NOT a valid Hive relational operator?
- A ATE B
- A IS NOT NULL
- A LIKE B
- A IS NULL
Accessing Hadoop Data Using Hive Final Exam Answers – Cognitive Class
Question 1: What is the primary purpose of Hive in the Hadoop architecture?
- To provide logging support for Hadoop jobs
- To support the execution of workflows consisting of a collection of actions
- To support SQL-like queries of data stored in Hadoop in place of writing MapReduce applications
- To move data into HDFS
Question 2: Hive is SQL-92 compliant and supports row-level inserts, updates, and deletes. True or false?
- True
- False
Question 3: In a production setting, you should configure the Hive metastore as
- Remote
- Local
- Embedded
- None of the above
Question 4: The Hive Command Line Interface (CLI) allows you to
- retrieve query explain plans
- view and manipulate table metadata
- perform queries, DML, and DDL
- All of the above
Question 5: When using the Hive CLI, which option allows you to execute HiveQL that’s saved in a text file?
- hive -d
- hive -S
- hive -e
- hive -f
Question 6: Which statement is true of “Managed” tables in Hive?
- Dropping a table deletes the table’s metadata, NOT the actual data
- You can easily share your data with other Hadoop tools
- Table data is stored in a directory outside of Hive
- None of the Above
Question 7: Hive Data Types include
- Maps
- Arrays
- Structs
- A subset of RDBMS primitive types
- All of the Above
Question 8: The PARTITION BY clause in Hive can be used to improve performance by storing all the data associated with a specified column’s value in the same folder. True or false?
- True
- False
Question 9: The LOAD DATA LOCAL command in Hive is used to move a datafile in HDFS into a Hive table structure. True or false?
- True
- False
Question 10: The INSERT OVERWRITE LOCAL DIRECTORY command in Hive is used to
- copy data into an externally managed table
- load data into a Hive Table
- append rows to an existing Hive Table
- export data from Hive to the local file system
Question 11: Hive supports which type of join?
- Left Semi-Join
- Inner Join
- Full Outer Join
- Equi-join
- All of the Above
Question 12: With Hive, you can write your own user defined functions in Java and invoke them using HiveQL. True or false?
- True
- False
Question 13: Which of the following is a valid Hive operator for complex data types?
- S.x where S is a struct and x is the name of the field you wish to retrieve
- M[k] where M is a map and k is a key value
- A[n] where A is an array and n is an int
- All of the above
Introduction to Accessing Hadoop Data Using Hive
Apache Hive is a data warehouse infrastructure built on top of Hadoop that provides a SQL-like query language called HiveQL for querying and managing large datasets. It allows users to interact with data stored in Hadoop Distributed File System (HDFS) using familiar SQL syntax. Here’s a basic guide on how to access Hadoop data using Hive:
1. Setup Hadoop and Hive:
- Ensure that you have Hadoop and Hive installed and configured on your system or cluster. You can download Hadoop and Hive from the Apache Hadoop and Apache Hive websites.
2. Start Hadoop Services:
- Start the necessary Hadoop services, including the Hadoop Distributed File System (HDFS) and YARN (Yet Another Resource Negotiator).
3. Start Hive Metastore:
- Hive uses a metastore to store metadata about tables and partitions. Start the Hive metastore service.
4. Launch Hive Shell:
- Open the Hive shell by typing the following command in your terminal.
- This will open the Hive interactive shell.
5. Create a Database:
- Create a database to organize your Hive tables.
6. Define an External Table:
- Define an external table that points to your Hadoop data in HDFS. For example, if you have a CSV file in HDFS, you can create a table.
- Adjust the column names, data types, and file format according to your data.
7. Query Data:
- Once your table is defined, you can query the data using HiveQL.
- Use standard SQL queries to analyze and manipulate your data.
8. Perform Operations:
- Hive supports various operations such as filtering, aggregating, and joining data.
9. Save Results:
- You can save the results of a query to another Hive table or export them to a file in HDFS or the local file system.
10. Exit Hive Shell:
- Type
exit;
to exit the Hive shell when you are finished.
Additional Tips:
- You can also use Hive with HiveQL in non-interactive mode by storing your queries in a script file and executing them with the
hive -f
command. - Hive supports various file formats, including ORC (Optimized Row Columnar), Parquet, and Avro. Choose the format that best suits your data and performance requirements.
This is a basic guide to get you started with accessing Hadoop data using Hive. Depending on your specific use case and data structure, you may need to customize the HiveQL statements and configurations accordingly.