Most often professionals asked in Hive interview question. Let If Hive internal table, store meta data and actual data in internal table. If table lost, both meta data and schema both lost, where as
if in external table, just lost only metastore but not actual data. it means hive ignore the data, but not delete actual data so its best approach.
Hive provides us data warehousing facilities on top of an existing Hadoop cluster. Along with that it provides an SQL like interface.
You can create table in two different ways.
a. Create External table
CREATE EXTERNAL TABLE students (id INT, name STRING, batch STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' #supply delimiter LOCATION '/user/hdfs/students';
For External Tables Hive does not move the data into its warehouse directory. If the external table is dropped, then the table metadata is deleted but not the data.
b. Create Normal Table
CREATE TABLE students (id INT, name STRING, batch STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' #supply delimiter LOCATION '/user/hddfs/students';
For Normal tables hive moves data into its warehouse directory. If the table is dropped, then the table metadata and the data will be deleted
Hive has a relational database on the master node it uses to keep track of state. For instance, when you CREATE TABLE FOO(foo string) LOCATION 'hdfs://tmp/';, this table schema is stored in the database. If you have a partitioned table, the partitions are stored in the database(this allows hive to use lists of partitions without going to the filesystem and finding them, etc). These sorts of things are the 'metadata'.
When you drop an internal table, it drops the data, and it also drops the metadata.
When you drop an external table, it only drops the meta data. That means hive is ignorant of that data now. It does not touch the data itself.
Ready to start your tutorial with us? That's great! Send us an email and we will get back to you as soon as possible!