Understanding Default Delimiters

In Hive, by default, each fields are delimited by CTRL-A (ie Octal \001).

If a field is a complex type(such as ARRAY, STRUCT or MAP) containing Primitive Type(like INT, STRING), then each value inside the complex type is delimited by CTRL-B (\002)

Example field : array('val1', 'val2')

If a field is a complex type containing complex type(we call this as Nested Type), then delimiters CTRL-B (\002) followed by CTRL-C(\003), CTRL-D(\004), etc... are used depending on the depth of the nesting.

Example field : array(named_struct('var1', 'val1'))

Let us test this concept with an example(a Table containing 4 fields)
create table marks as
select 
   'bob'                                           as name, 
   named_struct('engtot', '104', 'mathtot', '135') as total, 
   array(
      array(
         named_struct(
            'eng', 
            '33',
            'math',
            '44'
         ),
         named_struct(
            'eng',          
            '35',
            'math',
            '45'
         )
      ),
      array(
         named_struct(
            'eng', 
            '36',
            'math',
            '46'
         )
      )      
   )                                               as testmarks,
   map('engavg', '34.6', 'mathavg', '45')          as average;

hive> desc marks;
name                 string
total                struct
testmarks            array<array<struct<eng:string,math:string>>>
average              map   

hive> select * from marks;
bob   {"engtot":"104","mathtot":"135"} [[{"eng":"33","math":"44"},{"eng":"35","math":"45"}],[{"eng":"36","math":"46"}]] {"engavg":"34.6","mathavg":"45"}

Let us pull the table data into Local file system & verify the delimiters in VI Editor

hdfs dfs -getmerge /user/hive/warehouse/marks result.txt
vi result.txt
bob^A104^B135^A33^D44^C35^D45^B36^D46^Aengavg^C34.6^Bmathavg^C45

Let us dissect this data for clarity,

bob                           ^A
104            ^B 135         ^A
33^D44^C35^D45 ^B 36^D46      ^A
engavg^C34.6   ^B mathavg^C45
(Note : In a map, Key & Value is always delimited by ^C)

2 comments:

  1. The method involved with finding and baiting reasonable work contender to fill unfilled positions inside a business is known as obtaining. It involves searching for and speaking with potential candidates through different stages, including position sheets, virtual entertainment, representative suggestions, and systems administration social affairs.

    Taking everything into account, obtaining focuses on finding and drawing in appropriate individuals to fill open positions, procurement focuses on securing ability through consolidations or acquisitions, and enlisting focuses on recognizing and choosing possibility for open jobs. The progress of an association's staffing methodology is affected by every one of the three of these significant ability obtaining systems>> talent sourcing specialist

    ReplyDelete