编程知识 cdmana.com

Hive data type and case practice

Hive data type And Case practice

Basic data type

Hive data type Java data type length Example
TINYINT byte 1byte Signed integers 20
SMALLINT short 2byte Signed integers 20
INT int 4byte Signed integers 20
BIGINT long 8byte Signed integers 20
BOOLEAN boolean Boolean type ,true or false TRUE
FLOAT float Single-precision floating-point 3.14
DOUBLE double Double precision floating point 3.14
STRING string Character series , The character set... Can be specified , You can use single or double quotation marks ‘now’ “now time”
TIMESTAMP Time type
BINARY Byte array

about Hive Of String Type is equivalent to database varchar type , This type is a variable string , But it can't declare how many characters it can store at most , In theory it can store 2GB The number of characters .

Collection data type

data type describe Example
STRUCT and c In language struct similar , Both can pass “ spot ” Symbols access element content . for example , If a column of The data type is STRUCT{first STRING, last STRING}, So the first 1 Elements can be passed through the words paragraph .first To quote . struct()
MAP MAP It's a set of keys - Value pair tuple set , Using arrays Notation can access data . for example , If a column The data type of is MAP, Middle key -> It's worth it yes ’first’->’John’ and ’last’->’Doe’, Then you can. By field name [‘last’] Get last element map()
ARRAY An array is a collection of variables of the same type and name . These variables are called elements of an array , Number of each Group elements all have a number , The number starts from zero . example Such as , The array value is [‘John’, ‘Doe’], So the first 2 individual Elements can be passed through array names [1] reference . Array()
{
    
    "name": "John Doe",
    "salary": 100000.0 ,
    "subordinates": ["Mary Smith" , "Todd Jones"] ,   // list Array, subordinates[1]="Tood Jones"
    "deductions": {
                                      // Key value Map, deductions['Federal Taxes']=0.2
        "Federal Taxes": 0.2 ,
        "State Taxes": 0.05,
        "Insurance": 0.1
    }
    "address": {
                                         // structure Struct, address.city="Chicago"
        "street": "1 Michigan Ave." ,
        "city": "Chicago" ,
        "state": "IL" ,
        "zip": 60600
    }
}

Case practice

data

Based on the above json data structure , We are Hive Create the corresponding table in , And import data .
Create local file text.txt

[root@hadoop101 datas]# pwd
/opt/module/hive-1.2.1/datas
[root@hadoop101 datas]# cat text.txt 
John Doe,100000.0,Mary Smith_Todd Jones,Federal Taxes:0.2_State Taxes:0.05_Insurance:0.1,1 Michigan Ave._Chicago_1L_60600
Tom Smith,90000.0,Jan_Hello Ketty,Federal Taxes:0.2_State Taxes:0.05_Insurance:0.1,Guang dong._China_0.5L_60661

Build table

Hive Create a test table on test

CREATE  TABLE text(
name STRING,
sa1ary FLOAT,
subordinates ARRAY<STRING>,
deductions MAP<STRING, FLOAT>,
address STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '_'
MAP KEYS TERMINATED BY ':'
LINES TERMINATED BY '\n';

Field explanation :

Field explain remarks
row format delimited fields terminated by ‘,’ Column separator The storage in the table is the column separator
collection items terminated by ‘_’, MAP STRUCT and ARRAY The delimiter ( Data segmentation symbols )
map keys terminated by ‘:’ MAP Medium key And value The delimiter
LINES TERMINATED BY ‘\n’ Line separator Default ’\n’

Import local data

hive (default)> load data local inpath '/opt/module/hive-1.2.1/datas/text.txt' overwrite into table text;
Loading data to table default.text
Table default.text stats: [numFiles=1, numRows=0, totalSize=234, rawDataSize=0]
OK
Time taken: 2.325 seconds

Inquire about

hive (default)> select * from text;
OK
text.name       text.sa1ary     text.subordinates       text.deductions text.address
John Doe        100000.0        ["Mary Smith","Todd Jones"]     {"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1}        {"street":"1 Michigan Ave.","city":"Chicago","state":"1L","zip":60600}
Tom Smith       90000.0 ["Jan","Hello Ketty"]   {"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1}        {"street":"Guang dong.","city":"China","state":"0.5L","zip":60661}
Time taken: 0.614 seconds, Fetched: 2 row(s)
hive (default)> select name,subordinates[0],deductions['State Taxes'],address.city from text;
OK
name    _c1     _c2     city
John Doe        Mary Smith      0.05    Chicago
Tom Smith       Jan     0.05    China
Time taken: 0.175 seconds, Fetched: 2 row(s)

Type conversion

Hive The atomic data type of can be implicitly converted , Be similar to Java Type conversion of .

For example, an expression uses INT type ,TINYINT Will automatically switch to INT type , however Hive No reverse conversion , for example , An expression uses TINYINT type ,INT Will not automatically convert to TINYINT type , It will return an error , Unless use CAST operation .

Implicit type conversion rules

(1) Any integer type can be implicitly converted to a broader type , Such as TINYINT Can convert become INT,INT It can be converted into BIGINT.
(2) All integer types 、FLOAT and STRING Types can be implicitly converted to DOUBLE.
(3)TINYINT、SMALLINT、INT Can be converted into FLOAT.
(4)BOOLEAN Type cannot be converted to any other type .

Explicit data type conversion

have access to CAST The operation explicitly performs data type conversion .
for example CAST(‘1’ AS INT) Will put string ’1’ transformation In whole numbers 1; If the cast fails , Returns a null value NULL.

Reference resources

Hive Collection data type
Hive Data type conversion

版权声明
本文为[yljphp]所创,转载请带上原文链接,感谢

Scroll to Top