[英]Hive query output to file
我通過java代碼運行hive查詢。 例子:
“SELECT * FROM table WHERE id > 100”
如何將結果導出到 hdfs 文件。
以下查詢將結果直接插入 HDFS:
INSERT OVERWRITE DIRECTORY '/path/to/output/dir' SELECT * FROM table WHERE id > 100;
此命令會將輸出重定向到您選擇的文本文件:
$hive -e "select * from table where id > 10" > ~/sample_output.txt
這會將結果放在目錄下的制表符分隔文件中:
INSERT OVERWRITE LOCAL DIRECTORY '/home/hadoop/YourTableDir'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
SELECT * FROM table WHERE id > 100;
理想的做法是使用“INSERT OVERWRITE DIRECTORY '/pathtofile' select * from temp where id > 100”而不是“hive -e 'select * from...' > /filepath.txt”
@sarath 如果我想從不同的表運行另一個 select * 命令並寫入同一個文件,如何覆蓋文件?
INSERT OVERWRITE LOCAL DIRECTORY '/home/training/mydata/outputs' SELECT expl , count(expl) as total
FROM ( SELECT expand(splits) as expl FROM ( SELECT split(words,' ') as splits FROM wordcount ) t2 ) t3 GROUP BY expl ;
這是sarath問題的一個例子
以上是存儲在本地目錄中的輸出文件中的字數統計作業:)
我同意 tnguyen80 的回應。 請注意,當查詢中有特定的字符串值時,最好用雙引號給出整個查詢。
例如:
$hive -e "select * from table where city = 'London' and id >=100" > /home/user/outputdirectory/city details.csv
要直接將文件保存在 HDFS 中,請使用以下命令:
hive> insert overwrite directory '/user/cloudera/Sample' row format delimited fields terminated by '\t' stored as textfile select * from table where id >100;
這會將內容放在 HDFS 的 /user/cloudera/Sample 文件夾中。
例子:
創建外部表以將查詢結果存儲在“/user/myName/projectA_additionaData/”
CREATE EXTERNAL TABLE additionaData
(
ID INT,
latitude STRING,
longitude STRING
)
COMMENT 'Additional Data gathered by joining of the identified cities with latitude and longitude data'
ROW FORMAT DELIMITED FIELDS
TERMINATED BY ',' STORED AS TEXTFILE location '/user/myName/projectA_additionaData/';
將查詢結果送入臨時表
insert into additionaData
Select T.ID, C.latitude, C.longitude
from TWITER
join CITY C on (T.location_name = C.location);
刪除臨時表
drop table additionaData
兩種方式可以存儲 HQL 查詢結果:
INSERT OVERWRITE DIRECTORY "HDFS Path" ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
SELECT * FROM XXXX LIMIT 10;
$hive -e "select * from table_Name" > ~/sample_output.txt
$hive -e "select * from table where city = 'London' and id >=100" > /home/user/outputdirectory/city details.csv
在 Hive 命令行界面中輸入此行:
insert overwrite directory '/data/test' row format delimited fields terminated by '\\t' stored as textfile select * from testViewQuery;
testViewQuery
- 一些特定的視圖
要設置輸出目錄和輸出文件格式等,請嘗試以下操作:
INSERT OVERWRITE [LOCAL] DIRECTORY directory1
[ROW FORMAT row_format] [STORED AS file_format]
SELECT ... FROM ...
例子:
INSERT OVERWRITE DIRECTORY '/path/to/output/dir'
ROW FORMAT DELIMITED
STORED AS PARQUET
SELECT * FROM table WHERE id > 100;
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.