网站首页
网站导航
Ctrl+D收藏
首 页
代码段
源码包
文档库
工具箱
代码语言
.
CSharp
.
JS
Java
Asp.Net
C
MSSQL
PHP
Css
PLSQL
Python
Shell
EBS
ASP
Perl
ObjC
VB.Net
VBS
MYSQL
GO
Delphi
AS
DB2
Domino
Rails
ActionScript
Scala
代码分类
文件
系统
字符串
数据库
网络相关
图形/GUI
多媒体
算法
游戏
Jquery
Extjs
Android
HTML5
菜单
网页交互
WinForm
控件
企业应用
安全与加密
脚本/批处理
开放平台
其它
【
Hive
】
Shell脚本执行hive语句 | hive以日期建立分区表 | lin
作者:
/ 发布于
2017/6/2
/
198
#!/bin/bash source /etc/profile; ################################################## # Author: ouyangyewei # # # # Content: Combineorder Algorithm # ################################################## # change workspace to here cd / cd /home/deploy/recsys/algorithm/schedule/project/combineorder # generate product_sell data yesterday=$(date -d '-1 day' '+%Y-%m-%d') lastweek=$(date -d '-1 week' '+%Y-%m-%d') /usr/local/cloud/hive/bin/hive<<EOF CREATE EXTERNAL TABLE IF NOT EXISTS product_sell( category_id bigint, province_id bigint, product_id bigint, price double, sell_num bigint ) PARTITIONED BY (ds string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE; INSERT OVERWRITE TABLE product_sell PARTITION (ds='$yesterday') select a.category_id, b.good_receiver_province_id as province_id, a.id as product_id, (b.sell_amount/b.sell_num) as price, b.sell_num from product a join (select si.product_id, s.good_receiver_province_id, sum(si.order_item_amount) sell_amount, sum(si.order_item_num) sell_num from so_item si join so s on (si.order_id=s.id) where si.is_gift=0 and si.is_hidden=0 and si.ds between '$lastweek' and '$yesterday' group by s.good_receiver_province_id, si.product_id) b on (a.id=b.product_id); EOF # generate yhd_gmv_month data yesterday=$(date -d '-1 day' '+%Y-%m-%d') lastmonth=$(date -d '-1 month' '+%Y-%m-%d') /usr/local/cloud/hive/bin/hive<<EOF CREATE EXTERNAL TABLE IF NOT EXISTS yhd_gmv_month( province_id bigint, price_area int, product_id bigint, sell_num bigint ) PARTITIONED BY (ds string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE; INSERT OVERWRITE TABLE yhd_gmv_month PARTITION (ds='$yesterday') select ssi.province_id, (case when price>0.0 and price<=10.0 then 0 when price>10.0 and price<=20.0 then 1 when price>20.0 and price<=30.0 then 2 when price>30.0 then 3 else -1 end) as price_area, ssi.product_id, ssi.sell_num from (select s.good_receiver_province_id as province_id, si.product_id, sum(si.order_item_num) as sell_num, sum(si.order_item_amount)/sum(si.order_item_num) as price from so_item si join so s on (si.order_id=s.id) where si.is_hidden=0 and si.is_gift=0 and si.ds between '$lastmonth' and '$yesterday' group by s.good_receiver_province_id, si.product_id) ssi; EOF # execute the combineorder algorithm job cd / cd /home/deploy/recsys/algorithm/schedule/project/combineorder/schedule/pms_category_rec_prod hadoop jar /home/deploy/recsys/algorithm/schedule/project/combineorder/schedule/recommender-dm-1.0-SNAPSHOT.jar com.yhd.recommender.combineorder.schedule.CombineorderRecommendScheduler # export "pms_category_rec_prod" data to mysql cd / cd /home/deploy/recsys/algorithm/schedule/project/combineorder/schedule/pms_category_rec_prod hadoop jar /home/deploy/recsys/algorithm/schedule/project/combineorder/schedule/recommender-dm-1.0-SNAPSHOT.jar com.yhd.recommender.exporter.db.HdfsToDBProcessor # check "yhd_gmv_month" is exist yesterday=$(date -d '-1 day' '+%Y-%m-%d') hadoop fs -test -e /user/hive/warehouse/yhd_gmv_month/ds=2014-08-27 if [ $? -ne 0 ] ;then echo 'Error! Directory is not exist' else # auto modify date time oldestVersionDay=$(date -d '-3 day' '+%Y-%m-%d') olderVersionDay=$(date -d '-2 day' '+%Y-%m-%d') newVersionDay=$(date -d '-1 day' '+%Y-%m-%d') sed -r -i '{s/oldestVersion=\/user\/hive\/warehouse\/yhd_gmv_month\/ds=.*/oldestVersion=\/user\/hive\/warehouse\/yhd_gmv_month\/ds='"${oldestVersionDay}"'/}' /home/deploy/recsys/algorithm/schedule/verifaction/combineorder/yhd_gmv_month/input/verification.properties sed -r -i '{s/olderVersion=\/user\/hive\/warehouse\/yhd_gmv_month\/ds=.*/olderVersion=\/user\/hive\/warehouse\/yhd_gmv_month\/ds='"${olderVersionDay}"'/}' /home/deploy/recsys/algorithm/schedule/verifaction/combineorder/yhd_gmv_month/input/verification.properties sed -r -i '{s/newVersion=\/user\/hive\/warehouse\/yhd_gmv_month\/ds=.*/newVersion=\/user\/hive\/warehouse\/yhd_gmv_month\/ds='"${newVersionDay}"'/}' /home/deploy/recsys/algorithm/schedule/verifaction/combineorder/yhd_gmv_month/input/verification.properties # export "yhd_gmv_month" data to mysql cd / cd /home/deploy/recsys/algorithm/schedule/project/combineorder/schedule/yhd_gmv_month hadoop jar /home/deploy/recsys/algorithm/schedule/project/combineorder/schedule/recommender-dm-1.0-SNAPSHOT.jar com.yhd.recommender.exporter.db.HdfsToDBProcessor fi
评论列表
本站所提供的代码,版权归原作者所有,若有侵犯作者版权,请与我们联系,我们将立即删除或修改。谢谢!
本站所有代码发布及提供者。
试试其它关键字
同语言下
.
hive 数据清理--数据去重
.
实现一个字段包含另一个字段的查询
.
Hive窗口函数之累积值、平均值、首尾值的计算学习
.
Hive 累积和的计算
.
hive表创建,删除,导入数据,删除数据
.
INNER JOIN连接两个表、三个表、五个表的SQL语句
.
多表inner join用法
.
Hive创建临时表
.
分组排序 取top N
.
hive指定hadoop执行队列
可能有用的
.
hive 数据清理--数据去重
.
实现一个字段包含另一个字段的查询
.
Hive窗口函数之累积值、平均值、首尾值的计算学习
.
Hive 累积和的计算
.
hive表创建,删除,导入数据,删除数据
.
INNER JOIN连接两个表、三个表、五个表的SQL语句
.
多表inner join用法
.
Hive创建临时表
.
分组排序 取top N
.
hive指定hadoop执行队列
贡献的其它代码
Label
地图
本站
我们
服务
版权
联系
回馈
博客