编程知识 cdmana.com

Flink1.11+Hive批流一体数仓

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"导读"},{"type":"text","text":": Flink从1.9.0开始提供与Hive集成的功能,随着几个版本的迭代,在最新的Flink 1.11中,与Hive集成的功能进一步深化,并且开始尝试将流计算场景与Hive进行整合。本文主要分享在Flink 1.11中对接Hive的新特性,以及如何利用Flink对Hive数仓进行实时化改造,从而实现批流一体的目标。主要内容包括:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flink与Hive集成的背景介绍"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flink 1.11中的新特性"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"打造Hive批流一体数仓"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"01 Flink与Hive集成的背景介绍"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"为什么要做Flink和Hive集成的功能呢?最早的初衷是我们希望挖掘Flink在批处理方面的能力。众所周知,Flink在流计算方面已经是成功的引擎了,使用的用户也非常多。在Flink的设计理念当中,批计算是流处理中的一个特例。也就意味着,如果Flink在流计算方面做好,其实它的架构也能很好的支持批计算的场景。在批计算的场景中,SQL是一个很重要的切入点。因为做数据分析的同学,他们更习惯使用SQL进行开发,而不是去写DataStream或者DataSet这样的程序。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://www.infoq.cn/article/Yj5zE53NzNmM7bPFRaLz?utm_source=rss&utm_medium=article

Scroll to Top