flink SQL
2016-09-13 21:56:26 0 举报
Flink SQL是Apache Flink的流处理和批处理SQL查询引擎,它使得用户可以方便地使用SQL语言进行数据处理。Flink SQL支持多种数据源和数据格式,包括关系型数据库、NoSQL数据库、文件系统等。通过Flink SQL,用户可以编写复杂的SQL查询,对数据进行过滤、聚合、排序等操作,从而实现数据的分析和挖掘。Flink SQL还提供了丰富的窗口函数和时间处理功能,使得用户可以在大规模数据集上进行高效的实时计算和离线计算。总之,Flink SQL是一个功能强大、易用性好的大数据处理工具,适用于各种场景下的数据分析和业务决策。
作者其他创作
大纲/内容
GroupReduceSinkOperator
DataSetAggregate
translateToPlan
Program
sqlQueryByOptimizer
Planner
...
transToPlan
RelNode => Operator
FlinkChainedProgram
DataSetAggregateRule
final RelNode
Operation
SQL
QueryOperation
translate operation of Planner
StreamGraph
Operation with expression
translateToExecNodePlan
CalciteCatalogReader
RelNode
Parser
DQL
FlinkSchema
Operation with relNode
PlannerQueryOperation
- RelNode: calciteTree- TableSchema: Schema
DDL
CatalogCalciteSchema
SupportsFilterPushDown
TableSource
CatalogManagerCalciteSchema
LogicalAggregate
FlinkLogicalAggregate
SqlToOperationConverter
Rule
UseOperation
ShowOperation
translateToRel
tranformation
ExecNode
ModifyOperation
PlannerContext
Table
解析
Transfomation
getRelNode and optimize
Ruleset
rootSchema
DML
FlinkLogicalAggregateConvertor
Logical optimize
Flink SQL 解析流程
Catalog
+ manage database+ manage table+ manage function+ manage ...
build and optimize
translate
Schema
DataSet
SupportsLimitPushDown
Transformation
TableEnvironment
Physical optimize
CatalogManager
- catalogs- currentCatalog
+ manage catalog+ manage table+ manage view+ Set<String> listSchemas(String catalogName)+ execute command
Operation类别
Calcite
getStreamGraph
Optimizer
Optimizer 内部组织
Flink Catalog
执行
parse调用时机:1 TableEnvironment (execute sql or sqlUpdate or sqlQuery)transform调用时机:1、StreamTableEnv转DataStream2、Table execute3、TableEnvironment (execute sql or sqlUpdate)
optimize
parse
FlinkCalciteCatalogReader
StreamGraphGenerator
0 条评论
下一页