Athena Views!
Sounds like SQL Views, correct? To some extent they are, means athena view is also a logical table and it does not hold any physical data.
Good news is that manually, it is created similarly as SQL view, but what if you need to create it programmatically?
In today’s era most of the companies follow the process of automating the creation of AWS infrastructure to save time and resources to maintain. But as Athena views are recently introduced in AWS, so there is no proper documentation to create views using AWS CDK but guess what? It’s possible. Happy news, right?
Let’s get into it.
When we create athena view using AWS CDK, it creates the view in the Glue Data Catalog and then Data Catalog treats views as tables, which logically it is. So, this approach creates the view using CfnTable
construct. You can check more details (attributes and possible values) here.
Below are the things that you would need to create the view-
- A JSON which will hold the configuration of the view. Ex- ‘SELECT query’, and column data types etc.
parameters
should havepresto_view:true
. (as Athena does not support Hive views)table_type
should beVIRTUAL_VIEW
view_original_text
should have a base 64 encoded Presto view.
should have all the columns that the view will have (mentioned in the JSON) and their type casted data types. Type casting is required otherwise view will be created but you won’t be able to query it because athena will not be able to resolve the datatype. You can read more on this here.storage_descriptor
.columns
Now comes the question that what would be the exact content of the JSON that we talked about earlier, so it will look something like below. I have included the typecasting example as well for one of the columns-
View Configuration
athena_json = {
"originalSql": "SELECT cola, colb, colc FROM tableA"
"catalog": "awsdatacatalog",
"schema": "your_database",
"columns": [
{
"name": "cola",
"type": "varchar"
},
{
"name": "colb",
"type": "double"
},
{
"name": "colc",
"type": "real"
}
],
}
So, now the view configuration is ready, and you just have to write the final snippet to create the view in Glue Catalog. Below is the example for that-
'''Typecasting columns data types'''
type_casted_col_list = [{"name": item['name'],
"type": "string" if item['type'] =="varchar" else "float" if item['type'] == "real" else item['type']}
for item in athena_json['columns']]
'''base64 encoding view config'''
b64_en_view_config = (base64.b64encode((json.dumps(athena_json)).encode('utf-8'))).decode('utf-8')
'''creating view'''
view = glue.CfnTable(self, "MyView",
catalog_id="AccountID",
database_name="databaseName",
table_input=glue.CfnTable.TableInputProperty(
name="view_name",
parameters={
"presto_view" = "true"
"comment" = "Presto View"
},
table_type ="VIRTUAL_VIEW",
storage_descriptor=glue.CfnTable.StorageDescriptorProperty(
columns=type_casted_col_list,
),
view_original_text = f'/* Presto View: {b64_en_view_config} */"
view_expanded_text = "/* Presto View */"
This would create athena view which would be visible in the AWS Glue Catalog and can be queried through athena. You might face errors while querying if there is any mistake in the syntax.
Try this out and let me know in comments in case you face any errors, will be happy to help!
I am getting error ‘INVALID_VIEW’
LikeLike