Create AWS Athena View Using AWS CDK

Athena Views!
Sounds like SQL Views, correct? To some extent they are, means athena view is also a logical table and it does not hold any physical data.
Good news is that manually, it is created similarly as SQL view, but what if you need to create it programmatically?

In today’s era most of the companies follow the process of automating the creation of AWS infrastructure to save time and resources to maintain. But as Athena views are recently introduced in AWS, so there is no proper documentation to create views using AWS CDK but guess what? It’s possible. Happy news, right?

Advertisements

Let’s get into it.

When we create athena view using AWS CDK, it creates the view in the Glue Data Catalog and then Data Catalog treats views as tables, which logically it is. So, this approach creates the view using CfnTable construct. You can check more details (attributes and possible values) here.

Below are the things that you would need to create the view-

  • A JSON which will hold the configuration of the view. Ex- ‘SELECT query’, and column data types etc.
  • parameters should have presto_view:true. (as Athena does not support Hive views)
  • table_type should be VIRTUAL_VIEW
  • view_original_text should have a base 64 encoded Presto view.
  • storage_descriptor.columns should have all the columns that the view will have (mentioned in the JSON) and their type casted data types. Type casting is required otherwise view will be created but you won’t be able to query it because athena will not be able to resolve the datatype. You can read more on this here.
Advertisements

Now comes the question that what would be the exact content of the JSON that we talked about earlier, so it will look something like below. I have included the typecasting example as well for one of the columns-

View Configuration

athena_json = {
"originalSql": "SELECT cola, colb, colc FROM tableA"
"catalog": "awsdatacatalog",
"schema": "your_database",
"columns": [
{
"name": "cola",
"type": "varchar"
},
{
"name": "colb",
"type": "double"
},
{
"name": "colc",
"type": "real"
}
],
}

So, now the view configuration is ready, and you just have to write the final snippet to create the view in Glue Catalog. Below is the example for that-

'''Typecasting columns data types'''
type_casted_col_list = [{"name": item['name'],
"type": "string" if item['type'] =="varchar" else "float" if item['type'] == "real" else item['type']}
for item in athena_json['columns']]

'''base64 encoding view config'''
b64_en_view_config = (base64.b64encode((json.dumps(athena_json)).encode('utf-8'))).decode('utf-8')

'''creating view'''
view = glue.CfnTable(self, "MyView",
catalog_id="AccountID",
database_name="databaseName",
table_input=glue.CfnTable.TableInputProperty(
name="view_name",
parameters={
"presto_view" = "true"
"comment" = "Presto View"
},
table_type ="VIRTUAL_VIEW",
storage_descriptor=glue.CfnTable.StorageDescriptorProperty(
columns=type_casted_col_list,
),
view_original_text = f'/* Presto View: {b64_en_view_config} */"
view_expanded_text = "/* Presto View */"

This would create athena view which would be visible in the AWS Glue Catalog and can be queried through athena. You might face errors while querying if there is any mistake in the syntax.
Try this out and let me know in comments in case you face any errors, will be happy to help!

Advertisements

One thought on “Create AWS Athena View Using AWS CDK

Leave a comment