Notebooks
S
Snowflake
Ingest Public JSON

Ingest Public JSON

data-sciencenotebookmachine-learningsnowflake-demo-notebooksIngest Public JSONdata-engineeringPythonsql

How to Ingest JSON Data from Public Endpoint

This example demonstrates how you can download data from a public endpoint and transform it into a Snowpark Dataframe and save the results into a table in Snowflake.

Note: Running this notebook require that you have ACCOUNTADMIN or SECURITYADMIN roles to create new network rules.

[ ]
[ ]
CREATE or replace TABLE bike_riders (...

By default, Snowflake restricts network traffic from requests from public IP addresses. In order to access external data, we first need to create an external access integration to add data.seattle.gov as an allowed endpoint.

[ ]
[ ]

Next, we create a user-defined function (UDF) that allows users to connect outside of Snowflake and fetch the data from the remote endpoint. We attach the external access object that we created earlier to the UDF so that it has permission to access the allowed network. Read more about using external access integration in a UDF or procedure here.

The external function uses the requests library in Python to get the JSON response from the URL.

[ ]

Now we can call the external function on this URL, we see the JSON string returned as output:

[ ]

Next, we want to insert the JSON into the bike_riders table. We use Snowflake's PARSE_JSON function to process the data.

Furthermore, we use the :: operator to extract the value of the JSON field to the desired data type (STRING, NUMBER). Read more about how to work with semi-structured data in Snowflake here.

[ ]

Now that the table is loaded, we can use SQL to preview the data:

[ ]

Alternatively, we can also load this table into a Snowpark Dataframe to work with your data in Python.

[ ]
[ ]

We can also convert our Snowpark DataFrame to pandas and operate on it with pandas.

[ ]
[ ]

Now, we can visualize the TIMESTAMP column by plot a histogram distribution of hours.

[ ]

Conclusion

In this example, we demonstrated how you can create an external access integration and attach it to a UDF that loads data from a public endpoint. We also showed how you can load semi-structured JSON data into a Snowflake table and work with it using SQL or Python. To learn more about external network access to Snowflake, refer to the documentation here.