TOC > Usage overview > Using OGC API - Features with R and VBA
Tutorial: querying data with OGC API - Features using R and VBA programming languages
The OGC API - Features web services available in GeoMet-OGC-API offer the capability to query vector geospatial data and retrieve it in GeoJSON and CSV formats. The output file can then be used for display or additional data processing. This tutorial will show you how you can:
- Query and retrieve vector geospatial data from GeoMet-OGC-API with R and Excel's VBA
- Load the data inside the R code editor and display time series data with a simple graph
- Load the data inside an Excel sheet and display time series data with a simple graph
R example
Getting a simple piece of code working in R
is very simple using the plot() function. However, the costumization options are non-existent and as such we'll be making a plot that is a little more complex, but that really shows what you can do with our data and with R
:
library(ggplot2)
library(scales)
options(scipen=999)
aqhi_obs_realtime <- read.csv("https://api.weather.gc.ca/collections/aqhi-observations-realtime/items?sortby=observation_datetime&limit=100&location_name_fr=Ottawa&f=csv")
format_str="%Y-%m-%dT%H:%M:%OSZ"
obs_datetime <- as.POSIXct(aqhi_obs_realtime$observation_datetime, tz="GMT", format=format_str)
aqhi_obs_realtime$observation_datetime <- obs_datetime
p <- ggplot(aqhi_obs_realtime, aes(x=observation_datetime, y=aqhi)) +
geom_point(aes(fill=aqhi, size=aqhi), shape=21) +
scale_fill_viridis_c(guide = "legend") +
scale_size_continuous(range=c(1, 3)) +
geom_line(size=1) +
labs(title="AQHI as a function of time for Ottawa",
subtitle="From weather.gc.ca",
y="AQHI",
x="Observation datetime",
caption="Working with R example") +
scale_x_datetime(
breaks=seq(min(obs_datetime), max(obs_datetime), by="6 hours"),
date_labels="%m-%d %H:%M",
expand=c(0.005, 0))
print(p)
First things first, before we can run this code, we'll need to install the proper tools (of course skip steps if you already have them):
- Install RStudio
- Download the R language
- Download the appropriate version of RTools depending on the version of RStudio you have. This is necessary so that the libraries we will be adding to our code work properly.
We now have everything we need, so let's first open RStudio so we can move to the next steps! Once opened, go in the tab called Console
and type these two lines (press Enter
key after each):
install.packages("ggplot2")
install.packages("scales")
ggplot2 is a great plotting library that gives us lots of costumization options and scales is a library used with ggplot2 that helps with rescaling the final plot.
Next we'll go to File > New File > R Script
, which will open an empty untitled file. We will then copy and paste the code shown above inside it and we'll save the file (either by clicking on the Save
icon or by pressing Ctrl S
). You may name it however you like, just make sure the extension of the file is .R
(i.e. myfilename.R
). To run the entire code, simply click on the Source
button (the Run
button only runs lines you've highlighted inside your file). The result should look like this:
We will now explain the code we've just run line by line so that you understand what has been done:
The first two lines with library
are just to tell the code we'll be using these two libraries. The options(scipen=999) isn't used in this particular example, but it removes scientific notation if you ever need it.
Inside the read.csv() function, you put the desired url with whatever parameters you want to have. For example, here we want the aqhi-observations-realtime
data, we want the data to be sorted according to the observation_datetime
field, we want a limit
of 100 lines of data and we only want the data for Ottawa
. Finally, we just add &f=csv
at the end so that the data is in CSV format. It will thus load all the data from the CSV file inside the variable named X
as a dataset
inside R Studio.
The format_str
is used to tell the code how our dates are formatted so it knows how to interpret them. You can find more info on what each letter means here. The next line is so the code understands that the values in observation_datetime
aren't just text, they represent the date and time. We will thus use as.POSIXct() to achieve this goal. We will give it aqhi_obs_realtime$observation_datetime
, which means "all data inside the aqhi_obs_realtime
dataset under the observation_datetime
field". We'll also specify that the timezone is GMT
and we will tell it how the dates and times are formatted so that it can understand it, using the format_str
we've just defined. We then overwrite aqhi_obs_realtime$observation_datetime
with the datetimes formatted as datetimes instead of just as text.
Now we come to the ggplot function and all the formatting options that we've added to it:
- First the function itself. We need to pass it the dataset that we want to use to plot, then we need to pass it our aesthetic mappings aes, in which all we've done is tell it what columns we wish to be using. This line by itself will only show an empty graph since all we've done here is tell it what we'll be plotting, not how.
- Now let's tell it how to plot it! We'll use geom_point() to tell it to plot the data as points. Inside we've added some more aesthetic mappings, which will color the points according to the value inside
aqhi
and change the point's size according toaqhi
as well. You could've skipped the aesthetics and simply given it a color and a size, it's just there to show it is possible. - Problem now is the points vary a LOT with size, so we'll use scale_size_continuous() and tell it to have a minimum size of 1 and a maximum size of 3 for points. The other problem is that there are now two legends, one with sizes and one with color. To remedy that, we're using scale_fill_viridis_c to tell the code they're both of type
legend
, which will combine the two of them together. - After that, let's connect these points with a line while we're at it! We'll be using geom_line() for that purpose and give it a constant size.
- Next we'll add all the labels that we want all around the plot with labs.
- Finally, our x axis currently contains all of the dates with their full notation and we can't read a single thing on the x axis. As such, we'll format it using scale_x_datetime(). First we'll use the
breaks
argument to tell it to start at the first time and end at the last time (meaning put the first time label under the first point and don't go further than the last value). For that purpose we'll use the sequence generator seq(), which will return a list of all the labels that are between start and end and we'll use theby
argument to give it the desired interval. Next, we don't need to see every date as for example2022-01-01 00:00:00
. Instead, let's tell it using thedate_labels
argument that we only want01-01 00:00
with the date format%m-%d %H:%M
. ggplot also always keeps a buffer at the start and end of the plot that represent 5% of the entire graph on each side, so we'll reduce that to 0.5% by using theexpand
argument.
We are at the end of the script. All that's left is the print() line, which only tells RStudio to display the plot we've just created. Had you decided to paste all of these lines directly inside the console, you wouldn't need that print statement as within the console it is always implicit that you wish to see the result of what you're doing. Although it is a better practice to keep everything within a file and run that instead as you don't run the risk of losing your work.
Excel example
The goal here is to query and make the plot in Excel using only the Excel programming language: VBA. Before we start, something to note is that you will need to change the sheet name Sheet1
to the name of your sheet (unless it's already the same name). The reason is we are clearing the entire sheet and putting new data inside it, so this makes sure you don't have the wrong Excel sheet selected and clear everything that's in it.
Steps to follow to use VBA
in Excel:
- Enable the Developer tab: On the File tab, go to
Options > Customize Ribbon
. UnderCustomize the Ribbon
and underMain Tabs
, select theDeveloper
check box; - Go in the newly added
Developer
tab and click onVisual Basic
; - Right click on the Excel sheet in the
VBAProject
tab in the newly opened window, thenInsert > Module
; - Copy paste the code shown below inside the code Module (don't forget to change every
Sheet1
in the code if that's not the name of your sheet) and press theRun
button.
And that's it! You've just populated your Excel sheet with the data from aqhi-observations-realtime
and you've plotted a ScatterLineChart
using the data from the observation_datetime
and aqhi
fields. Here is the code that you need to accomplish this (code made with added comments to help users understand each step):
' Explicit -> The variables used must be defined with Dim before being used
Option Explicit
Sub MakeRequest()
Dim baseURL As String
Dim params As String
Dim reqURL As String
' URL for the request with parameters
baseURL = "https://api.weather.gc.ca/collections/aqhi-observations-realtime/items"
params = "sortby=observation_datetime&limit=100&location_name_fr=Ottawa"
' We concatenate into a single URL and we add "&f=csv" at the end to get the data in CSV format
reqURL = baseURL & "?" & params & "&f=csv"
Application.ScreenUpdating = False
' We clear the content of the cell in columns A to Z before adding the new data
ThisWorkbook.Worksheets("Sheet1").Range("A:Z").ClearContents
' We select the Excel sheet named "Sheet1"
Sheets("Sheet1").Select
Dim ConnString As String
' We send the request for the CSV file and we tell it to put the data inside "Sheet1"
ConnString = "TEXT;" & reqURL
With ThisWorkbook.Worksheets("Sheet1").QueryTables.Add(Connection:=ConnString, Destination:=Worksheets("Sheet1").Cells(1, 1))
.Name = "AQHI Observations Realtime"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = True
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.TextFileStartRow = 1
.TextFileParseType = xlDelimited
.TextFileCommaDelimiter = True
.Refresh
End With
' We will be making a LineChart of AQHI (aqhi) as a function of time (observation_datetime)
Dim WS As Worksheet
Dim ColxNum As Integer
Dim ColyNum As Integer
Dim ColxLetter As String
Dim ColyLetter As String
Dim lastRow As Long
Dim ValuesRange As String
Set WS = ActiveSheet
' We need to find the column number of "observation_datetime" and the corresponding column letter
ColxNum = Application.WorksheetFunction.Match("observation_datetime", WS.Rows(1), 0)
ColxLetter = Split(Cells(1, ColxNum).Address, "$")(1)
' We need to find the column number of "aqhi" and the corresponding column letter
ColyNum = Application.WorksheetFunction.Match("aqhi", WS.Rows(1), 0)
ColyLetter = Split(Cells(1, ColyNum).Address, "$")(1)
' We find the total number of lines for the data selection
lastRow = ActiveSheet.Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
' We select all the values from observation_datetime column and format it into a datetime format in the form "month-day hour:minute"
Dim c As Range
Dim ColDatetimeValues As String
ColDatetimeValues = ColxLetter & "1:" & ColxLetter & lastRow
WS.Range(ColDatetimeValues).Select
Selection.NumberFormat = "mm-dd hh:mm"
For Each c In Selection
c.Value = Replace(c.Value, "T", " ")
c.Value = Replace(c.Value, "Z", "")
Next c
' We format lines and columns for data selection as H1:H62,J1:J62 for the LineChart
ValuesRange = ColxLetter & "1:" & ColxLetter & lastRow & "," & ColyLetter & "1:" & ColyLetter & lastRow
' We select the values according to the formatted intervals
WS.Range(ValuesRange).Select
' We create a ScatterLine chart with the selected data
Dim myChart
Set myChart = WS.Shapes.AddChart2(227, xlXYScatterLines)
myChart.Select
' We add and format the labels
With myChart
.chart.SetElement msoElementChartTitleAboveChart
.chart.SetElement msoElementPrimaryCategoryGridLinesMinor
.chart.SetElement msoElementPrimaryCategoryAxisTitleBelowAxis
.chart.SetElement msoElementPrimaryValueAxisTitleRotated
.chart.ChartTitle.Text = "AQHI as a function of time for Ottawa"
.chart.Axes(Type:=xlCategory, AxisGroup:=xlPrimary).AxisTitle.Text = "Observation datetime"
.chart.Axes(Type:=xlCategory).TickLabels.Orientation = 45
.chart.Axes(Type:=xlCategory).MinimumScale = Range(ColxLetter & "2").Value
.chart.Axes(Type:=xlCategory).MaximumScale = Range(ColxLetter & lastRow).Value
.chart.Axes(Type:=xlValue, AxisGroup:=xlPrimary).AxisTitle.Text = "AQHI"
End With
End Sub
Here's the result after running the code: