Skip to content

Exercise 1 - Your first dataset

In this section you are going to publish a vector dataset.

For this exercise, we will use a CSV dataset of free Wifi locations in Florence, kindly provided by opendata.comune.fi.it.

You can find this dataset in workshop/exercises/data/free-wifi-florence.csv.

This exercise consists of two key steps:

  • adapt the pygeoapi.config.yml to define this dataset as an OGC API - Features collection
  • ensure that pygeoapi can find and connec to the data file

We will use the docker-compose.yml file provided.

Verify the existing Docker Compose config

Before making any changes, we will make sure that the initial Docker Compose setup provided to you is actually working. Two files are relevant:

  • workshop/exercises/docker-compose.yml
  • pygeoapi/docker.pygeoapi.config

To test:

Test the workshop configuration

  1. In a terminal shell navigate to the workshop folder and type:

docker-compose up
1. Open http://localhost:5000 in your browser, verify some collections 1. Close by typing CTRL-C

Note

You may also run the Docker container in the background (detached) as follows:

docker-compose up -d
docker ls  # verify that the pygeoapi container is running
# visit http://localhost:5000 in your browser, verify some collections
docker logs --follow pygeoapi  # view logs
docker-compose stop

Publish first dataset

You are now ready to publish your first dataset.

Setting up the pygeoapi config file

  1. Open the file workshop/exercises/pygeoapi/pygeoapi.config.yml in your text editor
  2. Look for the commented config section starting with # START - EXERCISE 1 - Your First Collection
  3. Uncomment all lines until # END - EXERCISE 1 - Your First Collection

Make sure that the indentation aligns (hint: directly under # START ...)

The config section reads:

185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
free_wifi_florence:
    type: collection
    title: Free WIFI Florence
    description: The dataset shows the location of the places in the Municipality of Florence where a free wireless internet connection service (Wifi) is available.
    keywords:
        - wifi
        - florence
    links:
        - type: text/csv
          rel: canonical
          title: data
          href: https://opendata.comune.fi.it/?q=metarepo/datasetinfo&id=fb5b7bac-bcb0-4326-9388-7e3f3d671d71
          hreflang: it-IT
    extents:
        spatial:
            bbox: [11, 43.6, 11.4, 43.9]
            crs: http://www.opengis.net/def/crs/OGC/1.3/CRS84
    providers:
        - type: feature
        name: CSV
        data: /data/free-wifi-florence.csv
        id_field: name-it
        geometry:
            x_field: lon
            y_field: lat

The most relevant part is the providers section. Here, we define a CSV Provider, pointing the file path to the /data directory we will mount (see next) from the local directory into the Docker container above. Because a CSV is not a spatial file, we explicitly configure pygeoapi so that the longitude and latitude (x and y) is mapped from the columns lon and lat in the CSV file.

Tip

To learn more about the pygeoapi configuration syntax and conventions see the relevant chapter in the documentation.

Tip

pygeoapi includes numerous data providers which enable access to a variety of data formats. Via the OGR/GDAL plugin the number of supported formats is almost limitless. Consult the data provider page how you can set up a connection to your dataset of choice. You can always copy a relevant example configuration and place it in the datasets section of the pygeoapi configuration file for your future project.

Making data available in the Docker container

As the Docker container (named pygeoapi) cannot directly access files on your local host system, we will use Docker volume mounts. This can be defined in the docker-compose.yml file as follows:

Configure access to the data

  1. Open the file workshop/exercises/docker-compose.yml
  2. Look for the commented section # Exercise 1 -
  3. Uncomment that line - ./data:/data

The relevant lines read:

43
44
45
volumes:
    - ./pygeoapi/pygeoapi.config.yml:/pygeoapi/local.config.yml
    - ./data:/data # Exercise 1 - Ready to pull data from here

The local ./pygeoapi/pygeoapi.config.yml file was already mounted. Now we have also mounted (made available) the entire local directory ./data.

Test

Start with updated configuration

  1. Start by typing docker-compose up
  2. Observe logging output
  3. If no errors: open http://localhost:5000
  4. Look for the Free Wifi Collection
  5. Browse through the collection

Debugging configuration errors

Incidentally you may run into errors, briefly discussed here:

  • A file cannot be found, a typo in the configuration
  • The format or structure of the spatial file is not fully supported
  • The port (5000) is already taken. Is a previous pygeoapi still running? If you change the port, consider that you also have to update the pygeoapi config file

There are two parameters in the configuration file which help to address these issues. Set the logging level to DEBUG and indicate a path to a log file.

Tip

On Docker, set the path of the logfile to the mounted folder, so you can easily access it from your host system. You can also view the console logs from your Docker container as follows:

docker logs --follow pygeoapi

Tip

Errors related to file paths typically happen on initial setup. However, they may also happen at unexpected moments, resulting in a broken service. Products such as GeoHealthCheck aim to monitor, detect and notify service health and availability. The OGC APi - Features tests in GeoHealthCheck poll the availability of the service at intervals. Consult the GeoHealthCheck documentation for more information.