Open data in FME – case study

By August 12, 2020 FME
open_data_fme

The amendment of the Land Surveying and Cartography Act as well as the “Anti-Crisis Shield 4.0.” Act coming into force resulted in more available open data.

We use different kinds of data every day – statistic data, geographic data, etc. They all come from different sources and have various formats supported only by selected systems. Using such data may be problematic, as sometimes it requires submitting an access request and paying access fees. It usually concerns technical data used for geospatial analyses or GIS product development. Luckily, most of the data is shared within free, open licenses.

Where to find open data to download?

Some data is available for free in Poland for a long time now. It usually comes from places such as local governments (voivodeships, provinces, counties), surveying and mapping services, or Statistics Poland. As mentioned before, the quantitative increase in open data is a result of recent law changes. Those changes aim to counteract the negative effects of the COVID-19 pandemic. Orthophotomaps and control points of the basic geodetic network are now shared free of charge thanks to the “Anti-Crisis Shield”.

How to download open data?

Most of the open data services allow to download it in a classic way. This means choosing a particular database and area of interest and then downloading a package that allows for saving different formats of data. However, these days it’s increasingly more common to use API of these services.

Both of these approaches have their pros and cons. It’s worth using the first method when data is downloaded only one time, for a particular area, and in case it can be archived. The second method using API is more universal. It allows for downloading data for different areas and without the need for saving it on the computer.

Which programs allow you to work with open data?

It doesn’t matter which open data service is used, which method is used for downloading it, and what format is used for saving it. FME Platform allows for loading locally saved data in any format, as well as directly downloading data from services based on shared API. In the next part of this article, we will show how to use FME for directly downloading open data such as orthophoto maps and census data.

Orthophotomaps – how to automatically download large amounts of data?

On the 24th of June, the Head Office of Geodesy and Cartography announced that everyone is now free to download basic geodetic network data and orthophotomaps without any limits. Orthophotomap is a cartographic survey of a chosen area, made with photos processed in a particular coordinate system. Photographies used to make an orthophotomap are taken from altitude, such as aerial photos or drone photos.

Let’s use raster data. Visit the Geoportal website and choose “Data for download” and “Ortofotomapa” layers. You will have two options for data selection: by currentness (ortofotomapa wg aktualności) and spatial resolution (ortofotomapa wg rozdzielczości). Then, you will see the index divided into map sections with an adequate symbol. During selection you can see all the available orthophotomaps for a particular point, or rather a “bbox” searching for data:

Every part of the data has a detailed technical description of a particular depiction and a link for downloading data which is the most important function. The beginning of the URL query is persistent, but the subsequent reference to a specific repository may vary.

The beginning and map symbol shouldn’t cause any problems. However, the internal syntax of the rest of the repository causes the exact link for downloading data to be shown only after selecting particular depictions. Luckily, it’s easy to find the query that allows you to gather all data into one window. You can find a “request” related to GetFeatureInfo service shown at National Geoportal when you analyze the “Network” section in the source control system (F12 key in the browser). Here’s an example.

The main part of the query is the right reference to layers (the index of map sheets) and the spatial query, “bbox”. It may look like you only mark a point when selecting a sheet on Geoportal, but the algorithm transforms it into a rectangle or square and performs the spatial query according to it. As a result, you get a window with answers.

The functionality presented above can be implemented into the FME platform. It speeds up automatic downloading of data about Poland using particular parameters, not only the currentness and spatial resolution. In the script used for this function, we prepared the following options:

  • Choosing your own spatial query, so-called “bbox”, which is included in coordinates of a query’s rectangle or square in the 1992 system (EPSG:2180) in the “y_min,x_min,y_max,x_max” format. If the value remains empty, the script will refer to exemplary data from the description below.
  • Using the exemplary list of spatial queries, that were generated from division grids of DMT (digital terrain model) sheets or orthophotomaps, from centrodes of particular sections.
  • The script allows you to choose the orthophotomap’s year/currentness and its resolution at the same time.
3_translation_parameter_values

Two HTTPCaller queries are the main parts of the script. In the first step, they refer to the selection that imitates Geoportal query and in the second step, they refer to the direct data downloading using a link from opendata.geoportal.gov.pl repository.

The right set of a query using StringConcatenator is an additional aspect:

4_string_concentrator

After you receive the result, search for desired attributes using regular expressions and clearing attributes. StringSearcher and AttributeCreator with AttributeTrimmer are the most useful tools here.

Finally, selected orthophotomaps are saved in the .fmw file place in the TIFF format together with a full description of attributes that you can find on geoportal. The description data is saved in the CSV format.

5_http_caller

The script presents many possibilities. After using it for a while you can notice how easy it is to implement additional parameters such as a specific symbol, type of color composition, or an exact number of reported work. It’s also simple to download full available archival data from the “box” without filtering. All you have to do is deleting “Tester” responsible for parametrization or modifying the code. We encourage you to test downloading other available data from Geoportal, such as BDOT10k, in the same way.

Download the script with exemplary data here (FME 2020.0, Build 20252).

The Land and Property Register – can you automatically download data about land parcels?

At the time of writing this article, a fully open dataset about Land and Property Register doesn’t exist yet. However, it doesn’t mean they aren’t available. In this part we will show you how to integrate FME with:

All the services mentioned above have a sort of open form of sharing and downloading data. They have some limitations, for example, The Land Parcel Location Service forbids harvesting. After some number of queries sent from a single IP, the service will be blocked for several hours. However, a few thousand of queries doesn’t cause a similar problem. It’s the same way for the “GetFeatureInfo” function for the National Integration of the Land Register. The National Integration of the Land Register and the National Integration of Terrain Utilities allow you for, among others, downloading data “block” of a cadaster. You can select several layers:

6_ULDK_1
The Land Parcel Location Service (ULDK) is a service with precise localization of land parcels. It localizes by several criteria:

  • Precise parcel identifier
  • Number of a parcel and district
  • Coordinate

Geometry in WKB or WKT format and descriptive attributes may be results of a query:

It’s easy to use such query with HTTPCaller to download the desired data. FME platform also allows you to use these services. There is a scenario in the prepared script where using WMS you can download data “block” that contains the land and property register. Next, you try to extract parcels’ geometries using map algebra. Lastly, centrodes of the created parcels are used for querying the Land Parcel Location Service (ULDK). Thanks to this you’ll receive precise parcels’ geometries with descriptive attributes such as TERYT records, parcel’s number, or precise localization including an administrative division.

First, the researched area is extracted from the National Register of Boundaries data and temporarily saved in the script localization.

8_ULDK_3

This type of data can be divided into buildings and parcels using map algebra methods. Data won’t be fully accurate, but it can be perfect for the Land Parcel Location Service (ULDK) service. This is the result of the “CAN WE MAKE SOMETHING OUT OF THIS – MAP ALGEBRA” part of the script, together with the centrodes:

9_ULDK_4

In the end, centrode coordinates get to the Land Parcel Location Service (ULDK) using HTTPCaller:

10_ULDK_5

Now, the service’s answer has to be divided by deliminators and geometries have to be extracted from the attribute. In the WKB case, you should remember about changing the field format from UTF-8 to the binary format. The result will reflect the premise above – precise parcel geometries with descriptive attributes for the researched area.

The FME script is available here. We encourage you to try it out!

The future of open data

Next law amendments about sharing data are just a matter of time. The way data is shared isn’t a problem, since FME guarantees that any API or service can be used for downloading it. Even if data is only shared partially, it’s possible to see around this problem and get a satisfying result, as shown in the examples above.

It’s your time now!

If you want to use open data in your projects, but don’t know how to do it with FME – contact us! We will be glad to help you. Visit our FME website to get more details about data integration!