{
"code": "FI"
}
Building a data source
Introduction §
In this guide we will explain how to build a new data source or productizer and how to use the developer portal to publish it using the developer portal so it can be queried through the product gateway using IOXIO Sandbox Dataspace. These steps are the same on all IOXIO Dataspaces, while some might miss the particular data products mentioned here.
Prerequisites §
Choosing a data definition for your source §
You can use the Data definition viewer to find available data source definitions.
If there is no definition for the kind of data that you want to provide you can create your own definition by following the How to create data definitions guide.
What are we building? §
In this guide we will build a productizer that provides an API matching the test/ioxio-dataspace-guides/Country/BasicInfo definition that we created in the How to create data definitions guide. As the definition is an OpenAPI spec you might want to use se some tool like the Swagger Editor to view it in a more human friendly format.
In this case it means we will create an API that accepts a POST request at the path /test/ioxio-dataspace-guides/Country/BasicInfo. It could for example be hosted at https://productizer.example.com/test/ioxio-dataspace-guides/Country/BasicInfo.
The POST request to that endpoint needs to accept a JSON payload, similar to this:
It needs to respond with a JSON payload that could look like this:
If you already know how you can build this using your own preferred tools and how to host it, you can skip ahead to the section where we add the data source using the developer portal.
Building a data source based on the example productizer §
We will use the FastAPI based example productizer as a starting point and just modify it to provide the country data instead. Feel free to fork the repository or just download the source code as an archive from GitHub to follow along.
General project setup and cleanup §
You most likely want to update the README.md to better describe your own data source.
If you intend to use Poetry to manage your Python dependencies you should change the name of the project and authors found in pyproject.toml and run poetry install to install the dependencies so you then can run the productizer locally by running poetry run invoke dev. If you don't intend to use poetry you can delete the file.
You might want to set up pre-commit for your project or get rid of the .pre-commit-config.yaml file.
If you don't intend to use Docker for your data source, you can also get rid of the Dockerfile, .dockerignore and the docker directory.
Adding models for the request and response §
Let's start by deleting the models related to the weather data (CurrentWeatherMetricRequest and CurrentWeatherMetricResponse) from the app/models.py, as we won't need them.
Next, let's add the definition for the request and response to app/models.py . We can directly copy the BasicCountryInfoRequest, Capital and BasicCountryInfoResponse classes from the final definition we created in the How to create data definitions guide. Note that we also need to add the necessary imports.
If you're building a data source for another definition, it's possible it was created using pydantic models, in which you can retrieve those from within the /src directory in the definitions repository. If they are not available you will have to build them yourself. In that case the same guide can be handy, as well as the official pydantic documentation and the Request Body section of the FastAPI docs.
The app/models.py file would after these changes look like this:
Add logic to retrieve data §
We can delete the file app/openweathermap.py. You might find it a useful example in case your productizer will fetch data from some other system, but in our case we won't need it. Instead we will create the file app/datasource.py with the following content:
This contains just a hardcoded dictionary of different values and a simple wrapper to fetch them. In practice you would want to change this to actually somehow fetch or generate the necessary data. Note that you might also need to install some Python packages for connecting to and querying your database or other systems. The original implementation required an API_KEY and an API_ENDPOINT, which our data source does not require, so we can remove those from settings.py. Those can however be handy examples for your own implementation.
Adding the route §
Let's open the file app/routers/dataproduct.py. We start with emptying it, so we get rid of the old route. Let's add our new route like this:
We define the route for the path /test/ioxio-dataspace-guides/Country/BasicInfo, which matches the path for our definition and the path defined in the OpenAPI spec file. We also define which pydantic models we use for the request and for the response, as well as define some metadata for the route. We use our get_data function to retrieve the data, raising a 404 exception in case the country is not found.
Finally we wrap the data into a BasicCountryInfoResponse. We could as well just return the data as a dictionary and let FastAPI take care of the rest automatically using the definition in the response_model. The FastAPI documentation is really well written and describes in great detail topics, such as the request body, response models and handling errors, so those resources are well worth a look if the explanation in this guide was too brief.
Deploy and host your productizer §
Deploy and host your productizer. For the next steps we will need the base URL at which it is responding, for example https://productizer.example.com/.
Register your data source in the Developer Portal §
Log in to the Developer Portal.
Ensure you have a group
In the menu navigate to My groups. If you don't yet have a group, create one. The name of the group should preferably relate to yourself, your company or the data source.
Add your data source
Use the menu to navigate to My data sources. Press the + ADD A DATASOURCE button. In the Data product definition, pick the definition you used for your productizer. Pick one of your groups in the dropdown for groups.
If you want to publish multiple data sources for the same definition using the same group, you can add a variant name to distinguish them from each other. Most likely you don't want to use this for your first data source. In the Base URL enter the base URL at which your deployment is available. The help text shows where the API endpoint is expected to be available. The Base URL will not be displayed to other users and is only used by the product gateway to connect to the productizer. All other applications must connect through the product gateway. For the purpose of this tutorial let's select "private" for the visibility options. The filled in form would look like this:
Finally press the CREATE button to create the data source.
Test your data source §
You should now be able to test your own data source by querying it through the product gateway. From the list of your data sources, press EDIT next to the data source you just created. The developer portal will show you the address at which you can query your own data source as well as the X-Preview-Token necessary to use a data source until it has been published.
You can for example use the cURL command line tool to query it like this (make sure the replace the url and X-Preview-Token to the one shown to you in the developer portal and change the data to match the expected payload of your own data source):
Note: As long as your datasource is not published you will need to use the X-Preview-Token header, once the datasource is published the header should be left out.
In case you make a request that causes an unexpected response to be generated by the productizer, like a 404 Not found message, the product gateway will respond with a 502 to indicate there was an error in the productizer:
Publish the data source §
When you've verified the data source works as intended you can publish the data source by selecting it in the list, pressing the EDIT button and ticking the Published checkbox and press the UPDATE button to save the changes. When it's published, it will be listed to all users in the Available data sources section and it will no longer require an X-Preview-Token header to be queried.
Next steps §
If you created the data source definition under your own test/<your-own-name> namespace, you likely want to submit a pull request to copy it outside the test namespace and add a version or copy it to the repository used for definitions in a production Dataspace. Note that you will also need to adjust your productizer to accept the request at the updated path or ensure it will accept requests on multiple different paths.
When the definition has been published, you will need to add the data source once more. This time using the definition in the new path. If you want the data source to appear in the list of available data sources for other users of the Dataspace also switch the radio button to Published.
You might want to clean up by removing the old data source definition by submitting a pull request to remove it as well as delete your old data source in the developer portal. Please note that this will make it impossible to query it, so make sure to update any applications that might be using it before deleting either one.
Due to this danger the deletion of the data source is slightly hard to reach. Open My data sources page in the developer portal and press OPEN next to the data source to reveal the EDIT button. Press it to go to the edit view, where you can press the DELETE button. You will still be prompted to confirm the deletion.
On this page