Our team belongs to the Management of Data, Information and Knowledge Group (MaDgIK) of the Department of Informatics and Telecommunications of the National and Kapodistrian University of Athens. Our work focuses on various aspects of knowledge harvesting, representation, reasoning and analytics and it is done under the supervision of Professor Manolis Koubarakis.
The present web pages present our recent research and development activities. During the past few years, we have developed state-of-the-art technologies and tools for managing, processing, interlinking and visualizing big linked geospatial and temporal data. More specifically, we have worked on:
The above technological innovations have been deployed in many applications dealing with Earth Observation data e.g., data from the Sentinel satellites of the European Copernicus program.
Another recent contribution of our team is the representation and querying of legislation using ontologies and linked data. The system Nomothesi@ we have developed harvests Greek legislation from the National Printing Office and makes it available on the Web as linked data so it can be linked to other publicly available Greek open data to develop interesting applications for the public or the legal profession.
These efforts have been funded by the European research projects TELEIOS, LEO, Optique, Melodies, Big Data Europe, WDAqua, Copernicus App Lab and the research excellence grant SCARE from the Greek General Secretariat for Research and Technology. All our tools are state-of-the-art and we are constantly maintaining and enchancing them with new features. So stay tuned!
GeoTriples is a tool for transforming geospatial data from their original formats (e.g., shapefiles or spatially-enabled relational databases) into RDF.
Strabon is a spatiotemporal RDF store. You can use it to store linked geospatial data that changes over time and pose queries using two popular extensions of SPARQL (GeoSPARQL and stSPARQL). Strabon has been shown experimentally to be the most efficient spatiotemporal RDF store available today.
Do you want to get the most of your relational data combining them with linked geospatial data without converting them to RDF? No problem. Ontop-spatial can create virtual geospatial RDF graphs on top of your geospatial databases.
Silk is an open source framework for integrating heterogeneous data sources. We have extended the tool to allow the discovery of spatial and temporal links among datasets. This extra functionality is now part of the default distribution.
Sextant is a web based and mobile ready platform for visualizing, exploring and interacting with linked geospatial data. The core feature of Sextant is the ability to create thematic maps by combining geospatial and temporal information that exists in a number of heterogeneous data sources.
JedAI constitutes an open source, high scalability toolkit that offers out-of-the-box solutions for any data integration task, e.g., Record Linkage, Entity Resolution and Link Discovery. At its core lies a set of domain-independent, state-of-the-art techniques that apply to both RDF and relational data. These techniques rely on an approximate, schema-agnostic functionality based on (meta-)blocking for high scalability.
We would like to give to ordinary citizens, software developers and legal professionals the ability to have at their fingertips advanced ways of searching and understanding legislation. We showcase this by harvesting Greek legislation from the National Printing Office, making it available as linked data and interlinking with other open Greek data sets.
The change detection application is realized by two workflows. The change detection workflow uses satellite images to detect areas with changes in land cover or land use. The event detection workflow works in the opposite direction: it is triggered by news and social media information about important events.
The CORINE Land Cover dataset of year 2012 is provided by the European Environment Agency. The CORINE Land Cover project was initiated in 1985. Updates have been produced in 2000, 2006, and 2012. The dataset consists of an inventory of land cover for all the countries of Europe in 44 classes.
The Global Administrative Areas dataset contains information about the administrative boundaries of all areas in the world.
The Urban Atlas dataset is providing pan-European comparable land use and land cover data for Functional Urban Areas, such as road network, services, utilities etc. It is a joint initiative of the European Commission Directorate-General for Regional and Urban Policy and the Directorate-General for Enterprise and Industry with the support of the European Space Agency and the European Environment Agency.
OpenStreetMap is a gazetteer that contains information about a wide variety of points of interest.
The EU-Hydro dataset is a photo-interpreted river network for the EEA39 countries derived from satellite imagery supplemented with ancillary data sources.
This dataset is provided by the Atmosphere Copernicus Service. We acquired it through the RAMANI OPeNDAP interface and have converted it into RDF. The dataset provides information for air quality, specifically observations for Nitrogen Dioxide (NO2), Ozone (O3) and UV emissions.
This global database of Leaf Area Indices (LAIs) is derived using input from the Moderate Resolution Imaging Spectroradiometer (MODIS) operational reflectance product. The LAI datasets were created by reprocessing the MODIS LAI products using a two-step integrated method.
You can consume Linked Data using HTTP requests. To get the results in specific formats you can use Accept header according to the required results format:
text/html (HTML table)
application/json OR application/geojson (GeoJSON)
We often present our work in tutorials that take place at international conferences. Some recent tutorials are the following:
Some particularly important rich sources of open and free big geospatial data are the Earth observation programs of various countries such as the Landsat program of the US and the Copernicus programme of the European Union. Earth observation data is a paradigmatic case of big data and the same is true for the information and knowledge extracted from it. Earth observation data (satellite images and in-situ data) and the information and knowledge extracted can be utilized in many applications with financial and environmental impact in areas such as emergency management, climate change, agriculture and security. This potential has not been fully realized up to now, because Earth observation data and the information extracted from it “is hidden” in various archives operated by NASA, ESA and national space agencies. Therefore, a user that would like to develop an application needs to search in these archives, discover the needed data and information and integrate it in his application. In this tutorial we show how to “break these silos open” by publishing their data as RDF, enable their discovery by modern search engines, interlink it with other relevant data, and make it freely available on the Web to enable the easy development of geospatial applications. We present a complete data science pipeline that starts with Earth Observation datasets in various formats that are made freely available in the archives of space agencies like ESA and NASA, and ends with the deployment of an interactive visual application that uses Earth Observation data together with other collateral data (e.g., open government data, closed enterprise data, model data etc.) using linked data technologies. The tutorial will give an in-depth coverage of the techniques, systems and applications of linked Earth observation data developed by the presenters in the last 8 years in the context of 5 European projects. Related work by other researchers will also be covered in depth. Finally, open problems and directions for future research in this area will also be discussed.
This is a half-day tutorial and will be held on Friday, October 26st. A brief overview is provided below:
Part 1: Introduction. Satellite images. The Copernicus programme of the European Union. Copernicus data as a paradigmatic case of big data.
Part 2: Database techniques for satellite data. In this part of the tutorial, we will survey the state of the art array DBMS: MonetDB/SciQL, paradigm4/SciDB and rasdaman. We will concentrate on the capabilities and existing applications of these systems for processing remote sensing data.
Part 3: Knowledge discovery from satellite images. In this part of the tutorial, we cover remote sensing literature that studies pattern recognition and machine learning techniques for extracting knowledge (e.g., land cover classes) from satellite images.
Part 4: RDF and SPARQL extensions for geospatial and temporal data. In this part of the tutorial, we will first discuss data models and query languages for geospatial and temporal extensions of RDF concentrating on the data model stRDF and the query language stSPARQL developed by our group, the Open Geospatial Consortium (OGC) Standard GeoSPARQL and the extension of RDF for representing incomplete information RDFi. These spatiotemporal extensions of RDF can be used for encoding the knowledge extracted from satellite images using the techniques covered in Part 3 of the tutorial together with other collateral data (e.g., the administrative divisions of a certain country, OpenStreetMap data etc.).
Part 5: Spatiotemporal RDF stores. In this part of the tutorial, we will present Strabon, Ontop-spatial and their competitor systems (Parliament, uSeekM, GraphDB, AllegroGraph, Virtuoso, Stardog and Oracle Spatial and Graph 12c) and a recent functional and performance comparison of them using the benchmark Geographica. We will also discuss open problems such as how to scale these systems to big data and how to represent and query raster data in the linked data paradigm.
Part 6: Interlinking geospatial and temporal RDF data. In this part of the tutorial, we will discuss work on geospatial entity resolution and more recent work on the discovery of geospatial relations with systems such as Silk and Radon.
Part 7: Searching, browsing, exploring and visualizing remote sensing data and linked spatiotemporal data. In this part of the tutorial, we will first discuss remote sensing techniques for content-based retrieval from satellite image archives. We will also present our tool Sextant for visualizing linked spatiotemporal data, and its use in an environmental application from the project Copernicus App Lab.
Big linked spatiotemporal data tools to be covered in the tutorial.
Konstantina Bereta is a Research Associate in the Dept. of Informatics and Telecommunications, National and Kapodistrian University of Athens, and she holds a BSc. and MSc. from the same department. She is also a PhD candidate under the supervision of Prof. Manolis Koubarakis (expected date of graduation: Fall 2018). She has worked as a scientific programmer and research associate in several EU FP7 projects. Her research interests focus in the areas of spatiotemporal databases, Semantic Web and Cloud Computing.
Stefan Manegold is the lead of the Database Architectures group of CWI and a Professor in Leiden University. He is a nationally and internationally recognized expert in system-oriented database research. He is particularly known for his pioneering work on hardware-conscious database technology, and for disseminating his research via the open-source columnar analytical database management system MonetDB, which is widely used in academia and business. Dr. Manegold’s research is focused on bridging the gap between database architectures and demanding applications areas, such as large-scale data analytics (Big Data), data intensive scientific discovery (eScience), and semantic web. His expertise comprises database architectures, query processing algorithms, and data management technology, with a particular focus on hardware- conscious algorithms and data structures, query optimization, scalability, performance, benchmarking and testing.
Manolis Koubarakis is a Professor in the Dept. of Informatics and Telecommunications, National and Kapodistrian University of Athens. He previously held positions at the Dept. of Electronic and Computer Engineering, Technical University of Crete (Assistant and Associate Professor), the Dept. of Informatics, University of Athens (Visiting Researcher), the Dept. of Computation, UMIST (now University of Manchester) (Lecturer) and the Dept. of Computing, Imperial College, London (Research Associate). He has published more than 180 papers that have been widely cited in the areas of Artificial Intelligence (especially Knowledge Representation), Databases, Semantic Web and Linked Data. In 2015, he was elected Fellow of the European Association of Artificial Intelligence (EurAI). He has served in the program committee of various international conferences and workshops, and he has organized various international events. He has attracted more than 6M Euros in funding from the European Commission, the Greek General Secretariat from Research and Technology, the European Space Agency and industry sources.
George Stamoulis is a Research Associate in the Dept. of Informatics and Telecommunications, National and Kapodistrian University of Athens, and a PhD candidate under the supervision of Prof. Koubarakis. He holds a Bsc. and Msc. from the Department of Informatics and Telecommunications of the National and Kapodistrian University of Athens. His research interests focus in the areas of Semantic Web, Data Visualization and Integration and User Interfaces.
Begüm Demir is a Professor and Chair of the Remote Sensing Image Analysis (RSiM) group at the Faculty of Electrical Engineering and Computer Science, Technische Universitat Berlin (TU Berlin), Germany. Before joining to TU Berlin, she was an Assistant Professor at the Department of Computer Science and Information Engineering, University of Trento, Italy, from 2013 to 2017 while in 2017 she became an Associate Professor at the same department. Her main research interests include machine learning and big data management with applications to remote sensing image analysis. She was a recipient of an ERC Starting Grant with the project “BigEarth- Accurate and Scalable Processing of Big Data in Earth Observation’ in 2017 and the IEEE Geoscience and Remote Sensing Society Early Career Award in 2018. She is a senior member of IEEE since 2016.
The Web of data has recently been populated with linked geospatial data as various geospatial data sources have been transformed into RDF and added to the linked data cloud (e.g., Geonames, Open Street Map, CORINE land cover etc.). Therefore, it is important to study how to represent geospatial data in RDF and how to query it using SPARQL. Researchers from the areas of Semantic Web and Linked Data have studied theses problems recently. The results of this research has been the development of geospatial extensions of RDF and SPARQL, and the implementation of geospatial RDF stores. In this tutorial, we present a comparative survey of current research in this area and point to directions for future work.
This is a half-day tutorial and was held on Saturday, October 21st, in the afternoon. A brief overview is provided below:
Part 1: Introduction
Part 2: Data models and query languages for linked geospatial data. We survey the recent geospatial extensions of RDF and SPARQL concentrating on the OGC standard GeoSPARQL and our own language stSPARQL. We also discuss proposals for geospatial Ontology Based Data Access (OBDA) with more emphasis on the OBDA framework of our system Ontop-spatial. Finally, we discuss the problem of querying incomplete geospatial information expressed using Semantic Web standards.
Part 3: Implemented Systems. In this part of the tutorial, we present systems for storing and querying linked geospatial data. We distinguish these implementations into two categories: geospatial RDF stores and geospatially-enabled OBDA systems. We will describe the architectures of the surveyed systems and we will compare them in terms of functionality and performance. Demos of Strabon and Ontop-spatial will also be given by the presenters.
Part 4: Open issues. This last part of this tutorial will be dedicated to the discussion of open issues. We will point out open problems in the area of data models and query languages, and we will discuss how we can improve the performance of GeoSPARQL query engines, both native and OBDA. For the latter problem, we will discuss how state-of-the-art approaches in the area of big geospatial data, and big RDF data query processing can be used to improve the performance of existing GeoSPARQL query engines.
1. Github repository with examples (software, datasets, etc.) that will be used in the tutorial.
Manolis Koubarakis is a Professor in the Dept. of Informatics and Telecommunications, National and Kapodistrian University of Athens. He is a EurAI fellow. He previously held positions at the Dept. of Electronic and Computer Engineering , Technical University of Crete (Assistant and Associate Professor), the Dept. of Informatics, University of Athens (Visiting Researcher), the Dept. of Computation, UMIST(now University of Manchester) (Lecturer) and the Dept. of Computing, Imperial College, London (Research Associate). He has published more than 170 papers that have been widely cited in the areas of Artificial Intelligence (especially Knowledge Representation), Databases, Semantic Web and Linked Data. He previously held positions at the Dept. of Electronic and Computer Engineering , Technical University of Crete (Assistant and Associate Professor), the Dept. of Informatics, University of Athens (Visiting Researcher), the Dept. of Computation, UMIST(now University of Manchester) (Lecturer) and the Dept. of Computing, Imperial College, London (Research Associate). He currently teaches the following university courses: Artificial Intelligence, Knowledge Technologies and Data Structures and Programming Techniques. His research has been funded by the European Commission, the Greek General Secretariat for Research and Technology and industry sources.
Konstantina Bereta is a research associate at the Management of Data, Information and Knowledge group, in the Dept. of Informatics and Telecommunications, National and Kapodistrian University of Athens. She is also a PhD candidate under the supervision of prof. Manolis Koubarakis and holds a BSc. and MSc. from the same department. She has worked as a scientific programmer and research associate in several EU FP7 projects. Her research interests focus in the areas of spatiotemporal databases, Semantic Web and Cloud Computing. She is the developer of Ontop-spatial, a geospatial extension of the system Ontop, which is currently the most efficient OBDA solution for geospatial information.
On September 6, 2012 Manolis Koubarakis, Kostis Kyzirakos and Charalampos Nikolaou presented an invited tutorial on data modeling, querying and reasoning for linked geospatial data in the Reasoning Web Summer School which took place in Vienna.
The topics covered in the tutorial together with the relevant slides are given below:
1. Introduction [pdf]
2. Background in geospatial data modeling [pdf]
3. Geospatial data in RDF - stSPARQL [pdf]
4. Geospatial data in RDF - GeoSPARQL [pdf]
5. Implemented RDF Stores with geospatial support [pdf]
6. Geospatial information with description logics, OWL and rules [pdf]
7. Conclusions, questions, discussion [pdf]
There is also an accompanying tutorial paper:
For some more interesting tutorials given at the Reasoning Web Summer School see http://www.kr.tuwien.ac.at/events/rw2012.
In this tutorial we survey the state of the art in data models, query languages, implemented systems and applications of linked geospatial data. Many kinds of geospatial data are becoming available as linked datasets given the proliferation of geospatial information on the Web (e.g., Google and Bing maps, user-generated geospatial content etc.). The topic of the tutorial is related to all core research areas of the Semantic Web (e.g., semantic information extraction, data modeling and ontologies, querying, reasoning, implemented systems etc.) since there is often a need to re-consider existing core techniques when we deal with geospatial information. Thus, it is timely to train Semantic Web researchers, especially the ones that are in the early stages of their careers, on the state of the art of this area and invite them to contribute to it.
We have recently witnessed a proliferation of geospatial data on the Web. In addition to professionally-produced material being offered for free (e.g., Google or Bing maps), the public has also been encouraged to make geospatial content, including their geographical location, available online. The volume of such geospatial Web content is already big and constantly growing.
Semantic Web researchers and practitioners have also started to make geospatial data available as linked data (e.g., Ordnance Survey, Great Britain's national mapping agency, makes available some of its geospatial data as linked data (http://data.ordnancesurvey.co.uk/.html), the portal LinkedGeoData makes OpenStreetMap data are made available as RDF (http://linkedgeodata.org/ etc.). Since a lot of data useful to the wider public is geospatial (e.g., open government data), we expect this trend to continue in the near future.
In this tutorial we will present the state of the art in data models, query languages and implemented systems for linked geospatial data i.e., geospatial data expressed in RDF.
The tutorial is targeted towards Semantic Web researchers in the early stages of their career. The prerequisite is good knowledge of RDF and SPARQL and some knowledge of other Semantic Web technologies (OWL, RDF stores, linked data). Knowledge of geospatial technologies is not a prerequisite and will be covered in some depth.
15:30-16:00 Coffee Break
17:15 - 17:30 Demo of Strabon
Manolis Koubarakis is a Professor in the Dept. of Informatics and Telecommunications, National and Kapodistrian University of Athens. He has a degree in Mathematics from the University of Crete, an M.Sc. in Computer Science from the University of Toronto, and a Ph.D. in Computer Science from the National Technical University of Athens. He joined his current department in September 2005 as an Associate Professor and was promoted to Professor in April 2011. Before coming to Athens, he has been an Assistant and Associate Professor in the Dept. of Electronic and Computer Engineering, Technical University of Crete, and a Lecturer in the Dept. of Computation, University of Manchester – Institute of Science and Technology (UMIST). Manolis has published more than 100 papers that have been widely cited in the areas of Artificial Intelligence (especially Knowledge Representation), Databases, Semantic Web and P2P Computing. His research has been financially supported by the European Commission (projects CHOROCHRONOS, DIET, BRIDGEMAP, Evergrow, OntoGrid, SemsorGrid4Env and TELEIOS), the Greek General Secretariat for Research and Technology and industry sources (Microsoft Research and British Telecommunications). He is currently co-ordinating project TELEIOS (http://www.earthobservatory.eu/) which is building an Earth Observatory using a combination of technologies based on semantics (geospatial extensions of RDF and SPARQL) and array extensions of SQL. Manolis has 16 years teaching experience in academic institutions in Greece and the United Kingdom, and has given many talks in international conferences and workshops (some of them invited). He has served as Tutorial chair for ESWC 2011.
Kostis Kyzirakos is a researcher in the Department of Informatics and Telecommunications, University of Athens. He received his Diploma in Engineering from the School of Electrical and Computer Engineering, NTUA, Athens. He has participated in projects funded by the European Commision (Ontogrid, SemsorGrid4Env, TELEIOS) and the Greek General Secretariat for Research and Technology (P2P Techniques for Semantic Web Services). He is one of the main developers of the semantic geospatial DBMS Strabon that was developed in the context of the EU projects SemsorGrid4Env and TELEIOS. In the same context, he studied and proposed how to represent and query geospatial data in the Semantic Web, published various geospatial datasets as linked geospatial data and implemented applications combining these data with previously published linked geospatial data. His current research focuses on modeling and querying semantic spatio-temporal information on top of traditional DBMS. He has given a tutorial on building semantic sensor webs and applications at ESWC 2011.
Manos Karpathiotakis is a researcher in the Department of Informatics and Telecommunications, University of Athens. He received his Bachelor degree and his Master of Science from the Department of Informatics and Telecommunications of the University of Athens. He has participated in projects funded by the European Commision (TELEIOS, SemsorGrid4Env) and he is one of the main developers of the semantic geospatial DBMS Strabon that was developed in the context of these projects. In the same context, he published various geospatial datasets as linked geospatial data and implemented applications combining these data with previously published linked geospatial data. His current research focuses on the overlapping areas of Geospatial Semantic Web, Semantic Sensor Web and Linked Data.
Research Assistant, PhD student
Research Assistant, PhD student
Research Assistant, PhD student
Research Assistant, MSc student
Research Assistant, MSc student
Research Assistant, PhD student
Research Assistant, MSc student
Research Assistant, MSc student
Research Assistant, MSc student