Research Projects

January 17, 2017

SOM team is currently working on the following funded projects. Get in touch if you’re specially interested in any of them.

Open data for All – RETOS Spanish National Project 2017-2020

The goal of the project is to make the promise of open data a reality by giving non-technical users tools they can use to find and compose the information they need.

More and more data is becoming available online every day coming from both the public sector and private sources. As an example, the European data portal registers over 400,000 public datasets online. Most of this data is available via some kind of (semi)structured format (XML, RDF, JSON,…) which, in theory, facilitates its consumption and combination. Indeed, the open data movement promises to bring to the fingertips of every citizen all the data they need, whether it is for planning their next trip or for government oversight.

Unfortunately, this is still far from reality. Our society is opening its data but not building the technology and infrastructure required to enable citizens to access and manipulate it. Only technical people have the skills to consume the heterogeneous data sources while the rest is forced to depend on third-party applications or companies.

This research project aims to change this. Our goal is to empower all citizens to exploit and benefit from the open data, helping them to become not only consumers but also creators of data that add new value to our society. In this sense, the project will automatically infer a unified global schema of the knowledge available in open data sets and present that schema to the average citizen in a way she can easily browse and query to get the information she needs. This request will be then transparently translated into a combined sequence of accesses to the required data sources to retrieve, visualize and republish it (if desired). When several data sources could be used (e.g. due to an overlap in the exposed data) quality aspects of the source or even monetary costs (some sources may be only partially free) will be taken into account to provide an optimal solution.

To achieve this ambitious goal, the project will pursue the following key research contributions:

  • APIfication of data sources: (Web) APIs are becoming the de facto choice for publishing content online. We will unify access to all kinds of data sources via an API interface
  • Schema discovery: Most sources won’t have any kind of formal description we could use to precisely understand what information the source provides. A systematic analysis of data samples will help us to infer that schema, enriched with annotations regarding quality aspects (e.g. reliability, availability, etc) to better characterize the data source.
  • Schema composition: Individual schemas will be matched and merged to create the global schema representing all available knowledge.
  • Citizen languages: Human-computer interaction techniques will be used to build a user-friendly language to express and visualize information requests on this global schema.
  • Query resolution: Each request will be translated into an optimal sequence of API calls on the underlying data sources to retrieve the data needed to respond to the request.

The results of this project will have a huge impact on our society by finally giving all citizens unrestricted access to the massive amounts of open data available online. This will also be beneficial to data providers, that could reach a broader audience, and software companies that will now have a simpler way to build new applications exploiting the links among a diversity of datasets. These benefits will be validated by means of case studies on open data sets provided by the city of Barcelona and the governments of Catalonia and Canarias, implemented on top of an open source platform released by the project.


The following figure illustrates the proposed approach.


Open Data for All: global schema


Gamification for Modeling Tools – CEA – 2017

The goal of the application of the use of game design elements in non-game contexts to engage more external developers and end-users on Papyrus

Model Driven Engineering (MDE) has been widely recognized as a major advance in the design and development of complex systems that must respond to rapidly evolving target platforms and increasing functional complexity. With the emergence of large collaborative environments such as Eclipse projects, it becomes possible to capitalize on technologies to provide efficient tooling to support MDE activities.

Papyrus, a stable and powerful Open Source UML/SysML tool suite, goes in this direction. It helps MDE designers in the maintenance and evolution of the system by providing tooling support close to process practices and concepts used in the application domain. Furthermore, being based on Eclipse, it benefits from a world-wide visibility, reflected by the activity in the Eclipse forum and social media (i.e., YouTube, Twitter).

Despite such good conditions to become a very popular platform for both industrial and research activities around MDE, Papyrus is currently not able to exploit them to build a constant, large and self-motivated community of external developers and end-users that really contribute to the advancement of the platform.

Following the success obtained on several platforms (Foursquare , Jenkins , Jira , Visual Studio , etc.), the objective of this project is to apply the use of game design elements in non-game contexts (“Gamification”) to engage more external developers and end-users on Papyrus.  Depending on the level and type of activity performed by end-users and developers, we could offer rewards that span from digital badges and physical goodies to trips and even maybe internship or job proposals.


Gamification for end-users

When it comes to using Papyrus for education purposes or for basic end-users, the tool might appear too detailed and powerful. A recent initiative, Papyrus for Education , tackles this issue and provides a set of mechanisms to adapt the tool to the user needs and expertise. Following such an initiative and to make the tool even more attractive for newbies, the objective is to build a set of modeling-related games (based on quizzes and quick exercises) on top of Papyrus that can be used to self-assess the modeling knowledge of the user and its familiarity with the tool and the UML/SysML languages in a playful way. We will also explore how these tests could be personalized based on the monitoring of the user activity on the platform as a way to detect what errors s/he is doing, what parts of the platform s/he is not exploring.

Gamification for developers

Monitoring and analyzing the activity of the community around a software project is paramount to assess its health and resilience.  This requires gathering data of the community activity in different online sites and platforms. In particular, for this project we will harvest data from the Papyrus tools (e.g., Papyrus Git repository, GitHub, Bugzilla and Eclipse forums). Then we will define several metrics on this data to determine the activity levels of each individual and his possible rewards based on the game levels defined.


MegaMart – ECSEL EU – 2017-2020

The goal of the project is to work on Megamodeling at Runtime to build a scalable model-based framework for continuous development and runtime validation of complex systems.

Productivity and quality are two of the major challenges of building, maintaining and evolving large complex and business critical software systems. In June 2012 Gartner released results of survey on failure of software projects. The survey showed that 28% of large IT projects with budget exceeding $1M fail. Among the reasons, functionality issues accounted for 22%; late delivery for 28%; and poor quality for 11% of failures. The Standish Group CHAOS report for 2013 states that only 10% of large IT projects delivered on time, on budget and with required features and functions. In the global context, the European industry faces stiff competition. Electronic systems are becoming more and more complex and software intensive, which calls for modern engineering practices to tackle advances in productivity and quality of these now cyber-physical systems. Model-driven Engineering and related technologies promise significant productivity gains, which have been proven valid in several studies. However, these technologies need to be further developed to scale for real-life industrial projects and provide advantages at runtime.

The ultimate objective of enhancing productivity while reducing costs and ensuring quality in development, integration and maintenance can be achieved by the use of techniques that integrate design and runtime aspects within system engineering methods incorporating existing engineering practices. Industrial scale models, which are usually multi-disciplinary, multi-teams, combine several product lines and typically include strong system quality requirements can be exploited at runtime, by advanced tracing and monitoring. Thus, achieving a continuous system engineering cycle between design and runtime, ensuring the quality of the running system and getting valuable feedback from it that can be used to boost the productivity and provide lessons-learnt for future generations of the products.

The major challenge in the Model-Driven Engineering of critical software systems is the integration of design and runtime aspects. The system behaviour at runtime has to be matched with the design in order to fully understand the critical situation, failures in design and deviations from requirements. Many methods and tools exist for tracing the execution and performing measurements of runtime properties. However, these methods do not allow the integration with system models – the most suitable level for system engineers for analysis and decision-making.

VISION: We will exploit important features of:

  1. MARTE, SysML and others to express both system functional and non-functional properties;
  2. model-based verification and validation methods at design time and runtime;
  3. methods for model management / megamodelling;
  4. methods for traceability over large multi-disciplines models;
  5. methods for inference of system deviations and affected design elements;

in order to create a scalable framework for model-based continuous development and validation of large and complex industrial systems