Model-Based Testing Community | Connecting the MBT world…



Community Tool Survey: We need your support!

Dear MBT Community,

some weeks or even months ago we had a great idea: Let us organize a Community Tool Survey. We are confident that this could become a great event. But we need your help!
We all know such events from other domains like constraint solving or model checking. Doing something similar in the MBT context is a great opportunity to shed some light on the different domains for which MBT can be applied, and on different approaches and tool vendors to handle the challenges of these domains. However, tools are made with different domain background and this should be kept in mind when organizing such a Community Tool Survey.

What we have in mind is a survey, which essentially consists of three main steps:

  1. Define, within our community, a requirements specification for a problem to be solved.
  2. Let MBT tool vendors test this problem by using their test generators.
  3. Publish the results within the MBT Community for further discussion.

In the following, we present some of our questions and ideas about the contest and invite all to give us constructive feedback to our ideas.

1) Where do the specifications of the tasks to be solved come from? Do we provide a specification on our own? Do we use well-known problem descriptions which are available out there?

2) How should the specifications of the tasks to be solved look like? Should we use concrete (formal) models or textual specification? Since there are many input languages for the different tools, choosing just one modeling language would be unfair to n-1 tools. Also, not all languages support all features equally, like compositionality, nondeterminism, (a)synchronous communication, timing aspects, etc. We could also invite all participating tool vendors to submit one model from their domain, each. This would equal the chances.

3) Do we provide a system under test (SUT)? Doing so would allow us to measure metrics as code or requirements coverage, or even the number of spotted failures. This brings us to the next question:

4) How should the results of the test generation be evaluated? There are many well known means such as model coverage, code coverage (if we have an SUT available), mutation analysis on the code (again, we need an SUT) or on the model. We could also ask domain experts to create test cases based on the model manually and compare the automatically generated ones to the manually designed ones. As another indicator, we could have a look at the efficiency (e.g. number of test cases, test case length). Or is the comparison too far away from reality? What about just presenting the results and analyzing the different approaches from different domains? It would also be a good result to know what tools are suited for what domains.

We are looking forward to your input 🙂 Use the comment functionality of this blogpost or write us simply an e-mail.

Also: Spread the word about our idea! You can do it by using Facebook, Twitter, LinkedIn, e-mail, etc. etc. This project depends on your support and input. Receiving more feedback let us calibrate the idea and omit (common) pitfalls. At the end the survey is done by the community for the community!

· · · · ·


  • Author comment by Lars Frantzen · January 14, 2012 at 10:07 am

    One testing related case study is the Conference Protocol:
    Both specifications and an implementation are provided.
    Several tools have been used to test this protocol, e.g. TGV, TorX, Phact, and SpecExplorer. See Google Scholar for the papers. We might get some inspiration from there.


  • Dr. Christian Brandes · February 5, 2012 at 7:30 pm

    Thank you, fellas, for launching this project.

    The overall goal of this survey should be: Give interested testers a good
    starting point for checking & comparing MBT approaches & tools (their
    different modeling concepts & generators etc.). Saying so, I have two other
    topics in mind: We need to convince testers that MBT offers great opportunities,
    for example by comparing the survey results with manually derived testcases
    for the given reference/problem, and we need to get more and more testers
    into modeling – which is reading models first and later creating them for
    testing purposes.

    I think it is necessary to create several reference problems – covering different
    domains (business software, embedded software, …) and giving tool vendors
    the chance to demonstrate their strenghts. And maybe there should be one simple
    initial problem – not necessarily the car configurator 😉 – because I have found
    that there appear fundamental differences in MBT tools already when watching
    simple and elementary examples. This leads to the crucial topic “acceptance
    of models and tool handling”: Is the test model readable and easy to understand?
    Is it intuitively created within the tool’s model editor? These basic things.

    And the problems definitively should leave degrees of freedom to vendors,
    e.g. the choice of modeling notation. Each problem should give a SUT, a spec
    and a test goal – nothing more. But it should demand several artifacts: a
    test model and the generated test cases. And a video showing how both are
    created with the MBT tool – with this video being confirmed by the organizers
    to be replicable.

    Just my 2 cents,


  • Author comment by Stephan Weißleder · February 6, 2012 at 9:06 pm

    Hi all,

    in the survey, there is always the point of identifying the aspects that are somehow better supported by one MBT tool … better supported than by another tool?
    The problem is how to measure these things. We already proposed several approaches (coverage, mutation, etc.).

    But what about including the “human factor”?
    The idea is to get groups of students into this survey.
    For instance, they do not favor one special MBT tool, which seems to be a fair starting ground.
    Furthermore, they can also act as referees.

    One example:
    We have two groups of students: A and B.
    We further separate both groups:
    Now, A is split up in three groups of students (with different MBT tools) that try to create models and automatically generate tests from it.
    The groups from A compete. And they are judged from the students in B.
    The criteria for this don’t even necessarily need to be identified by us beforehand …

    We might also think that the price for the winner groups from A and B can be founded, e.g., by the working group for the testing of object-oriented programs and model-based testing (AK TOOP/MBT) of the Gesellschaft für Informatik ( ).


  • Bruno Legeard · February 11, 2012 at 9:20 am

    Dear All,

    Part of the debate and difficulty to define a MBT tool survey comes from the fact that MBT tools can better be compared with complex IDE – Integrated Development Environments (in our case, specialized to test design) than be compared to tools like model-checkers, proof tools or constraint solvers.

    For example, in the SMT-COMP: The Satisfiability Modulo Theories Competition, this is clear: each tool competes in various categories (regarding the theories they cover) and the solvers can be compared relatively with the performance they show to achieve some proof;

    This is very different for MBT environments because the effectiveness (in terms of productivity or capability to detect faults in the SUT) highly depends on the capability of the user, for example in designing a good model but also in the definition of test objectives. Of course, one can evaluate the expressiveness of the input modeling language, the performance of the test generation engine, but easy to use characteristics are also very important. So, as explain before by Stephan, the human factor is important.

    Another important issue is that, like IDEs, MBT environments should be specialized regarding to the targeted domain. A MBT environment for real-time embedded systems strongly differs with a MBT environment for Enterprise application software (Information systems). Many things differ: the nature of the SUT impacts the modeling language, the testing context and the integration of the MBT tool in the software development life-cycle, etc…

    To summarize, to realize an adequate survey of MBT environments, we should solve:
    • The question of knowledge and pre-requisite for the evaluators (to avoid biases)
    • The question of the targeted domain (Embedded systems, Enterprise application software, protocol software) that imply to define cases in each category;

    This is why, most companies that want to evaluate MBT solutions, use a pilot project approach, involving their own testing team on their own project.


  • Heinz-Jürgen Scherer · March 6, 2012 at 7:44 pm

    In my point of view the survey should primarily focus the business or technical domain and secondly the use-cases covered by the MBT tool.
    E.g. we at are working for integration of model data and tool interoperability. We are using business process models for MBT driven integration and regression testing, especially in the ERP application context (SAP). Even if the sources of model can come from next any modeling tool, many requirements, conditions and formal aspects have to be fulfilled that makes it hard to use any other tool for this domain and given use-cases.


  • Author comment by rbinder · March 9, 2012 at 5:05 pm

    I’ve been working on developing a benchmark for MBT tools. I think the TPC (Transaction Processing Council) approach used to benchmark database systems could work.

    Essentially, the TPC defines generic requirements in process similar to RFC or standards development. The members of the TPC are tool producers and major consumers.

    The producers develop a benchmark suite, then submit it to the TPC to be run under controlled conditions. The TPC independently validates the results and then publishes this.

    The TPC has evolved a family of database benchmarks over many years. They are accepted as valid by all producers and consumers.

    Some adaptations would be needed for MBT. Here are the essential elements I see.

    1) Define abstract requirements that could be implemented in any test model. This will be a challenge, but my recent work with Microsoft indicates that this can be done using a text document format similar to an RFC.

    2) Provide a reference implementation for the requirements. This would also define a baseline environment stack.

    3) Use mutation to seed the reference implementation with defects. I think it would be useful to have both a published set of mutations and a stochastic strategy to generate mutations, as part of the benchmark definition.

    4) Provide a common set of PCOs (abstract adapters) for the reference implementation.

    Tool providers would develop a test suite for the reference implementation and its PCOs. This would be run on the baseline reference implementation, the standard mutant version, and the stochastic mutant. Validated results would be published.

    I asked for suggestions about reference implementations in the LinkedIn MBT forum several months ago, and did not get much interest. It is clear that reference implementations would be needed for at least transaction processing/human interface (typical business apps), embedded/reactive/real-time, and component/frameworks/libraries. There are of course many others.

    I think reference apps should be stable, widely used, and have no-strings attached source code license. The PCOs could be contributed as open source. The mutated code could be provided by volunteers. As the reference app will get a lot very good free testing, I expect we might be able to get some cooperation from their core maintainers.

    Here are some possible sources for reference apps:

    Java Pet Store

    I’ve poked around to try to find suitable embedded apps (relatively simple and open source would be ideal), but haven’t located anything promising yet.

    There are a large collection of component libraries, for example:

    The most important task is finding some volunteers who agree on an approach, and getting some sponsors to bootstrap the process.


Leave a Reply



Theme Design by
© by MBT Community 2011