May 19-22, 2013

Omni Mont-Royal Hotel, Montreal, Canada

Accepted Papers with Abstracts

  • Number of Full Papers Submitted: 75
  • Number of Full Papers Accepted: 29
  • Number of Work-in-Progress Papers Accepted: 11
  • Number of Invited Papers: 3
Tower of London British Museum Natural History Museum Science Museum National Gallery London Eye

Tower of London

It is Her Majesty's Royal Palace and Fortress, a historic castle located on the north bank of the River Thames in central London.

British Museum

Founded in 1753, the British Museum’s remarkable collection spans over two million years of human history.

Natural History Museum

It displays a collection of specimens from various segments of natural history.

Science Museum

Founded in 1857, it houses a collection of more than 15,000 objects.

National Gallery

Founded in 1824, it displays over 2,300 paintings dating from the mid-13th century to 1900.

London Eye

Erected in 1999, it is a giant Ferris wheel also known as the Millennium Wheel on the South Bank of the River Thames in London.

High-Fidelity Battlefield Simulation Support Military Simulation Simulations Prepare Marine Corps for War Virtual Reality Simulation Virtual Reality Driving Simulator Virtual Reality Medical Simulator
themed object
ACM SIGSIM Conference on Principles of Advanced Discrete Simulation (PADS)
get in touch


Accepted Papers with Abstracts


Accelerating Optimistic HLA-based Simulations in Virtual Execution Environments

Zengxiang Li, Xiaorong Li, Ta Nguyen Binh Duong, Wentong Cai and Stephen Turner

High Level Architecture (HLA)-based simulations employing optimistic synchronization allows federates to process event and to advance simulation time freely at the risk of over-optimistic execution and execution rollbacks. In this paper, an adaptive resource provisioning system is proposed to accelerate optimistic HLA-based simulations in Virtual Execution Environment (VEE). A performance monitor is introduced using a middleware approach to measure the performance of individual federates transparently to the simulation application. Based on the performance measurements, a resource manager distributes the available computational resources to the federates, making them advance simulation time with comparable speeds. Our proposed approach is evaluated using a real-world simulation model with various workload inputs and different parameter settings. The experimental results show that, compared with distributing resources evenly among federates, our proposed approach can accelerate the simulation execution significantly using the same amount of computational resources.

A Marine Traffic Simulation System for Hub Ports

Shell Ying Huang, Wen Jing Hsu, Hui Fang and Tiancheng Song

Ensuring congestion-free marine traffic is crucial for hub ports in the world. For these hub ports, there has been increasing demand in marine transport. Therefore a capacity-assessment tool that models and simulates the navigational network, the traffic flows and complex navigational behaviors of vessels is needed. The simulation model presented in this paper is unique in the unprecedented number of vessels to be handled, the scale and complexity of the waterway networks covered, and the degree of accuracy demanded of navigational behaviors. As such, none of the existing models and simulation tools is adequate for assessing the waterway capacity. The model was calibrated based on detailed analysis of historical records and consultations with domain experts. Simulation results were further verified against additional radar data. The simulation system built has laid a useful foundation for planning future marine traffic for hub ports.

Approximate Parallel Simulation of Web Search Engines

Mauricio Marin, Veronica Gil Costa, Carolina Bonacic and Roberto Solar

Large scale Web search engines are complex and highly optimized systems devised to operate on dedicated clusters of processors. Any, even a small, gain in performance is beneficial to economical operation given the large amount of hardware resources deployed in the respective data centers. Performance is fully dependent on users behavior which is featured by unpredictable and drastic variations in trending topics and arrival rate intensity. In this context, discrete-event simulation is a powerful tool either to predict performance of new optimizations introduced in search engine components or to evaluate different scenarios under which alternative component configurations are able to process demanding workloads. These simulators must be fast, memory efficient and parallel to cope with the execution of millions of events in small running time on few processors. In this paper we propose achieving this objective at the expense of performing approximate parallel simulation. The experimental evaluation shows that our approximate algorithms achieve good statistic agreement with results from sequential simulations of the same search engine realizations.

Hierarchical Interest Management for Distributed Virtual Environments

Ke Pan, Xueyan Tang, Wentong Cai, Suiping Zhou and Hanying Zheng

An Interest Management (IM) mechanism eliminates irrelevant status updates transmitted in Distributed Virtual Environments (DVE). This paper proposes a new hierarchical IM mechanism for DVEs. The hierarchical mechanism divides the virtual world into multiple levels of cells and keeps the relationship between an entity and an Area-Of-Interest (AOI) at a particular cell level according to their relative position. As their relative position changes, the relationship level is updated accordingly. Compared with the traditional area-based and cell-based mechanisms, the proposed hierarchical mechanism signicantly reduces the communication bandwidth consumption of IM and thus considerably improves the scalability of DVEs. In addition, the proposed mechanism also has much lower computation cost than the traditional mechanisms and very acceptable storage requirement for its data structures.

Software Test Automation using DEVSimPy Environment

Laurent Capocchi, Jean Francois Santucci and Thimothée Ville

The paper deals with test automation of GUI (General User Interface) software using simulations. The development of GUI software requires a great amount of time and cost concerning the testing aspects. In order to facilitate and speed up the testing of such GUI software an approach based on discrete-event modeling and simulation is proposed. Traditionally, the GUI software test automation approaches require the development of testing procedures which are fastidious to carry on. The idea is to perform test automation of GUI software by integrating of existing GUI software testing environments within a DEVS (Discrete EVent system Specification) formalism framework called DEVSimPy. The proposed approach is validated on a real application dealing with medical software which have to respect very strict formats defined by French governmental institutions.

A Time Management Optimization Framework for Large-Scale Distributed Hardware-In-The-Loop Simulation

Wei Dong

Large-scale distributed HIL(Hardware-In-The-Loop) simulation is an important and indispensable method for testing and verifying complex engineering systems. An important necessary condition for realizing HIL simulation is that the speedup ratio of full-speed simulation must be greater than 1, and satisfying this condition becomes more and more difficult with the ceaselessly increasing scale of simulation. Aiming at the problem how to maximizing the speedup ratio, a time management optimization framework for large-scale distributed HIL simulation is proposed in this paper. Different from other works on performance optimization of HIL simulation, in this paper, the problem is focused on simulation speedup ratio and is considered in the range of analysis simulation, which means causal abnormity is intolerable. According to this goal, a new formal description framework of distributed simulation is given based on the automata theory. Then the basic objective and condition of distributed simulation is formally reanalyzed, which results in the conclusion that the classical Local Causality Constraint for distributed simulation is only a sufficient condition rather than sufficient and necessary condition. Based on this, the optimization problem for simulation speedup ratio is radically analyzed and the overall strategy for this problem is given. Considering different conditions, two different levels of optimization mechanisms respectively for time advancing and task partition are given. And finally, the application and experiment results show the effectiveness of the proposed method.

Post-mortem Analysis of Emergent Behavior in Complex Simulation Models

Claudia Szabo and Yong Meng Teo

Analyzing and validating emergent behavior in component-based models is increasingly challenging as models grow in size and complexity. Despite increasing research interest, there is a lack of automated, formalized approaches to identify emergent behavior and its causes. As part of our integrated framework for understanding emergent behavior, we propose a post-mortem emergence analysis approach that identifies the causes of emergent behavior in terms of properties of the composed model and properties of the individual model components, and their interactions. In this paper, we detail the use of reconstructability analysis for post-mortem analysis of known emergent behavior. The two-step process first identifies model components that are most likely to have caused emergent behavior, and then analyzes their interaction. Our case study using small and large examples demonstrates the applicability of our approach.

An Expansion-aided Synchronous Conservative Time Management Algorithm on GPU

Wenjie Tang, Yiping Yao and Feng Zhu

The graphic processing unit (GPU) brings an opportunity to implement large scale simulations in an economical way. GPU’s performance relies on high parallelism, but using synchronous conservative time management algorithm for discrete event simulation will meet the scenarios with limited parallelism. This conflict leads to bad performance even though the application itself has high parallelism. To solve this problem, we propose an expansion-aided synchronous conservative time management algorithm. It uses runtime information to enlarge the time bound of “safe” events, and uses an expansion method to import “safe” events. By interleaving a series of expansions with event computation, more events can be assembled to be processed in parallel. Moreover, a simulated annealing algorithm is adopted to control the number of expansions. It helps achieve stable performance under different conditions by finding a balance between low parallelism and unnecessary expansions. Experiments demonstrate that the proposed algorithm can achieve up to a 30% performance improvement and obtain up to 60x speedup over CPU-based simulation.

Supporting End-to-End Internet QoS for DDS-based Large-Scale Distributed Simulation

Akram Hakiri, Pascal Berthou, Slim Slim Abdellatif, Michel Diaz and Thierry Gayraud

Supporting end-to-end quality-of-service (QoS) in Large-scale distributed interactive simulations (DIS) is hard due to the heterogeneity and scale of communication networks, transient behavior, and the lack of mechanisms that holistically schedule different resources end-to-end. This paper aims to cope with these problems in the context of wide area network (WAN)-based DIS applications that use the OMG Data Distribution Service (DDS) QoS-enabled publish/\-subscribe middleware. First, we show the design and implementation of the QoS framework, which is a policy-driven architecture that shields DDS-based DIS applications from the details of network QoS mechanisms by specifying per-flow network QoS requirements, performing resource allocation and validation decisions (such as admission control), and enforcing per-flow network QoS at runtime. Second, we evaluate the capabilities of the framework in an experimental large-scale multi-domain environment. The evaluation of the architecture shows that the proposed QoS framework improves the delivery of DDS services over heterogeneous IP networks, and confirms its potential impact for providing network-level differentiated performance.

Towards Performance Evaluation of Conservative Distributed Discrete-Event Network Simulations Using Second-Order Simulation

Philipp Andelfinger and Hannes Hartenstein

Whether a given simulation model of a computer network will benefit from parallelization is difficult to determine in advance, complicated by the fact that hardware properties of the simulation execution environment can substantially affect the execution time of a given simulation. We describe SONSim, an approach to predict the execution time based on a simulation of an envisioned distributed network simulation (second-order simulation). SONSim takes into account both network model characteristics and hardware properties of the simulation execution environment. To show that a SONSim prototype is able to predict distributed performance with acceptable accuracy we study three reference network simulation models differing fundamentally in topology and levels of model detail – simple topologies comprised of interconnected subnetworks, peer-to-peer networks and wireless networks. We evaluate the performance predictions for multiple configurations by comparing predictions for the three reference network models to execution time measurements of distributed simulations on physical hardware using both Ethernet and InfiniBand interconnects. In addition, utilizing the freedom to vary simulation hardware and model parameters in the second-order simulation, we demonstrate how SONSim can be used to identify general model characteristics that determine distributed simulation performance.

Research and Application on Ontology-based Layered Cloud Simulation Service Description Framework

Tan Li, Baocun Hou, Xudong Chai and Bohu Li

Cloud simulation system improves the ability of current network-based M&S in on-demand simulation and massive–user service. The share of multi-granularity resources and dynamic establishment of simulation services in Cloud Simulation raise new challenges to the Simulation Service Description Framework (SSDF). An ontology-based layered SSDF (OLSSDF) was proposed towards those challenges in cloud simulation, which includes the layered architecture of cloud simulation service and the ontology semantics of each layer in the framework which were defined and formalized in OWL-S. The OLSSDF for cloud simulation was applied and in the description of simulation services in certain cloud simulation system prototype of aero plane. The primary research and application show that the OLSSDF oriented to cloud simulation, which describes cloud simulation services in both attribute-semantic and model-semantic, adapts well to the various multi-granularity simulation resources and facilitates the intelligent discovery and automatic combination of simulation services in the cloud simulation mode.

Event Pool Structures for PDES on Many-Core Beowulf Clusters

Tom Dickman, Sounak Gupta and Philip Wilsey

Multi-core and many-core processing chips are becoming widespread and are now being widely integrated into Beowulf clusters. This poses a challenging problem for distributed simulation as it now becomes necessary to extend the algorithms to operate on a platform that includes both shared memory and distributed memory hardware. Furthermore, as the number of on-chip cores grows, the challenges for developing solutions without significant contention for shared data structures grows. This is especially true for the pending event list data structures where multiple execution threads attempt to schedule the next event for execution. This problem is especially aggravated in parallel simulation, where event executions are generally fine-grained leading quickly to non-trivial contention for the pending event list. This manuscript explores the design of the software architecture and several data structures to manage the pending event sets for execution in a Time Warp synchronized parallel simulation engine. The experiments are especially targeting multi-core and many-core Beowulf clusters containing 8-core to 48-core processors. These studies include a two-level structure for holding the pending event sets using three different data structures, namely: splay trees, the STL multiset, and ladder queues. Performance comparisons of the three data structures using two architectures for the pending event sets are presented.

Modeling Communication Software Execution for Accurate Simulation of Distributed Systems

Stein Kristiansen, Thomas Plagemann and Vera Goebel

Network simulation is commonly used to evaluate the performance of distributed systems, but these approaches do not account for the performance impact that protocol execution on nodes has on performance, which may be significant. We propose a methodology to capture execution models from communication software running on real devices that can be integrated with discrete event network simulators to improve their accuracy. We provide a set of rules to instrument the software to obtain the events of importance, and present the techniques to create executable models based on the obtained traces. To make the models scalable, processing stages are reduced to statistical distributions. When the resulting models are executed in a device model with a scheduler simulator, we are able to model the dynamics of multithreading and parallel execution. Our initial results from a proof-of-concept extension to Ns-3 show that our models are able to accurately model protocol execution on the Google Nexus One with low simulation overhead.

Leveraging Symbiotic Relationship Between Simulation and Emulation for Scalable Network Experimentation

Miguel Erazo and Jason Liu

A testbed capable of representing detailed operations of complex applications under diverse large-scale network conditions can be extremely helpful for investigating potential system design and implementation problems, and studying application performance issues, such as scalability and robustness, even before the applications are deployed in a real environment. We introduce a novel method that combines high-performance large-scale network simulation and high-fidelity network emulation, and thereby enables real instances of network applications and protocols to run in real operating environments, and be tested under large-scale simulated network settings. In our approach, network simulation and emulation form a symbiotic relationship, through which they are synchronized for an accurate representation of the large-scale traffic behavior. We introduce a model downscaling method, along with an efficient queuing model and a traffic reproduction technique, which can significantly reduce the synchronization overhead and improve computational efficiency, while maintaining the accuracy of the system. We validate our approach with extensive experiments via simulation and with a real-system prototype.

Semi-Automatic Extraction of Software Skeletons for Benchmarking Large-Scale Parallel Applications

Matthew Sottile, Amruth Dakshinamurthy, Gilbert Hendry and Damian Dechev

The design of high-performance computing architectures requires performance analysis of large-scale parallel applications to derive various parameters concerning hardware design and software development. The process of performance analysis and benchmarking an application can be done in several ways with varying degrees of fidelity. One of the most cost-effective ways is to do a coarse-grained study of large-scale parallel applications through the use of program skeletons. The concept of a "program skeleton" that we discuss in this paper is an abstracted program that is derived from a larger program where source code that is determined to be irrelevant is removed for the purposes of the skeleton. In this work, we develop a semi-automatic approach for extracting program skeletons based on compiler program analysis. We demonstrate correctness of our skeleton extraction process as well as show the performance speedup of using skeletons by comparing trace files derived from executing a large-scale parallel program and its program skeleton on the SST/macro simulator.

Formalization of Emergence in Multi-agent Systems

Yong Meng Teo, Ba Linh Luong and Claudia Szabo

Emergence is a distinguishing feature in systems, especially when complexity grows with the number of components, interactions and connectivity. There is immense interest in emergence, and a plethora of definitions from philosophy to sciences. Despite this, there is a lack of consensus on the definition of emergence and this hinders the development of a formal approach to understand and predict emergent behavior in multi-agent systems. This paper proposes a grammar-based approach to formalize and verify the existence and extent of emergence without prior knowledge or definition of emergent properties. Our approach is based on weak (basic) emergence that is both generated and autonomous from the underlying agents. In contrast with current work, our approach has two main advantages. By focusing only on system interactions of interest and feasible combinations of individual agent behavior, state-space explosion is reduced. In formalizing emergence, our extended grammar is designed to model agents of diverse types, mobile agents and open systems. Theoretical and experimental studies using the boids model demonstrate the complexity of our formal approach.

GPU Accelerated Three-stage Execution Model for Event-Parallel Simulation

Xiaosong Li, Wentong Cai and Stephen Turner

This paper introduces the concept of event-parallel discrete event simulation (DES) and its corresponding implementation on the GPU platform. Inspired by the typical spatial-parallel DES and time-parallel DES, the event-parallel approach on GPU uses each thread to process one of the N events, where N is the total number of events. By taking the advantage of the high parallelism of GPU threads, this approach achieves greater speedup. GPU architecture is adopted in the execution of the event-parallel approach, so as to take the advantages of parallel processing capability provided by the massively large amount of GPU threads. A three-stage execution model composing of generating events, sorting events and processing events in parallel is proposed. This execution model achieves reasonable speedup. Compared with event scheduling approach on CPU, we achieve up to 22.80 speedup in our case study.

Designing Computational Steering Facilities for Distributed Agent Based Simulations

Gennaro Cordasco, Rosario De Chiara, Francesco Raia, Vittorio Scarano, Carmine Spagnuolo and Luca Vicidomini

Agent-Based Models (ABMs) are a class of models which, by simulating the behavior of multiple agents (i.e., independent actions, interactions and adaptation), aim to emulate and/or predict complex phenomena. One of the general features of ABM simulations is their experimental capacity, that requires a viable and reliable infrastructure to interact with a running simulation, monitoring its behaviour, as it proceeds, and applying changes to the configurations at run time, (the computational steering) in order to study “what if” scenarios.A common approach for improving the efficiency and the effectiveness of ABMs as a research tool is to distribute the overall computation on a number of machines, which makes the computational steering of the simulation particularly challenging. In this paper, we present the principles and the architecture design of the management and control infrastructure that is available in D-Mason, a framework for implementing distributed ABM simulations. Together with an efficient parallel distribution of the simulation tasks, D-Mason offers a number of facilities to support the the computational steering of a simulation, i.e. monitoring and interacting with a running distributed simulation. Both performances of this mechanism and its implementation are also briefly discussed.

Supporting Robust System Analysis with the Test Matrix Tool Framework

Edward Clarkson, Jennifer Hurt, Jason Zutty, Christopher Skeels, Brian Parise and Greg Rohling

We present the Test Matrix Tool (TMT) framework, a simulation-agnostic framework providing end-to-end support for robust analysis of complex systems. Often driven by Design of Experiments-or similar methodologies, the need to execute a large number of simulations is common to many problem environments. TMT addresses key end-user needs in easing the specification, execution and analysis of simulation workloads in ways that are consistent between specific applications of the framework. The TMT design contributes modular specifications for key data communicated between and within the specification, execution and analysis components. Our TMT implementation is an instantiation of those formats freely available for general use. TMT’s data analysis component provides a variety of features—data filtering, comparison, transformation and visualization—for analytic tasks on any type of TMT-embedded model. We provide a brief case study as an example of its use in a real-world application.

TerraME HPA: Parallel Simulation of Multi-Agent Systems over SMPs

Saulo Cabral Silva, Tiago Garcia Carneiro, Joubert Castro Lima and Rodrigo Reis Pereira

Construction of prognoses about environmental changes demands simulations of massive multi-agent models. This work evaluates the hypothesis that the combined use of techniques such as annotation and bag of tasks can result in flexible and scalable platforms for multi-agent simulation. For this, the TerraME modeling platform was extended to run over SMPs (Symmetric Multiprocessors) architectures and used in real case studies.While annotation allows modelers to implement different parallelization strategies without prevent models to run over sequential architectures, the bag of tasks provides load balancing over multiprocessors. The results demonstrated that 65% of linear speedup can be obtained for models with high dependence among tasks, when 8 processors are used. Moreover, for models that have low data or control dependencies, around 90% of linear speedup can be obtained.

Can PDES Scale in Environments with Heterogeneous Delays?

Jingjing Wang, Ketan Bahulkar, Dmitry Ponomarev and Nael Abu-Ghazaleh

The performance and scalability of Parallel Discrete Event Simulation (PDES) is often limited by communication latencies and overheads. The emergence of multi-core processors and their expected evolution into many-cores offers the promise of low latency communication and tight memory integration between cores; these properties should significantly improve the performance of PDES in such environments. However, on clusters of multi-cores (CMs), the latency and processing overheads incurred when communicating between different machines (nodes) far outweigh those between cores on the same chip, especially when commodity networking fabrics and communication software are used. It is unclear if there is any benefit to the low latency among cores on the same node given that some communication links are significantly worse. In this study, we examine the performance of multithreaded implementation of PDES on CMs. We demonstrate that the inter-node communication costs impose a substantial bottleneck on PDES and demonstrate that without optimizations addressing these long latencies, multithreaded PDES does not significantly outperform the multiprocess version despite direct communication through shared memory on the individual nodes. We then propose three optimizations: message consolidation and routing, infrequent polling and latency-sensitive model partitioning. We show that with these optimizations in place, threaded implementation of PDES significantly outperforms process-based implementation even on CMs.

Interference Resilient PDES on Multi-core Systems: Towards Proportional Slowdown

Jingjing Wang, Nael Abu-Ghazaleh and Dmitry Ponomarev

Parallel Discrete Event Simulation (PDES) harnesses the power of parallel processing to improve the performance and capacity of simulation, supporting bigger models, in more details and for more scenarios. PDES engines are typically designed and evaluated assuming a homogeneous parallel computing system that is dedicated to the simulation application. In this paper, we first show that the presence of interference from other users, even a single process in an arbitrarily large parallel environment, can lead to dramatic slowdown in the performance of the simulation. We define a new metric, which we call proportional slowdown, that represents the idealized target for graceful slowdown in the presence of interference. We identify some of the reasons why simulators fall far short of proportional slowdown. Based on these observations, we design alternative simulation scheduling and mapping algorithms that are better able to tolerate interference. More precisely, the most resilient simulators will allow dynamic mapping of simulation event execution to processing resources (a work pool model). However, this model has significant overhead and can substantially impact locality. Thus, we propose a locality-aware adaptive dynamic-mapping (LADM) algorithm for PDES on multi-core systems. LADM reduces the number of active threads in the presence of interference, avoiding having threads disabled due to context switching. We show that LADM can substantially reduce the impact of interference while maintaining memory locality reducing the gap with proportional slowdown. LADM and similar techniques can also help in situations where there is load imbalance or processor heterogeneity.

Optimizing Parallel Simulation of Multicore Systems Using Domain-Specific Knowledge

Jun Wang, Zhenjiang Dong, Sudhakar Yalamanchili and George Riley

This paper presents two optimization techniques for the basic Null-message algorithm in the context of parallel simulation of multicore computer architectures. Unlike the general, application-independent optimization methods, these are application-specific optimizations that make use of system properties of the simulation application. We demonstrate in two aspects that the domain-specific knowledge offers great potential for optimization. First, it allows us to send Null-messages much less eagerly, thus greatly reducing the amount of Null-messages. Second, the internal state of the simulation application allows us to make conservative forecast of future outgoing events, and by combining the forecast from both side of a link we can greatly improve the simulation look-ahead. Compared with the basic Null-message algorithm, our optimizations greatly reduce the number of Null-messages and increase simulation performance significantly as a result.

Dynamic Resolution in Distributed Cyber-Physical System Simulation

Dylan Pfeifer, Andreas Gerstlauer and Jonathan Valvano

Cyber-physical systems challenge distributed simulation techniques for reasons of the heterogeneous tools used to model system components at different levels of abstraction, each with potentially different notions of time. The SimConnect and SimTalk distributed cyber-physical system simulation tools meet the synchronization challenge of distributed simulation, but also offer dynamic resolution among coordinated simulators for tradeoffs in simulation speed versus accuracy. This paper discusses the dynamic resolution capabilities of SimConnect and SimTalk, and evaluates the tools in distributed simulation of a closed-loop motor control system. Results show selectable tradeoffs in speedup versus accuracy over non-dynamic coordination.

Reducing Simulation Costs in Embedded Simulation in Yard Crane Dispatching in Container Terminals

Shell Ying Huang and Xi Guo

Embedding simulation in optimization algorithms will incur computational costs. For NP-hard problems the computational costs of the embedded simulation in the optimization algorithm are likely to be substantial. YC dispatching is NP-hard. So it is very important to be able to minimize simulation costs in YC dispatching algorithms. In the optimization algorithm for yard crane dispatching published, simulation of YC operations of the entire (partial) sequence of YC jobs are carried out each time the tardiness of a (partial) sequence needs to be evaluated. In this paper we study two approaches to reduce simulation costs in these embedded simulations in the optimization algorithm. Experimental results show that one approach significantly reduces the computational time of the optimization algorithm. We also analyze the reasons for the other approach which fails to reduce the computational time.

A Generic Adaptive Simulation Algorithm for Component-based Simulation Systems

Tobias Helms, Roland Ewald, Stefan Rybacki and Adelinde M. Uhrmacher

The state of a model may strongly vary during simulation, and with it also the simulation’s computational demands. Adapting the simulation algorithm to these demands at runtime can therefore improve the overall performance. Although this is a general and cross-cutting concern, only few simulation systems offer re-usable support for this kind of runtime adaptation. We present a flexible and generic mechanism for the runtime adaptation of component-based simulation algorithms. It encapsulates simulation algorithms applicable to a given problem and employs reinforcement learning to explore the algorithms’ suitability during a simulation run. We evaluate the approach by executing models from two modeling formalisms used in computational biology.

Topological Computation of Activity Regions

Martin Potier, Antoine Spicher and Olivier Michel

Most of the frameworks and languages available in the field of modeling and simulation of dynamical systems focus on the specification of the state of the system and its transition function. Although we believe that this task has been elegantly solved by the design of the rule-based topological programming language MGS, an interesting challenge remains in the computation of the activity, and its topology, exhibited by their discrete event simulation. This additional information can help in optimize, analyze and model complex systems. After a short introduction, we start by introducing the basics of MGS theory. Several examples are given to support our claim on the versatily of the topological approach chosen by MGS. Then, we introduce the notion of activity and activity tracking and show how this notion can be used to optimise the pattern matching process of MGS. The conclusion offers research directions opened by the computation of activity regions.

Hybrid Scheduling for Event-driven Simulation over Heterogeneous Computers

Bilel Ben Romdhanne, Mohamed Said Mosli Bouksiaa, Navid Nikaein and Christian Bonnet

In this work we propose a new scheduling approach designed from scratch to maximize heterogeneous computer usage and the event processing flow at the same time. The scheduler is built based on three fundamental concepts which introduce a new vision of discrete event simulation: 1) events are clustered according to their potential time parallelism on one hand and to their potential process and data similarity on the other hand. 2) events’ meta-data is enhanced with additional descriptor which simplifies and accelerates the scheduling decision. 3) the simulation is hybrid time-event driven rather than time or event driven. The concretization of our approach is denoted the H-scheduler which uses several processes to manage the event flow. Furthermore we propose a dynamic scheduling optimization which aims to further maximize the event throughput. The combination of those features allows the H-scheduler to provide the highest efficiency rate compared to the majority of GPU and CPU schedulers. In particular it goes beyond the default Cunetsim Scheduler by 90% in average while it keeps an advanced backward with existing simulators.

Consistent and Efficient Output-Streams Management in Optimistic Simulation Platforms

Francesco Antonacci, Alessandro Pellegrini and Francesco Quaglia

Optimistic synchronization is considered an effective means for supporting efficent (distributed) Parallel Discrete Event Simulations. It relies on a speculative approach, where simulation events are executed regardless of their safety, and consistency is ensured via proper rollback mechanism, upon the a-posterior detection of causal inconsistency along the events' execution path. Interactions with the outside world (e.g., generation of output streams) are a well-known problem for rollback based systems, due to the fact that the outside world may have no notion of rollback. In this context, approaches for allowing the simulation modeler to generate consistent output rely on either the usage of ad-hoc APIs (which must be provided by the underlying simulation kernel) or temporary suspension of processing activities in order to wait for the final outcome (commit/rollback) associated with a speculatively produced output. In this paper we present design indications and a reference implementation for an output streams' management subsystem which allows the simulation model writer to rely on standard output-generation libraries (e.g. {\sf stdio}) within code blocks associated with event processing. Further, the subsystem ensures that the output produced within the parallel/distributed run is consistent, namely system wide timestamp ordered. The above features jointly provide the illusion of a classical (simple to deal with) sequential programming model, which spares the developer from being aware that the simulation program is run concurrently and speculatively. By the results of an experimental assessment we also show how the design/development optimizations we present lead to minimal overhead to the actual simulation run, giving rise to the situation where the run would have been carried out with near-to-zero output management cost. At the same time, the delay for materializing the output stream (making it available for any type of audit activity) is shown to be fairly limited and constant, independently of whether the application exhibits I/O vs CPU bound profile. Further, the whole output streams' management subsystem has been designed in order to provide scalability for I/O management on clusters of multi-core machines, which is achieved via a multi-level stream processing scheme.

Discrete Event Design Patterns

Maamar Hamri, Rabah Messouci and Claudia Frydman

In this paper we highlight techniques from software engineering to design and code the behaviors of object. Historically, design patterns are well-suitable to structure and archive solutions to recurrent coding problems. In fact the design patterns enhance interesting technical features: code readability, maintainability and safety, easy communication among designers, etc. Thus, the DEVS designers may take profit from this technique to design simulations. After a review of behavioral design patterns, we propose the state event design pattern to design basic behaviors described with state machines. In this pattern we objectify events in addition to states. Then, we generalize this pattern to DEVS behaviors.

A Context-driven Approach to Scalable Human Activity Simulation

Jae Woong Lee, Sumi Helal, Yunsick Sung and Kyungeun Cho

As demands for human activity recognition technology increase, simulation of human activities for training and testing purposes is becoming increasingly important. Traditional simulation, however, is based on an event-driven approach, which focuses on single sensor events and models within a single human activity. It requires detailed description and processing of every low-level event that enters into an activity scenario. For many realistic and complex human scenarios, the event-driven approach burdens the simulator users with complicated low-level specifications required to configure and run the simulation. It also increases computational complexity and impedes scalable simulation. We propose a novel, context-driven approach to simulating human activities in smart spaces. In the proposed approach, vectors of sensors rather than single sensor events drive the simulation quicker from one context to another. Abstracting the space state into contexts highly simplify the tasks and efforts of the simulation user in setting up and configuring the smart space and human activities. We present the context-driven simulation approach and show how it works. We present an architecture based on the Persim-3D system and provide a comparative performance study between the event- and context-driven simulation approaches.

Bayesian-based Scenario Generation Method for Human Activities

Yunsick Sung, Sumi Helal, Jae Woong Lee and Kyungeun Cho

Emerging smart space applications are increasingly relying on capabilities for recognizing human activities. Activity recognition research is however challenged and slowed by the lack of data necessary for testing and validation. Collecting data through live-in trials in real world deployments is often very expensive and complicated. Legitimate limitations on the use of human subjects also renders a much smaller dataset than desired to be collected. To address this challenge, we propose a scenario generation approach in which a small set of scenarios is used to generate new relevant and realistic scenarios, and hence increase the base of testing data needed for activity recognition validation. Unlike existing methods for generating scenarios, which usually focus on scenario structure and complexity, we propose a Bayesian-based approach that learns the stochastic characteristics of a small number of collected datasets to generate additional scenarios of similar characteristics. Our approach is prolific and can generate enormous datasets with high degree of realism at affordable cost. The proposed approach is validated using a Viterbi-based algorithm and a real dataset case study. The validation experiment confirms that the generated dataset has highly similar stochastic characteristics as that of the real dataset.

CoAP-Mediated Hybrid Simulation and Visualisation Environment for Specknets

Diana Alexandra Crisan, Ion Emilian Radoi and Dk Arvind

This paper describes an integrated hybrid simulation environment in which real devices interact in real-time with a discrete-event simulator and a 3-D visualisation engine. The simulated framework in which the communication between the real and virtual worlds is mediated by CoAP is a powerful tool for designers of Internet of Things (IoT) applications to assess design decisions ahead of deployment based on realistic data from on-body sensors and typical movement of people within built spaces. A motivating example is used to illustrate the capabilities of hybrid simulation based on a multi-residence housing facility intended for elderly people, each wearing an on-body speck with one or more sensors to monitor their state such as their breathing, heart-rate, and activity, and which transmits this information via a mesh network of base-stations to a central hub. The results demonstrate that design decisions can be made on the choice of routing protocols based on real-time transmission of data from people which captures their typical movement in a built environment and based on real data from on-body devices.

Empirical Evaluation of Conservative and Optimistic Discrete Event Execution on Cloud and VM Platforms

Srikanth Yoginath and Kalyan Perumalla

Virtual machine (VM) technologies, especially those offered via Cloud platforms, present new dimensions with respect to performance and cost in executing parallel discrete event simulation (PDES) applications. Due to the introduction of overall cost as a metric, the choice of the highest-end computing configuration is no longer the most economical one. Moreover, runtime dynamics unique to VM platforms introduce new performance characteristics, and the variety of possible VM configurations give rise to a range of choices for hosting a PDES run. Here, an empirical study of these issues is undertaken to guide an understanding of the dynamics, trends and trade-offs in executing PDES on VM/Cloud platforms. Performance results and cost measures are obtained from actual execution of a range of scenarios in two PDES benchmark applications on the Amazon cloud offerings and on a high-end VM host machine. The data reveals interesting insights into the new VM-PDES dynamics that come into play and also leads to counterintuitive guidelines with respect to choosing the best and second-best configurations when overall cost of execution is considered. In particular, it is found that choosing the highest-end VM configuration guarantees neither the best runtime nor the least cost. Interestingly, choosing a (suitably scaled) low-end VM configuration provides the least overall cost without adversely affecting the total runtime.

A Flexible Simulation Framework for Multicore Schedulers

Alex Aravind and Viswanathan Manickam

As the multicore processors are becoming the norm, parallel programming is expected to emerge as the mainstream software development approach. This new trend poses several challenges including performance, power management, system utilization, and predictable response. Such a demand is hard to meet without the cooperation of the hardware, the operating system, and the applications. Particularly, an efficient scheduling of cores to the application threads is fundamentally important in assuring the above mentioned characteristics. We believe the operating system has to take a larger responsibility of ensuring efficient scheduling of threads to multicore processors. To study the performance of a new scheduling algorithm for the future multicore systems with hundreds and thousands of cores, we need a flexible scheduling simulation testbed. Designing such a multicore scheduling simulation testbed and illustrating its functionality by studying some well known scheduling algorithms such as Linux and Solaris are the main contributions of this paper. The proposed scheduling simulation testbed is developed using Java and expected to be released for public use.

Data Assimilation in Agent Based Simulation of Smart Environment

Minghao Wang and Xiaolin Hu

Location information of occupants in a building can be applied in smart environment applications to help in accomplishing various tasks. Due to the nature of peoples’ movement patterns, simulating dynamics of the occupancy in a building normally requires a bottom-up approach that aggregates behaviors of each occupant together. Agent-based simulation is useful to study people’s movement in such scenario. These simulations provide valuable information about movement patterns of people in building structures and thus help the design and the development of egress strategies in emergency situations. However, traditional agent-based simulation is not dynamically data-driven and simulates in an offline manner. While more and more buildings nowadays are equipped with sensing device, it is possible to utilize the observation data regarding position information of occupants obtained from such devices to improve the simulation dynamically. In this paper, we propose a method to assimilate the real time sensor data in an agent-based simulation so as to obtain an accurate inference of people’s location information from sensing device to support energy control and emergency egress. We use particle filter algorithm to estimate the occupants locations in the framework and applied an alternative strategy to improve the performance of the estimation. This approach is general enough and can be adapted to other types of agent-based models and sensor data.

Parallel Stepwise Stochastic Simulation: Harnessing GPUs to Explore Possible Futures States of a Chromosome Folding Model Thanks to the Possible Futures Algorithm (PFA)

Jonathan Passerat-Palmbach, Jonathan Caux, Yannick Le Pennec, Romain Reuillon, Ivan Junier, François Kepes and David R.C. Hill

For the sake of software compatibility, simulations are sometimes parallelized without much code rewriting. While this approach can provide the expected speed-up, it is not a silver bullet. Particularly, when the bottleneck of the simulation is impossible to parallelize without breaking the correctness of the model, a new model must be designed from scratch. We apply such reshaping on a chromosome folding model, where the model can get stuck in the search for the next valid state to progress at some point in the simulation. The new parallel model proposes to compute several possible evolutions of the same state in parallel, thus increasing the probability to obtain a valid state at each step. We call this technique the possible futures approach. The new model leverages GPUs in order to hide the latency induced by the computation overhead of possible futures. Thanks to this totally different choice of design, compared to the initial sequential model, the acceptance rate of new states significantly increases, without impacting the execution time. Moreover, the model has proved to become more and more efficient with the size of the simulated chromosome, when the initial one would on the contrary encounter more and more troubles. Eventually, these results were obtained using Fermi architecture GPUs from NVIDIA, but the new model has recently shown improved performance on the cutting-edge Kepler architecture K20 GPUs.

Parallel Simulation of Software Defined Networks

Dong Jin and David Nicol

Existing network architectures fall short when handling networking trends, e.g., mobility, server virtualization, and cloud computing, as well as market requirements with rapid changes. Software-defined networking (SDN) is designed to transform network architectures by decoupling the control plane from the data plane. Intelligence is shifted to the logically centralized controller with direct programmability, and the underly infrastructures are abstracted from applications. The wide adoption of SDN in network industries motivates development of large-scale, high-fidelity testbeds for evaluation of systems that incorporate SDN. We leverage our prior work on a hybrid network testbed with a parallel network simulator and a virtual-machine-based emulation system. In this paper, we extend the testbed to support OpenFlow-based SDN simulation and emulation, and show how to exploit typical SDN controller behavior to deal with potential performance issues caused by the centralized controller in parallel discrete-event simulation, and investigate methods for improving the model scalability, including an asynchronous synchronization algorithm for passive controllers and a two-level architecture for active controllers. The techniques not only improve the simulation performance, but also are valuable for designing scalable SDN controller in real.

On the Parallel Simulation of Scale-Free Networks

Robert Pienta and Richard Fujimoto

Scale-free networks have received much attention in recent years due to their prevalence in many important applications such as social networks, biological systems, and the Internet. We consider the use of conservative parallel discrete event simulation techniques in network simulation applications involving scale-free networks. An analytical model is developed to study the parallelism available in simulations using a conservative time window synchronization algorithm. The performance of scale-free network simulations using two variants of the Chandy/Misra/Bryant synchronization algorithm are evaluated. These results demonstrate the importance of topology in the performance of synchronization protocols when developing parallel discrete event simulations involving scale-free networks, and highlight important challenges such as performance bottlenecks that must be addressed to achieve efficient parallel execution. These results suggest that new approaches to parallel simulation of scale-free networks may offer significant benefit.

Modeling and Simulation of Crowd using Cellular Discrete Event Systems Theory

Ronnie Farrell, Mohammad Moallemi, Sixuan Wang, Wang Xiang and Gabriel Wainer

In this paper, we discuss how Cellular Discrete Event System Specification (Cell-DEVS) theory can be used in modeling and simulation of the crowd. We will show that the efficient cell update mechanism of Cell-DEVS allows for more efficient entitybased simulation of the crowd compared to cellular automata. On the other hand the formal interfacing mechanisms provided by this theory allows for integration of other components such as DEVS atomic processing component or visualization and building information modeling components with the Cell-DEVS model. Finally, we describe in details of the design and development of several pedestrian models and present the results.

Simulation-based Verification of Hybrid Automata Stochastic Logic Formulas for Stochastic Symmetric Nets

Marco Beccuti, Elvio Gilberto Amparore, Susanna Donatelli, Benoît Barbot and Giuliana Franceschinis

The Hybrid Automata Stochastic Logic (HASL) has been recently defined as a flexible way to express classical performance measures as well as more complex, path-based ones (generically called “HASL formulas”). The considered paths are executions of Generalized Stochastic Petri Nets (GSPN), which are an extension of the basic Petri net formalism to define discrete event stochastic processes. The computation of the HASL formulas for a GSPN model is demanded to the COSMOS tool, that applies simulation techniques to the formula computation. Stochastic Symmetric Nets (SSN) are an high level Petri net formalism, of the colored type, in which tokens can have an identity, and it is well known that colored Petri nets allow one to describe systems in a more compact and parametric form than basic (uncolored) Petri nets. In this paper we propose to extend HASL and COSMOS to support colors, so that performance formulas for SSN can be easily defined and evaluated. This requires a new definition of the logic, to ensure that colors are taken into account in a correct and useful manner, and a significant extension of the COSMOS tool.

Grand Challenges in Modeling and Simulation: Expanding Our Horizons

Simon Taylor, Osman Balci, Wentong Cai, Margaret Loper, David Nicol and George Riley

There continues to be many advances in the theory and practice of Modeling & Simulation (M&S). However, some of these can be considered as Grand Challenges; issues whose solutions require significant focused effort across a community, sometimes with ground breaking collaborations with new disciplines. In 2002 the first M&S Grand Challenges Workshop was held in Dagstuhl, Germany, in an attempt to focus efforts on key areas. In 2012 a new initiative was launched to continue these Grand Challenge efforts. This third Grand Challenge Panel members present their views on M&S Grand Challenges. Themes presented in this panel include Modeling and Simulation Methodology; Agent-based Modeling and Simulation; Modeling and Simulation in Systems Engineering; Cyber Systems Modeling; and Network Simulation.

Warp Speed: Executing Time Warp on 1,966,080 Cores

Peter D. Barnes Jr., Christopher D. Carothers, David R. Jefferson and Justin M. Lapre

Time Warp is a parallel discrete-event simulation synchronization protocol that automatically uncovers the available parallelism in a model through its error detection and rollback recovery mechanism. In this paper, we present the performance results of ROSS executing the Time Warp synchronization protocol using up to 7.8M MPI tasks on 1,966,080 cores of the {Sequoia} Blue Gene/Q supercomputer system. For the PHOLD benchmark model, we demonstrate the ability to process 33 trillion events in 65 seconds yielding a peak event-rate in excess of 504 billion events/second using 120 racks of Sequoia. This is by far the highest event-rate reported by any massively parallel simulation to date running the PHOLD benchmark. In terms of overall speedup, we report 97x performance improvement when scaling from 32,768 to 1,966,080 cores. This super-linear performance is attributed to significant cache performance improvements when running at peak scale. From these performance results, we devise a new, long range performance metric, called {Warp Speed} which grows linearly with an exponential increase in the PHOLD event-rate. At present, we are now at {Warp Speed 2.7}. It will be nearly 150 years before we expect to reach {Warp Speed} 10.0.


slide up button