[ecoop-info] CFP: IEEE TARDIS2011 (1st International Workshop on fault Tolerant Architectures for Reliable Distributed Infrastructures and Services )

Parastoo Mohagheghi Parastoo.Mohagheghi at sintef.no
Tue Jun 14 15:41:59 CEST 2011

Dear Colleague,

We apologize if you receive this CfP multiple times!

1st International Workshop on fault Tolerant Architectures for Reliable Distributed Infrastructures and Services (TARDIS2011)
                                                     to be held at the:
4th IEEE International Conference on Utility and Cloud Computing (UCC 2011)
5th-8th December in Melbourne, Australia

Important Dates

 Deadline for contribution submissions: 17 July 2011
 Acceptance notification: 30 August 2011
 Deadline for camera ready contributions: 25 September 2011


“Not letting the Sky fall down” [1,2,3]

[1] http://www.zdnet.co.uk/blogs/mapping-babel-10017967/aws-disrupted-by-us-east-coast-failure-10022283/
[2] http://justinsb.posterous.com/aws-down-why-the-sky-is-falling
[3] http://aws.amazon.com/message/65648/

Cloud Computing has moved the center of gravity of application distributed execution, by exploiting vitualization at different layers and by adding a complexity level to the scheduling problem. While Cloud computing can bring more flexibility in the design of applications, it also arises new research challenges. Compared with the traditional method of dedicating one server to a single application, consolidation through virtualization can boost the resource utilization rate by aggregating workloads from separate machines into a small number of servers: workloads can be now executed in a dense environment using much less machines, in which the impacts of faults can be vastly magnified. For example, any single hardware failure will affect all the virtual servers in that physical machine, or under dynamic workloads, it may be difficult to distinguish real faults from normal system.

The need of this concept revisiting is fundamental when provisioning is left to public Cloud infrastructures, where an optimal budget must be met. Different strategies can be tailored, from hybrid architectures to service distribution across cloud providers. Additionally, cloud providers typically establish Service Level Agreements (SLAs) with their customers, and providers must also enforce the Quality of Service (QoS) in their infrastructures, under an unreliable and highly dynamic environment.

Cloud computing is playing an increasingly important role in current distributed computing, which involves a wide community. The Cloud provides a scalable, computational model where users access services based on their requirements without regard to where the services are hosted or how they are delivered: computing processing power, storage, network bandwidth or software usage can be provided as services over the Internet. In consequence, applications developed over such on-demand infrastructures can be built upon more flexible principles, being more fault tolerant, more resilient and more dynamic. Although fault tolerance in distributed systems has been a matter of research in the past that has generated a wide collection of algorithms for fault detection, identification and correction, these concepts will have to be re-visited in the context of Cloud computing.

Papers on all aspects of Fault tolerance and reliability in private, public and hybrid Clouds are expected, and the call is open to all members of the Grid and Cluster Computing community.


- Application-level (including workflows, or any other problem solving environment), Middleware-level or Virtual and Physical Resource-level fault tolerance techniques.
- Programing models for Cloud computing including Fault tolerance.
- Fault tolerance detection & identification techniques in Cloud computing.
- Fault diagnosis systems in Cloud computing.
- Fault Tolerance recovery techniques in Cloud computing.
- Cloud Computing Fault Taxonomies.
- Fault Prediction Techniques and Models in Cloud computing.
- Fault tolerance in resource provision (SLA level and provision policies).
- The relationship between Quality of Service and Fault tolerance in *aaS.
- Fault tolerance solutions in other distributed computing environments than Cloud that would definitely benefit this paradigm.

Submitted papers must represent original unpublished research that is not currently under review for any other workshop, conference or journal. Submissions must be done via Easychair (http://www.easychair.org/conferences/?conf=tardis2011), workshop papers must be written in English and will be a maximum of 6 pages in length in IEEE format. Additional pages may be purchased (in some circumstances) subject to approval of the proceedings chair.

Papers will be reviewed by the Program Committee and successful candidates will present their work at the workshop. Workshop proceedings will be published as part of the IEEE UCC 2011 proceedings

Please, note that at least one author of each accepted submission must attend the workshop and register for UCC 2011 (information for the registration will be provided at the UCC2011 main Web site).


Workshop Chairs:

- Jose Luis Vazquez-Poletti (Universidad Complutense de Madrid, Spain)
- Rafael Tolosana-Calasanz (Universidad de Zaragoza, Spain)
- Jose Angel Bañares (Universidad de Zaragoza, Spain)

Program Committee:

- Rajkumar Buyya (UMelb, Australia)
- Agustin C. Caminero (UNED, Spain)
- Jesus Carretero (UC3M, Spain)
- Jinjun Chen (UTS, Australia)
- Jose Cunha (UNL, Portugal)
- Patrizio Dazzi (CNR, Italy)
- Ewa Deelman (USC, USA)
- Wolfgang Gentzsch (DEISA, Germany)
- Matti Hiltunen (AT&T, USA)
- Eduardo Huedo (UCM, Spain)
- Marco Lackovic (UniCal, Italy)
- Charles Loomis (LAL, France)
- Patrick Martin (QueensU, Canada)
- Parastoo Mohagheghi (SINTEF, Norway)
- Alberto Nuñez (UC3M, Spain)
- Dana Petcu (UVT, Romania)
- Radu Prodan (UIBK, Austria)
- Omer Rana (UCardiff, UK)
- Domenico Talia (UniCal, Italy)
- Ian Taylor (UCardiff, UK)
- Johan Tordsson (UMU, Sweden)

More to be confirmed!

More information about the ecoop-info mailing list