Large-scale information gathering becomes more and more common with the increasing popularity of smartphones, GPS, social networks, and sensor networks. Services based on this real-time data are the logical next step. Service Oriented Stream Systems (SOSS) have a focus on one-time ad hoc queries as op-posed to continuous queries. High availability is crucial in these services. However, data replication has inherent costs, which are particularly burdensome for high rate, often overloaded, SOSS. To provide high availability and to cope with the problem of over-loading the system, we propose a mechanism called soft quorums.
Soft quorums incorporate a tuning knob that provides a trade off between query result accuracy and performance. Thus, in essence, soft quorums simultaneously offer high availability and per-query load shedding as needed. This is done in a system-wise optimal way. The parameter choices of soft quorums automatically adapt to dynamic data stream rates and query rates, and minimize the overall system load, given an accuracy requirement. We devise a recovery algorithm and study data quality after recovery. Finally, we conduct a comprehensive experimental study using two real-world and some synthetic datasets.