Clustera: An Integrated Computation And Data Management System
File(s)
Date
2007Author
DeWitt, David J.
Robinson, Eric
Shankar, Srinath
Paulson, Erik
Naughton, Jeffrey
Royalty, Joshua
Krioukov, Andrew
Publisher
University of Wisconsin-Madison Department of Computer Sciences
Metadata
Show full item recordAbstract
This paper introduces Clustera, an integrated computation and data management system. In contrast to traditional cluster-management systems that target specific types of workloads, Clustera is designed for extensibility, enabling the system to be easily extended to handle a wide variety of job types ranging from computationally-intensive, long-running jobs with minimal I/O requirements to complex SQL queries over massive relational tables. Another unique feature of Clustera is the way in which the system architecture exploits modern software building blocks including application servers and relational database systems in order to realize important performance, scalability, portability and usability benefits. Finally, experimental evaluation suggests that Clustera has good scale-up properties for SQL processing, that Clustera delivers performance comparable to Hadoop for MapReduce processing and that Clustera can support higher job throughput rates than previously published results for the Condor and CondorJ2 batch computing systems.
Permanent Link
http://digital.library.wisc.edu/1793/60638Citation
TR1637