Computer Systems Laboratory - University of Louisville

Generalized Optimal Response Time Retrieval of Replicated Data from Storage Arrays

Paper

[TOS '13] PDF, BibTex

Disk Parameters

Speed of a disk is defined as the average time it takes to retrieve a single data block from that disks. In our experiments, we use the disk specifications given in Table 1.

**Table 1:** Disk Specifications
Producer	Model	Type	RPM	Speed (ms)
Seagate	Barracuda	HDD	7.2K	13.2
WD	Raptor	HDD	10K	8.3
Seagate	Cheetah	HDD	15K	6.1
OCZ	Vertex	SSD	-	0.5
Intel	X25-E	SSD	-	0.2

Experiments

All the experiments conducted are summarized in Table 2. Delay and initial load values are given in milliseconds. R(2,10,2) means that a number among the set 2, 4, 6, 8, and 10 is chosen randomly. If the system is homogeneous, the properties of the cheetah disk is used for all the disks in the system. If the system is heterogeneous, then the disks are chosen randomly among the disk group indicated in the table. Disk groups can be HDDs, SSDs, or HDDs+SSDs.

**Table 2:** Experiment Parameters
Experiment	Number of	Disk	Site 1			Site 2
Number	Sites	Properties	Disks	Delays	Loads	Disks	Delays	Loads
1	1	homogeneous	cheetah	0	0	-	-	-
2	1	heterogeneous	ssd	0	0	-	-	-
3	1	heterogeneous	hdd	0	0	-	-	-
4	1	heterogeneous	ssd+hdd	0	0	-	-	-
5	1	heterogeneous	ssd+hdd	R(2,10,2)	R(2,10,2)	-	-	-
6	2	homogeneous	cheetah	0	0	cheetah	0	0
7	2	homogeneous	cheetah	0	0	cheetah	0	20
8	2	homogeneous	cheetah	0	5	cheetah	0	15
9	2	homogeneous	cheetah	0	10	cheetah	0	10
10	2	homogeneous	cheetah	0	15	cheetah	0	5
11	2	homogeneous	cheetah	0	20	cheetah	0	0
12	2	homogeneous	cheetah	0	0	cheetah	20	0
13	2	homogeneous	cheetah	5	0	cheetah	15	0
14	2	homogeneous	cheetah	10	0	cheetah	10	0
15	2	homogeneous	cheetah	15	0	cheetah	5	0
16	2	homogeneous	cheetah	20	0	cheetah	0	0
17	2	heterogeneous	ssd	0	0	hdd	0	0
17	2	heterogeneous	hdd	0	0	ssd	0	0
19	2	heterogeneous	ssd+hdd	0	0	ssd+hdd	0	0
20	2	heterogeneous	ssd+hdd	R(2,10,2)	R(2,10,2)	ssd+hdd	R(2,10,2)	R(2,10,2)

Results

Results of the experiments defined in Table 2 is provided in Table 3.

**Table 3:** Experimental Results
Experiment	All Results
Experiment 1	PDF
Experiment 2	PDF
Experiment 3	PDF
Experiment 4	PDF
Experiment 5	PDF
Experiment 6	PDF
Experiment 7	PDF
Experiment 8	PDF
Experiment 9	PDF
Experiment 10	PDF
Experiment 11	PDF
Experiment 12	PDF
Experiment 13	PDF
Experiment 14	PDF
Experiment 15	PDF
Experiment 16	PDF
Experiment 17	PDF
Experiment 18	PDF
Experiment 19	PDF
Experiment 20	PDF

Contact

You can send an e-mail to this address for any questions.

Integrated Maximum Flow Algorithm for Optimal Response Time Retrieval of Replicated Data

Paper

[ICPP '12] PDF, BibTex

Disk Parameters

Speed of a disk is defined as the average time it takes to retrieve a single data block from that disks. In our experiments, we use the disk specifications given in Table 1.

**Table 1:** Disk Specifications
Producer	Model	Type	RPM	Speed (ms)
Seagate	Barracuda	HDD	7.2K	13.2
WD	Raptor	HDD	10K	8.3
Seagate	Cheetah	HDD	15K	6.1
OCZ	Vertex	SSD	-	0.5
Intel	X25-E	SSD	-	0.2

Experiments

**Table 2:** Experiment Parameters
Experiment	Number of	Disk	Site 1			Site 2
Number	Sites	Properties	Disks	Delays	Loads	Disks	Delays	Loads
1	2	homogeneous	cheetah	0	0	cheetah	0	0
2	2	heterogeneous	ssd	0	0	hdd	0	0
3	2	heterogeneous	hdd	0	0	ssd	0	0
4	2	heterogeneous	ssd+hdd	0	0	ssd+hdd	0	0
5	2	heterogeneous	ssd+hdd	R(2,10,2)	R(2,10,2)	ssd+hdd	R(2,10,2)	R(2,10,2)

Results

Results of the experiments defined in Table 2 is provided in Table 3.

**Table 3:** Experimental Results
Experiment	All Results
Experiment 1	PDF
Experiment 2	PDF
Experiment 3	PDF
Experiment 4	PDF
Experiment 5	PDF

Contact

You can send an e-mail to this address for any questions.

Equivalent Disk Allocations

Paper

[TPDS '12] PDF, Supplementary File, BibTex

Periodic Disk Allocations with Best Additive Error and Threshold

Efficient retrieval of a range query is challening. Multi-disk architectures offer the opportunity to exploit I/O parallelism during retrieval. A common approach for efficient parallel I/O is partitioning the data space into disjoint regions, and allocating the data to multiple disks. When users issue a query, data falling into disjoint partitions is retrieved in parallel from multiple disks. This technique is referred to as declustering and can be summarized as a good way of distributing data to multiple I/O devices.

Additive error of a range query is the difference between optimal and actual retrieval cost. Additive error of a declustering scheme is the maximum additive error over all the queries. Threshold of a declustering scheme is k if all spatial range queries with at most k buckets can be retrieved optimally. It is desirable to find declustering schemes with low additive error and high threshold. Periodic disk allocations yield good results, however; the number of periodic disk allocations is large and finding the ones with the best additive error and threshold is not easy.

Here, we share our recent research findings by providing periodic disk allocations giving the best additive error and threshold for 2, 3 and 4 dimensional databases.

Format of the Files

The first column is N, the number of disks in the system.
The second column is the best additive error or threshold for N number of disks specified in the first column.
The third column is the allocation that yields the best additive error or threshold.
We use the notation (a1 , a2 , . . . , ad ) for the d-dimensional disk allocation (a1∗i1 + a2∗i2 + . . . + ad∗id mod N).

Results

Dimentionality	Additive Error	Threshold
2 Dimensions	txt	txt
3 Dimensions	txt	txt
4 Dimensions	txt	txt

Contact

You can send an e-mail to this address for any questions.

Software/Data/Tutorials

Our GitHub Page

I/O Tracing/Profiling Tools

I/O Tracing/Profiling Example

Creating a Kernel Module

Updating a Kernel Module

I/O Benchmarking Tools

Generalized Optimal Response Time Retrieval of Replicated Data from Storage Arrays

Generalized Optimal Response Time Retrieval of Replicated Data from Storage Arrays

Paper

Disk Parameters

Experiments

Results

Contact

Integrated Maximum Flow Algorithm for Optimal Response Time Retrieval of Replicated Data

Integrated Maximum Flow Algorithm for Optimal Response Time Retrieval of Replicated Data

Paper

Disk Parameters

Experiments

Results

Contact

Equivalent Disk Allocations

Equivalent Disk Allocations

Paper

Periodic Disk Allocations with Best Additive Error and Threshold

Format of the Files

Results

Contact