layout.py revision 1452
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainen# CDDL HEADER START
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainen# The contents of this file are subject to the terms of the
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainen# Common Development and Distribution License (the "License").
447bf65ddb82ec279e7386828748ef47e199a6afTimo Sirainen# You may not use this file except in compliance with the License.
00e7c3010f7da4a49881a7feb05e413af353af0aTimo Sirainen# You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
5733207dc3ec10e6e5a6e0a8b30fbd1b061062b9Timo Sirainen# See the License for the specific language governing permissions
b200bc3875fa06d42c8619865cc306c3297fcaccAki Tuomi# and limitations under the License.
b200bc3875fa06d42c8619865cc306c3297fcaccAki Tuomi# When distributing Covered Code, include this CDDL HEADER in each
b200bc3875fa06d42c8619865cc306c3297fcaccAki Tuomi# file and include the License file at usr/src/OPENSOLARIS.LICENSE.
b200bc3875fa06d42c8619865cc306c3297fcaccAki Tuomi# If applicable, add the following below this CDDL HEADER, with the
b200bc3875fa06d42c8619865cc306c3297fcaccAki Tuomi# fields enclosed by brackets "[]" replaced with your own identifying
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainen# information: Portions Copyright [yyyy] [name of copyright owner]
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainen# CDDL HEADER END
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainen# Copyright 2009 Sun Microsystems, Inc. All rights reserved.
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainen# Use is subject to license terms.
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainen"""object to map content hashes to file paths
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo SirainenThe Layout class hierarchy encapsulates bijective mappings between a hash
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainen(or file name since those are equivalent in our system) and a relative path
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainenthat describes where to place that file in the file system. This bijective
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainenrelation should hold when the union of all layouts is considered as a single
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainenset of mappings. In practical terms, this means that only one layout may
300e4e43ed1ca46d0614459161ca2fb460ef661aTimo Sirainenpotentially deposit a hash into any particular location. This is not a
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainendifficult requirement to satisfy since each layout may append a unique
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainenidentifier to the file name or choose to carve out its own namespace at some
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainenlevel of directory hierarchy.
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo SirainenThe V1Layout places each file into a single layer of 256 directories. A
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainenfanout of 256 provides good performance compared to the other layouts
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainentested. It also allows over 8M files to be stored even with filesystems
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainenwhich limit the number of files in a directory to 65k.
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo SirainenThe V0Layout layout uses two layers of directories; the first has a fanout
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainenof 256 while the second has a fanout of 16M. This layout has the problem
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainenthat for the sizes of images (on the order of 300-500k files) and repos (on
300e4e43ed1ca46d0614459161ca2fb460ef661aTimo Sirainenthe order of 1M files), the second director level usually contains a single
2670cd577aa57eb9f915a4f4220ae48c9b4fc5fbTimo Sirainenfile. This imposes a substantial penalty for removing or resyncing the
00e7c3010f7da4a49881a7feb05e413af353af0aTimo Sirainendirectories because a readdir(3C) must be done for each directory and
447bf65ddb82ec279e7386828748ef47e199a6afTimo Sirainenreaddir is two orders of magnitude slower than the open or read ZFS
00e7c3010f7da4a49881a7feb05e413af353af0aTimo Sirainenoperations, and one order of magnitude slower than ZFS remove. Reducing
00e7c3010f7da4a49881a7feb05e413af353af0aTimo Sirainenthe number of directories used to hold the downloaded files was a goal for
00e7c3010f7da4a49881a7feb05e413af353af0aTimo Sirainenthe next layout.
00e7c3010f7da4a49881a7feb05e413af353af0aTimo SirainenTo evaluate a layout, it is necessary to measure the insertion time, the
00e7c3010f7da4a49881a7feb05e413af353af0aTimo Sirainenremoval time, and the time to open a random file. The insertion time
447bf65ddb82ec279e7386828748ef47e199a6afTimo Sirainenaffects the publication speed. The removal time effects the time a client
447bf65ddb82ec279e7386828748ef47e199a6afTimo Sirainenmay take to clear its download cache. The access time effects how quickly
447bf65ddb82ec279e7386828748ef47e199a6afTimo Sirainena server can open a file to serve it. File sizes from 1 to 10M were used
447bf65ddb82ec279e7386828748ef47e199a6afTimo Sirainento asses the scalability of the different layouts."""
447bf65ddb82ec279e7386828748ef47e199a6afTimo Sirainen """This class is the parent class to all layouts. It defines the
41e6163a8c2b0550b2a45b8e8fb3ec86d1b3489fTimo Sirainen interface which those subclasses must satisfy."""
447bf65ddb82ec279e7386828748ef47e199a6afTimo Sirainen """Return the path to the file with name "hashval"."""
raise NotImplementedError
def get_default_layouts():
def get_preferred_layout():
return V1Layout()