catalog.py revision 409
23N/A# The contents of this file are subject to the terms of the 23N/A# Common Development and Distribution License (the "License"). 23N/A# You may not use this file except in compliance with the License. 23N/A# See the License for the specific language governing permissions 23N/A# and limitations under the License. 23N/A# When distributing Covered Code, include this CDDL HEADER in each 23N/A# If applicable, add the following below this CDDL HEADER, with the 23N/A# fields enclosed by brackets "[]" replaced with your own identifying 23N/A# information: Portions Copyright [yyyy] [name of copyright owner] 23N/A# Copyright 2008 Sun Microsystems, Inc. All rights reserved. 23N/A# Use is subject to license terms. 23N/A"""Interfaces and implementation for the Catalog object, as well as functions 23N/Athat operate on lists of package FMRIs.""" 37N/A """A Catalog is the representation of the package FMRIs available to 37N/A this client or repository. Both purposes utilize the same storage 26N/A The serialized structure of the repository is an unordered list of 26N/A available package versions, followed by an unordered list of 50N/A incorporation relationships between packages. This latter section 50N/A allows the graph to be topologically sorted by the client. 26N/A S Last-Modified: [timespec] 26N/A XXX A authority mirror-uri ... 59N/A In order to improve the time to search the catalog, a cached list 59N/A of package names is kept in the catalog instance. In an effort 59N/A to prevent the catalog from having to generate this list every time 23N/A it is constructed, the array that contains the names is pickled and 157N/A saved and pkg_names.pkl. 157N/A # XXX Mirroring records also need to be allowed from client 157N/A # configuration, and not just catalogs. 157N/A # XXX It would be nice to include available tags and package sizes, 157N/A # although this could also be calculated from the set of manifests. 157N/A # XXX Current code is O(N_packages) O(M_versions), should be 157N/A # O(1) O(M_versions), and possibly O(1) O(1). 157N/A # XXX Initial estimates suggest that the Catalog could be composed of 157N/A # 1e5 - 1e7 lines. Catalogs across these magnitudes will need to be 157N/A # spread out into chunks, and may require a delta-oriented update 73N/A """Create a catalog. If the path supplied does not exist, 50N/A this will create the required directory structure. 157N/A Otherwise, if the directories are already in place, the 157N/A existing catalog is opened. If pkg_root is specified 157N/A and no catalog is found at cat_root, the catalog will be 157N/A rebuilt. authority names the authority that 157N/A is represented by this catalog.""" 157N/A # We need to lock the search database against multiple 157N/A # simultaneous updates from separate threads closing 157N/A # publication transactions. 157N/A # Rebuild catalog, if we're the depot and it's necessary 157N/A # Load the list of pkg names. If it doesn't exist, build a list 157N/A # of pkg names. If the catalog gets rebuilt in build_catalog, 157N/A # add_fmri() will generate the list of package names instead. 157N/A """Add a package, named by the fmri, to the catalog. 157N/A Throws an exception if an identical package is already 157N/A present. Throws an exception if package has no version.""" 161N/A "Unversioned FMRI not supported: %s" %
fmri 161N/A # Callers should verify that the FMRI they're going to add is 161N/A # valid; however, this check is here in case they're 161N/A "Existing renames make adding FMRI %s invalid." \
161N/A "Package %s is already in the catalog" % \
161N/A # Add this pkg name to the list of package names 161N/A """Perform any catalog transformations necessary if 161N/A prefix p is found in the catalog. Previously, we didn't 161N/A know how to handle this prefix and now we do. If we 161N/A need to transform the entry from server to client form, 161N/A make sure that happens here.""" 161N/A """Takes the list of in-memory attributes and returns 161N/A a list of strings, each string naming an attribute.""" 161N/A s =
"S %s: %s\n" % (k, v)
161N/A """Helper method that takes the full path to the package 161N/A directory and the name of the manifest file, and returns an FMRI 161N/A constructed from the information in those components.""" 157N/A """If this version of the catalog knows about new prefixes, 157N/A check the on disk catalog to see if we can perform any 157N/A transformations based upon previously unknown catalog formats. 157N/A This routine will add a catalog attribute if it doesn't exist, 161N/A otherwise it checks this attribute against a hard-coded 161N/A version-specific tuple to see if new methods were added. 161N/A If new methods were added, it will call an additional routine 161N/A that updates the on-disk catalog, if necessary.""" 161N/A # If a prefixes attribute doesn't exist, write one and get on 157N/A # Prefixes attribute does exist. Check if it has changed. 157N/A # Nothing to do if prefixes haven't changed 59N/A # If known_prefixes contains a prefix not in pfx_set, 157N/A # add the prefix and perform a catalog transform. 157N/A # Write out updated prefixes list 157N/A """Walk the on-disk package data and build (or rebuild) the 157N/A package catalog and search database.""" 157N/A # XXX eschew os.walk in favor of another os.listdir here? 157N/A # XXX force a rebuild despite mtimes? 157N/A # XXX queue this and fork later? 157N/A # XXX force a rebuild despite mtimes? 157N/A # If the database doesn't exist, don't bother 157N/A # building the list; we'll just build it all. 157N/A # If we have no updates to make to the search database but it 157N/A # already exists, just make it available. If we do have updates 157N/A # to make (including possibly building it from scratch), fork it 157N/A # off into another process; when that's done, we'll mark it 157N/A "Failed to open search database", \
157N/A "for writing: %s (errno=%s)" % \
157N/A "Failed to open search " + \
45N/A "database: %s (errno=%s)" % \
45N/A # If we are in a subthread already, 45N/A # the signal method will not work. 59N/A # On non-unix, where there is no convenient 59N/A # way to fork subprocesses, just update the 45N/A """Handler method for the SIGCLD signal. Checks to see if the 45N/A search database update child has finished, and enables searching 45N/A if it finished successfully, or logs an error if it didn't.""" 157N/A "Failed to open search database", \
106N/A "for writing: %s (errno=%s)" % \
"Failed to open search " + \
"database: %s (errno=%s)" % \
# XXX This should be logged instead print "ERROR building search database:" # Since we're here explicitly to update # the database, if we fail, there's "Failed to open search database", \
"for writing: %s (errno=%s)" % \
"Failed to open search database", \
"for writing: %s (errno=%s)" % \
# XXX We should probably iterate over the catalog, for # cases where manifests have stuck around, but have been # moved to historical and removed from the catalog. """Update the search database with the FMRIs passed in via 'fmri_list'. If 'fmri_list' is empty or None, then rebuild the database from scratch. 'fmri_list' should be a list of tuples where the first element is the full path to the package name in pkg_root and the second element is the version string.""" # If we're in the process of updating the database in our # separate process, and this particular update until that's # If we rebuilt the database from scratch ... XXX why would we # Five digits of a base-62 number represents a little over 900 million. # Assuming 1 million tokens used in a WOS build (current imports use # just short of 500k, but we don't have all the l10n packages, and may # not have all the search tokens we want) and keeping every nightly # build gives us 2.5 years before we run out of token space. We're # likely to garbage collect manifests and rebuild the db before then. # XXX We're eventually going to run into conflicts with real tokens # here. This is unlikely until we hit, say "alias", which is a ways # off, but we should still look at solving this. # XXX Do we want to log warnings as we approach index capacity? """Update the search database with the data from the manifest for 'fmri', which has been collected into 'search_dict'""" # self.searchdb: token -> (type, fmri, action name, key value) # Don't update the database if it already has this FMRI's # XXX The database files are so damned huge (if # holey) because we have zillions of copies of # the full fmri strings. We might want to # indirect these as well. "'%s' (s_ptr = %s) to search " \
"""Because of the size limitations of the underlying database records, not only do we have to store pointers to the actual search data, but once the pointer records fill up, we have to chain those records up to spillover records. This method adds the pointer to the data to the end of the last link in the chain, overflowing as necessary. The search token is passed in as 'token', and the pointer to the actual data which should be returned is passed in as 'data_token'.""" # According to the ndbm man page, the total length of # key and value must be less than 1024. Seems like the # actual value is 1018, probably due to some padding or # accounting bytes or something. The 2 is for the space # separator and the plus-sign for the extension token. # XXX The comparison should be against 1017, but that # crahes in the if clause below trying to append the # extension token. Dunno why. # If we're adding the first element in the next # link of the chain, add the extension token to # the end of this link, and put the token # pointing to the data at the beginning of the break # from while True; we're done # If we find an extension token, start looking # at the next chain link. # If we get here, it's safe to append the data token to # the current link, and get out. """Search through the search database for 'token'. Return a list of token type / fmri pairs.""" # For each indirect token in the search token's value, # add its value to the return list. If we see a chain # token, switch to its value and continue. If we fall # out of the loop without seeing a chain token, we can """Iterate through the catalog, looking for packages matching 'pattern', based on the function in 'matcher' and the versioning constraint described by 'constraint'. If 'matcher' is None, uses fmri subset matching as the default. Returns a sorted list of PkgFmri objects, newest versions first. If 'counthash' is a dictionary, instead store the number of matched fmris for each package name which was matched.""" # 'patterns' may be partially or fully decorated fmris; we want # to extract their names and versions to match separately # XXX "5.11" here needs to be saner # Walk list of pkg names and patterns. See if any of the # patterns match known package names """A generator function that produces FMRIs as it iterates over the contents of the catalog.""" # Handle old two-column catalog file, mostly in # use on server. If *this* doesn't work, we # have a corrupt catalog. "corrupt catalog entry for " \
"""Returns a list of RenameRecords where fmri is listed as the # Don't bother doing this if no FMRI is present # Load renamed packages, if needed """Returns a list of RenameRecords where fmri is listed as # Don't bother doing this if no FMRI is present # Load renamed packages, if needed """Given a list of pkg_names, return all of the FMRIs that contain an pkg_name entry as a substring.""" # Handle old two-column catalog file, mostly in # use on server. If *this* doesn't work, we # have a corrupt catalog. "corrupt catalog entry for " \
"""Return the time at which the catalog was last modified.""" """Load attributes from the catalog file into the in-memory # convert npkgs to integer value """Read the catalog and build the array of fmri pkg names that is contained within the catalog. Returns a list of strings of package names.""" # Handle old two-column catalog file, mostly in # use on server. If *this* doesn't work, we # have a corrupt catalog. "corrupt catalog entry in file " \
"""Pickle the list of package names in the catalog for faster # Don't bother saving, if we don't have """Load pickled list of package names. This function may raise an IOError if the file doesn't exist. Callers should be sure to catch this exception and rebuild the package names, if required.""" """Load the catalog's rename records into self.renamed""" """Returns the number of packages in the catalog.""" """Returns the URL of the catalog's origin.""" """A static method that takes a file-like object and a path. This is the other half of catalog.send(). It reads a stream as an incoming catalog and lays it down # XXX Need to be able to handle old and new # Write the authority's origin into our attributes # Save a list of package names for easier searching """Record that the name of package oldname has been changed to newname as of version vers. Returns a timestamp of when the catalog was modified and a RenamedPackage object that describes the rename.""" # Check that the destination (new) package is already in the # catalog. Also check that the old package does not exist at # the version that is being renamed. "Destination FMRI %s must be in catalog" % \
"Src FMRI %s must not be in catalog" % \
# Load renamed packages, if needed # Check that rename record isn't already in catalog "Rename %s is already in the catalog" %
rr # Keep renames acyclic. Check that the destination of this # rename isn't the source of another rename. "Can't rename %s. Causes cycle in rename graph." \
"""Returns true if fmri and pfmri are the same package because of a rename operation.""" """Returns true if fmri is a successor to pfmri by way of a rename operation.""" """Returns true if fmri is a predecessor to pfmri by """Returns a list of packages that are newer than fmri.""" """Returns a list of packages that are older than fmri.""" """Save attributes from the in-memory catalog to a file # This may get called in a situation where # the user does not have write access to the attrs """Send the contents of this catalog out to the filep specified as an argument.""" # Missing catalog is fine; other errors need to """Set time to timestamp if supplied by caller. Otherwise """Check that the fmri supplied as an argument would be valid to add to the catalog. This checks to make sure that from adding this FMRI.""" # In order to avoid a fine from the Department of Redundancy Department, # allow these methods to be invoked without explictly naming the Catalog class. # Prefixes that this catalog knows how to handle # Method used by Catalog and UpdateLog. Since UpdateLog needs to know # about Catalog, keep it in Catalog to avoid circular dependency problems. """Return an integer timestamp that can be used for comparisons.""" """Take timestamp ts in string isoformat, and convert it to a datetime # usec is not in the string if 0 """Iterate through the given list of PkgFmri objects, looking for packages matching 'pattern', based on the function in 'matcher' and the versioning constraint described by 'constraint'. If 'matcher' is None, uses fmri subset matching as the default. Returns a sorted list of PkgFmri objects, newest versions first. If 'counthash' is a dictionary, instead store the number of matched fmris for each package name which # 'pattern' may be a partially or fully decorated fmri; we want # to extract its name and version to match separately against # XXX "5.11" here needs to be saner """An in-memory representation of a rename object. This object records information about a package that has had its name changed. Renaming a package presents a number of challenges. The packaging system must still be able to recognize and decode dependencies on packages with the old name. In order for this to work correctly, the rename record must contain both the old and new name of the package. It is also undesireable to have a renamed package receive subsequent versions. However, it still should be possible to publish bugfixes to the old package lineage. This means that we must also record versioning information at the time a package is renamed. This versioning information allows us to determine which portions of the version and namespace are allowed to add new versions. If a package is re-named to the NULL package at a specific version, this is equivalent to freezing the package. No further updates to the version history may be made under that name. (NULL is never open) The rename catalog format is as follows: R <srcname> <srcversion> <destname> <destversion> """Create a RenamedPackage object. Srcname is the original name of the package, destname is the name this package will take after the operation is successful. Versionstr is the version at which this change takes place. No versions >= version of srcname will be permitted.""" "Must supply a source or destination version" """Implementing our own == function allows us to properly check whether a rename object is in a list of renamed """Return a FMRI that represents the destination name and version of the renamed package.""" """Return a FMRI that represents the most recent version of the package had it not been renamed."""