catalog.py revision 296
23N/A# The contents of this file are subject to the terms of the 23N/A# Common Development and Distribution License (the "License"). 23N/A# You may not use this file except in compliance with the License. 23N/A# See the License for the specific language governing permissions 23N/A# and limitations under the License. 23N/A# When distributing Covered Code, include this CDDL HEADER in each 23N/A# If applicable, add the following below this CDDL HEADER, with the 23N/A# fields enclosed by brackets "[]" replaced with your own identifying 23N/A# information: Portions Copyright [yyyy] [name of copyright owner] 221N/A# Copyright 2008 Sun Microsystems, Inc. All rights reserved. 23N/A# Use is subject to license terms. 221N/A"""Interfaces and implementation for the Catalog object, as well as functions 221N/Athat operate on lists of package FMRIs.""" 34N/A """A Catalog is the representation of the package FMRIs available to 34N/A this client or repository. Both purposes utilize the same storage 24N/A The serialized structure of the repository is an unordered list of 26N/A available package versions, followed by an unordered list of 26N/A incorporation relationships between packages. This latter section 26N/A allows the graph to be topologically sorted by the client. 26N/A S Last-Modified: [timespec] 50N/A XXX A authority mirror-uri ... 270N/A In order to improve the time to search the catalog, a cached list 270N/A of package names is kept in the catalog instance. In an effort 270N/A to prevent the catalog from having to generate this list every time 270N/A it is constructed, the array that contains the names is pickled and 270N/A saved and pkg_names.pkl. 37N/A # XXX Mirroring records also need to be allowed from client 59N/A # configuration, and not just catalogs. 59N/A # XXX It would be nice to include available tags and package sizes, 59N/A # although this could also be calculated from the set of manifests. 59N/A # XXX Current code is O(N_packages) O(M_versions), should be 157N/A # O(1) O(M_versions), and possibly O(1) O(1). 59N/A # XXX Initial estimates suggest that the Catalog could be composed of 59N/A # 1e5 - 1e7 lines. Catalogs across these magnitudes will need to be 59N/A # spread out into chunks, and may require a delta-oriented update 270N/A """Create a catalog. If the path supplied does not exist, 157N/A this will create the required directory structure. 157N/A Otherwise, if the directories are already in place, the 157N/A existing catalog is opened. If pkg_root is specified 195N/A and no catalog is found at cat_root, the catalog will be 195N/A rebuilt. authority names the authority that 195N/A is represented by this catalog.""" 204N/A # We need to lock the search database against multiple 204N/A # simultaneous updates from separate threads closing 204N/A # publication transactions. 157N/A # Rebuild catalog, if we're the depot and it's necessary 270N/A # Load the list of pkg names. If it doesn't exist, build a list 270N/A # of pkg names. If the catalog gets rebuilt in build_catalog, 270N/A # add_fmri() will generate the list of package names instead. 157N/A """Add a package, named by the fmri, to the catalog. 157N/A Throws an exception if an identical package is already 157N/A present. Throws an exception if package has no version.""" 157N/A "Unversioned FMRI not supported: %s" %
fmri 26N/A # Callers should verify that the FMRI they're going to add is 258N/A # valid; however, this check is here in case they're 258N/A "Existing renames make adding FMRI %s invalid." \
157N/A "Package %s is already in the catalog" % \
215N/A # Add this pkg name to the list of package names 220N/A """Perform any catalog transformations necessary if 220N/A prefix p is found in the catalog. Previously, we didn't 220N/A know how to handle this prefix and now we do. If we 220N/A need to transform the entry from server to client form, 220N/A make sure that happens here.""" 157N/A """Takes the list of in-memory attributes and returns 157N/A a list of strings, each string naming an attribute.""" 157N/A s =
"S %s: %s\n" % (k, v)
204N/A """Helper method that takes the full path to the package 204N/A directory and the name of the manifest file, and returns an FMRI 204N/A constructed from the information in those components.""" 220N/A """If this version of the catalog knows about new prefixes, 220N/A check the on disk catalog to see if we can perform any 220N/A transformations based upon previously unknown catalog formats. 220N/A This routine will add a catalog attribute if it doesn't exist, 220N/A otherwise it checks this attribute against a hard-coded 220N/A version-specific tuple to see if new methods were added. 220N/A If new methods were added, it will call an additional routine 220N/A that updates the on-disk catalog, if necessary.""" 220N/A # If a prefixes attribute doesn't exist, write one and get on 220N/A # Prefixes attribute does exist. Check if it has changed. 220N/A # Nothing to do if prefixes haven't changed 220N/A # If known_prefixes contains a prefix not in pfx_set, 220N/A # add the prefix and perform a catalog transform. 220N/A # Write out updated prefixes list 204N/A """Walk the on-disk package data and build (or rebuild) the 204N/A package catalog and search database.""" 157N/A # XXX eschew os.walk in favor of another os.listdir here? 204N/A # XXX force a rebuild despite mtimes? 204N/A # XXX queue this and fork later? 204N/A # XXX force a rebuild despite mtimes? 204N/A # If the database doesn't exist, don't bother 204N/A # building the list; we'll just build it all. 204N/A # If we have no updates to make to the search database but it 204N/A # already exists, just make it available. If we do have updates 204N/A # to make (including possibly building it from scratch), fork it 204N/A # off into another process; when that's done, we'll mark it 264N/A "Failed to open search database", \
264N/A "for writing: %s (errno=%s)" % \
264N/A "Failed to open search " + \
264N/A "database: %s (errno=%s)" % \
204N/A # if we are in a subthread already, the signal method 204N/A # on non-unix, where there is no convenient 204N/A # way to fork subprocesses, just update the 264N/A """Handler method for the SIGCLD signal. Checks to see if the 264N/A search database update child has finished, and enables searching 274N/A if it finished successfully, or logs an error if it didn't.""" 264N/A "Failed to open search database", \
264N/A "for writing: %s (errno=%s)" % \
204N/A "Failed to open search " + \
204N/A "database: %s (errno=%s)" % \
264N/A # XXX This should be logged instead 264N/A print "ERROR building search database:" 204N/A # Since we're here explicitly to update 204N/A # the database, if we fail, there's 264N/A "Failed to open search database", \
264N/A "for writing: %s (errno=%s)" % \
204N/A "Failed to open search database", \
204N/A "for writing: %s (errno=%s)" % \
204N/A # XXX We should probably iterate over the catalog, for 204N/A # cases where manifests have stuck around, but have been 204N/A # moved to historical and removed from the catalog. 265N/A """Update the search database with the FMRIs passed in via 265N/A 'fmri_list'. If 'fmri_list' is empty or None, then rebuild the 265N/A database from scratch. 'fmri_list' should be a list of tuples 265N/A where the first element is the full path to the package name in 265N/A pkg_root and the second element is the version string.""" 265N/A # If we're in the process of updating the database in our 265N/A # separate process, and this particular update until that's 204N/A # If we rebuilt the database from scratch ... XXX why would we 204N/A # self.searchdb.close() 204N/A # Five digits of a base-62 number represents a little over 900 million. 204N/A # Assuming 1 million tokens used in a WOS build (current imports use 204N/A # just short of 500k, but we don't have all the l10n packages, and may 204N/A # not have all the search tokens we want) and keeping every nightly 204N/A # build gives us 2.5 years before we run out of token space. We're 204N/A # likely to garbage collect manifests and rebuild the db before then. 204N/A # XXX We're eventually going to run into conflicts with real tokens 204N/A # here. This is unlikely until we hit, say "alias", which is a ways 204N/A # off, but we should still look at solving this. 204N/A # XXX Do we want to log warnings as we approach index capacity? 204N/A """Update the search database with the data from the manifest 204N/A for 'fmri', which has been collected into 'search_dict'""" 204N/A # self.searchdb: token -> (type, fmri, action) 204N/A # XXX search_dict doesn't have action info, but should 204N/A # Don't update the database if it already has this FMRI's 204N/A # XXX The database files are so damned huge (if 204N/A # holey) because we have zillions of copies of 204N/A # the full fmri strings. We might want to 204N/A # indirect these as well. 204N/A """Because of the size limitations of the underlying database 204N/A records, not only do we have to store pointers to the actual 204N/A search data, but once the pointer records fill up, we have to 204N/A chain those records up to spillover records. This method adds 204N/A the pointer to the data to the end of the last link in the 204N/A chain, overflowing as necessary. The search token is passed in 204N/A as 'token', and the pointer to the actual data which should be 204N/A returned is passed in as 'data_token'.""" 204N/A # According to the ndbm man page, the total length of 204N/A # key and value must be less than 1024. Seems like the 204N/A # actual value is 1018, probably due to some padding or 204N/A # accounting bytes or something. The 2 is for the space 204N/A # separator and the plus-sign for the extension token. 204N/A # XXX The comparison should be against 1017, but that 204N/A # crahes in the if clause below trying to append the 204N/A # extension token. Dunno why. 204N/A # If we're adding the first element in the next 204N/A # link of the chain, add the extension token to 204N/A # the end of this link, and put the token 204N/A # pointing to the data at the beginning of the 204N/A break # from while True; we're done 204N/A # If we find an extension token, start looking 204N/A # at the next chain link. 204N/A # If we get here, it's safe to append the data token to 204N/A # the current link, and get out. 204N/A """Search through the search database for 'token'. Return a 204N/A list of token type / fmri pairs.""" 204N/A # For each indirect token in the search token's value, 204N/A # add its value to the return list. If we see a chain 204N/A # token, switch to its value and continue. If we fall 157N/A # out of the loop without seeing a chain token, we can 270N/A """Iterate through the catalog, looking for packages matching 270N/A 'pattern', based on the function in 'matcher' and the versioning 270N/A constraint described by 'constraint'. If 'matcher' is None, 270N/A uses fmri subset matching as the default. Returns a sorted list 270N/A of PkgFmri objects, newest versions first. If 'counthash' is a 270N/A dictionary, instead store the number of matched fmris for each 161N/A package name which was matched.""" 161N/A # 'patterns' may be partially or fully decorated fmris; we want 157N/A # to extract their names and versions to match separately 270N/A # XXX "5.11" here needs to be saner 221N/A # Walk list of pkg names and patterns. See if any of the 221N/A # patterns match known package names 195N/A """A generator function that produces FMRIs as it 24N/A iterates over the contents of the catalog.""" 258N/A # Handle old two-column catalog file, mostly in 258N/A """Returns a list of RenameRecords where fmri is listed as the 258N/A destination package.""" 258N/A # Don't bother doing this if no FMRI is present 258N/A # Load renamed packages, if needed 258N/A """Returns a list of RenameRecords where fmri is listed as 258N/A # Don't bother doing this if no FMRI is present 270N/A # Load renamed packages, if needed 270N/A """Given a list of pkg_names, return all of the FMRIs 270N/A that contain an pkg_name entry as a substring.""" 270N/A # Handle old two-column catalog file, mostly in 157N/A """Return the time at which the catalog was last modified.""" 157N/A """Load attributes from the catalog file into the in-memory 157N/A attributes dictionary""" 270N/A # convert npkgs to integer value 270N/A """Read the catalog and build the array of fmri pkg names 270N/A that is contained within the catalog. Returns a list 270N/A of strings of package names.""" 270N/A # Handle old two-column catalog file, mostly in 270N/A """Pickle the list of package names in the catalog for faster 270N/A # Don't bother saving, if we don't have 270N/A """Load pickled list of package names. This function 270N/A may raise an IOError if the file doesn't exist. Callers 270N/A should be sure to catch this exception and rebuild 270N/A the package names, if required.""" 258N/A """Load the catalog's rename records into self.renamed""" 157N/A """Returns the number of packages in the catalog.""" 157N/A """A static method that takes a file-like object and 157N/A a path. This is the other half of catalog.send(). It 157N/A reads a stream as an incoming catalog and lays it down 270N/A # XXX Need to be able to handle old and new 258N/A # Save a list of package names for easier searching 258N/A """Record that the name of package oldname has been changed 258N/A to newname as of version vers. Returns a timestamp 258N/A of when the catalog was modified and a RenamedPackage 258N/A object that describes the rename.""" 258N/A # Check that the destination (new) package is already in the 258N/A # catalog. Also check that the old package does not exist at 258N/A # the version that is being renamed. 258N/A "Destination FMRI %s must be in catalog" % \
258N/A "Src FMRI %s must not be in catalog" % \
258N/A # Load renamed packages, if needed 258N/A # Check that rename record isn't already in catalog 258N/A "Rename %s is already in the catalog" %
rr 258N/A # Keep renames acyclic. Check that the destination of this 258N/A # rename isn't the source of another rename. 258N/A "Can't rename %s. Causes cycle in rename graph." \
258N/A """Returns true if fmri and pfmri are the same package because 258N/A of a rename operation.""" 258N/A """Returns true if fmri is a successor to pfmri by way 258N/A of a rename operation.""" 258N/A """Returns true if fmri is a predecessor to pfmri by 258N/A """Returns a list of packages that are newer than fmri.""" 258N/A """Returns a list of packages that are older than fmri.""" 157N/A """Save attributes from the in-memory catalog to a file 157N/A specified by filenm.""" 157N/A # This may get called in a situation where 174N/A # the user does not have write access to the attrs 157N/A """Send the contents of this catalog out to the filep 157N/A specified as an argument.""" 215N/A # Send attributes first. 215N/A # Missing catalog is fine; other errors need to be 258N/A """Set time to timestamp if supplied by caller. Otherwise 258N/A use the system time.""" 220N/A """Check that the fmri supplied as an argument would be 258N/A valid to add to the catalog. This checks to make sure that 215N/A from adding this FMRI.""" 215N/A# In order to avoid a fine from the Department of Redundancy Department, 215N/A# allow these methods to be invoked without explictly naming the Catalog class. 215N/A# Prefixes that this catalog knows how to handle 215N/A# Method used by Catalog and UpdateLog. Since UpdateLog needs to know 215N/A# about Catalog, keep it in Catalog to avoid circular dependency problems. 215N/A """Return an integer timestamp that can be used for comparisons.""" 221N/A """Take timestamp ts in string isoformat, and convert it to a datetime 221N/A # usec is not in the string if 0 221N/A """Iterate through the given list of PkgFmri objects, 221N/A looking for packages matching 'pattern', based on the function 221N/A in 'matcher' and the versioning constraint described by 221N/A 'constraint'. If 'matcher' is None, uses fmri subset matching 221N/A as the default. Returns a sorted list of PkgFmri objects, 221N/A newest versions first. If 'counthash' is a dictionary, instead 221N/A store the number of matched fmris for each package name which 221N/A # 'pattern' may be a partially or fully decorated fmri; we want 221N/A # to extract its name and version to match separately against 221N/A # XXX "5.11" here needs to be saner 258N/A """An in-memory representation of a rename object. This object records 258N/A information about a package that has had its name changed. 258N/A Renaming a package presents a number of challenges. The packaging 258N/A system must still be able to recognize and decode dependencies on 258N/A packages with the old name. In order for this to work correctly, the 258N/A rename record must contain both the old and new name of the package. It 258N/A is also undesireable to have a renamed package receive subsequent 258N/A versions. However, it still should be possible to publish bugfixes to 258N/A the old package lineage. This means that we must also record 258N/A versioning information at the time a package is renamed. 258N/A This versioning information allows us to determine which portions 258N/A of the version and namespace are allowed to add new versions. 258N/A If a package is re-named to the NULL package at a specific version, 258N/A this is equivalent to freezing the package. No further updates to 258N/A the version history may be made under that name. (NULL is never open) 258N/A The rename catalog format is as follows: 258N/A R <srcname> <srcversion> <destname> <destversion> 258N/A """Create a RenamedPackage object. Srcname is the original 258N/A name of the package, destname is the name this package 258N/A will take after the operation is successful. 258N/A Versionstr is the version at which this change takes place. No 258N/A versions >= version of srcname will be permitted.""" 258N/A "Must supply a source or destination version" 258N/A """Implementing our own == function allows us to properly 258N/A check whether a rename object is in a list of renamed """Return a FMRI that represents the destination name and version of the renamed package.""" """Return a FMRI that represents the most recent version of the package had it not been renamed."""