catalog.py revision 207
23N/A# The contents of this file are subject to the terms of the 23N/A# Common Development and Distribution License (the "License"). 23N/A# You may not use this file except in compliance with the License. 23N/A# See the License for the specific language governing permissions 23N/A# and limitations under the License. 23N/A# When distributing Covered Code, include this CDDL HEADER in each 23N/A# If applicable, add the following below this CDDL HEADER, with the 23N/A# fields enclosed by brackets "[]" replaced with your own identifying 23N/A# information: Portions Copyright [yyyy] [name of copyright owner] 221N/A# Copyright 2007 Sun Microsystems, Inc. All rights reserved. 23N/A# Use is subject to license terms. 23N/A """A Catalog is the representation of the package FMRIs available to 258N/A this client or repository. Both purposes utilize the same storage 258N/A The serialized structure of the repository is an unordered list of 34N/A available package versions, followed by an unordered list of 34N/A incorporation relationships between packages. This latter section 34N/A allows the graph to be topologically sorted by the client. 24N/A S Last-Modified: [timespec] 26N/A XXX A authority mirror-uri ... 50N/A # XXX Mirroring records also need to be allowed from client 26N/A # configuration, and not just catalogs. 26N/A # XXX It would be nice to include available tags and package sizes, 270N/A # although this could also be calculated from the set of manifests. 270N/A # XXX Current code is O(N_packages) O(M_versions), should be 270N/A # O(1) O(M_versions), and possibly O(1) O(1). 270N/A # XXX Initial estimates suggest that the Catalog could be composed of 59N/A # 1e5 - 1e7 lines. Catalogs across these magnitudes will need to be 37N/A # spread out into chunks, and may require a delta-oriented update 59N/A """Create a catalog. If the path supplied does not exist, 59N/A this will create the required directory structure. 59N/A Otherwise, if the directories are already in place, the 157N/A existing catalog is opened. If pkg_root is specified 157N/A and no catalog is found at cat_root, the catalog will be 59N/A rebuilt. authority names the authority that 59N/A is represented by this catalog.""" 157N/A # We need to lock the search database against multiple 195N/A # simultaneous updates from separate threads closing 195N/A # publication transactions. 34N/A """Add a package, named by the fmri, to the catalog. 157N/A Throws an exception if an identical package is already 157N/A present. Throws an exception if package has no version.""" 157N/A "Unversioned FMRI not supported: %s" %
fmri 270N/A "Package %s is already in the catalog" % \
270N/A """Takes the list of in-memory attributes and returns 270N/A a list of strings, each string naming an attribute.""" 157N/A s =
"S %s: %s\n" % (k, v)
258N/A """Helper method that takes the full path to the package 258N/A directory and the name of the manifest file, and returns an FMRI 258N/A constructed from the information in those components.""" 34N/A """Walk the on-disk package data and build (or rebuild) the 157N/A package catalog and search database.""" 215N/A # XXX eschew os.walk in favor of another os.listdir here? 220N/A # XXX force a rebuild despite mtimes? 220N/A # XXX queue this and fork later? 157N/A # XXX force a rebuild despite mtimes? 157N/A # If the database doesn't exist, don't bother 157N/A # building the list; we'll just build it all. 157N/A # If we have no updates to make to the search database but it 157N/A # already exists, just make it available. If we do have updates 157N/A # to make (including possibly building it from scratch), fork it 157N/A # off into another process; when that's done, we'll mark it 204N/A """Handler method for the SIGCLD signal. Checks to see if the 220N/A search database update child has finished, and enables searching 220N/A if it finished successfully, or logs an error if it didn't.""" 220N/A # XXX This should be logged instead 220N/A print "ERROR building search database:" 220N/A """Update the search database with the FMRIs passed in via 220N/A 'fmri_list'. If 'fmri_list' is empty or None, then rebuild the 220N/A database from scratch. 'fmri_list' should be a list of tuples 220N/A where the first element is the full path to the package name in 220N/A pkg_root and the second element is the version string.""" 220N/A # If we're in the process of updating the database in our 220N/A # separate process, and this particular update until that's 204N/A # XXX We should probably iterate over the catalog, for 204N/A # cases where manifests have stuck around, but have been 204N/A # moved to historical and removed from the catalog. 204N/A # If we rebuilt the database from scratch ... XXX why would we 204N/A # self.searchdb.close() 204N/A # Five digits of a base-62 number represents a little over 900 million. 204N/A # Assuming 1 million tokens used in a WOS build (current imports use 204N/A # just short of 500k, but we don't have all the l10n packages, and may 204N/A # not have all the search tokens we want) and keeping every nightly 204N/A # build gives us 2.5 years before we run out of token space. We're 204N/A # likely to garbage collect manifests and rebuild the db before then. 204N/A # XXX We're eventually going to run into conflicts with real tokens 204N/A # here. This is unlikely until we hit, say "alias", which is a ways 204N/A # off, but we should still look at solving this. 264N/A # XXX Do we want to log warnings as we approach index capacity? 264N/A """Update the search database with the data from the manifest 264N/A for 'fmri', which has been collected into 'search_dict'""" 264N/A # self.searchdb: token -> (type, fmri, action) 264N/A # XXX search_dict doesn't have action info, but should 204N/A # Don't update the database if it already has this FMRI's 204N/A # XXX The database files are so damned huge (if 204N/A # holey) because we have zillions of copies of 204N/A # the full fmri strings. We might want to 204N/A # indirect these as well. 274N/A """Because of the size limitations of the underlying database 264N/A records, not only do we have to store pointers to the actual 264N/A search data, but once the pointer records fill up, we have to 264N/A chain those records up to spillover records. This method adds 264N/A the pointer to the data to the end of the last link in the 264N/A chain, overflowing as necessary. The search token is passed in 264N/A as 'token', and the pointer to the actual data which should be 264N/A returned is passed in as 'data_token'.""" 264N/A # According to the ndbm man page, the total length of 264N/A # key and value must be less than 1024. Seems like the 264N/A # actual value is 1018, probably due to some padding or 264N/A # accounting bytes or something. The 2 is for the space 204N/A # separator and the plus-sign for the extension token. 204N/A # XXX The comparison should be against 1017, but that 204N/A # crahes in the if clause below trying to append the 204N/A # extension token. Dunno why. 204N/A # If we're adding the first element in the next 204N/A # link of the chain, add the extension token to 204N/A # the end of this link, and put the token 265N/A # pointing to the data at the beginning of the 264N/A break # from while True; we're done 264N/A # If we find an extension token, start looking 264N/A # at the next chain link. 264N/A # If we get here, it's safe to append the data token to 264N/A # the current link, and get out. 264N/A """Search through the search database for 'token'. Return a 264N/A list of token type / fmri pairs.""" 264N/A # For each indirect token in the search token's value, 264N/A # add its value to the return list. If we see a chain 264N/A # token, switch to its value and continue. If we fall 264N/A # out of the loop without seeing a chain token, we can 204N/A """Iterate through the catalog, looking for packages matching 204N/A 'pattern', based on the function in 'matcher' and the versioning 204N/A constraint described by 'constraint'. If 'matcher' is None, 204N/A uses fmri subset matching as the default. Returns a sorted list 204N/A of PkgFmri objects, newest versions first. If 'counthash' is a 204N/A dictionary, instead store the number of matched fmris for each 204N/A package name which was matched.""" 265N/A # 'pattern' may be a partially or fully decorated fmri; we want 265N/A # to extract its name and version to match separately against 265N/A # XXX "5.11" here needs to be saner 204N/A # Handle old two-column catalog file, mostly in 204N/A """A generator function that produces FMRIs as it 204N/A iterates over the contents of the catalog.""" 204N/A # Handle old two-column catalog file, mostly in 204N/A """Load attributes from the catalog file into the in-memory 204N/A attributes dictionary""" 204N/A """Returns the number of packages in the catalog.""" 204N/A """A class method that takes a file-like object and 204N/A a path. This is the other half of catalog.send(). It 204N/A reads a stream as an incoming catalog and lays it down 204N/A # XXX Need to be able to handle old and new 204N/A """Save attributes from the in-memory catalog to a file 157N/A specified by filenm.""" 195N/A """Send the contents of this catalog out to the filep 221N/A specified as an argument.""" 270N/A # Send attributes first. 161N/A # Missing catalog is fine; other errors need to be 270N/A# In order to avoid a fine from the Department of Redundancy Department, 270N/A# allow these methods to be invoked without explictly naming the Catalog class.