998d56ea74f8cc733b423da076dadb0fd0a789f3wroweAPACHE 2.x ROADMAP
998d56ea74f8cc733b423da076dadb0fd0a789f3wrowe==================
96a3e7182f6ab9eaff6c94fc31a162b59b2827dandLast modified at [$Date$]
998d56ea74f8cc733b423da076dadb0fd0a789f3wroweWORKS IN PROGRESS
998d56ea74f8cc733b423da076dadb0fd0a789f3wrowe-----------------
a601d863bd772fefc4dc82a883589d8be6a44811wrowe * Source code should follow style guidelines.
a601d863bd772fefc4dc82a883589d8be6a44811wrowe OK, we all agree pretty code is good. Probably best to clean this
a601d863bd772fefc4dc82a883589d8be6a44811wrowe up by hand immediately upon branching a 2.1 tree.
6f51bbc3054846e0c0a897d5f16ceba1726bebc6jerenkrantz Status: Justin volunteers to hand-edit the entire source tree ;)
6f51bbc3054846e0c0a897d5f16ceba1726bebc6jerenkrantz Justin says:
6f51bbc3054846e0c0a897d5f16ceba1726bebc6jerenkrantz Recall when the release plan for 2.0 was written:
6f51bbc3054846e0c0a897d5f16ceba1726bebc6jerenkrantz Absolute Enforcement of an "Apache Style" for code.
6f51bbc3054846e0c0a897d5f16ceba1726bebc6jerenkrantz Watch this slip into 3.0.
ff920f8ae37ba372801b67ea7c3d5bf1dfb55585dreid David says:
ff920f8ae37ba372801b67ea7c3d5bf1dfb55585dreid The style guide needs to be reviewed before this can be done.
ff920f8ae37ba372801b67ea7c3d5bf1dfb55585dreid The current file is dated April 20th 1998!
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe OtherBill offers:
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe It's survived since '98 because it's welldone :-) Suggest we
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe simply follow whatever is documented in styleguide.html as we
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe branch the next tree. Really sort of straightforward, if you
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe dislike a bit within that doc, bring it up on the dev@httpd
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe list prior to the next branch.
998d56ea74f8cc733b423da076dadb0fd0a789f3wrowe So Bill sums up ... let's get the code cleaned up in CVS head.
998d56ea74f8cc733b423da076dadb0fd0a789f3wrowe Remember, it just takes cvs diff -b (that is, --ignore-space-change)
998d56ea74f8cc733b423da076dadb0fd0a789f3wrowe to see the code changes and ignore that cruft. Get editing Justin :)
a601d863bd772fefc4dc82a883589d8be6a44811wrowe * Replace stat [deferred open] with open/fstat in directory_walk.
a601d863bd772fefc4dc82a883589d8be6a44811wrowe Justin, Ian, OtherBill all interested in this. Implies setting up
a601d863bd772fefc4dc82a883589d8be6a44811wrowe the apr_file_t member in request_rec, and having all modules use
a601d863bd772fefc4dc82a883589d8be6a44811wrowe that file, and allow the cleanup to close it [if it isn't a shared,
a601d863bd772fefc4dc82a883589d8be6a44811wrowe cached file handle.]
a601d863bd772fefc4dc82a883589d8be6a44811wrowe * The Async Apache Server implemented in terms of APR.
a601d863bd772fefc4dc82a883589d8be6a44811wrowe [Bill Stoddard's pet project.]
fc9e01023a2fb7f7af9b25621ab080bbe7a95611jerenkrantz Message-ID: <008301c17d42$9b446970$01000100@sashimi> (dev@apr)
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe OtherBill notes that this can proceed in two parts...
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe Async accept, setup, and tear-down of the request
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe e.g. dealing with the incoming request headers, prior to
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe dispatching the request to a thread for processing.
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe This doesn't need to wait for a 2.x/3.0 bump.
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe Async delegation of the entire request processing chain
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe Too many handlers use stack storage and presume it is
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe available for the life of the request, so a complete
998a0c99bc22357406f359ace2f602c5d6e376c6wrowe async implementation would need to happen 3.0 release.
ac901b38303d566fa40041abc9f3f9253afd6bd4brianp Brian notes that async writes will provide a bigger
ac901b38303d566fa40041abc9f3f9253afd6bd4brianp scalability win than async reads for most servers.
ac901b38303d566fa40041abc9f3f9253afd6bd4brianp We may want to try a hybrid sync-read/async-write MPM
ac901b38303d566fa40041abc9f3f9253afd6bd4brianp as a next step. This should be relatively easy to
ac901b38303d566fa40041abc9f3f9253afd6bd4brianp build: start with the current worker or leader/followers
ac901b38303d566fa40041abc9f3f9253afd6bd4brianp model, but hand off each response brigade to a "completion
ac901b38303d566fa40041abc9f3f9253afd6bd4brianp thread" that multiplexes writes on many connections, so
ac901b38303d566fa40041abc9f3f9253afd6bd4brianp that the worker thread doesn't have to wait around for
ac901b38303d566fa40041abc9f3f9253afd6bd4brianp the sendfile to complete.
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gsteinMAKING APACHE REPOSITORY-AGNOSTIC
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein(or: remove knowledge of the filesystem)
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein[ 2002/10/01: discussion in progress on items below; this isn't
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein planned yet ]
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein * dav_resource concept for an HTTP resource ("ap_resource")
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein * r->filename, r->canonical_filename, r->finfo need to
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein disappear. All users need to use new APIs on the ap_resource
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein (backwards compat: today, when this occurs with mod_dav and a
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein custom backend, the above items refer to the topmost directory
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein mapped by a location; e.g. docroot)
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe Need to preserve a 'filename'-like string for mime-by-name
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe sorts of operations. But this only needs to be the name itself
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe and not a full path.
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein Justin: Can we leverage the path info, or do we not trust the
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein gstein: well, it isn't the "path info", but the actual URI of
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein the resource. And of course we trust the user... that is
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein the resource they requested.
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein dav_resource->uri is the field you want. path_info might
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein still exist, but that portion might be related to the
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein CGI concept of "path translated" or some other further
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein resolution.
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein To continue, I would suggest that "path translated" and
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein having *any* path info is Badness. It means that you did
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein not fully resolve a resource for the given URI. The
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein "abs_path" in a URI identifies a resource, and that
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein should get fully resolved. None of this "resolve to
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein <here> and then we have a magical second resolution
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein (inside the CGI script)" or somesuch.
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz Justin: Well, let's consider mod_mbox for a second. It is sort of
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz a virtual filesystem in its own right - as it introduces
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz it's own notion of a URI space, but it is intrinsically
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz tied to the filesystem to do the lookups. But, for the
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz portion that isn't resolved on the file system, it has
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz its own addressing scheme. Do we need the ability to
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz layer resolution?
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein * The translate_name hook goes away
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe Wrowe altogether disagrees. translate_name today even operates
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe on URIs ... this mechansim needs to be preserved.
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein * The doc for map_to_storage is totally opaque to me. It has
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein something to do with filesystems, but it also talks about
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein security and per_dir_config and other stuff. I presume something
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein needs to happen there -- at least better doc.
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe Wrowe agrees and will write it up.
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein * The directory_walk concept disappears. All configuration is
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein tagged to Locations. The "mod_filesystem" module might have some
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein internal concept of the same config appearing in multiple
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein places, but that is handled internally rather than by Apache
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe Wrowe suggests this is wrong, instead it's private to filesystem
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe requests, and is already invoked from map_to_storage, not the core
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe handler. <Directory > and <Files > blocks are preserved as-is,
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe but <Directory > sections become specific to the filesystem handler
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe alone. Because alternate filesystem schemes could be loaded, this
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe should be exposed, from the core, for other file-based stores to
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe share. Consider an archive store where the layers become
cdc56e100a8fa11e989d1633914502db1c8e0818wrowe <Directory path> -> <Archive store> -> <File name>
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz Justin: How do we map Directory entries to Locations?
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein * The "Location tree" is an in-memory representation of the URL
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein namespace. Nodes of the tree have configuration specific to that
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein location in the namespace.
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein Something like:
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein typedef struct {
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein const char *name; /* name of this node relative to parent */
10a6f4803b893e9e77f5ad60ccb387ca1a15f84djerenkrantz struct ap_conf_vector_t *locn_config;
10a6f4803b893e9e77f5ad60ccb387ca1a15f84djerenkrantz apr_hash_t *children; /* NULL if no child configs */
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein } ap_locn_node;
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein The following config:
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein <Location /server-status>
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein SetHandler server-status
10a6f4803b893e9e77f5ad60ccb387ca1a15f84djerenkrantz Order deny,allow
10a6f4803b893e9e77f5ad60ccb387ca1a15f84djerenkrantz Deny from all
10a6f4803b893e9e77f5ad60ccb387ca1a15f84djerenkrantz Allow from 127.0.0.1
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein </Location>
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein Creates a node with name=="server_status", and the node is a
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein child of the "/" node. (hmm. node->name is redundant with the
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein hash key; maybe drop node->name)
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein In the config vector, mod_access has stored its Order, Deny, and
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein Allow configs. mod_core has stored the SetHandler.
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein During the Location walk, we merge the config vectors normally.
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein Note that an Alias simply associates a filesystem path (in
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein mod_filesystem) with that Location in the tree. Merging
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein continues with child locations, but a merge is never done
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein through filesystem locations. Config on a specific subdir needs
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein to be mapped back into the corresponding point in the Location
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein tree for proper merging.
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein * Config is parsed into a tree, as we did for the 2.0 timeframe,
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein but that tree is just a representation of the config (for
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein multiple runs and for in-memory manipulation and usage). It is
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein unrelated to the "Location tree".
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein * Calls to apr_file_io functions generally need to be replaced
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein with operations against the ap_resource. For example, rather
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein than calling apr_dir_open/read/close(), a caller uses
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein resource->repos->get_children() or somesuch.
b76a31daaa6e83bb0fd627a04f20e82bffcf1df4poirier Note that things like mod_dir, mod_autoindex, and mod_negotiation
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein need to be converted to use these mechanisms so that their
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein functions will work on logical repositories rather than just
27ddebab333df2a3d82b0f4ea63878d1d9f38ae8gstein filesystems.
29b927351fa38aca768d31d70116acd0a71ed415jerenkrantz * How do we handle CGI scripts? Especially when the resource may
29b927351fa38aca768d31d70116acd0a71ed415jerenkrantz not be backed by a file? Ideally, we should be able to come up
29b927351fa38aca768d31d70116acd0a71ed415jerenkrantz with some mechanism to allow CGIs to work in a
29b927351fa38aca768d31d70116acd0a71ed415jerenkrantz repository-independent manner.
29b927351fa38aca768d31d70116acd0a71ed415jerenkrantz - Writing the virtual data as a file and then executing it?
29b927351fa38aca768d31d70116acd0a71ed415jerenkrantz - Can a shell be executed in a streamy manner? (Portably?)
29b927351fa38aca768d31d70116acd0a71ed415jerenkrantz - Have an 'execute_resource' hook/func that allows the
29b927351fa38aca768d31d70116acd0a71ed415jerenkrantz repository to choose its manner - be it exec() or whatever.
29b927351fa38aca768d31d70116acd0a71ed415jerenkrantz - Won't this approach lead to duplication of code? Helper fns?
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein gstein: PHP, Perl, and Python scripts are nominally executed by
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein a filter inserted by mod_php/perl/python. I'd suggest
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein that shell/batch scripts are similar.
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein But to ask further: what if it is an executable
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein *program* rather than just a script? Do we yank that out
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein of the repository, drop it onto the filesystem, and run
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein it? eeewwwww...
39d8628e74736d0bd8c4114d8e615dffa8fbbcf7gstein I'll vote -0.9 for CGIs as a filter. Keep 'em handlers.
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz Justin: So, do we give up executing CGIs from virtual repositories?
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz That seems like a sad tradeoff to make. I'd like to have
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz my CGI scripts under DAV (SVN) control.
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz * How do we handle overlaying of Location and Directory entries?
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz Right now, we have a problem when /cgi-bin/ is ScriptAlias'd and
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz mod_dav has control over /. Some people believe that /cgi-bin/
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz shouldn't be under DAV control, while others do believe it
bb16031776199a40aa54b2a6540835f7f8db0ce8jerenkrantz should be. What's the right strategy?