[Camps-users] Upcoming development session and a proposed way forward

Thu Jan 15 03:41:32 UTC 2009

Ethan Rowe wrote:
> All,
> 
> 
> With that in mind, I'd like to put forward (yet another) proposal for a
> camp system design.  This is derived in part from the proposal I made
> almost a year ago (in February), except that it arguably simplifies
> things: instead of having "hosts" versus "services" versus "resources",
> we just have "resources" which encompass everything.
> 

Figures, you would go the opposite direction that I already coded in ;-).

> This does not have to be what we do.  We could look at all sorts of
> things, from starting out with a fundamentally new design, to unifying
> the camp commands to a single "camp <command> <options>" style, to
> improving the docs on the existing system, etc.
> 

See my other e-mail, I think there is enough there that building a 
unified command syntax will be the easy part, pulling all the other 
pieces together is more difficult.

> I'd love to hear some ideas on what matters, but I'd specifically like
> to hear feedback on the proposed design (to follow, in great detail).
> If we can have some rough consensus on what a next-generation camp
> system would look like, this group could start on it, understanding that
> we would start with a bite-sized chunk that we could really deliver in a
> relatively short time.
> 

I've posted some initial thoughts and coding. For me having the camp 
system installable from some sort of packaging is key, as is being able 
to put it just about anywhere, and have camps live anywhere (aka not 
tied to the home directory, though that is a fine default). So that is 
the area I concentrated on initially. I like the option of distributing 
the system on CPAN because it is system agnostic (including being 
available on Windows, Mac OS X, etc.) but also allows other packages to 
be built, aka rpms, debs, etc. I'd even volunteer tuits towards being 
the PAUSE maintainer, etc. so that no one else has to deal with it. 
(I've had interest in it for a while anyways.)

> Please let me know your thoughts.  Thanks.
> - Ethan
> 
> BASIC MODEL
> 
> No separation of "hosts", "services", and "resources".  Just "resources".
> 
> A resource may be a host, or a service.  There's no need for
> distinguishing them.
> 

I'm not sure I fully see the need for "host" at all, but it may be that 
you are thinking more of encapsulation and hierarchy than is in camps 
v3. I was thinking of the host as just being a configuration parameter 
of a resource. Or may be this is more related to distributed 
environments than my brain is ready to contemplate :-).

> One resource can depend on another.
> 
> One resource may contain another.
> 
> We use resource containment (object composition) to build operational
> relationships between resources.
> 
> We use (hopefully shallow) inheritance hierarchies to define families of
> related resource types (like a "database" resource family of which
> "Postgres" and "MySQL" are members).
> 

Yep, was thinking along these lines.

> INTERACTING WITH RESOURCES
> 
> 1. Low level interaction
> 
> A resource has two basic "command interfaces", by which I mean ways of
> issuing commands/requests to the thing represented by the resource (not
> "interfaces" in the sense of Java-esque interfaces that a class implements):
> * a "system" interface: this is the interface through which the camp
> system issues commands/requests to initialize and configure the resource
> in question (i.e. if you need to initialize a new Postgres cluster, you
> use the Postgres resource's system interface)
> * a "service" interface: this is the interface that issues
> commands/requests to the actual public-facing interface offered by the
> resource in question; once you've configured your Postgres resource
> through its system interface, you issue DDL commands through its service
> interface.
> 
> Each resource has a container attribute, which simply points back to the
> resource that contains the resource in question.  So, the local
> operating system resource is the container for the postgres resource.
> 
> By default, a resource's "system" interface simply passes arguments
> through to the resource container's "service" interface.  Thus, commands
> issued to the resource's system interface are formatted as commands
> issued against the container service.  We would probably want some
> convention so that adding in resource class-specific functionality (so
> all commands issued would be issued relative to some base
> script/executable/whatever) is easily done with minimal base/super-class
> interaction annoyances (no need to call $self->SUPER::foo, for instance,
> in Perl-speak).
> 
> The default resource is "localhost", the service and system interfaces
> of which are the same.  They simply format commands to go to the
> underlying operating system of the primary camp server.
> 
> A resource representing a remote host might use the "localhost"
> interfaces to format commands, but pipe those commands over an SSH
> tunnel, maintained within the resource instance.
> 
> A virtual machine resource would implement the SSH tunnel design,
> presumably, for its "service" interface, while the default value of
> using its container's "service" interface for its own "system" interface
> would make perfect sense.  So, for instance, "localhost" might be the
> container, and meaning that the VM resource system interface is the same
> as the localhost service interface.  So, you are effectively issuing
> shell commands to localhost via the VM resource service interface when
> initializing the VM resource.
> 

I'm not sure I follow all of this, and it sounds like a lot of 
abstraction to try to achieve right away, I'm wondering if it makes more 
sense in a later iteration?

> 2. High-level interaction
> 
> We might want resources to implement a few common methods like:
> * initialize
> * remove
> * start
> * stop
> * restart

To which I would add "refresh" based on the current implementation. But 
the list looks like a good start. I would probably also add "info" as a 
generic way to get just that, info. For example whether a service is 
running/down, out dated, and configuration data, etc.

> 
> Logically, initialize and remove methods would be primarily concerned
> with the system interface.  Start, stop, restart would probably use the
> service interface.  For instance, an Apache resource would basically use
> file system operations (copying files, removing files) to build up and
> tear down an Apache resource instance.  But controlling the resource
> once it's initialized would all be done through the httpd command,
> around which the service interface would basically be a wrapper.
> 
> RESOURCE CONFIGURATION
> 
> Resources would have attributes that must be configured.  Attribute
> values should generally be calculated algorithmically.  The current camp
> system basically treats attribute calculation as a set of mathematical
> functions (that don't look terribly mathematical); for a given camp
> number X, each attribute has only one possible value f(X) (i.e. the camp
> number determines everything, from paths to port numbers to hostnames).
> 

Good description.

> However, it should be possible for attributes to be persistent, meaning
> that once derived they are permanently stored as part of a camp's
> configuration, rather than dynamically calculated for all time.
> 
> It must be possible to reinitialize or reconfigure a resource (or set of
> resources), and persistent attributes can be preserved rather than
> recalculated.  Control over what persistent attributes to blow away at
> re-initialization time must be built in from the beginning if this is to
> make sense.
> 
> Resource configuration, once determined, should be serialized down to
> some storage format like YAML, JSON, XML (!!!!!!!!!!!), etc., so that
> persistent attributes are preserved.
> 
> Furthermore, because the central camp system needs to know all
> attributes of the extant camps, this configuration information must be
> preserved in the central system.  However, to potentially allow for a
> distributed camp system where camps are spread across N nodes in a
> server cluster (yes, we have a camp system for which this would be
> highly relevant), and simply to allow local inspection of the
> configuration, each camp's resource configuration information should
> persist within the camp itself (i.e. in userland rather than base
> camp-systemland).
>

In this vein I have the system create a ".camp" directory inside of the 
camp itself for storing information of this kind. Currently I have it 
store the date+time the camp was created and the type, just as sample data.

> This means having some configuration data scattered around, which is
> kind of a drag.  But it's a manageable drag.  Camp commands for
> manipulating configuration values (camp config set <name> <value>) could
> only be counted as successful if they appear to work centrally and
> locally, for instance.
>

Agreed.

> MAGICAL RESOURCES
> 
> The base resource would of course define basic behavior for all resources.
> 
> A "localhost" resource would exist by default.  It refers to the
> underlying operating system (and shell environment) for the central camp
> system itself.
> 
> I'd like to propose that individual camps be treated as resources as
> well.  This is a fairly new idea (in my mind, anyway) and may be
> completely ridiculous.  I'm just putting it out there.  Representing the
> camps themselves as resources, and then the attributes of the camps
> (i.e. numbers, owners, base paths, etc.) are managed like any other
> resource attribute.  Furthermore, the persistent attribute functionality
> lets us potentially have more command over what goes into a given camp.
>  Perhaps the default for the memcached resource is for camps to get a
> single memcached server node.  That would be fine for most cases,
> probably, unless you're the guy who needs to do hard-core memcached
> usage, testing, etc., in which case you really need multiple memcached
> servers for your camp.  So you can configure your camp resource to
> indicate a need for 5 memcached servers, say, which in turn affects how
> the memcached resources are configured when you reinitialize your camp.
>

This fits in with another desire I had for the new system, and that is 
to allow the ability to have a camp's contents be pulled from multiple 
repos, of possibly different VCS types. In our world that would allow 
for multiple camp projects to pull from a single IC resource for example 
which would make upgrades easier, etc.

> That raises the complexity of things considerably but I like it anyway.
>  Please act terribly surprised.
> 
> RESOURCE IDENTIFICATION
> 
> Each resource should have a friendly type-name.  Like "localhost",
> "postgres", "apache", "git", "svn", etc.
> 
> Any given resource in a camp system deployment should have a name
> attribute that can be explicitly set in the configuration, but the
> resource should default to its type-name.  This means we have a decent
> convention that would work well for relatively simple deployments for
> which there is only one sort of resource for each given layer of the
> stack (e.g. one Apache instance, one Postgres instance, one appserver
> instance, etc.).
> 
> If there is truly only one use of a particular resource type within a
> deployment, then the bare type-name can suffice to identify that
> resource.  However, names can be relative to container resources;
> resources representing remote hosts, for instance, could all contain a
> "postgres" resource:
>   - host1.postgres
>   - host2.postgres
>   - ...
> 
> Using a configuration-specified name only becomes important if you want
> to name things according to use-case-specific roles ("master_db" versus
> "slave_db", for instance), or if you need multiple instances of the same
> resource (an Apache instance that serves static content, and an Apache
> instance that functions as an appserver for php/mod_perl/mod_python).
>

We already talked a bit about this offline but I think it is a nice 
setup. I think a specific resource of a given type should be able to be 
marked as "default" allowing us to determine the default and call it by 
resource type even in the case that there are multiple with names.

  > RESOURCE DECLARATION
> 
> Naturally, we need a way to declare resources as existing/mattering, as
> having relationships with each other, etc.
>

Yep. I think we can look at just about any dependency handling in any 
package manager and get some hints.

> I need to think this one through, more (whereas nothing else defined
> above requires any further thinking-through whatsoever, obviously).
> However, a few things probably ought to guide us:
> * simplicity and clarity
> * ease of use
> * common-sense defaults/conventions that fit the simple/common case,
> easily extended/overridden for the less common cases.
> 
> If that sounds a little like Rails' "convention over configuration",
> there's probably a reason.
> 
> Here's an idea:
> * under any given camp type, there's a "resources" directory in which
> the resource definitions reside
> * any top-level resource (other than "localhost", which is the magical
> root resource) appears as a directory within this "resources" directory
> (e.g.. <camp_type>/resources/postgres/;
> <camp_type>/resources/memcached/; <camp_type>resources/django/; etc.,
> etc., etc.)
> * the name of the directory identifies the resource's name relative to
> its containing resource; in the above examples, that means you get
> top-level resources named "postgres", "memcached", and "django"
> * within each directory, some file exists that defines the resource
> configuration/behavior; it could be named the same for any resource
> (i.e. "resource", or "config", etc.), or it could be required to have a
> name that matches the resource name (i.e. same as the directory)
> * that file would be a Perl module.
> * Furthermore, it would, when parsed, default to subclassing the
> resource type of the same name as the resource being defined; so, if the
> resource is named "postgres", the camp system would look for a
> "postgres" resource definition class in its standard library of known
> resource types; if found, the "postgres" resource's configuration module
> would automatically subclass it.
> * Furthermore, the module can override this and explicitly specify what
> type of resource it is.  Hence we have convention and configuration.  Yay.
> * Still furthermore, the module would automatically be using Moose and
> whatever helper functions we want available for easy definition of
> attributes, etc.
> * A resource can have a subdirectory "resources" that contains still
> more resource definitions, establishing the container/contained
> relationship sensibly within the filesystem.
> * A resource has a subdirectory "templates" that contains template
> configuration files for rendering when installing resources into a camp.
>  This is like the <camp_type>/etc/ directory from the existing camp
> design, but cleanly separated by resource
>

Other than the storage of the template files this seems like a lot of 
extra work to traverse and read from the filesystem when it could be 
dropped into a single configuration file that has arbitrary depth 
(XML-esque) fairly easily.

> A master configuration file for the camp type should specify a listing
> of resources to include by default in a new camp.  Perhaps that's done
> in some nice declarative manner, like:
> 
>  default_resources(
>      qw(
>          postgres
>          apache
>          memcached
>          git
>          django
>      )
>  );
> 
> (Or something equivalent, perhaps in YAML.  Whatever).
> 
> When operating on a camp (setting up a new one, for instance), the camp
> system consults the command-line arguments to determine what resources
> to include; if not specified there, it uses the default.
> 
>>From there, it loads up only the resource modules that are necessary.
> 

See Module::Pluggable::Object and autouse (5.10 only??).

> All resources get instantiated as objects in memory with their
> configuration determined as a first pass, and then templates get
> rendered and resources installed/launched as a second pass.
> 
> BASE RESOURCE DEFINITIONS
> 
> It is basically implied by the above that resource types have a base
> definition within the camp system.
> 
> So, there would be one base "resource" module defining the stuff common
> to all resources.
> 
>>From there, we would have subclasses of this "resource" module that
> implement specific resource types (again, "postgres" versus "mysql"
> versus "apache" versus "fabulous shiny pants" versus whatever else).
> This is standard inheritance.
> 
> We do not fret about getting too generic, at least at first.  I.e. no
> "Resource::Database" from which "Resource::Database::Postgres" and
> "Resource::Database::MySQL" derive.  We implement postgres and mysql
> independently, and if common things can obviously be factored out into
> some common ancestor, we do it.  But we shouldn't assume too many
> similarities between competing resource types despite them fulfilling
> the same role (if Postgres and MySQL can honestly be thought of as
> fulfilling the same role, which is a topic for another day).
> 
> Because the resource configuration modules in each camp type are Perl
> modules that subclass these things, we automatically get the opportunity
> for customization within our camps without having to hack the base type
> classes themselves.  Joy.  That ought to be a no-brainer, but those of
> us accustomed to working with some older appservers will likely find
> this most liberating.
> 
> I could go on but that's enough for now.
> 

I'll say :-). So who has some tuits?

-- 
Brian J. Miller
End Point Corp.
brian at endpoint.com