# ========== Version Info ===================
# $Rev: 1509 $
# $Author: patdunlavey $
# $Date: 2007-10-11 11:37:22 -0700 (Thu, 11 Oct 2007) $
# ===========================================

============================================================
Advokit design note (very preliminary)
============================================================

*** Elevator Speech

AdvoKit is a web application that allows a distributed,
hierarchical group of volunteers to work together towards a set
of common goals.  All voter information is captured in ways that
can be searched, sorted, and acted upon.  All goals are
measurably defined, so you know when you are reaching them.

AdvoKit is designed to record detailed voter information from a
variety of sources, including the voters themselves, and then
aggregate this information for the purpose of producing high-quality,
iteratively-refined lists of voters.  AdvoKit then provides the tools
to mobilize these voters, so you can achieve your goals.

Ding!  [elevator stopping noise]

*** Data Model

Briefly, the "voter" is the thing we're interested in, the "volunteer" is
the thing that can record and act on voter information, and everything
else has to do with capturing voter information or organizing volunteers.

There is no concept of a "volunteer" in the database.  All users are
assumed to be volunteers.  

* Voters

Voter information is primarily captured by the "voter" table, which has
upward of 100 columns.  It is hoped that the voter table is a superset of
the lion's share of popular/typical voterfile formats.

For information not native to our attempted one-size-fits-all voter table,
we provide two storage and management mechanisms: built-in custom fields
and extended customfields.  In the voter table, the columns whose names
start with "custom_" are "built-in".  These fields can be renamed in the 
UI, and the names are stored in the votercustom table.

Extended custom field documentation TBA.  Basically, a copy of Willow
custom fields, but associated with voters, not tasks or jobs.

* Organization

As we mentioned, volunteers are organized hierarchically.  The
hierarchy is not direct.  Rather, AdvoKit provides a hierarchy of
nodes, and volunteers can be attached to these nodes in varying
capacities.

Top-level nodes are "campaigns", which roughly correspond to
"campaigns" in English.  So "Bob Jones for President" would be one
campaign, and "Veto Proposition #256" would be another.  Though there
is nothing to prevent Bob Jones from adding a "Bob Jones for President
in 2020" campaign.

Second-level nodes are "operations".  A campaign may contain any number
of operations.  An operation has a specific start and end date, as well
as some voter relevancy parameters to use when generating contact sheets.

All other nodes are simply ... nodes.  

A node has the following attributes:

- Depth, which is the distance between the node to the top-level
campaign node.

- Parent, which is the parent node.

- Exercise, which is a set of questions and anwers.

- Contact filter, which is a SQL where clause that is ANDed with all
parent node contact filters in order to winnow down the list of
possible voters available to that node.

Users associated with parent nodes (via personnodeposition) can access
any voter information associated with child nodes (via
voternodeposition).

* Objectives

Actually, we call them "goals" in the database.  A goal is a free-form
text field.

Goals are achieved through "exercises".  AdvoKit provides the means to
create and configure a variety of exercises, which are typically
performed during interaction with voters during early phases of a
campaign.

Operations need not have exercises associated with them.

* Exercises

An exercise is a set of questions and answers.  If a volunteer is
interacting with a voter in the context of a particular operation,
then the volunteer is expected to attempt to answer the questions,
either by asking the voter directly, or by inferring the answers from
the conversation (e.g. "I breathe CO2" means the voter is probably a
plant-based organism).  

Exercises have a "trickle" attribute, which indicates whether
they apply only to the node itself, or to all nodes below.

* Positions

A position is just a name and a set of tasks.  A position can be
associated to any number of nodes, meaning that at any point in time,
multple nodes at various levels in the hierarchy may be tasked with
the same thing (e.g. call voter, ask about favorite foods, ask about
donations)

* Volunteers

A volunteer is a user in the system, a row in the person table.
Volunteers are mapped to position-node pairs, thus a volunteer may
hold multiple positions in the hierarchy.

*** Architecture

AdvoKit is built around a modified Willow framework.  The framework
itself will not be discussed here, only the modifications.

As mentioned earlier, AdvoKit is a distributed web application.  An
"installation" of AdvoKit is a set of instances -- one master
instance, and one or more slave instances.  There can only be one
master among related instances.

An instance may be both master and slave.

The choice of architecture is motivated by the need to distribute
the workload while simultaneously providing centralized reporting.

* Master instance 

The primary function of the master instance is to configure and update
slave instances with information common to ALL instances.  Such
information includes:

- Any lookup table (e.g. cat, county, imtype)
- Voter custom fields
- Campaigns, operations, goals, and exercises

Basically, anything that can be filtered or reported on must be
created and configured by the master instance.

* Slave instance

The primary function of slave instances is to record interactions with
voters.  Information unique to a slave instance includes:

- Voter records
- Values for custom fields
- Answers to exercises
- Positions, tasks, and task followups
- Volunteers

* Interaction between master and slave

Segmenting a database is easy enough.  The real problem is aggregating
the data for reporting purposes.  There are two aspects to this
problem:

- Pushing out master key values to slave instances, and
- Merging data created on slave instances

The basic mechanisms are as follows:

Tables containing data which is to be pushed out will be maintained
only from the master instance.  The configuration for the master
instance includes a list of remote database servers, along with their
connection information.  When viewing such data, the master database
will query only the local database, but when inserting, updating, or
deleting the data, the master instance will attempt to perform the
operation on each remote database in turn.  If transactions are
enabled, then should there be a problem when updating a remote
database, all transactions will be rolled back.

Tables containing data which is specific to slave instances are marked
with an instance code.  This, combined with the primary key (always
"id") for the table, provides a means by which data can be aggregated
for reporting purposes.

Mega-important note: for this to work, the database schema for all
instances MUST be identical.

Master-controlled tables:

answer
campaign
cat
county
customfield
customfieldsettings
customfieldvalue
department (not sure if I'll use this one)
exercise
exercisesection
goal
imtype
language
operation
organization (not sure if I'll use this one)
priority
qcat
question
secretquestion
sectionaction
sectiondisplay
section
sectionquestion (exercise sections, not user sections)
state (as in united states)
votercustom (for native custom field headings)


Slave-controlled tables -- these will be aggregated into reports.

voteranswer
votercustomfield
voterexercise
voterlanguage
voter
voternodeposition


Tables used in both, not necessarily shared:

attachment
filter
followup
followupdetail
image
node
nodeposition
person
personlanguage
personnodeposition
personsection
position
positionperson
preferences
settings
task
team (not sure if I'll use this one)
ustate
ver

*** Importing Voterfiles

Unfortunately, we lack foreknowledge of voterfile formats.  So this is
how it will work:

First, we will require that voterfiles be in CSV format, with the
first row being the field names.

We will provide a screen that contains all of the column names from the
voter table, followed by a description of each.  The user will enter
the name of the corresponding CSV field next to each of the
column names.  For native custom fields, an additional text input is
provided for the user to specify the display name for the field.

A file upload box will also be provided.  Thus, the user will upload
the file and map the fields in the same operation.  

For the first version, the above must be performed on a per-instance
basis.

Should additional fields be required at a later date, the user can
take advantage of the advanced custom field mechanism.