StaticAna.tex revision 9fe9f9ae1487f67b98a9f3429c94084391746677
\documentclass{article}
\parindent 0pt
\parskip 5pt
\begin{document}
\title{The HasCASL language and its analysis}
\author{C. Maeder}
\maketitle
\section{Preliminaries}
see the documentation for HetCATS, Common, CASL
\section{HasCASL Parser}
The HasCASL parser reuses the token (in \texttt{HToken.hs}) and
\texttt{itemList} parser (in \texttt{ParseItem.hs}) from \texttt{CASL} as well
as ids and annotations from the \texttt{Common} directory.
The concrete syntax of HasCASL is given as plain text in
\texttt{Concrete-Syntax.txt}. The abstract syntax is given in
\texttt{As.hs}. The document \texttt{Abstract-Syntax.txt} is out-dated and was
intended as a human readable abstract syntax.
The data types in \texttt{As.hs} usually contain alternatives (Haskell
constructors) for different representation before and after the static
analysis, because an application usually requires a mixfix analysis that
depends on a global environment. Parsing alone only depends on its input
(and not on any global information).
\section{Static Analysis}
HasCASL entities are -- in hierarchical
order -- kinds, types and terms, where predicate types and formulae are
special cases of types and terms respectively.
Kinds are constructed from classes, types from type
constructors and terms from values, i.e. operations or predicates.
Classes, type constructors and values may be declared by giving them a name.
A type constructor must at least be declared with a kind and a value with a
type. (An implicit kind will always be the universe class \texttt{Type}.)
Usually, declarations are extended with further properties or even a unique
definition.
In opposition to the hierarchy a class can be defined as a downset of a type
and a subtype can be defined via a predicate.
\subsection{Classes}
The major representation of a class is given by an intersection class over
class names only. The universe class \texttt{Type} is simply an empty
intersection. Nested intersection classes are flattened already by the
parser. A trivial intersection class is a single class name.
\subsubsection*{Class declarations}
\begin{itemize}
\item A class name can be declared as a subclass of an intersection class. If
this intersection class is trivial then the superclass name is also regarded
as a new or repeated declaration.
A cyclic subclass relation is prohibited.
\item An undeclared class name may be defined as a synonym (or alias) for an
intersection class. For the purpose of comparing classes, synonyms are
expanded.
\item An undeclared class name may be defined as the downset of a type,
comprising all subtypes of the given type. (The variable name for the
subtype is ignored.)
The downset of a type is a subclass of the downset of a
supertype.
\item No superclasses can be declared for downsets or intersection classes
\end{itemize}
\subsubsection*{Remarks}
The class name of a downsets within an intersection class is not further
expanded, although one may have different classe names for the same downset.
Furthermore, type variables may be declared with an anonymous downset.
Class names are token ids, that is, simple ids with an optional compound list,
because types within downsets may also be compound ids.
Class declarations may be repeated, sub- and superclass declarations are
accumulated as long as they are non-cyclic.
Class definitions may be repeated, if they remain equal, otherwise it is an
error. Two downsets should be equal if the corresponding types are equal. (For
type equality type aliases must be expanded.) Currently class equality for
downsets is ignored.
All redeclarations issue a warning.
The optional list of basic-items and the ``\texttt{instance}''-tag for a class
are not handled yet.
\subsection{Kinds}
Kinds are constructed from classes and the builtin kind constructor ($\to$).
The application of a type constructor (\texttt{Pair : Type -> Type -> Type})
is thus written curried (\texttt{Pair Nat Nat}).
(Infix and outfix type constructors are currently only builtin.)
\begin{verbatim}
__ * __ : Type -> Type -> Type
[__] : Type -> Type
\end{verbatim}
In Haskell kinds are written using the star (*) instead of \texttt{Type}, but
in HasCASL any class names are legal to indicate a more specific
applicability.
Furthermore, classes in argument position may be marked as co- or contra
variant to indicate subtype relations of applied type constructors.
\begin{verbatim}
__ -> __ : Type- -> Type+ -> Type
\end{verbatim}
\subsection{Constructor classes}
A class describes a set of types. The universe class \texttt{Type} is the
class of all types.
A constructor class describes a set of type constructors
of a specific kind. Therefore classes (except downsets) can be declared with
additional kinds.
\begin{verbatim}
class Monad : Type -> Type
\end{verbatim}
The classes given within the kind of a class must not depend on the declared
class itself.
During kind inference, constructor classes are replaced by there kind. The kind
of a plain class is the universe class \texttt{Type}.
The kinds of intersections and super classes must be equal.
Constructor classes must not have a variance ($+$ or $-$) within a kind. (This
cannot be checked by the parser.)
\subsection{Types}
Builtin type constructors are 4 different function arrows (partiality combined
with continuity), products (in $\times$ notation) and the lazy
prefix type constructor ``$?$''.
Type constructor names are declared with a kind. These names may be mixfix
ids, but in order to avoid a mixfix analysis for types, they should be
simple ids. Instead the parser resolves types as follows:
\begin{itemize}
\item function arrows have lowest infix priority and associate to the right
\item next come products with (generalized) infix $\times$ notation
\item next comes lazy type construction
\end{itemize}
Everything else to the right of a $?$ is regarded as follows:
\begin{itemize}
\item A single token is taken as nullary type constructor name
\item empty braces or brackets are taken as an id (with two tokens) as above
\item braces or brackets with one argument refer to an unary outfix type
constructor $\{\_\_\}$ or $[\_\_]$.
\item braces or brackets with more arguments are considered illegal.
\item parenthesis may only contains a single argument. Tuple do not exist.
\item The empty tuple is illegal. Instead the builtin type
\texttt{Unit} should be used.
\end{itemize}
Mixfix type expressions with several components are simply regarded as
unparenthesized prefix applications. The first component must be the
type constructor.
Square brackets with several arguments could be taken as compound lists for a
preceding token. (Singleton compound lists are not supported, but regarded as
type construction, i.e. for the list type.)
Type names cannot be overloaded!
Type names may be redeclared, but must then have equal kinds. Only
class names in argument or result positions may be different to indicate
special class memberships of applications.
\begin{verbatim}
type __->__ : Cpo- -> Cpo+ -> Cpo
\end{verbatim}
\subsubsection*{General declarations}
Generally, there are two flavors of declarations for type constructors and
operations: value and pattern notation. In a value declaration a mixfix id is
separated from its kind or type by a colon whereas in a pattern declaration
arguments and mixfix components are intermingled and only the (optional)
result (kind or type) is written to the right of a colon.
If a result (kind or type) is omitted, a default is substituted. No kind
defaults to the \emph{universe} and no result type for an operation indicates
a predicate declaration (although the builtin \texttt{Pred} type constructor
is recommended). The keyword \texttt{pred} is only kept for compatibility with
CASL and not recommended.
For simplicity only the value notation -- that does have no parenthesis on the
left hand side -- is supported.
The pattern notation may be seen as a generalization of the value notation,
because the result may be of higher order, but even then the mixing of places
and parenthesized arguments should be forbidden.
The number of places must be equal or less to the number of arguments.
(For operations arguments of products are also counted. However, if a place is
given for a first product component, then places must follow for all
components of that product.)
\subsubsection*{Type declarations}
\begin{itemize}
\item There are simple value and pattern declarations.
\item Type patterns (with the implicit kind \texttt{Type})
may be declared to be subtypes of a given type. Only a simple type name as
supertype is regarded to be an additional declaration (for compatibility
with CASL).
(Confusingly, this does not mean that the result class is an anonymous
downset as within variable declarations.)
\item Type patterns may be declared to be isomorphic, this corresponds to a
trivial cyclic subtype relation.
\item A type pattern may be defined as predicate subtype via a
formula. For a repeated definition the equality of the formulae is
undecidable and not thoroughly checked (but left to a proof tool).
\item A type pattern may be a type alias. When comparing types these aliases
are replaced with there definition (with proper argument substitution).
\end{itemize}
\subsection{Variable declarations}
By variable declarations either type or value variables can be declared.
Type variables must be simple ids and there kind can either be an anonymous
downset (and using $<$ instead of a colon) or an intersection class (where
class names can also refer to a downset).
A wrongly spelled (or non-existing) class name causes the analysis to assume
that not a type but value variable is declared (most likely causing a type
error). Keeping class and type names disjoint is not further enforced.
\subsection{Data types}
Currently, monomorphic data type definitions are collected. Constructor ids
and selector ids within one alternative must be disjoint (although this is to
restrictive, because only the qualified names must be disjoint). For missing
selectors unique (non-CASL) ids are generated.
\subsection{Terms and formulae}
For a uniform mixfix- and type- analysis, terms and formulae are not
distinguished and all treated as terms. If the first argument of a mixfix
operation is a (unnested) product type, the individual component arguments may
be placed according to the place holders. Otherwise curried arguments will be
expected at place holder positions. Thus a classical infix operation may
either have a curried type or conform to the CASL notation with a product
argument. For a curried type, trailing place holders may be omitted, since the
unparenthesized prefix notation (always) means (unlike CASL) function
application.
Currently, class constraints and sub/supertype information is ignored during
type analysis, but typed terms (via \texttt{:}) are checked for unifying
types. (Laziness is ignored during unification.)
Type casts (via \texttt{as}) or type containment tests (via\texttt{in})
only check for a (unique) type and omit subtype reasoning (that may be done
later, possibly involving a proof tool).
(Literate notations, i.e. for lists, have not yet been added to the analysis.)
\subsubsection{Binding terms}
Binding terms are quantified formulae, lambda terms, case- and
let-expressions. Left-hand-side patterns add variable bindings that may
occur in right-hand-side expressions.
Currently, patterns are not analysed.
\subsubsection{Operation definitions}
Operations may be given via a defining term. Operation's arguments on the left
hand side of an equal sign are moved to the right hand side following a
lambda.
\subsubsection{Program equations}
Program equations correspond to Haskell equations. In order to resolve mixfix
patterns on the left hand side, the so called definition target, must be a
simple prefix identifier with arguments in parenthesis or must have been
declared previously (with place holders and a signature type).
Currently, program equations are not analysed due to the missing
pattern analysis.
\subsubsection{Formulae}
Formulae are simply treated as terms of a ``logical'' type. The fact that
terms may not contain arbitrary formulae must be ruled out later on.
Predicates are treated like (special) operations.
\section{Sources in \texttt{HetCATS/HasCASL}}
HasCASL only relies on stuff from \texttt{Common}.
\begin{description}
\item[Abstract-Syntax.txt] outdated
\item[As.hs] abstract syntax
\item[AsToLe.hs] the static analysis
\item[AsUtils.hs] helpers for \texttt{As}
\item[ClassAna.hs] analyse classes
\item[ClassDecl.hs] analyse class declarations
\item[Concrete-Syntax.txt] concrete syntax
\item[DataAna.hs] analyse data type definitions
\item[HToken.hs] extension of \texttt{Common.Token} for HasCASL
(further keywords)
\item[Le.hs] analysed local environment
\item[Logic\_HasCASL.hs] instance for \texttt{Logic.Logic}
\item[Merge.hs] consistently merge together repeated or extended items
\item[MixAna.hs] top-level mixfix-analysis
\item[MixParserState.hs] states containing types and argument terms during
combined mixfix- and type- analysis
\item[Morphism.hs] dummy morphisms for \texttt{Logic\_HasCASL}
\item[OpDecl.hs] analyse declarations
\item[ParseItem.hs] parsing basic (HasCASL) items
\item[ParseTerm.hs] parsing classes, types, terms and formulae
\item[PrintAs.hs] pretty printing the abstract syntax
\item[PrintLe.hs] pretty printing the analysed abstract syntax
\item[RunStaticAna.hs] wrapper for \texttt{Common.RunParsers}
\item[TypeAna.hs] analyse types
\item[TypeDecl.hs] analyse type declarations
\item[Unify.hs] unification of types
\item[hacapa.lhs] counterpart to \texttt{CASL/capa.lhs}
\end{description}
The binary produced by \texttt{ghc-call} is named \texttt{hacapa}.
The subdirectory \texttt{test} contains \texttt{*.hascasl} files with basic
HasCASL specifications (or test cases per lines). The binary \texttt{hacapa}
called with the option \texttt{analysis} produces for the standard input an
environment and diagnostic messages on standard output.
As for CASL the tests can also be run by \texttt{make check} or
\texttt{make output}.
\end{document}
%%% Local Variables:
%%% mode: latex
%%% TeX-master: t
%%% End: