AI Expert Newsletter
AI - The art and science of making computers do interesting
things that are not in their nature.
February 2005
New Downloads
Course notes and samples for an introductory Prolog course are now available
from the downloads section of www.ainewsletter.com.
The course includes a logic-base approach to a normal business application,
airline pricing and booking; numerous exercises in recursion and nested structures;
and a frame-based system with property list matching utilities that can be used
for ontologies and numerous other applications.
Dennis Merritt
Welcome to the February AI Expert.
As promised last month, I have the do-it-yourself
on Inductive Logic Programming, and another do-it-yourself
on the Copycat analogical-reasoning program, which explores
the emergent subcognitive behaviour that its authors
believe to be the crux of creativity. If it is, I believe
it must also be the essence of aesthetics. To complement these,
there are some quotes on creativity and emergence,
and a pointer to StarLogo, a language for teaching about
parallel and emergent behaviour. But first, some abnormally muscular robots.
From the number of times I've heard it reported, my first piece of news
has grabbed media attention as science rarely does. Perhaps that's
because it is claimed to use nanotechnology. In the
UK, what we mainly hear of nanotechnology is the dangers. Grey goo
will eat the Earth! Prince Charles
warns of risks! Tiny particles will rot our lungs! (This last may be true, but is also a
long way from the precision atomic-scale engineering
that nanotechnology is really about.)
As reported by
BBC News Online,
University of California researchers
have created robot "legs" less than a millimetre long, consisting of rat heart-muscle cells grown onto
silicon or plastic "bones". Nano-scale engineering controls where the cells
attach to one another as they grow, so that they form a muscle rather than some
random blob of tissue; and the skeleton too is precision-engineered, with tiny
hinges that allow it to move and bend.
A bit of Web searching turned up more information.
The research was done by
Carlo Montemagno
and colleagues at the
UCLA California NanoSystems Institute.
From this New Scientist
feature
and the papers I've linked at the end,
First self-assembled microrobots powered by muscle
and
Constructing Biological Motor Powered Nanomechanical Devices,
it seems the crucial advance is
in growing the muscles and attaching them to
their skeleton. Previous muscle-mechanical systems have used
muscle bundles extracted intact from animals; but here,
the researchers persuaded the muscles to assemble themselves.
The first step was to fabricate the
skeleton from a silicon wafer, forming a 50-micrometre wide arch. The
researchers coated this with
a special polymer into which finely spaced
patterns were etched, and a gold film deposited on top.
Then flesh was put onto these bones.
The silicon-polymer-gold combination was placed into a cell culture medium.
The polymer took up water, forming a gel in which the cells could
grow and differentiate,
lining up along the patterns to form complete muscle bundles attached to
the skeleton. And, as New Scientist
says, when the researchers looked into their microscopes,
they were amazed to see their musclebot crawling around.
Once we can build complete robots in this way,
they will pose bizarre challenges to
AI. How much autonomy will the muscle-drive
have? What's the most appropriate control architecture?
(That would make an interesting simulation project.)
Must we still worry about inverse joint kinematics,
or can we just hook up a handful of touch-sensory neurons and
let the system learn to walk for itself? Most importantly,
how do we wire in a sense of self-preservation? For,
as student fees increase...
I'm terribly sorry, Professor, but I was working
late last night, I'd missed lunch, and the vending
machine was out of order. I've eaten your robot.
news.bbc.co.uk/1/hi/sci/tech/4181197.stm -
'Living' robots powered by muscle. The BBC News feature.
news.bbc.co.uk/1/hi/sci/tech/3883749.stm -
Prince warns of science 'risks', also from the BBC.
www.newscientist.com/article.ns?id=dn4714 -
First robot moved by muscle power, from New Scientist.
http://www.sciencentral.com/articles/view.php3?article_id=218391960 -
ScienCentral's popular-science page
on biobots and molecular motors, including Montemagno's work.
Some nice links.
www.cnsi.ucla.edu/faculty/montemagno_c.html -
Carlo Montemagno's home page. The UCLA California NanoSystems Institute is at
www.cnsi.ucla.edu/mainpage.html.
www.spie.org/paper/FirstSelf.pdf -
First self-assembled microrobots powered by muscle, by
Jianzhong Xi, Jacob Schmidt, and Carlo Montemagno. Technical descriptions
of how the biobots were built; assumes knowledge of the nano-engineering techniques.
www.foresight.org/Conferences/MNT6/Papers/Montemagno/index.html -
Constructing Biological Motor Powered Nanomechanical Devices
by Carlo Montemagno, George Bachand, Scott Stelick, and Marlene Bachand. A
draft paper for the Sixth Foresight Conference on Molecular Nanotechnology.
Most of the paper is about molecular motors, but it also describes patterning techniques
similar to those used in guiding the biobots' muscle growth. Quite apart from that, it's
amazing to see serious plans to use single proteins as
motors. I love the authors' phrase
"Despite the superb performance of the F1-ATPase motor protein ..." - it
sounds like something from a review of motorbike engines.
A solitary ant, afield, cannot be considered to have much of anything on
his mind; indeed, with only a few neurons strung together by fibers, he can't
be imagined to have a mind at all, much less a thought. He is more like a
ganglion on legs. Four ants together, or ten, encircling a dead moth on
a path, begin to look more like an idea. They fumble and shove, gradually
moving the food toward the Hill, but as though by blind chance.
It is only when you watch the dense mass of thousands of ants,
crowded together around the Hill, blackening the ground, that
you begin to see the whole beast, and now you observe it thinking,
planning, calculating. It is an intelligence, a kind of live
computer, with crawling bits for its wits.
At a stage in the construction, twigs of a certain size are needed, and
all the members forage obsessively for twigs of just this size.
Later, when outer walls are to be finished, the size must change, and as
though given orders by telephone, all the workers shift the search to
the new twigs. If you disturb the arrangement of a part of the Hill,
hundreds of ants will set it vibrating, shifting, until it is put
right again. Distant sources of food are somehow sensed, and long lines,
like tentacles, reach out over the ground, up over walls, behind
boulders, to fetch it in.
From The Lives of a Cell by Lewis Thomas.
This can't be math I'm doing. It's fun and math ain't. So says one of the children
quoted in Seymour Papert's book Mindstorms. Papert,
co-founder with Marvin Minsky of the MIT Artificial
Intelligence Lab, invented Logo, the Basic-like language in which children
can write programs to control a computer-driven turtle, and hence learn about
mathematical concepts such as polygons and interior-angle sums, as
well as meta-cognitive concepts such as the need to debug one's learnt
knowledge. He developed this into the idea of educational microworlds,
a theme explored in Mindstorms.
I recently discovered a parellel implementation of Logo,
StarLogo, through
a mention in John Hiler's Weblog-related Weblog
Microcontent News.
He describes experiments with a
StarLogo ant simulator,
complete with food and pheremone trails. The behaviour
generated by the simulator is complex, but emerges from
four simple rules - and Hiler proposes that you could
model blogging behaviour with this, for example, by replacing
the rule "look for food" by "look for news".
Microcontent News appears to be no longer active, but the ant simulator
is still available. StarLogo was designed as a tool for teaching children, by
experts in teaching, and it's a simple way to demonstrate emergent behaviour, including
ideas from Artificial Life.
www.microcontentnews.com/entries/20021220-2589.htm -
Microcontent News page for December 20, 2002, John Hiler's entry on the ant simulator.
The simulator source is at
www.cs.ucc.ie/~dgb/courses/ai/notes/myants.slogo.
education.mit.edu/starlogo/ -
StarLogo.
web.media.mit.edu/~mres/ -
Mitchel Resnick, developer of StarLogo and author of
Turtles, Termites, and Traffic Jams: An Exploration in Massively Parallel Microworlds.
www.thepangburns.com/jesse/projects/ant_simulation.htm -
Another StarLogo ant simulator, by Jesse Pangburn.
www.papert.org/ -
Seymour Papert. Links to the LCSI company for constructivist educational
technology using Logo microworlds, the Logo Foundation, LEGO Mindstorms, and many
papers on topics such as school reform and how computers help children learn.
For fifteen days I strove to prove that there could not be any functions
like those I have since called Fuchsian functions. I was then very
ignorant; every day I seated myself at my work table, stayed an
hour or two, tried a great number of combinations and reached no
results. One evening, contrary to my custom, I drank black coffee
and could not sleep. Ideas rose in crowds; I felt them collide
until pairs interlocked, so to speak, making a stable combination.
By the next morning I had established the existence of a class of
Fuchsian functions, those which come from the hypergeometric series;
I had only to write out the results, which took but a few hours.
...
...
Permit me a rough comparison. Figure the future elements of our
combinations [full-fledged ideas] as something like the hooked
atoms of Epicurus. During the complete repose of the mind, these
atoms are motionless, they are, so to speak, hooked to the wall;
so this complete rest may be indefinitely prolonged without the
atoms meeting, and consequently without any combination
between them.
On the other hand, during a period of apparent rest and
unconscious work, certain of them are detached from the wall and
put in motion. They flash in every direction through the space
(I was about to say the room) where they are enclosed, as
would, for example, a swarm of gnats or, if you prefer a more
learned comparison, like the molecules of gas in the kinematic
theory of gases. Then their mutual impacts may produce new
combinations...
Now our will did not choose them at random; it pursued a
perfectly determined aim. The modified atoms are therefore
not any atoms whatsoever; they are those from which we might
reasonably expect the desired solution. Then the mobilised
atoms undergo impacts which make them enter into
combinations among themselves or with other atoms at rest which
they struck against in their course. Again I beg pardon, my comparison
is very rough, but I scarcely know how otherwise to make my
thoughts understood.
By Henri Poincaré; quoted in
The Psychology of Invention in the Mathematical Field
by Jaques Hadamard.
Strange though it may seem, nondeliberate yet nonaccidental slippage
permeates our mental processes, and is the very crux of fluid thought.
In the early 1980s, Douglas Hofstadter took over the Mathematical
Games column of Scientific American, and wrote a number of essays
under the heading Metamagical Themas, later collecting them into
a book published under the same name. He covered many topics, from the
arbitrariness of the genetic code to the threat of nuclear war; for me,
the most interesting are about creativity, where he mentions
ideas which he and colleagues at the
Fluid Analogy Research Group went on to implement in their programs.
The quote above is from one of Hofstadter's essays, Variations on
a Theme as the Crux of Creativity.
Scott Bolland has written a nifty applet reimplementation of
Hofstadter and Melanie Mitchell's
Copycat program; it's easy to run, and I'd like to use it
to introduce you to some of these ideas.
Consider the following questions:
What is to ijk as abd is to abc?
What is to iijjkk as abd is to abc?
What is to xyz as abd is to abc?
These are all simple letter-string problems. To the first, you'd
probably say the answer is ijl, since the c in
abc has gone to a d, and to "do the same thing" to the
k, we move it one letter forward too, making it an l.
The second problem is less straightforward, because iijjkk has no
immediately obvious mapping onto abc, making it
harder to see to which part of it we should "do the same thing" to as we did to
the c.
One possibility is to map the
final k onto the c and move it one letter forward,
giving us iijjkl. However, it seems more elegant
to map two letters each of one sequence onto one of the other, making the
final kk correspond to the d. Then we move both letters
forward, giving the answer iijjll.
This is what Hofstadter calls conceptual slippage. The kk isn't
actually the same kind of thing as the c, and we have to "slip"
our notion of it to perceive it as such.
People give a variety of answers to the third problem. One I, as well
as Hofstadter, find
particularly elegant, is wyz. Here, the a has been mapped to
the z and the c has been mapped to
the x, so that one sequence is reversed with respect to the
other. Then, to preserve symmetry, the relation of alphabetic successor between
c and d is also reversed, so that to find the answer to
the problem, "doing the same thing" to x takes its alphabetic
predecessor, converting it to w. The conceptual slippage here applies
more to the relation between letters than to the letters themselves as it
did in the previous example.
These letter-string puzzles are fun, but have they really anything to
do with the heavy-duty creative thought practised by an Einstein, a Shakespeare, or
a Turing? To try and justify Hofstadter's assertion that they do,
I created the following analogy problems:
What is to Prolog (or logic) as classes are to Java?
What is to reasoning by association as Boolean algebra is to reasoning by deduction?
What is to liquid methane as Terrestrial biochemistry is to water?
What is to integers as union is to sets?
What is to 5-8 as 1 is to 5-4?
What is to 10÷3 as 2 is to 10÷5?
What is to -9 as 3 (and -3) are to 9?
What is to groups as primes are to integers?
What is to homosexual love as marriage is to heterosexual?
What is to "mimsy" (or "wabe", or "borogroves", ...) as German is to English?
What is to line drawing as minor chords are to music?
Two in
computing; one (inspired by the Huygens probe) in xenobiology;
five in mathematics (one solved by Lenat's AM program,
four well-known in the history of the number system, one
finally resolved 22 years ago by
by group theorists); one for law and religion; one (translation of Carroll's Jabberwocky into
Der Jammerwoch)
in literary translation; and
one for visual art (answered by Paul Klee's Pedagogical Sketchbook).
So we can certainly couch a lot of problems as analogies. Whether an extension
of the mechanisms Hofstadter proposes is essential to solving them, and how much else
is necessary, I don't know. But it's interesting and perhaps important, and that's why
I'm introducing Copycat to you.
Incidentally, Hofstadter also proposes, more prosaically, that perceiving the right analogy is essential
for survival. Consider someone who argues that killing a person is the same as
breaking a window because both are nasty and can be done with a brick: such reasoning
is somehow missing the essence of the situation. Minds
so faulty will be weeded out by natural selection, leaving
behind those with a deeper perception.
Paint the letters of which the problem is made onto little
Velcro balls and put these into a jar. Shake hard. From time to
time, two related letter-balls will collide. They may be
related because one follows the other in the alphabet, or because one
stood next to the other in the problem, or even because they
are the same letter.
Now, imagine that as the letter-balls collide, they
sometimes stick and form bonds, the strength of the bond depending
on how strongly the letters are
related. An i-j bond is stronger than an i-k bond.
Both are much much stronger than an i-u bond, which, since
the i and the u are just too distant in
alphabetic order to be related, is almost nonexistent.
The shaking will break bonds as well as make them. But
the stronger bonds will persist for longer, forming
relatively more stable combinations.
Occasionally also, these pairs will themselves bond together.
An a-b may bond with another a-b through the
force of identity. But - and this is the crux of
conceptual slippage - it may also bond with a z-y. This
happens
because the a-b bond is an "alphabetic successor"
bond, and the z-y bond is an "alphabetic predecessor"
bond, and an "alphabetic successor"
bond is related to an "alphabetic predecessor"
bond by the force of oppositeness.
Now, we just watch the
jar. As molecules form, we shake less vigorously, and then
even less vigorously. The rate at which we
cease to shake must be carefully chosen. Big
molecules are almost complete solutions to the
problem and we don't want to
destroy them by shaking too violently. On the other hand,
we don't want to stop too
soon; if we do, the mixture will freeze before enough
combinations have been tested.
My description above was inspired by
Poincaré's quote on mathematical
creativity. Hofstadter cites it in his essay, and
it probably inspired some aspects of Copycat's implementation. In more
technical terms, Copycat does a
simulated annealing over structures formed by linking letters
by their possible relationships.
This cools at a rate dependent on the size
and number of structures formed, and implements a parallel breadth-first
search over the possible combinations of relationships between
elements of the analogy problem. Hofstadter calls this
a "parallel terraced scan".
Incidentally, my point about the difference between an
i-j bond versus an i-u bond reflects
a distinction Hofstadter draws between "roles" and "literals".
In the "What is to ijk as abd is to abc" problem, the c and d
can be perceived as related. It therefore makes
sense to imitate this relation - "do the same thing" -
to the k. But if the problem had been
"What is to ijk as abx is to abc", then
the c and x would be just too far apart.
A sensible answer would regard the x as fixed -
a literal - rather than filling a role, that of alphabetic successor.
So we would make the answer ijx. Shallow minds that
miss the essence of a problem may do so because they see literals
when they ought to see roles.
I highly recommend reading Analogies and Roles in Human and Machine Thinking,
another essay from Metamagical Themas,
to see theses ideas, including the
distinction between roles and literals, developed further. Copycat is
a highly subtle attempt to model how
we balance aesthetic forces such as
symmetry, uniformity, good substructure, and boundary strength.
Try designing a typeface, drawing a comic strip, or whatever
your favourite artistry is; feel them; and compare with
the little letter problems.
Bolland's Copycat is an applet, so needs no
messy installation, just a Java-enabled Web browser.
Point your browser at the applet,
www.psy.uq.edu.au/CogPsych/Copycat.
You should see a page starting "Copycat" and
"Please wait while the Copycat applet loads".
A grey applet rectangle will shortly appear above them, followed
by a separate pop-up containing the applet's user interface. This
is initially blank, before changing to display
its controls.
On a fast University line, this took my system less than a minute: very good
for a download from Queensland to Oxford, half-way round
the world.
Once the applet is running, you may want to expand the window, which
will also expand the contents, making it easier to read.
Be warned that when I changed my browser to another page, the
applet window went blank. Returning to the applet URL didn't
restore it, and I had to close my browser, then open a new one and start
right from the beginning.
I'm always sceptical when running new software; so many things
can go wrong. So let's see straight away
whether this one will work on an analogy problem. Click on the menu
bar to display the "File" menu, and select "New". This will pop up
a little "New problem" window, with three text fields, for "Initial
string", "Modified string", and "Target string". The fields should
contain the strings abc, abd, and ijk. Click the "OK" button
to enter this as the current problem.
The window will disappear, and your strings will reappear in the
main window, under the heading "Workspace". This holds the problem
that Copycat is going to think about next, and is roughly equivalent
to human long-term memory. We now tell Copycat to
begin. The grey bar to the right of the workspace contains
four control symbols, of which the second is a right-pointing triangle.
This stands for "Play". Click it, and Copycat will start its problem solving.
Within a few moments, the solution will appear in the bottom-right
box in the Workspace. Copycat is non-determininistic, which means that
it can come up with different answers from run to run. However, in
this case, the answer is very likely to be ijl. So Copycat has decided
that it is ijl which is to as ijk as
abc is to abd.
Once Copycat has displayed a solution, doing another run is easy. Click the
leftmost of the four buttons, the square, to reset. A question mark should
replace the previous answer. Then press the "Play" triangle again, and wait
as before.
Instead of pressing "Play", you can single-step through a problem by repeatedly
clicking the right-hand button, a triangle with a bar. Finally, pressing
the third button, "||", interrupts Copycat's thoughts. Press Play or
Single-Step to continue.
As Copycat runs, you will see the reading on the big red thermometer
gradually decrease. This
corresponds to the amount of shaking in
my how Copycat works analogy, i.e
the temperature in the simulated
annealing. In the main window, you see the letter strings, with links forming
and re-forming between them. These links are the "bonds" or relationships
which Copycat currently perceives. You will also
see a counter for "codelets". These are pattern recognisers
which look for the relationships - the equivalent
of bumping letter-balls together in the jar.
I said above that Copycat may think up different answers to the
same problem in different runs. We can investigate this non-determinism
by using group runs. Click "Run" on the menu bar, and select
"Group Run". This will pop up a "New group run" window, which will
invite you to type the name of the run. Any name will do - it's
just an arbitrary identifier. My window had to be expanded vertically
before I could see the field that holds the name.
Having typed a name, click OK, and select New from the File menu as
we did above. Enter a problem. A good thing to try is to leave the initial
string at abc and the modified string at abd,
but to make the target string be xyz, as in my section on
conceptual slippage.
After entering the problem, click OK in the problem window.
This time, the main "Workspace" window will remain blank, except for four
control buttons at its top right, to the left of the original
four buttons we have already used. Press the second, which is
another "Play" button. You have initiated a whole group of runs, and what
appears this time is a bar chart whose
bars grow to depict the frequency of each answer Copycat has found.
How well does it manage conceptual slippage in the
"What is to xyz as abd is to abc" problem?
If you got this far, you've managed to run Copycat, and I hope you're
intrigued by these ideas. I am therefore going, abruptly, to stop. As well
as his applet, Bolland has written an excellent
tutorial
on Copycat, which it would be silly for me to duplicate. For other
reading, see the links below, and the essays in
Metamagical Themas. And don't underestimate the subject's
importance. As Hofstadter says at the end
of Analogies and Roles in Human and Machine Thinking:
I feel confident that this tiny alphabetic world allows all of the key
features of analogy to make their appearences. In fact I would go further
and claim: Not only does the Copycat domain allow all the central
features of analogy to emerge, but they emerge in a more crystal-clear
way than in any other domain I've yet come across, precisely because
of its stripped-down-ness. Paradoxically, Copycat's conceptual
richness and beauty emanate directly from its apparent impoverishedness,
just as the richness of the "ideal gas" metaphor emanates from
its absolute simplicity. Time will tell if this
limb I am out on is solid.
www.psy.uq.edu.au/CogPsych/Copycat -
Scott Bolland's Copycat applet. His tutorial, which
assumes no previous knowledge, is at
www2.psy.uq.edu.au/CogPsych/Copycat/Tutorial/.
www.itee.uq.edu.au/~cogs2010/Schedule2003.html -
Bolland's timetable for a Queensland University
course on models in cognitive science. The entry for Week 10 links to
practical notes on Copycat, including exercises.
www.itee.uq.edu.au/events/seminars/archive/sem-0015.html -
Using Copycat for context-dependent visual object recognition. Abstract
of Bolland's Ph.D confirmation seminar on his object-recognition program VICOR,
which uses analogical matching to recognise the same object in apparently
unrelated visual data.
www.cigital.com/~gem/lspirit.html -
The Letter Spirit project. See also the
entry on Science, Simplification, and Special Somersaults at
www.ainewsletter.com/newsletters/aix_0501.htm#s
in the January Newsletter.
www.cs.pomona.edu/~marshall/metacat/ -
James Marshall's Metacat Home Page, with links to his
thesis and source code, and instructions on running the code. Metacat extends
Copycat by being able to introspect, recognising patterns in its
problem-solving processes as well as in its perceptions.
www.ulg.ac.be/cogsci/rfrench/elephants.pdf -
When coffee cups are like old elephants or Why representation modules don't make sense.
Preprint of Robert French's paper from
Proceedings of the 1997 International Conference on New
Trends in Cognitive Science, edited by A Riegler and M Peschl, Austrian Society for Cognitive Science. French
argues that we can't separate the creation of perceptual representations from their use, and hence
for programs and cognitive models that intertwine them as Copycat does.
www.ulg.ac.be/cogsci/rfrench/analogy.tics.pdf -
The Computational Modeling of Analogy-Making. Preprint of French's paper from
Trends in Cognitive Sciences, Volume 6, Number 5, 2002. A survey of previous programs and
models of analogical reasoning.
www.aaai.org/AITopics/html/analogy.html -
AAAI analogical reasoning page.
Mathematicians often cultivate a certain smug attitude of superiority
over the experimental sciences. There is, after all, a higher form
of truth in mathematics which is impervious to the whims of
time. A theorem proved 500 years ago is still true today, and while
such theorems may be elementary, we often find them useful (and
sometimes we even find them interesting). We still teach Newton to
our undergraduates; a chemist who reached that far back would
be teaching alchemy.
Yet mathematicians are rather miserable scientists; they have no sense
of empirical classification or experimental control. In technique,
mathematics and alchemy are virtually indistinguishable. Consider
the two. The alchemists mixed together whatever was on hand and
then observed the results which were, by the way, often quite ordinary.
Of course they could repeat the process with the new products
(again and again), but the further such experiments proceeded the
more confused and uninteresting was the product. Their experiments
were largely uncontrolled meanderings which only rarely
proveded a glimmer of revelation. And modern mathematics? The
alchemists would feel right at home.
From Mathematics and Alchemy, editorial by John Ewing in
The Mathematical Intelligencer Volume 3, Number 4, 1981.
Pervez Musharraf could one day be taking advice
from an Induced Logic Program. Inductive Logic
Programming is a form of symbolic machine learning which
learns, from examples expressed in Prolog or similar
logic-based notation, logical rules stating what
the examples have in common.
Oxford University Computing Lab is one of many centres researching into
ILP, and its Web site lists an abundance of
applications.
Amongst these - which include learning rules to identify
over-performing stocks,
diabetics at risk from kidney damage,
the best embryos for transfer in in vitro fertilisation,
and
the best mesh-resolution for finite-element mesh designs -
is Jamal Abdul-Nasir's work on
classifying questions
posed to the Speaker of the National Assembly of Pakistan.
Every day, about 100 parliamentary questions are submitted to
the Questions Branch of the National Assembly,
in the hope they will be put to ministers during
Question Hour. Some questions are not permitted -
such as those asking for publicly-available
information that members could find out elsewhere - so
the clerks of the Questions Branch have to sort the permissible from
the impermissible. This takes about 5 minutes per question.
That's not a lot, but because of the number of questions, any
computational assistance would be welcome. Abdul-Nasir's research used ILP
to learn permissible-vs-impermissible rules from a database of
previously-classified questions, and concluded that this
could save a substantial amount of time.
I can't imagine Tony Blair accepting advice from a computer program; but
this example, and the others listed above, show how versatile ILP can be.
Because of this versatility, the human-readability of its
output, and its ability to handle complicated structures such
as molecules and documents, ILP deserves to be better known. Several free implementations
now exist, and I'm going to use one of these to work
through some examples that you can try for yourself.
Before I launch into the downloads and demonstrations, I want to
show you one great advantage of ILP, the readability of its rules.
I'm going to illustrate this with an example from drug design.
First, a bit about the subject.
Pharmacology - the study of drugs and their effects -
is difficult. We may know that some compound lowers blood pressure, induces
anaesthesia, or lifts depression, but have little idea why. Perhaps
its effect was first discovered by accident, as
legend
has it happened with coffee when, one day, Kaldi the shepherd noticed his goats
jumping around in the field. After eating the red berries from the shiny dark-leaved shrub the goats
had been nibbling, Kaldi was jumping around too, and the rest is
history:
London coffee houses; Sartre and Hemingway scribbling away
in Les Deux Magots; take-away cappuccino ventis on
every street corner; and me bashing away at this feature
in the Excelsior,
Oxford.
To explain such effects, pharmacologists invented the "pharmacophore". An often-used analogy
is that of
lock and key.
Designed drugs, and other active molecules such as caffeine, affect enzymes by slotting into regions
of a particular size and shape, like a key fitting into a keyhole.
It's only the key's teeth whose shape is important;
it doesn't matter whether the head is oval or hexagonal, or
threaded onto a keyring. Similarly, only a small
part of the "ligand" - the drug or other molecule - interacts with the
enzyme. The rest is scaffolding. The part that does interact
needs to be a particular shape (though shape is a fuzzy
concept when you're as small as an atom), as well as a particular distribution
of electrons and charge. These properties are the
pharmacophore. The pharmacophore for an enzyme determine the ligands that
bind to it, as the tumblers in my bedroom lock determine the keys that
open it.
One class of drugs very important for treating high blood pressure
is the ACE-inhibitors.
These block the action of Angiotensin-Converting Enzyme or ACE, which
generates a protein that constricts blood vessels and raises blood pressure.
It's not surprising that, in their quest for better
treatments, pharmacologists have tried to determine the pharmacophore of the ACE-inhibitors.
And so has ILP. There is a paper,
Pharmacophore Discovery
using the Inductive Logic Programming System PROGOL,
which describe a blindfold trial where ILP was applied to ACE-inhibitors
for which earlier experimenters (not using machine learning)
had already worked out a pharmacophore.
The paper is, incidentally, a good introduction to both ILP and
drug discovery, requiring little knowledge of either.
As the authors say, this
trial was nearly as straightforward
as "(1) run Progol over the earlier experimenters' data and assumptions;
(2) compare the output with their results". It was a success. Progol
did rediscover the pharmacophore.
Moreover, its rediscovery was easy to read. Progol represented its
rule internally as a Prolog clause. But it can translate these into
a stylised English. Doing so here gave this output:
Molecule A is an ACE inhibitor if:
molecule A can bind to zinc at a site B, and
molecule A contains a hydrogen acceptor C, and
the distance between B and C is 7.9 +/ 1.0 Angstroms, and
molecule A contains a hydrogen acceptor D, and
the distance between B and D is 8.5 +/ 1.0 Angstroms, and
the distance between C and D is 2.1 +/ 1.0 Angstroms, and
molecule A contains a hydrogen acceptor E, and
the distance between B and E is 4.9 +/ 1.0 Angstroms, and
the distance between C and E is 3.1 +/ 1.0 Angstroms, and
the distance between D and E is 3.8 +/ 1.0 Angstroms.
I have a fondness for ILP in pharmacology,
through working on this myself with one of the
authors of the paper above. However, that's not the
only reason I chose it to illustrate ILP.
Molecules are messy, complex, structured. They branch, fold and tangle
in three dimensions. They are built from atoms which we must specify geometrically
(size), chemically (is it a hydrogen acceptor?), and in relation to one
another (bonding). Even without the rest, the geometry can be
daunting: those who work on protein folding know that you have to think
about the sequence in which the amino-acid building blocks are connected (primary structure),
and the way these sequences curve to form helices and strands (secondary
structure), and the way these helices and strands then coil and loop
to make the entire protein (tertiary structure). If ILP can
cope with all this, surely little else can frighten it.
As the Oxford
University
applications page
with which I introduced ILP shows, ILP is indeed
not limited to pharmacology: its applications range from medical screening
to learning about stocks and shares. Others will be found via my links below.
I should give two warnings. First, ILP still has shortcomings. Some of these
are discussed in the recent paper
ILP: A Short
Look Back and a Longer Look Forward.
Second, you must know some theory to use ILP correctly; otherwise,
its results may be subtly wrong without you realising.
Drug designers prefer large pharmacophores to small, because they
indicate that the molecules being tested have more in common,
a result which is less likely to occur by chance. In ILP terms, this means
the designers would prefer a large rule to a small one. ILP programs,
however, prefer short rules to large. This is because they're simpler and thus
more interesting as theories of what distinguishes examples of a concept
from non-examples. (Einstein would not have been so popular had
e equalled 0.0000000513 + mc2.000056). The authors of our
ACE paper had to correct for this mismatch; and had to know how ILP
worked in order to do so.
I'm now going to start the demonstrations. The first step is to obtain a suitable
Prolog. Aleph will run under SWI and YAP Prologs; I am using SWI, because that's
my Prolog of choice anyway. Prolog experts may wish to try porting [contact
Dennis for an Amzi! version
of Aleph]; otherwise, follow my instructions, and you should be able to run
the demos even if you don't know Prolog. So go to Jan Wielemaker's SWI Prolog
site at www.swi-prolog.org/, and then
to the "Download" link in the left-hand column. This brings you to the "SWI-Prolog
downloads" page; go to the "Development release" link and choose one of the
top three downloads, depending on your operating system. On mine, Windows XP,
the self-installing executable installs with no trouble: just press "Accept",
"Next" or "Finish" at the appropriate questions, as you would when installing
any other program.
If you've not used Prolog before, a bit of practice will
help. I shall give instructions in
Windowese; adapt them to your system as appropriate.
Installing SWI should have left a Prolog icon on your
desktop. Double-click it. This will bring up a white Prolog
window displaying something like:
Welcome to SWI-Prolog (Multi-threaded, Version 5.5.3)
Copyright (c) 1990-2005 University of Amsterdam.
[etc]
For help, use ?- help(Topic). or ?- apropos(Word).
1 ?-
Click in the window to get the cursor into it, and type
the query
1<2.
after the ?- prompt. Don't omit the dot at
the end, otherwise Prolog won't know the query is complete.
Hit RETURN. Prolog should reply
Yes
2 ?-
Prolog has interpreted your input as a query about the truth of
the statement "1<2". Since the statement is true,
it replied with a Yes.
Now type the query
1>2.
You should see output similar to the above, but with a No
instead of a Yes, indicating that the statement
is false.
If you got this far, your Prolog is working.
Should you wish to exit, you can close the window, but
you may as well leave it awaiting the first demonstration.
Before returning to the demo, I shall explain how to
download the Aleph ILP program
and the data to be learnt about, and then
show what the data means.
Now you need Aleph.
Aim your browser at Ashwin Srinivasan's Aleph site,
web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/.
This will get you the Aleph manual.
Aleph itself lives at
www.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/aleph.pl,
which is linked from the "How to obtain Aleph" section of the manual.
Download and save it, which
should give you an aleph.pl file.
Don't accept if your browser offers to execute the file:
it's probably been set up to think the .pl
extension denotes Perl.
The example datasets live at
www.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/misc/examples.zip.
This zip file contains a top level examples directory, with under it
a number of subdirectories
and a README with a one-line description of each. Each subdirectory
contains .b, .f and
.n files, the inputs to Aleph. There is usually also
a "typescript" file showing a run, and possibly
other files containing more information.
In the paths I specify to Prolog, I shall assume you've
unzipped the file so that its examples
directory lives in the same directory in which you've put Aleph.
The first demonstration is a Prolog version of Ryszard
Michalski's notorious trains challenge, posed to machine learning in 1974.
You are given data describing ten trains, and told that five are eastbound
and five are westbound. The challenge is to induce, from descriptions of
the trains, a rule that distinguishes those in the "westbound" set
from those in the "eastbound".
These labels are arbitrary, by the way - there are no destination
indicators on the trains, or anything else connected with directions,
and the challenge would work as well if you were told that
five trains were squamous and five were rugose.
The point is just to find a rule
that passes one set of trains and fails the others.
The reason this was such a challenge is that the trains
are fairly complex structured objects. They consist of
carriages - different trains have different numbers of carriages -
which can be short or long, have open or
closed roofs and differing numbers of wheels,
and carry stylised loads which may be rectangles, triangles, or
other shapes. The learning program must, therefore, be able to
represent this structure and work with it when constructing its generalisations.
At the time, such "relational" learning was a young subject,
research in which had not progressed further than Patrick Winston's
arch-learning
program, which learnt from examples of simple geometric structures
consisting of only three or four components each.
Let's now see how the trains look in Prolog.
Examine the train.f file in the trains
subdirectory of the example
datasets. The task which these
datasets pose is to find a rule which is true of
all the eastbound trains and false of all the westbound. That is,
we want Aleph to find a concept of which all the
eastbound trains are positive examples and all the westbound
are negative examples. The positive examples are in
train.f.
The first line of the file says that the
predicate eastbound is true of a train called east1. Said
more simply, train east1 is eastbound. Similarly, the
second line tells us that train east2 is eastbound. And so on:
eastbound(east1).
eastbound(east2).
eastbound(east3).
eastbound(east4).
eastbound(east5).
This is all very well, but so far, we only know these trains' names,
but nothing of their make-up.This is described in the train.b file.
Starting at line 58, we see:
short(car_12).
closed(car_12).
long(car_11).
long(car_13).
short(car_14).
open_car(car_11).
open_car(car_13).
open_car(car_14).
shape(car_11,rectangle).
shape(car_12,rectangle).
shape(car_13,rectangle).
shape(car_14,rectangle).
load(car_11,rectangle,3).
load(car_12,triangle,1).
load(car_13,hexagon,1).
load(car_14,circle,1).
wheels(car_11,2).
wheels(car_12,2).
wheels(car_13,3).
wheels(car_14,2).
has_car(east1,car_11).
has_car(east1,car_12).
has_car(east1,car_13).
has_car(east1,car_14).
These are facts about parts of trains. For example, the first line shown above
says that the predicate short is true of car_12 - that is,
carriage 12 is short. Similarly, carriage 12 is closed. Carriage 11 is long. And so on.
The predicates with two or more arguments, such as shape and load, work
the same way. Carriage 11 has a rectangular shape. Carriage 12 carries 1
hexagon. Carriage 14 has 2 wheels.
The final four lines above associate a train with its carriages. Although
we've told Prolog lots of stuff about the carriages' properties, we
haven't told it which train these belong to. So the final four lines do this, saying that the
train east1 has carriages car_11 to car_14; and
similarly for the other trains.
I haven't yet mentioned the trains.n file. This contains
"negative examples", i.e. descriptions of trains which are not eastbound.
It's always easier to learn the boundaries of a concept if you have examples
that lie outside them - as when little Johnny opens his copy of the Three Little Pigs
and says "Look Mummy! Nice doggie!", pointing at a toothily grinning wolf. ILP has
developed methods for doing without negative examples, but my two
demonstrations will use them.
Let's now return to Prolog, which we left at the end of
the section on Installing
and running Prolog. The example datasets
are nicely set up, and running them is easy.
Start Prolog again if you exited it. Then change its
current directory to wherever you put Aleph. Do so by
typing
cd('d:/aleph').
but with the appropriate path for your files.
The quotes round the path are single quotes,
and the directory separators are forward slashes, even in Windows.
Remember the final dot.
Then give the command
[aleph].
(The square brackets round the filename are an idiosyncratic
abbreviation for the file-loading command.) Your
Prolog should reply something along these lines:
3 ?- cd('d:/aleph').
Yes
4 ?- [aleph].
% library(pce) loaded into pce 0.05 sec, 170,024 bytes
% library(broadcast) compiled into broadcast 0.06 sec, 197,144 bytes
% library(time) compiled into time 0.00 sec, 2,956 bytes
A L E P H
Version 5
Last modified: Sun Oct 10 06:59:50 BST 2004
Manual: http://www.comlab.ox.ac.uk/oucl/groups/machlearn/Aleph/index.html
% aleph compiled 0.26 sec, 690,528 bytes
Yes
5 ?-
Now you can run Aleph. Assuming that your examples are under the Aleph
directory, change to their trainssubdirectory; then tell Aleph
to read the files and learn!
cd('examples/trains').
read_all(train).
induce.
The calls to read_all and induce
generate
a lot of output, ending with a list of the trial rules
Aleph generated, the best rule (i.e. the smallest that describes all the
positive examples but none of the negative),
and a summary of the number of examples that passed or failed.
This is what you should see:
eastbound(A) :-
has_car(A, B), short(B), closed(B).
This translates into English as:
For any train A, A is eastbound if
A has a carriage B, and
B is short, and
B is closed.
In Prolog, names starting with capital letters are logical variables,
which can stand for anything, as long as it's the same thing
each time the variable occurs in a single rule. The ":-" means "if",
and the comma means "and". My translation demonstrates how this works,
but if you're going to use ILP seriously, you'll probably
need to learn some Prolog.
To see how we could describe molecules to ILP, it's only
necessary to think of them as trains. Instead of
describing the properties of carriages, the
facts could equally well be about the properties of atoms. Add
new facts to describe the couplings - i.e. whether the atoms are
singly, doubly or triply bonded - and allow the train
to loop back on itself, and you have molecules.
The classification into eastbound versus westbound can, in the
simplest cases, be replaced by biologically-active versus
biologically-inactive, where we know from experimental data
which class each molecule goes in. The links at the end
point to more information on this and other applications.
The name behind the acronym ILP is "Inductive Logic Programming".
Not "Learning", not "Derivation", but "Programming".
That's because the rules ILP learns are, in fact, programs.
Aleph's eastbound trains rule
is a train-recognising program. What I am shall now do is
make Aleph learn a list-handling program. Even better, it will
be a recursive list-handling program.
Go to the recursion subdirectory of the Aleph
datasets, and examine mem.f. From its
extension, it contains the positive examples. Some of these are:
mem(1,[1]).
mem(0,[0,0]).
mem(0,[0,1]).
mem(2,[2,3]).
mem(3,[2,3]).
mem(3,[4,2,3]).
Like many other languages, Prolog has lists - structures for representing
sequences - built in. It writes them inside square brackets. Thus [1]
is a list with the single element 1, and [4,2,3] is a
list of three elements. Given this notation, we can interpret the facts as examples
of list membership: in each, the first argument is a member of the list given as second
argument.
Now compare mem.n, the negative examples:
mem(0,[1,2]).
mem(0,[1]).
mem(3,[]).
mem(2,[3]).
If we continue to interpret the predicate mem as list membership,
then all these facts are clearly false, since the first argument does not occur in
the second.
To run ILP over these, change Prolog's directory to the same
recursion subdirectory. The following should work, if you
left Prolog running after the trains demo:
cd('../recursion').
Then type:
read_all(mem).
induce.
The output will be similar to that for the trains: a short "I've read the files"
message from read_all, and then a long sequence of trial rules
from induce, followed by the one it considers best:
[theory]
[Rule 1] [Pos cover = 12 Neg cover = 0]
mem(A, B) :-
B=[A|C].
[Rule 2] [Pos cover = 10 Neg cover = 0]
mem(A, B) :-
B=[C|D], mem(A, D).
In this, the notation [ | ] is Prolog's
way of splitting a list into its first
element and the rest (its tail).
This time, Aleph needed two rules to describe the examples. Together, these
form a recursive definition of list membership, the first being the
base case and the second the recursive step:
For any item A and list B, A is a member of B if
B is equal to A followed by something else, C.
For any item A and list B, A is a member of B if
B is equal to something followed by some list which we'll call D, and
A is a member of D.
So here, Aleph has learnt a recursive program.
Both the trains and the recursive-programs datasets
were written in Prolog. However, although what goes into Aleph
has to be Prolog, our raw data does not, as long as we can translate it.
Since I hope you'll try Aleph on some real-world applications, I'll
show how we might translate data from Web forms and from Excel
into Prolog.
Suppose we want to train on data typed into
Web input forms. Each field in a form has its own name - the HTML "NAME" attribute.
When the form is submitted, the browser sends its contents
to the server as a sequence of name-value pairs, which, using
the utilities provided with Perl, Python and other scripting
languages, can be extracted and dumped into a database or
converted immediately to Prolog. Thus, data
on a user's finances might be converted to
age(u_173,50).
sex(u_173,f).
children(u_173,4).
owns_credit_card(u_173,visa).
owns_credit_card(u_173,amex).
annual_income(u_173,55000).
where the age, sex, etc. come from corresponding input fields, and
u_173 is a unique ID generated for that particular
submission. Compare the trains, where east1
could be regarded as an ID for a train.
Now suppose instead that we want to train on data entered into Excel -
a plausible choice for data entry because of its handy tabular layout and
ease of use. Recent Excels can save spreadsheets as XML,
which is easier to unscramble than a .xls file, and for which
there exists a range of parsers. Thus the spreadsheet
| age | sex | children | credit cards | annual income |
| 50 | f | 4 | visa,amex | 55000 |
| 34 | m | 0 | mastercard | 32500 |
could also be translated into the facts shown just above. I've experimented
with this using my spreadsheet compilation and analysis
tool Model Master to pull data from
Excel XML, and it works pretty well. Sometimes, Excel's
simpler comma-separated value format suffices, but XML means you needn't
avoid commas (or other separators) in strings, and it gives more information
about data types.
I hope these demonstrations showed how easy it now is to
use ILP. There are several other examples
provided with Aleph, and I do recommend
running them all and working right through
the Aleph manual.
This will give you a good feel for the kinds of
data ILP can learn from and the kinds of rule it can learn,
as well as showing how many variations exist on the
ILP algorithms. Information on how ILP works and
on its applications can be found via the links.
If you want to use ILP on a real-world problem,
read "On the appropriateness of Aleph" in the manual. There are many
ILP programs; Aleph is easy to install and run, and flexible enough
to implement a number of learning methods, but this flexibility makes
it slower than more specialised programs. Consider also whether
ILP itself is best for you. Srinivasan recommends
asking whether other techniques such as statistical programs, genetic algorithms,
neural networks, Bayesian nets, or tree-learners would do instead.
Each has its own pros and cons - speed, storage, number
and kind of training data, intelligibility of learnt knowledge, difficulty
of understanding the technique - and ILP isn't always the
best trade-off. But sometimes it is; and its successes on
biological molecules, with all their structural complexity, are a compelling
demonstration of its power.
www.swi-prolog.org/ - SWI Prolog.
www.ncc.up.pt/~vsc/Yap/ - Yap Prolog.
web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/ -
Ashwin Srinivasan's Aleph manual at Oxford University
Computing Lab. This is the contents page, and links to the rest of the manual.
The Aleph download is at www.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/aleph.pl,
and the examples are at
www.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/misc/examples.zip.
www.cs.wisc.edu/~dpage/cs731/mlj_page.ps -
Pharmacophore Discovery using the Inductive Logic Programming System PROGOL. Paper by
P Finn, S Muggleton, D Page, and A Srinivasan, from Special issue of Machine Learning on Applications and the
Knowledge Discovery Process, Kohavi and Provost (Guest Editors), Volume 30, Numbers 1 and 2, 1998.
www.doc.ic.ac.uk/~shm/progol.html -
The site for Progol, including downloads. As with Aleph, contact the author if
you want to use it commercially.
capra.iespana.es/capra/ingles/cafe/cafe.htm -
A Spanish goat site's story of The coffee and the goats. According to an
article by Canadian chemistry professor and
writer Joe Schwarcz, spent coffee grounds are excellent for removing
the smell of elephant urine. That article is no longer on the Web,
but - returning to pharmacology - Joe's archives at
www.tvo.org/yourhealth/joeschwarcz.html
describe the effects of an assortment of plants, from cabbage and
cranberries to chamomile and kava kava. Drug design sometimes begins from
observations of such effects.
www.newdrugdesign.com/Rachel_Theory_02.html -
The biochemistry of drugs, from the RACHEL Software Innovations site.
Introduction to pharmacophores and the way drugs bind to biological molecules.
www.itakura.toyo.ac.jp/~chiekon/papers/wflp01.pdf -
Inducing Differences among Documents using Aleph with Construction of
Background Knowledge. Paper by Chieko Nakabasami describing
preliminary research on Aleph and case-based reasoning
for learning to extract information about the content of scientific papers.
www.cs.wisc.edu/~dpage/ - David Page.
Contains links to his work on data mining and machine learning (especially ILP
and other techniques for learning about complicated structures) applied to bioinformatics,
chemoinformatics, and health sciences.
www.doc.ic.ac.uk/~shm/cbl.html -
Imperial College Computational Bioinformatics Laboratory. Has an overview for non-experts of ILP-based
knowledge discovery for pharmaceutical research,
including diagrams and a sample hypothesis for protein folding. Also describes Progol's impressive
performance in the National Toxicology Program's carcinogenicity prediction competition.
jmlr.csail.mit.edu/papers/volume4/page03a/page03a.pdf -
ILP: A Short Look Back and a Longer Look Forward. Paper by Page and Srinivasan from
Journal of Machine Learning Research, Volume 4, 2003. Looks at the most
challenging applications of ILP - bioinformatics and natural language - and how
its remaining shortcomings might be overcome.
web.comlab.ox.ac.uk/oucl/research/areas/machlearn/applications.html -
Oxford University Computing Lab pages on applications of ILP.
www-users.cs.york.ac.uk/~jc/teaching/GSLT/ilp-goth.html -
James Cussens's Day Course on inductive logic programming and natural-language processing
for the Graduate School in Language Technology, Göteborg,
with practicals on various examples with Aleph.
Solutions to exercises can be found at www.sics.se/~fredriko/courses/ml03/.
Other ILP links are available from James's home page at
www-users.cs.york.ac.uk/~jc/.
web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/misc/basic.html -
Basic Ideas Relevant to ILP by Ashwin Srinivasan.
www.sciencemag.org/feature/data/compsci/machine_learning.shtml -
David Aha's very comprehensive Machine Learning Resources page. Includes
ILP resource sites, datasets for experimenting with machine learning, and
links for genetic algorithms and neural nets.
robotics.stanford.edu/people/nilsson/mlbook.html -
Online draft of Nils Nilsson's book on Machine Learning. Written in 1996, the notes are becoming outdated - there have
been many advances - but are still a good introduction to the important ideas.
www.aaai.org/AITopics/html/machine.html -
AAAI machine learning page.
www.aaai.org/Library/Magazine/Vol04/04-03/Papers/AIMag04-03-007.pdf -
Machine Learning: A Historical and Methodological Analysis by Jaime Carbonell, Ryszard Michalski and Tom Mitchell.
For historical perspective, this shows where machine learning was 20 years ago.
www.rci.rutgers.edu/~cfs/472_html/Learn/LearningToc.html -
An Early Machine Learning approach to Concept Learning. Charles Schmidt's explanation
of WInston's arch-learning program.
Past newsletters are available at either www.ddj.com
or www.ainewsletter.com.
As ever, interesting links and ideas for future issues are very
welcome. Feel free to contact either myself (below) or Jocelyn <popx@j-paine.org>
with comments, thoughts and suggestions.
Until next month,
Dennis Merritt
Copyright ©2004 Amzi! inc., CMP, and Jocelyn Paine. All Rights
Reserved
|