#36 - Apr 99 [Updated 2 May 99]
StuChat 36
Hi Folks,
Been a long time since I did this - actually took some time out
to write a StuChat
rather than just throwing one together. This one will cover: Evangelist,
fast square
root and arctan algorithms and finally a quick run down and some
software related to
Genetic Algorithms.
First ladies and gentlemen, a minute of silence for the death
of the EvangeList
which ceased operations on 16th April 1999. You've probably read
it elsewhere
by now but here's the closing message:
In the past two years Apple has experienced a stunning turnaround.
This
is due to many things including the steadfast loyalty of Apple's
customers--and EvangeListas are the most steadfast of the steadfast.
The original purpose of EvangeList was to counteract the
negative news
about Apple and Macintosh, and I believe that EvangeList has served
its
purpose--fantastically, as a matter of fact. So after discussing
what we
should do with EvangeList with the folks at Apple, we've decided
to
retire the list.
It is with fond memories I bid the Evangelist fairwell. Two
years
ago were indeed very dark times. Many people (myself included)
alluded to the notion
of the "Dark Side" slowly gaining ground and converts.
I must admit it isn't half as
much fun these days; knowing we were right (I am aware that i
may be getting a little
ahead of the situation here) and that we would climb out into
the light one day "soon."
These days it is rather different; we screamed for on-board
3d hardware and standards;
we screamed for better performance (though we always knew this
was a "given" considering
we were RISC and the darkside wasn't) and we screamed for memory
protection. Now we have
them (or very soon will have) has the fun gone?
Yes and no. Yes, because there's very little to moan about
at the moment; very little
to want. No because the fundamentals have changed very little
- PC's are still running
antiquated instruction sets; Macs are still Macs and the HW has
just standardised -
what's the difference betwen a 603 and a G3? Very little. A G4?
AltiVec which is already
a very well understood and anticipated technology. People still
expect Mac software to
be better than PC software and so it should be!
Square Root algorithm (with thanks to Prof. Ranko Bojanic)
Very often one needs fast trig functions and would be quite happy
to trade accuracy
for speed. A classic use for sqrt is when calculating the length
of a line (in 3d
space) viz:
sqrt(a^2+b^2+c^2)
Using Newton's method and selecting the starting point we can
get a very good
approximation after just two or three iterations - we really don't
need absolute
accuracy.
First compute R = a^2 + b^2 + c^2. Then
set x_0 = a+b+c and then compute
x_1 = 0.5*(x_0 + R/x_0)
x_2 = 0.5*(x_1 + R/x_1)
x_3 = 0.5*(x_2 + R/x_2)
etc for greater accuracy
For example, if we take 10,11,12 for a,b and c the result of
sqrt(a^2+b^2+c^2)
is 19.10.
In this example:
R is calculated as 365 (10^2+11^2+12^2) and x_0 as 33
Now the first iteration:
x_1 = 0.5*(x_0 + R/x_0)
R/x_0 = 365 / 33 = 11.06
+x_0 = 11.06 + 33 = 44.06
*0.5 = 22.03
x_1 computes as 22.03
On the second iteration:
x_2 = 0.5*(x_1 + R/x_1)
R/x_1 = 365 / 22.03 = 16.57
+x_1 = 16.57 + 22.03 = 38.6
*0.5 = 19.3
x_2 computes as 19.3
The third iteration:
x_3 = 0.5*(x_2 + R/x_2)
R/x_2 = 365 / 19.3 = 18.91
+x_2 = 18.91 + 19.3 = 38.21
*0.5 = 19.10
x_3 computes as 19.10
As you can see, within a few iterations the result is close enough - in this case three.
The error estimate after n iterations can be calculated with:
x_n - Sqrt(a^2+b^2+c^2) <
< 2 * Sqrt(a^2+b^2+c^2) * (0.5(Sqrt(3)-1))^(2^n)
Fast ATAN (see below for update re: more accurate atan)
Atan can be used to calculate the angle or heading to an object
on a 2d plane given
a right angle triangle as follows:
We know that:
sin(x)=opposite/hypotenuse
cos(x)=adjacent/hypotenuse
tan(x)=opposite/adjacent
When considering a 360 degrees "arc" if we split
it into quadrants then we can construct
right angle triangles. From a right angle triangle using the inverse
functions arcsin,
arccos and arctan we can calculate the angle given the length
of the sides of the triangle.
Now you can use Pythagorus to calculate the length of the hypotonuse
should you really want
to, but we already have the length of the adjacent and opposite
sides, so if we had a quick
arctan function we could use this:
angle=arctan(opposite/adjacent)
The problem with arctan is that it gets very non-linear above
an input of 1.0 which equates
to 45 degrees, or one half of a quadrant. What you can do is split
the quadrant in half
which means the greatest input to the arctan function will be
1.0 and the greatest angle
returned will be 45 degrees - you then need to correct the result
depending on what octant
the point lies in, but that's a simple addition.
The actual arctan function then becomes a simple table look
up. In this example
I have used a table with 21 integer values declared. The pseudo-code
routine
is accurate to about 2.5 degrees over a 45 degree arc.
**Takes a floating point input value called "in"
between 0 and 1,
**returns an integer result in degrees
if in<0 or in>1 error
index=convert_to_int(in*20)
return atans(index)
atans: dc.w 0,3,6,8,11,14,17,19,22,24,26,29,31,33,35,37,38,40,42,43,45
**end
I have essentially found what I believe is the best method
for the evaluation of the atan function. The computation of
the atan function as described in StuChat 36 is very ingenious,
but its accuracy is not very good.
Here is a very simple formula to compute the atan function:
atan(x) = x/(1+ 0.28*x^2) (|x|<=1)
(To get degrees, multiply the right hand side by 180/pi).
For example, let's compute atan(0.15). By your formula,
20*0.15 = 3. Hence, from the table,
atan(0,15) = 8 degrees.
By this formula,
atan(0.15) = 0.15/(1+0.28*(0.15)^2)
= 0.15/1.0063 =
= 0.149060916
or, in degrees,
atan(0.15) = 0.149060916 *(180/3.14159265)
= 8.54 degrees
With a calculator we find that
atan(0.15) = 8.53076561...degrees
Hence the error is less than 0.01 degrees!
For values of x > 1 use the formula
atan(x) = pi/2 - x/(x^2 + 0.28) (|x| >=1),
or, in degrees,
atan(x) = 90 - (x/(x^2 + 0.28))*(180/pi) |x|>=1.
For example, let's compute atan(20). We have
atan(20) = 90 - (20/400.28)*180/3.14159265 =
= 90 - 2.862785029
= 87.13721497 degrees
Your calculator gives
atan(20) = 87.13759477 degrees
Hence the error is less than 0.0004 degrees!
To summarize: you need only to have a formula for |x| <=
1.
For values |x| >= 1 you use the identity
atan(x) = pi/2 - atan(1/x).
>From Approximation Theory (C. Hastings, Jr., Note 143,
Math. Tables Aids. Comp 6, 68 (1953)) we know that
atan(x) = x/(1+0.28*x^2) + e(x)
where |e(x)| <= 0.005 for |x| <=1. (If you are
interested, I can send you info how to get even more
precise approximations of the atan(x) function)
I am sure that you will see immediately the advantages
of using this method from the programming point of view.
See StuChat 37 for more on Atan
(My sincere thanks to Ranko)
Genetic Algorithms
These are fun. The basic premise is as follows:
1. Generate a load of genes. A gene is simply a list of instructions.
2. 'Run' them.
3. Score them according to how well they performed the "task".
4. Pick the top 10% of genes and discard the remaining 90%.
5. Fill the pool with duplications of the top 10%.
6. Mate the genes - for example for each pair of genes, swap one
instruction from each.
7. Go to step 2.
A simple task might be to produce the number 10. The genes
consist of instructions
that are executed either in a virtual machine or on a real processor
(assuming suitable
safety mechanisms are in place to catch crash conditions). After
executing they can be
scored according to their result and length - a shorter gene scores
higher than a longer
gene and one that gets closer to 10 scores more than one that
doesn't.
Here's a very simple (but complete) example in 68K assm I wrote a long time ago:
*****************************************************************
*Simple test of Genetic Algorithm
*
*
*What:
*The goal of a gene in this case it to produce the result 10.
*a perfect gene will have an instruction length of 1 and produce
the result 10 when run
*this is impossible.
*Each gene can contain instructions as follows:
*0 - clear accumulator
*1 - inc accumulator
*2 - dec accumulator
*3 - shift left accumulator
*4 - shift right accumulator
*5 - nop
*
*The accumulator is always cleared before processing the genes
code.
*A gene is preceded by it's instruction length, score, result
and selected. Each instruction, each gene "entry", is
*a byte, thus a typical gene might
*look like 3,0,3,0,1,1,1
*length----^
*score-------^
*result--------^
*selected--------^ *Set to 1 by culler if deemed fit for next
generation (top 10%)
*
*Process
*1. Initialise n genes with random garbage (but the length byte
has to be correct
**maximum length allowed must be set!
*2 loop
*2a Process each gene, giving it a score depending on how close
its result is to 10,
*and how long it's instruction length is.
*
*Thus the lower the score, the better the gene (code contained
within it). The highest
*score is zero (impossible),
*a good one might be
* inc accum *1
* shift left *2
* shift left *4
* shift left *8
* inc accum *9
* inc accum *10
*Six instructions which would score ((10-10)+1)*6=6
**As it turns out this is what it produced
** inc accum *1
** shift left *2
** inc *3
** inc *4
** inc *5
** shift left *10!!!! (Quite frankly that's amazing, and it only
took 4 iterations!)
*Show the score
*2b pick the top 10% of genes and delete the remaining 90%
*2c. duplicate these to fill the remainder of the 90%
*2d. mate these genes with each other as follows.
* 1. Pick two random genes. Swap an instruction at random between
them.
**(To prevent stagnation, there's a 1 in thousand chance we'll
just stick a dec into
**the first instruction of one of them)
*3 do it all again until we decide to stop
number_of_genes: equ 300 *lets have this many
max_gene_length: equ 64 *max. number of bytes in each gene
max_instruction_length: equ max_gene_length-4
min_instruction_length: equ 3
max_opcode: equ 5 *instructions between 0 and 5 are valid
GA_main: entry
bsr init_mac
bsr ga_init
ga_loop:
bsr score_genes *run their code in VM and calc score
**find top 10 %
bsr cull
debug *examine genes here. 5 steps produces optimal program!!!!!!!!!!!!!!!
lea genes(pc),a3 *quick vis check
move.l the_highest_score(a5),d0
bsr mate *mate each gene with it's neighbour
bra ga_loop
rts_ "GA_main"
**this routine mates each gene with it's neighbour
**this involves:
*1. pick two genes at random
*2. swap one random instruction between them.
*
mate:
move.l #number_of_genes/2,d5 *counter
**loop
do_next_mate:
**pick next two genes at random
movem.l d1-d7/a0-a6,-(sp)
move.l #number_of_genes-1,d0
bsr get_rand
movem.l (sp)+,d1-d7/a0-a6
mulu.l #64,d0
lea genes(pc),a3
add.l d0,a3
movem.l d1-d7/a0-a6,-(sp)
move.l #number_of_genes-1,d0
bsr get_rand
movem.l (sp)+,d1-d7/a0-a6
mulu.l #64,d0
lea genes(pc),a4
add.l d0,a4
**We just swap one byte at a random position determined by the
shortest string
**find shortest
clr.l d0
clr.l d1
move.b (a3),d0
move.b (a4),d1
lea 4(a3),a3
lea 4(a4),a4 *to code
cmp.l d1,d0
ble.s first_is_shortest
move.l d1,d0
first_is_shortest:
movem.l d1-d7/a0-a6,-(sp)
bsr get_rand *<=d0
movem.l (sp)+,d1-d7/a0-a6
**just a tadge of randomness
movem.l d0/d2-d7/a0-a6,-(sp)
move.l #1000,d0
bsr get_rand
move.l d0,d1
movem.l (sp)+,d0/d2-d7/a0-a6
**if d1=500 then we make inst 1 in one gene a dec instruction
cmpi.w #500,d1
bne.s not_mutation
add.l d0,a3
add.l d0,a4
move.b #2,(a3) *dec instruction
bra.s done_mute
not_mutation:
**swap bytes
add.l d0,a3
add.l d0,a4
move.b (a3),d1
move.b (a4),(a3)
move.b d1,(a4) *swap just one instruction
done_mute:
*have a cigarette and do next
subq.l #1,d5
bne do_next_mate
rts_ "mate"
**walks the genes.
*first finds highest scorers
cull:
bsr mark_as_winner
lea genes(pc),a3 *the winners should now be marked
bsr move_winners_to_start *move em to the start of the table
**duplicate this 10% to other 90%
lea genes(pc),a3 *dest
lea genes(pc),a4 *copy from here
add.l #max_gene_length*(number_of_genes/10),a4 *to here
move.l #8,d0 *9 times..
replicate:
**copy (number_of_genes/10)*(max_gene_length/4) to a4 from a3
push a3 *save src
move.l #(number_of_genes/10)*(max_gene_length/4),d1
copy_10:
move.l (a3)+,(a4)+
subq.l #1,d1
bne.s copy_10
pop a3 *restore src
dbra d0,replicate
rts_ "cull"
move_winners_to_start:
lea genes(pc),a3 *dest
lea genes(pc),a4 *copy from here
move.l #number_of_genes-1,d7 *cntr
mwts_loop:
tst.b 3(a4)
beq.s not_a_winner
**Copy max_gene_length bytes from a4 to a3
moveq #max_gene_length/4-1,d0
move.l a4,a2 *save src ptr
**Clear winner byte
clr.b 3(a4)
cts:
move.l (A4)+,(a3)+
dbra d0,cts
move.l a2,a4 *restore src ptr
not_a_winner:
lea 64(a4),a4
dbra d7,mwts_loop
rts_ "move_winners_to_start"
**This routine runs through genes looking for winniers (top 10%
scorers)
mark_as_winner:
**we need to find the highest 10% of scorers, or 1/10 of number_of_genes
clr.l d6 *score index
move.l #number_of_genes/10,d5 *counter
move.l #0xff,the_highest_score(a5)
*work our way through looking for genes with scores that match
score index until
*we have marked number_of_genes/10
mark_loop:
clr.l d7 *gene index
ml_at_score:
move.l d7,d0
lea genes(pc),a3
mulu.l #max_gene_length,d0
add.l d0,a3 *to gene
cmp.b 1(a3),d6 *cmp score to score
bne.s no_match
clr.l d2
move.b 1(a3),d2 *the score
cmp.l the_highest_score(a5),d2
bge.s not_highest
move.l d2,the_highest_score(a5)
not_highest:
**mark gene as selected
move.b #1,3(a3)
subq.l #1,d5 *dec needed counter
no_match:
addq.l #1,d7
cmp.l #number_of_genes,d7
bne.s ml_at_score *carry on looking for results that match acceptable
inc.l d6 *dec acceptable score
tst.l d5
bgt.s mark_loop
rts_ "mark_as_winner"
score_genes:
clr.l current_gene(a5) *the one were working on
move.l current_gene(a5),d7
**
do_score:
move.l d7,d0
bsr run_gene *execute it's code to get result
addq.l #1,d7
cmp.l #number_of_genes,d7
bne.s do_score
rts_ "score_genes"
run_gene:
lea genes(pc),a3
mulu.l #max_gene_length,d0
add.l d0,a3 *to gene
move.l a3,a4 *save gene start for scoring
clr.l d4 *accum
clr.l d3
move.b (a3),d3 *instruction counter
move.l d3,d6 *save length
subq.l #1,d3
addq.l #4,a3 *to first instruction
process:
clr.l d0
move.b (a3)+,d0
bsr execute
dbra d3,process
**result in d4
andi.l #0x7fff,d4 *keep is reasonable
*calc score
*score is 0 if result =10 else score=>0>255
*as follows
*if result=10 then score=0+length
*else
*find diff between 10 and result and add length
**if result -ve, make +ve
clr.l d0 *Score
move.b d4,2(a4) *Save result
cmpi.l #10,d4
beq.s is_10
*get diff between result and 10
bge.s is_pos
moveq #10,d3
sub.l d4,d3
move.l d3,d0
bge.s is_10 *>0
neg.l d0
bra.s is_10
is_pos:
sub.l #10,d4
move.l d4,d0 *diff
is_10: add.l d6,d0 *plus length
cmpi.l #0xff,d0
ble.s score_ceil_ok
moveq #-1,d0
score_ceil_ok:
move.b d0,1(a4) *save score
rts_ "run_gene"
execute: *inst in d0 as int32
*0 - clear accumulator
*1 - inc accumulator
*2 - dec accumulator
*3 - shift left accumulator
*4 - shift right accumulator
*5 - nop
tst.l d0
beq.s clear
cmpi.l #1,d0
beq.s inc
cmpi.l #2,d0
beq.s dec
cmpi.l #3,d0
beq.s left
cmpi.l #4,d0
beq.s right
cmpi.l #5,d0
beq.s nop
debug *illegal instruction!!!!!!!
clear:
clr.l d4
bra.s end_proc
inc: addq.l #1,d4
bra.s end_proc
dec: subq.l #1,d4
bra.s end_proc
left: lsl.l #1,d4
bra.s end_proc
right: lsr.l #1,d4
bra.s end_proc
nop:
end_proc:
rts_ "execute"
***********************************************************************************
ga_init:
clr.l current_gene(a5) *the one were working on
move.l current_gene(a5),d7
do_init:
move.l d7,d0
bsr init_gene
addq.l #1,d7
cmp.l #number_of_genes,d7
bne.s do_init
rts_ "GA_init"
init_gene:
lea genes(pc),a3
mulu.l #max_gene_length,d0
add.l d0,a3 *to gene
**get number of instructions for this gene
move.l #max_instruction_length,d0
bsr get_rand2 *number between 2 and d0
move.b d0,(a3)+ *length
clr.b (a3)+ *Score
clr.b (a3)+ *result
clr.b (a3)+ *pad byte
subq.l #1,d0
bge ok
debug
ok:
cmpi.l #max_gene_length-1,d0
ble.s ok1
debug
ok1:
move.l d0,d6 *loop
set_inst:
move.l #max_opcode,d0
bsr get_rand
move.b d0,(a3)+
dbra d6,set_inst
rts_ "init_gene"
get_rand2:
push d0
OSRandom d0
pop d1 *input max value
andi.l #$ffff,d0
move.l d1,d3 *range
sub.l #min_instruction_length,d3 *now
mulu.l d0,d3 *(rndm*range)
lsr.l #8,d3
lsr.l #8,d3 *div 65535
move.l d3,d0
addq.l #min_instruction_length,d0
rts_ "get_rand2"
get_rand:
push d0
OSRandom d0
pop d1 *input max value
andi.l #$ffff,d0
move.l d1,d3 *range
mulu.l d0,d3 *(rndm*range)
lsr.l #8,d3
lsr.l #8,d3 *div 65535
move.l d3,d0
rts_ "get_rand"
******data
genes: ds.b number_of_genes*max_gene_length ;Space for genes
**GA_Globals
current_gene: globoff.l 1
the_highest_score: globoff.l 1
the_highest_addr: globoff.l 1
extern init_mac
Unfortunately there is no interface so you have to examine the
genes with MacsBug after
each iteration but hopefully it's food for thought. There are
seven more of these
examples so maybe I'll post the next in the next StuChat should
anybody be interested.
I have posted a Fantasm 6 project for this example available on
FTP
as a Stuffit(tm) 5 archive as GA1.sit.hqx.
Till the next time,
Code on!
©Lightsoft 99