In this "issue":
Development
Well, this week I've spent quite a bit of time talking to various PowerMac developers. The level of expertise is slowly rising as people work out their own little systems of programming without a stack, and how to actually get maximum speed out of the chip. It's just like the clock has been turned back 10 years, when people were getting their heads around the likes of 6502's and Z80's. Just proves that it doesn't matter how powerful the machine, it's never powerful enough! I think it's going to be scary when we see the first machine code PPC games!
One of the conversations I had centered around hardware acceleration, and how you should make available a CopyBits option (in a game), as on some machines CopyBits is hardware accelerated and will easily outperform the most devious of direct screen write algorithms. At the same time, not all machines have accelerated CopyBits, and so you have to write to the lowest common denominator, which does mean devious graphics routines. Where this is leading is that we were to include various CLUT (Color Look Up Table) routines in the PPC video library, but then decided that it was probably a waste of time because whatever we wrote would not be suitable for your situation.
For example, if we wrote a general purpose CLUT rotate, it would be too slow because you may only want to rotate colours 32-64 and so a custom rotate will be far quicker. The last thing you want is a slow routine trying to work out what it is you want. Ok, so it makes coffee for you, but it doesn't do the job as quickly as it could.
I'll show you some example PPC CLUT code in the interlude (Fully working small demoette).
The other thing of concern to quite a few people at the moment is the amount of time it can take to change the CLUT. We've talked to quite a few people about this, and the general opinion is that it can take up to one whole video frame for the SetEntries call to come back to you - Eeek! This is born out in our own code, where we can see the speed of the rotates varying quite considerably. Why? Well it seems that most video hardware will wait for the VBL (Vertical BLanking) time before actually changing the CLUT - this of course is a perfectly natural thing for video hardware to do, as it simply may not have the time to transfer 2k of data whilst scanning the monitor at a high refresh rate, whereas it does nothing whilst the electron beam is returning to the top left of the monitor tube and so can happily update it's internal hardware palette. The other reason is that you will get no visual interference on the screen when the CLUT is changed as during the VBL, the electron beam is blanked, and so draws nothing.
This poses a big problem if you need to rotate colours whilst other action is happening.
Why would you want to change colours whilst doing other things? Millions of reasons - fading backdrops or just selected areas of graphics, animation, general fast graphics frigs etc - it's always been a time honoured way of making things look more busy than they really are.
So we have a real problem in that we can't do colour rotates and other things at the same time - or can we? As a matter of fact, I don't think I've seen it done on the Mac?
Thus we have our second puzzle, as posted on the puzzle page. I'll give you one of the ways we thought of doing it in a few weeks. In the mean time, if you have a way of doing this, send it in and we'll post it up.
One of the examples that never got sent out with PowerFantasm (it will be included in the 406 distribution) is how to create a Transition Vector with PF. As you may know, PF does not create TV's in it's TOC when you define a code pointer with toc_routine. It creates half a TV - the code pointer.
A Transition Vector is two words - the first is the code pointer, the second is the code pointer's TOC value.
You generally need a TV when you wish to create an Universal
Proc Pointer. These are needed whenever you want to use a callbackproc,
for example in an async sound driver or controls.
As we now know what a transition vector is, we can create one
quite easily, normally in the BSS section, by getting the relocated
code pointer, and the current rtoc value. The following code sums
it up:
includeh general_usage.def import CallUniversalProc *Note - not defined in headers ***************************************************************************************** *Example of setting up an Universal Proc Pointer for PowerFantasm V4.xx * *©Lightsoft 1995. * * * ***************************************************************************************** ** **Theory of operation: **Step 1 **Because PowerFantasm does not create transition vectors in it's TOC (Because it wastes **valuable space) we create a transition vector for the routine to be called in the BSS. **Step 2 **We then pass that TV to NewRoutineDescriptor, which will return an UPP which we can save **away. Note that all your UPP's should be calculated as early as possible in your application's boot **up stage, to prevent heap fragmentation. ***************************************************************************************** **Note, for your edification, we've used the quicker startup and exit forms of code, rather **than the longer (but more educational) macros (Startup and tidy_up). bss: reg r30 **********Start up ENTRY Startup: mflr r0 stw r0,8(sp) *Store link register on the stack stmw r10,-88(sp) *Save r10-r31 stwu sp,-64+88(sp) *Skip over the stack space lwz r30,(rtoc) *Load global data (bss) pointer (first entry in TOC) stw r2,20(sp) *Save RTOC; **********Call Macsbug to examine ; Xcall Debugger **Step 1 **Because Fantasm does not create transition vectors in the TOC, we create our own in the **BSS. **Set up transition vector la r10,test_tv(`bss) *TV here in bss lwz r4,[t]my_test(rtoc) *Actual pointer to code stw r4,(r10) *into transition vector stw rtoc,4(r10) *followed by our toc **Step 2 **get a routine descriptor for our test upp. la r3,test_tv(`bss) *pointer to tv to code we want to execute lwz r4,my_test_info(rtoc) *info record for code li r5,1 *isa=ppc Xcall NewRoutineDescriptor cmpwi r3,0 *Check if call failed beq Error *NewRoutineDescriptor failed. stw r3,test_upp(`bss) *save the upp **End of actual set up, we can now test... ***********The acid test - call my_test through the upp lwz r3,test_upp(`bss) lwz r4,my_test_info(rtoc) Xcall CallUniversalProc ***********Quit app Error: *On error, quit. Exit: lwz r0,64+8+88(sp) *Get saved link register mtlr r0 addi sp,sp,64+88 *Reset stack pointer lmw r10,-88(sp) *Restore r10-r31 blr **************************************************************************************** **This is out universal headered ppc test routine my_test: toc_routine *so we get a relocated pointer to this code entry point mflr r29 Xcall Debugger *so we can check we get here nop *Do what you have to do. nop nop mtlr r29 blr *End of routine ****************************************************************************************** *DATA ****************************************************************************************** *Initialised data * **procinfo data for my_test my_test_info: dc.w 0 *procinfo 0=no parameters, see IM PPC sys s/w dc.b 0 *resvd dc.b 1 *ppc (68k=0) dc.h 4 *Routine flags 4=native + 2=needs init + 1=offset ******************************************************************************************* *Uninitialised data into BSS section test_tv: rs.w 2 *Transition vector for routine. test_upp: rs.w 1 *save our upp here **
This little snippet should be assembled in the Stand Alone mode.
It sets up a transition vector to the routine "my_test",
then calls NewRoutineDescriptor to create a UPP, and finally,
calls CallUniversalProc just as a test.
General
Well, that low end platform I was talking about last week finally has a name - "LERP" or Low End Reference Platform. Come on guys, are you serious or what? Hello?
I see now that the cloning operation is proceeding smoothly,
what with the likes of the big Japanese and Taiwanese corporations
getting a slice. Rumor has it that LG Electronics, or Goldstar
as they are more commonly known, think they'll have a box out
by September, and both I.B.M. (God bless 'em) and Motorola are
sub-licensing thick and fast. This is good.
However, I did have a rather silly thought about all this: It
is a well known fact that some people buy a Mac because it is
a top of the range item. It exudes quality and looks expensive
- to some, a status symbol. I do hope we don't end up in a situation
where we have people scorning other Mac users simply because they
have a "Goldstar Spesh" rather than a "real"
Mac. It is possible, specially if the cheaper clones won't do
what a real Mac does - for example there could be a lot of corners
cut in the licensing area of things like fonts and internal technologies.
What if a clone has problems running QT? What if it won't take
an ADB keyboard. I think I'm talking complete, unadulterated cr*p
here, so...
On a different note, I received a Macsbug dump this week from
406, and couldn't help noticing the hard disk was called "MacintoshHD"
- how can you have a hard disk called that? James - get it sorted
- it's a Mac :-)
But, this started me looking into what people call their Macs
and associated devices - a quick check round here revealed the
following:
"Victoria", "Annabel", "Connie",
"Susan", "LightWorld" - soon to be changed
apparently, "The Beast" and of course, "Elsie".
The server under the stairs (and out of view) has not got a name,
because it is a PeeCee, and as such simply isn't worthy.
Why are we allowed to name Macs? So they can become ours. The
same way you can change a great deal of the interface, to make
the Mac yours, you can also give it a name. Is this silly?
Nah! It's great.
Interlude
Ok, so now onto what can cause problems
for some Mac developers - fades. You want one of those cool screen
fades for your latest and greatest?
We'll talk about fading 256 colour screens, as that's just about
the defacto standard for games these days. We've uploaded a complete
example project to the downloads section, but it is not a supported
Lightsoft project. It is just example code that shows you how
to do fades and rotates. It is native only, and will run on any
PowerMac that has a 256 indexed mode video driver (that is - 8
bit or 256 colours). The code therein has not been edited or commented
to any great extreme - it is "as is".You can download
it by clicking here
(20k)
Now then, how to fade? Well, most Macs support what's called an "indexed" video mode - that is where a colour is translated to an index, which is used to address a table, which contains the best match for that colour (as 16 bit RGB values). This means that the Mac can supply a rough approximation for the colour you want. Colours on the Mac are normally defined a red, green and blue values. Each value is 16 bits, and you can set the foreground drawing colour with RGBForeColor and the background drawing colour with RGBBackColor. When the Mac is actually refreshing the screen, it looks at the pixel in VRAM, uses it's value to index into the table, and sends the contents of the table to the monitor
The table which is indexed is called the Color Look Up Table, or CLUT. Thus when in this mode, if we can change the CLUT periodically, we can alter colours on screen dynamically without writing any pixels. ~From this we can deduce that each entry in the CLUT is 3 halfs right - Red, Green and Blue values? Wrong, each entry is actually four halfs, which makes a great deal of sense from an addressing point of view! The values we want to alter are the last three halfs - these are the Red, Green and Blue values for this index.
Fading
The simplest form of fade, is to fade all the CLUT entries to
zero. To do this we need to determine the difference between the
starting value and zero (?! I know, but this theory helps later)
then divide this difference by the number of steps we want to
fade by. For example if we want a slow, smooth fade, we could
fade over 300 steps. If we want a quick fade, we can fade over
say 10 or 12 steps. For example if entry 0 has the values r=1000,
g=500, b=2000 and we want to fade over 100 steps,we divide the
red, green and blue values by 100, which gives 10,5,20. These
values can then be subtracted from this clut entry to fade to
zero.
As I'm sure you can see, we need an array to hold the fade values in, and each entry in this array is 3 halfs - the fade values for the RGBcomponents of the CLUT entry. We also need an array to hold the original clut in, so we may modify it.
Once we have calculated these fade values, it is simply a matter of looping around for the number of steps, subtracting the fade values from each CLUT entry until the whole CLUT has been modified. When every entry in the clut has been modified we can call SetEntries to send the new CLUT to the video hardware, then go back and subtract the differences again. When the loop is finished, the screen will be dark.
A variation of fading to zero is fading to a colour. Again we use an array of fade values, but this time the fade values are calculated not from the original colours minus zero, but the original colours minus the new colour, so we can get poitive or negative fade values. Then, as per the previous fade, we loop round modifying the CLUT entries until the requisite number of fade steps have completed.
The last fade incorporated into the example code is a "fade to clut" routine. This works similar to the above, except this time, when calculating the fade values, we have to find the difference between two CLUT entries and store these as the fade values.
When working with CLUTs, it is very important to get the Port
correct, otherwise you may copy/set the wrong colours! Remember,
there is only one CLUT - in the video hardware. In reality, when
we are getting a CLUT for modification, we are reading a PMTable
out of a Port - the current Port, so make sure it is the port
you want (unless of course you are synthesising your own CLUT).
Rotating
A simple procedure whereby the whole CLUT is rotated either left
or right, more commonly called forwards or backwards. Included
in the example code is a simple rotate which rotates all the entries
left one position. These rotates can be seen in the vertical,
horizontal and circle examples of the code. To rotate down by
one, you copy the first entry, then shift the next 254 entries
down one, then paste in the first entry into position 255.
You may like to rotate by more than one position, or just rotate a section of the CLUT, say colours 32-64 thus animating only part of the colour table - play with it, it's good fun!
The routines
The file "fade.s" contains most of the routines we've talked about previously.
fade_down, fade_to_clut and fade_to_colour are the main routines here.
copy_active_clut gets memory for the clut, then copies the PMTable of the active port and puts the handle to this clut in "current_clut_h".
get_current_pm_table returns a pointer in r3 pointing to the PMTable of the active port - used by copy_active_clut, but usable in it's own right. get_clut is also used by copy_active_clut and copies the PMTable pointed to by r3 to the clut in "current_clut_h".
load_clut is a resource loader that loads a clut from the resource fork. Takes the clut ID in r3, and returns with the handle to the resource in r3.
calc_fades, calc_fades_clut and calc_fades_colour are the routines that calculate the fade arrays for the three different forms of fade.
run_fade is the actual fade routine that will fade a clut. Needs the clut_copy and fade_values arrays correctly set up. run_fade_add is the same routine, but instead of subtracting the fade_values from the clut_copy, adds them - this works better when fading to a clut.
splat_clut takes a new clut in r3 and calls SetEntries to set the new clut.
copy_clut is a general routine that copies the clut in r3 to the memory pointed to by r4.
rotate_clut is a simple rotate routine that will rotate the clut in r3 to the left one position.
The file cf_utilities contains various window and drawing routines that I won't document here as they're just standard routines that call system functions.
"clut_fade_main" is naturally enough the top level
of the program. The theory of operation follows this path:
That's it! Ah, just one final thing - does anybody have a cgi script that will accept a form and post the fields on as email? If so, can we have a copy as we haven't got a clue!
Code on!
Stu.
©Lightsoft 1996. Unauthorised reproduction prohibited.