3. The Year of the Snake

Developing Demo Effects #3
In 1988, something like 1200 crack intros, intros, and demos were made! Two very typical effects for this year follow, with one thing in common: the Sine wave.

Hello, Bob!

Short for Blitter OBject, this was not really anything new - it had already been, well, referenced in the Hardware Reference Manual. When you coded a scroller, for example, you often arranged it so you had nothing behind it to worry about, but in games, a "bob routine" that would restore the background (and thereby escaping the limitations of the sprite chip) was a must! Bobs emerged in demos, perhaps from coders aspiring to become game coders. And they weren't as "aspiring" as all that! Quite a few games in 1988 lacked proper, efficient hardware bob routines.
This snippet shows a so-called cookie-cut blit, where the address registers point to the bob graphic, the contour (called mask) used to mask out the background, and the screen. (The accompanying bob graphic is converted with the interleaved option, and the contour is a filled circle duplicated in all bitplanes, also interleaved. If you draw your own, you can create a contour in Deluxe Paint by cutting to brush, selecting the highest color number in the palette, and pressing F2. This will set all bits in all bitplanes to 1 inside the contour to create the mask.)
DrawBob:					;a0-a2=bob/screen/mask, d0-d2=x,y, blit size
    *--- calc shift ---*
	moveq #15,d3				;AND with 15
	and.w d0,d3
	add.w d3,d3				;multiply by 4 to look up longword
	add.w d3,d3				;BLTCON values
    *--- calc screen addr ---*
	asr.w #4,d0				;divide by 16
	add.w d0,d0				;horizontal byte offset
	muls #bwid,d1				;vertical line offset, table possible
	add.w d0,d1
	lea (a1,d1.w),a3
    *--- last:poke only nec. regs ---*
	move.l XtblCookie(PC,d3.w),BLTCON0(a6)	;shifts+minterm
	move.l a3,BLTCPTH(a6)			;C dest
	move.l a0,BLTBPTH(a6)			;B mask (movem.l possible for these 3)
	move.l a2,BLTAPTH(a6)			;A src
	move.l a3,BLTDPTH(a6)			;D dest
	move.w d2,BLTSIZE(a6)			;blit size
	rts					;a3=screen address of bob

XtblCookie:					;look up BLTCON0+1 from X-shift
	dc.l $0fca0000,$1fca1000,$2fca2000,$3fca3000
	dc.l $4fca4000,$5fca5000,$6fca6000,$7fca7000
	dc.l $8fca8000,$9fca9000,$afcaa000,$bfcab000
	dc.l $cfcac000,$dfcad000,$efcae000,$ffcaf000

Enter the Sine Wave

Coders discovered that the sine wave was very versatile. It could be used to create circles and other smooth waves to create movements in a demo that were less stiff and mechanical than the simpler ones. It could be used to give the illusion of acceleration or gravity, breathing life into on-screen movements. Coders were very inventive in applying the sine wave to create many kinds of interesting effects that would otherwise have looked dull.
The sine function takes a lot of time for a CPU (without a floating-point unit) to calculate.
Certainly, you can use the math libraries provided with AmigaDOS to calculate this, but it required learning about floating point formats and providing the libraries on the disk on which the demo was spread. It's also non-trivial to calculate this to good precision using integer arithmetic in Assembler.
A simple and ubiquitous short-cut was to use a high-level language to calculate the sine wave you needed and save it as a binary file, which you would then include in your demo and use as a look-up table.

Figure 1: SineWriter by Duane "Tachyon" McDonnell, generating our sine wave values

The Bob Snake

To generate the typical "bob snake" of 1988 you would have two pointers traversing this table at individual speeds, scale the values individually, and add these two offset-and-scaled waves together to get an interesting wave for the X-axis. You would do the same with two other sets of parameters for the Y-axis. The following "loop" renders the classic double-sine bob snake effect using the DrawBob routine above.
In fact, some demos from this time had these parameters available for the demo viewer to play around with to change the waves. To do this, change the WaveSteps parameters.

	REPT MaxBobs
	move.w (a5,d4.w),d0		;x
	move.w (a5,d5.w),d1		;y
	add.w d1,d1
	add.w (a5,d6.w),d0		;x
	add.w (a5,d7.w),d1		;y

	asr.w #1,d0			;scaled add for variation
	asr.w #2,d1			;feel free to scale with muls

	bsr.w DrawBob
	move.l a3,(a4)+			;save bob's screen addr
    *--- step to next bob ---*
	lea WaveSteps(PC),a3
	add.w (a3)+,d4
	add.w (a3)+,d5
	add.w (a3)+,d6
	add.w (a3)+,d7
	move.w #1023*2,d3		;keep within curve size, used below
	and.w d3,d4
	and.w d3,d5
	and.w d3,d6
	and.w d3,d7

The code saves the modified screen regions to a list, so that only those regions need to be cleared instead of a much slower full-screen clear. (For a high number of bobs using few bitplanes, a full screen clear will be more efficient.)

Figure 2: Our bob snake. Ball graphic by Fredrik "illmidus" Gustafsson

The Sine Scroller

Around this time, we started to see demos trying to push the limits. How many bobs can I draw "in one frame", i.e. before the framerate starts to stutter? Previously, there had been only simple effects with no real thought of "filling up the rastertime" and making it more impressive. Now, competition set in, and if your demo caused frame overrun, you'd better optimize! (This optimization is also necessary for sine scrollers, which consists of hundreds of small blits.)

Figure 3: Bat Sine by Exodus, featuring a 1 pixel sine scroller
A few made sine scrollers with the CPU, necessarily resulting in a horizontal 8px resolution - moving columns of bytes up and down. To go beyond this and still not get frame stutter, you had to use the Blitter.

The "Bimmer"?

Did you know that Jay Miner, the father of the Amiga's original chip set, called it The Bimmer? He insisted on this bit image manipulator acronym because it did so much more than the block image transfers of a Blitter. As we shall see.

Now, depending on design decisions, such as other effects running, or showing a big 32-color logo, full 1px resolution was perhaps not possible. Decreasing the font height, and deciding that the sine scroller should be the main effect - and optimizing the loop that blitted the thin font-character-slices - would eventually make 1px resolution sine scrollers possible.
A fast sine scroller would not copy slices from the font, but let the text scroll into a rectangular buffer as usual, and THEN copy the slices to different Y positions on the screen, in order to distribute the character lookups. In our example, the source bitmap (with the secret message!) represents this scroll buffer.
Optimizations include eliminating as many re-sets of Blitter registers as possible in the inner loop, making every 16th blit a copy-blit instead of an OR-blit, and expanding the scroll buffer so that the blank padding space above and below would clear the background; another trick to avoid a slow full-screen clear.
    *--- init for Sine ---*
	bsr.w WaitBlitter
	move.w #$ffff,BLTALWM(a6)	;for 1-word blits, masks are ANDed together.
	move.w #bpl-2,BLTBMOD(a6)	;B modulo, screen
	move.w #fontbpl-2,BLTAMOD(a6)	;A modulo, scrollbuffer font bitplane width
	move.w #bpl-2,BLTDMOD(a6)	;D modulo, screen
	clr.w BLTCON1(a6)		;BLTCON0 varies, this doesn't.

	lea ScrollBuffer+paddingh*fontbpl,a0
	lea 120*bwid(a2),a2		;dest addr ptr (increasing in x)
	move.w #fonth*64+1,d4		;normal blit size

    *--- sine scroller loop ---*

	moveq #w/16-1,d7
	lea SliceMasks(PC),a5		;easy table to save 16 x ror.w in-loop

	lea -paddingh*fontbpl(a0),a3
	lea -paddingh*bwid(a2),a1	;copy an extra high slice=clear around.
	add.w (a4)+,a1
	move.w #$09f0,BLTCON0(a6)	;first pixel slice of 16:copy operation
	move.w (a5)+,BLTAFWM(a6)	;slice mask
	move.l a3,BLTAPTH(a6)		;ScrollBuffer source
	move.l a1,BLTDPTH(a6)		;Screen destination
	move.w #(fonth+paddingh*2)*64+1,BLTSIZE(a6)

	move.w #$0dfc,BLTCON0(a6)	;then, OR

	REPT 15				;...15 slices to the screen.

	move.l a2,a1
	add.w (a4)+,a1
	move.w (a5)+,BLTAFWM(a6)
	move.l a1,BLTBPTH(a6)		;Screen source for OR
	move.l a0,BLTAPTH(a6)
	move.l a1,BLTDPTH(a6)
	move.w d4,BLTSIZE(a6)


	addq.w #2,a0
	addq.w #2,a2
	dbf d7,.wordl

SliceMasks:				;single bits being shifted rightward.
	dc.w $8000,$4000,$2000,$1000
	dc.w $800,$400,$200,$100
	dc.w $80,$40,$20,$10
	dc.w $8,$4,$2,$1

These optimizations together allow a full-screen, 25px high sine scroller, and there is rastertime to spare.
Note that to show the principle and general optimization techniques, I've chosen the second fastest sine scroller technique. Know that there is also the C2P-shift type sine scroller - and that the source here can be adapted to mask several bit-slices to single blits, too, to optimize it substantially. Try it!
Both sources are based on the usual startup code, for the first time utilizing its double-buffering. Buffering is required for heavy full-screen effects, in order to not show flickery artifacts (as the raster sweeps down the screen displaying the buffer currently being drawn into).