# The Kolakoski Sequence in BASIC

To me, the argument about structure vs. unstructured programming languages and the general bashing of BASIC has missed an important element.

BASIC takes some of its approach from FORTRAN (which was short for FORmula TRANslator). One thing I have always enjoyed was trying to solve various math puzzles in BASIC. In the 1980s, BASIC was for fun and many business applications. You could also write a lot of games and puzzles in BASIC (as David Ahl proved). As for the early video games, a lot of the more advanced ones were likely to be written in assembly language.

Recently, the Kolakoski Sequence caught my imagination and I wanted to write a simple program to construct the series.

Details on the sequence can be found various places, but the simplified version paraphrased from Wikipedia is, “The Kolakoski sequence is an infinite sequence of symbols {1,2} that is its own run-length encoded value. It was initially named after the William Kolakoski (1944–97) who discussed it in 1965, but subsequent research has revealed that it first appeared in a paper by Rufus Oldenburger in 1939.

Run-length encoding simply means that the value represent how long or how many instances of a particular value lasts or exist in a series.

So let’s look at the start of the sequence:

1,2,2,1,1,2,1,2,2,1,2,2,1,1,2,1,1,2,2,1,…

Now for some simple run-length encoding, how many 1s in a row, how many 2s in a row, how many 1s (for a second time) in a row, and how many 2s (for a second time) in a row? That would be:

1,2,2,1,…

When we run-length encode the first six digits of the sequence, they yield the first four digits of the sequence.

We can “unencode” our four digit series by saying the first number represents how many times 1 is repeated, the second number represents how many times 2 is repeated, the third digit goes back to the number of times 1 is repeated, and the fourth digit goes back to how many time 2 is repeated. So our 1,2,2,1,… becomes:

1,2,2,1,1,2,…

Using this general approach we can write a simple program that will “unencode” or “expand” the series starting with nothing more than the first two digits. Below is a simple BASIC program (written for the Brandy BASIC interpreter, but fairly easy to port to any other vintage version of BASIC):

`10 REM CALCULATE KOLAKOSKI SEQUENCE`
`20 REM`
`30 REM BY JAMES MCCLANAHAN, W4JBM`
`40 REM ORIGINAL VERSION MAY 2018`
`50 REM`
`100 S%=50 `
`110 DIM I%(S%+5)`
`120 I%(1)=1:I%(2)=2`
`130 N1%=2:N2%=1`
`140 P1%=2:P2%=2`
`150 FOR X%=1 TO I%(P1%)`
`160 I%(P2%+X%-1)=N1%`
`170 NEXT X%`
`180 PRINT P1%;" - ";`
`190 FOR X%=1 TO P1%`
`200 PRINT RIGHT\$(STR\$(I%(X%)),1);`
`210 NEXT X%`
`220 PRINT:PRINT P2%;" - ";`
`230 FOR X%=1 TO P2%`
`240 PRINT RIGHT\$(STR\$(I%(X%)),1);`
`250 NEXT X%`
`260 PRINT:PRINT`
`270 SWAP N1%,N2%`
`280 P2%=P2%+I%(P1%)`
`290 P1%=P1%+1`
`300 IF P2%<S% THEN GOTO 150`
`399 STOP`

The value of 50 in line 100 shows how many digits maximum we want to calculate the series out to. (We may fall a few digits short and we may actually expand the sequence a few digits further. I’ve included simple logic to deal with that. I’ve also used some common string functions to format the sequence.)

At some point, I may take one of my older memory mapped video systems and use the display memory as the storage area for the sequence. But for now, I’ve accomplished basically (pun intended) what I set out to do: Demonstrate how the sequence can be calculated to an arbitrary length (limited, in our case, by computer memory or storage) given just the first two values of the sequence and the rules used to develop it.

Nothing fancy, but the kind of things you can spend an hour with BASIC playing around and learn a bit from. For me, it’s better than the crossword puzzle.

73,
Jim W4JBM

“With a schematic in one hand, a soldering iron in the other, and a puzzled look on his face…”