It’s been a long slog, but the quest is over. I now have a functional-enough GW-BASIC to C# code translator. Using adventure.bas as a guide, I implemented enough features to produce a working version of it reimagined in C#.
The first step was to consume my GWParse library and make the ExampleProgram tests pass again. After a few initial design changes, I went straight to work on each incremental feature. String variables, numeric variables, string arrays (and DIM statements), numeric arrays, CLS, multi-statement lines, 2-D numeric arrays, all sorts of expressions, PRINT, GOSUB/RETURN, RUN, END, INPUT, IF/THEN, LEN, READ/DATA, FOR/NEXT, MID$, LEFT$, negation, and (not too much) more.
There were a few interesting computer-sciency things I hit along the way. For instance, my approach to building subroutines involved looking at start lines (indicated by GOSUB) and end lines (indicated by RETURN). The trick is that, while subroutines will always have exactly one start line (at least in a reasonable program), they could have multiple RETURNs. To handle this, I iterated over an ordered list of end lines and paired each with the corresponding start, as long as the start was not greater (in which case I would pair it with the immediately preceding start instead).
Another complication was how IF/THEN and FOR/NEXT statements would not always appear as the first statement on a line. For example, the line 10 CLS : IF A=1 THEN GOTO 20
should be translated into something like CLS(); if (A_n == 1) { goto L20; }
whereas my original naïve approach pushed the CLS()
inside the if
block. The solution involved putting a marker token (I cheated and used null
) when IF was encountered, creating a split point between the statements preceding and the statements inside the IF. (The same basic approach worked for FOR/NEXT as well.)
With all the above implemented, a valid C# program results from the translation of adventure.bas! That being said, the code is quite obviously machine-generated and has some idiosyncrasies. Take this passage for example:
610 FL=0 : FOR I=0 TO NO-1 620 IF R=(OB(I) AND 127) THEN PRINT " ";OB$(I) : FL=1 630 NEXT I
The C# translation comes out as follows:
FL_n = 0; I_n = 0; while ((I_n) <= ((NO_n) - (1))) { if ((((R_n.CompareTo(((int)(OB_na[(int)(I_n)])) & 127)) == 0) ? -1 : 0) != 0) { PRINT(" " + OB_sa[(int)(I_n)]); FL_n = (1); } I_n = ((I_n) + (1)); }
Because of the way Booleans are handled in GW-BASIC (essentially as 16-bit integers), I used some rather verbose CompareTo ternary constructions to make sure intermediate and final expression results were valid. It is also somewhat more difficult to generate for loops using Roslyn, so I used while loops everywhere.
There are a few things I decided not to fix. Since DATA/READ statements in GW-BASIC are not imperatively executed, I used a DATA queue to hold all values read and did not emit code for them. This has the side effect of leaving orphaned comment lines such as in 26100 DATA AN OLD DIARY,DIA,1 : REM OBJECT #0
. The end of line comment // OBJECT #0
appears near the end of the program while the data queuing statements land well above in an unrelated area. A similar confusing fate exists for comments appearing right before subroutine lines. Perfect is the enemy of good, as they say.
As a final note, I actually discovered a few bugs in the original adventure.bas program listing as a result of translating it. It turns out the DROP routine never worked and there was a minor errant GOTO resulting in unreachable code.
So… what to do now with a machine-translated C# program? Why, the ultimate legacy refactoring project, of course! Stay tuned.
Pingback: Refactoring recipes: remove goto – WriteAsync .NET