Wednesday, August 15, 2012

Parallax Propeller + COSMAC 1802: the Saga Continues...

Monday morning I posted about hooking up a Parallax Propeller QuickStart Board to an 1802 Microprocessor. My hopes are to make the Propeller behave as a RAM for the 1802, primarily to be able to display a bitmap of that segment of memory as video. The idea is to make it compatible software-wise with the 1861 Pixie chip (the 1802's native video adapter), which is presently unavailable.

Since then I've spent more time on this project. A lot of that time got tied up in side issues related to tying a Propeller to an 1802, as well as some time spent dealing with the test equipment I want to use to keep an eye on the circuit and see what it's up to.

First, here's how it looks tonight:

Propeller QuickStart hooked up to an old CDP1802 microprocessor.
The second stage of the Propeller/1802 mashup circuit. Click for full size.
I've got a closer relationship between the two processors now. The Propeller is actually doing something. Before, the 1802 was getting power and a fixed clock signal off the QuickStart board, but the Propeller was not itself involved.

Clocking the 1802
One thing I put a fair bit of time into (too much, that is) is using the Propeller to control the clock of the 1802. There are a few reasons I think this would be nice:

The clock can be varied in software depending on the user's desire and the individual 1802's capabilities.
At some later point, the Prop can have a greater degree of control over the 1802's operation in general. With the ability to stop it, start it, and even single-step it.

On my first stab at this a couple of days ago, I just threw a repeat loop into the Prop's code to pulse an output line. I didn't have very good control of the rate, I didn't have the constants set right, and it didn't work right off the bat so I decided to drop it and just use another clock.

Let's Try a Timer, Because...That's What They're For
This time I took time to read up on using the counter/timers on the Prop. Each cog has one, and I played around with it for a while before feeding its output to the 1802's clock input. After a while I was able to get the results I wanted, and was able to fine-tune that bit of the code.

Then I hooked it up to the 1802 with a frequency of 1.25MHz...well, I should say I hooked it up to a 4049, passed it through a pair of inverters to square it up a bit and buffer it. Then I passed the signal to the 1802. Which ran just fine.

At this point I still had the data bus of the 1802 hard wired to a $C4, which is a NOP instruction for the 1802. That way I can watch LEDs on the address bus and see that the chip is running through its address space. The pulse rate of the high order LED gives me an immediate idea of how fast the 1802 is getting clocked, too.

Then I started playing with the constant for the counter's frequency to see what speeds this 1802 chip would be good at with a 5V Vdd and 3.3V Vcc (the 1802 can use split supply voltages for its core and I/O. Not bad for 1976, eh?)

Overclocking! 4.8MHz Baby! Yeah, uh, Megahertz.
This one ran up to 3.6MHz without a problem. I didn't run it up until it stopped, the other day I think I had it running at about 4MHz when I was playing with the repeat loop. Above 3.6MHz it seemed to start running hot, so I stepped it back down again.

Above 3.6MHz it was noticeably warm after a while, and seemed to be getting warmer. Between 3.2 and 3.6MHz it was warm, but maintaining its temperature just fine. At 3.2MHz it was solid as a rock and the heat was barely detectable to a calibrated fingertip. If I can get it to run at this speed reliably with the Propeller providing video then popular Elf software like Tiny Basic, Forth, CHIP-8, and the programs written in them are going to rock on this system.

Edit: I've run it up to 4.807MHz now, and it didn't seem all that hot this time. It really shouldn't have any problem with heat at these voltages, it normally runs at 5MHz at higher voltages. And now it doesn't seem to have a problem. When I tried to take it over 4.8MHz it ran sometimes but not others, or hung occasionally. So this is the limit for stable operation at this voltage for this chip (an RCA CDP1802CE.)

Getting Data to the 1802

OK, so back to the problem of making the Propeller look like a RAM. Frankly, when I did my first look at the timing involved, I figured this would be completely trivial to implement. Push in data and address bus wires, connect up /MRD (memory read signal) from the 1802 to the Prop, crank a little Spin code, and I'd have the Prop acting like a ROM.

Put in /MWR and a little more code, BAM, the Prop is a RAM.

Hook up TPA to catch the high order address byte, tweak the code, and the Prop is ready to map more than 256 bytes. All I'd have to do is add the video code then figure out how much Prop RAM was available then expand to fit.

Well, it wasn't that easy. If it had been, I'd be playing with the video now.

I tried to leap forward at first. I plugged in the address and data bus wires, plugged in /MRD, and added some code to my clock driver program. It had a 256 byte array that it set up and initialized to $C4 in every position (NOP again, but this time it would be a soft NOP rather than a hard-wired one.) Then I wrote the program to wait for the Propeller I/O line I'd selected for /MRD to go low. Once that happened, it would get the address off the address bus, look up the appropriate element in the RAM array in memory, and put it out on the data bus until /MRD came back up again.

Easy-peasy, right?

Well, it didn't work.

Oscilloscope signal trace of one of the data lines and /MRD on the Propeller/1802 mashup.
Top: Data from Propellor pretending to be a memory. Bottom: Memory Read Request. Result: Memory Too Slow! (Click for full size.)

When I was doing my timing calcs and looking at the 1802's leisurely timing--it doesn't mind waiting just under a millisecond for its memory to come back when it's running at its normal speed (about 1.78MHz.) Hey, an 80MHz processor could be running a version of interpreted BASIC written in LISP and keep up with that, right?

I guess not. At least not the way I did it.

I clocked the 1802 at 1.25MHz for the test. This is an easy, slow speed that the Propeller can generate easily. I figured it'd work, I'd push up the frequency to 2.5MHz, check timings on the scope, and discover that I couldn't make the 1802 go too fast for the Prop.

Unfortunately the trace above shows different.

I Can Do Lots of Things Wrong, All at Once
I'm sure this is the result of something I'm doing improperly. I can think of several possibilities already:

I'm using WAITPNE and WAITPEQ to respond to the pin change. Perhaps, because they do some power-saving activity on the cog on which they're invoked, they just aren't meant to respond this quickly. Maybe if I just go to an active polling loop I'll be OK?

I'm using SPIN, an interpreted language. Perhaps I need to just insert a bit of Propeller assembly to tighten up the timing?

Perhaps there's some option or configuration thing I've not done?

Maybe this cog has to do something active with the timer/counter (I'm using the same cog as the one that's running the 1802's clock.) Perhaps if I move the RAM functions to another cog I'll be fine?

Those are just what comes off the top of my head. I'm far from being an expert on the Propeller, and I haven't picked up the books on it yet, just the free manual and datasheet downloads. But things like this create an opportunity to learn.

I also had a few other things I needed to figure out on the way, minor little things like the order you mention the output pins in when writing your code statements (I was getting $23 out instead of $C4 initially. Yeah, oops.) That and the whole thing is a rather fragile lash-up right now. I'm going to go to solder as soon as I get the RAM moving data to and from the 1802. But not before, because I hate rework, and if I don't test it on a breadboard first, there'll be rework. And this lets me do a few things like change which lines I'm using for address and data easily. I moved the data lines to P16..23 so that I can watch the data on the QuickStart board's LEDs, like a little front panel. Before, I had the address lines here, but I've already got LEDs on the breadboard showing me the address line states. So a quick change of a few constants in software, and I get a data display with no re-soldering.

Slightly Off-Task Tasks
I also took some time out to search out a line cord for the new frequency counter I got at the Ham Radio Club night before last that didn't have one (found one, after searching high and low. It's an oddball, not an HP cord.) And I checked out the two portable NonLinear Systems oscilloscopes I bought, too (one works, one doesn't, as expected.) The frequency counter was very nice to have while I was figuring out how to use the Prop's counter/timer as a numerically controlled oscillator, so the time was well spent even if it did eat into my Propeller-as-RAM time.

Looking Forward to Round 3
I'll be back at this before the end of the week (I'd like to tomorrow, but it'll be a busy day so I may not be able to.) If you have any suggestions or salient experience, your comments or emails would be much appreciated!

Edit:

I've had a look at the Propeller docs. It looks like the timing of hub instructions is my problem. I may just insert WAIT states for the 1802 and see what that does.

Monday, August 13, 2012

Parallax Propeller + Cosmac Elf = ?

I've started working on one implementation of an idea I've had for a while...

There's this neat old 70's computer system called the COSMAC Elf. It's like a lot of the microprocessor trainer systems of the time, but it's got some unique abilities that make it a bit more interesting to build, and expand on, than some of the others.

Video output being one of the biggies.

Step One
Before I go further, here's a look at an early step in implementing my idea:
An 1802 CPU on a breadboard connected to a Propeller Quickstart board for power and clock.
A first step: Power and Clock from the QS Board to the 1802. (Click for bigger image.)

Here I've gotten to the point of getting clock and two levels of power from the Quickstart board to the 1802 CPU. First, I had a crystal oscillator circuit providing a clock to the 1802. Only 5V power came from the Vin on the Quickstart board.

Then, I split the voltages. The crystal oscillator, the inverter for the clock signal, and Vdd for the 1802 were split off with 5V power, and the rest of the circuit was put on the 3.3V power from the Quickstart board. At this point, I'd been running the 1802 at 1MHz, slow enough I could easily watch the LEDs on the address lines changing as it ran.

Then I moved up to a 2MHz oscillator. The 1802 was still good with that, with its Vdd at 5V and Vcc at 3.3V.

Then I tried to get fancy.

I took an output off the Prop and tried to use a repeat-wait structure to clock the 1802. It worked, up to a point. But I got to where I was unsure of my actual frequency, and the 1802 stopped running at a slower speed than I expected (I thought.) In fact, I was getting too clever and messing myself up. Looking back, I was probably somewhere above 4MHz when the 1802 refused to run any more!

After a while I realized that, and just decided to put the crystal oscillator back in.

Then I took another look, and decided that a 5MHz XI off of the QS board could be used as a clock base. A 2.5MHz clock would be fine (actually, anything from 1.76MHz on up would be fine.) Most Elf computers run somewhere around 1.76 to 1.79MHz to accommodate the clock for the video IC they use. Getting at least that speed is pretty much a must for me to feel like this project is going where I want it to. But getting a faster clock would be even better, as we'll see.

First I dropped in a CMOS part--a 4013--to act as a divider for the 5MHz clock to drop it to 2.5MHz for the 1802. I forgot that at 5V the 4013 only really works up to 4MHz on its input. So that turned out to be a waste of time.

Then I dug out a small supply of 74AC74 ICs, which work fine at well 5MHz and above. It worked fine, dividing the 5MHz down to 2.5MHz. In fact, just to be a little conservative, I used both flip flops to divide the clock down to 1.25MHz, then ran that to the inverter I'd had the crystal oscillator going to.

That worked, so I tried the 2.5MHz output. At first the 1802 wouldn't run, I pushed the Reset switch, and noticed the clock took off when I bumped the Ground wire. Once the ground wire was back in its place securely, the 1802 ran just fine at 2.5MHz.

Then I bypassed the 4049 inverter, the signal from the 74AC74 is plenty strong enough to drive the 1802 by itself.

That was step one. Time for a break before step two.

The Plan
So why hook up an old CPU to a fast modern CPU like the Propeller?

Because of a problem with getting chips for the old Elf computer.

People still like building the old Elf computer. It's a complete computer system that can be built in an afternoon if you've got all the parts and tools at hand (I built my first in about four hours.) It's a computer that pretty well exposes all of its parts to examination, so it's easy to learn how it works, and to understand all the bits of the system.

The video IC, called the CDP1861 Pixie chip, is one of the simplest video ICs ever made. It's basically some timing and control signals wrapped around a shift register that works with the DMA mode of the 1802 to produce a really nifty one chip video interface.

It's not exactly workstation graphics, being monochrome with resolutions ranging from 64 x 32 to 64 x 128. But it does the job. People program the system using this quality of video. And you thought the Vic-20 was low-res!

Well, the problem is that in the past few years Pixie chips have pretty well become Unobtanium (a term that goes way back before the movie Avatar, by the way.) In other words, you can't get 'em. There have been a couple of less than optimal replacements (from the perspective of new ELf builders who have to make them up themselves rather than just buy one pre-made.)

I'm trying to come up with something slightly less suboptimal. And solve a problem that the Pixie chip has.


The Third Cycle...of DOOM

The Pixie's problem is that its timing only deals well with 1802 programs that use instructions that take two instruction cycles or less to complete. There are two instructions, Long Branch and Long Skip, that take 3 cycles. They create jitter in the video by throwing its timing off.

Since the Elf's video is basically just a straight bitmap of a chunk of memory on the screen (the lowest resolution, 64 x 32, is a map of the Elf's base 256 Bytes of memory straight to video.) So, if some other circuit could just read a relatively new, fast RAM in the time when the 1802 isn't reading it, then the other circuit could just pull the data then run it to video, and leave the Elf none the wiser.

That way all the Elf has to do is manipulate the data in the section of memory on the screen. And a 3-cycle instruction won't cause any problems. And the 1802 gets about 40% of its time back, during which it would otherwise have been doing DMA of that memory data to the Pixie chip's shift register.

So:
Video that doesn't need the unobtainable Pixie,
No 3-cycle instruction timing issues,
No loss of 40% processing time to video DMA, and
Able to run at higher clock speeds.

Sounds like a winner, right?

Implementation Details

Next was how to actually implement it. I've looked at several ways, with various advantages and disadvantages.

Using faster RAMs was a first building block I looked at. For example, a static RAM pulled out of a 486 motherboard's cache RAM would be more than fast enough. Both 20nS and 15nS are readily available. Plenty fast to grab a byte once every 1802 instruction cycle for the external video system.

Then, came my initial thought, maybe use an Atmel AVR microcontroller to do the grabbing, put the data into its internal RAM, use that as a frame buffer for some bit-bashed monochrome video. No big deal.

No big deal if you've already got the ability to program AVRs or are prepared to supply them preprogrammed. Still, not a bad solution. Just not likely to be popular as I'd like because of the hurdle of programming the chip. The idea wasn't just considered with an AVR, any of a number of uC families could work. But they had the same problem.

Another idea was to build a circuit from random logic. Not as appealing, with my schedule, but if I could be the pioneer on this and put something together then anyone could order the parts and start wiring. It would probably add about 1/3 to double the work to assembling an Elf computer. Again, not perfect, but possible.

Then I thought about taking the AVR idea and putting in a Propeller board. The advantage here is that, rather than getting a bare microcontroller and having to get the infrastructure to program it to do the One Job that that user may ever use it for, the board itself is all the infrastructure needed. (Yes, Arduino occurred to me, too.)

A download on a computer, a USB cable of the correct type (I'm using a cell phone charging cable), and you're in business. Even if the user never does anything with a Propeller again (what an unfortunate thought!) then they wouldn't be out anything but a bit of their time to get the "part" they need programmed for the job.

And it's less time than hand-wiring random logic on a perf board, no matter how you look at it.


It Just Keeps Getting Better

So, I started with the idea that the speedy (80MHz) Propeller would have no problem sneaking in and reading bytes out of the Elf's RAM during those long, lazy ~200nS slack periods. Then it could put the data into an internal frame buffer.

And, what hey!, the Propeller has built-in video. I could make it so that the final Propellor program puts out Composite baseband, Composite broadcast, and VGA all at once. What a deal! The user doesn't even have to pick between different programs based on what sort of video they want right now.

Then came the next idea. Replace a chunk of the Elf's RAM entirely. Let the Propeller be a chunk of RAM.

It can't replace all the possible RAM of an Elf in its present version. It's only got 32K of RAM in it, and it needs some of that for any applications it's running.

But, it can replace a chunk of the Elf's RAM. Enough for Pixie-quality video. And with some lines used for control, the video resolution and frame buffer location within that memory can be changed. I don't know yet, but it seems like mapping somewhere from 2K to 4K wouldn't be that difficult.


Pixie Compatibility

At first, I'm going to concentrate on a limited memory map, and on replicating the basic Pixie mode of 64 across by 32 high. In spite of the fact that there won't be any actual DMA transfer needed, it should be able to display video from the basic Pixie video programs like the iconic Starship Enterprise video from 1977 without modification. They'll just run faster as a result of no DMA overhead. I think.

Then looking at the exact details of how an expanded Elf uses the other modes (if it does, I've never done so myself) will let me look at expanding the Propeller's memory map area and responding to the Elf's control to do that.

So, if I can do what the Pixie does without requiring DMA, I'm already getting a system that's about 40% faster even if I don't move up the clock speed from ~1.79MHz. (The 1802 was the original overclocker's chip, way back in the 70's, but that's another story.) If I can increase the clock speed even more, with no effect on the video (since a hard-clocked Pixie chip isn't there any more), then I've got a system that'll run such things as Chip-8 and Tiny Basic that much faster than an original Elf with a Pixie.

If the 2.5MHz setup turns out to work (I see no obstacles at present, but that just means I haven't run into them yet), then I'm getting a system that should run about 2.3x faster than the stock Elf. It'll still be no speed demon (that's not the point), but it'll be nicer to use.

Next Steps, Baby Steps

I'm going to write a program for the Prop to make it pretend to be a RAM for the 1802. It will be a simple 256 byte memory. That avoids the multiplexed signals for the address. It'll (hopefully) receive and store data bytes from the 1802, and deliver data from its store on request.

The first pass is going to be really, really simple. I'm going to set up an array, put in a short program to blink an LED on the 1802's 'Q' output, and set it up to respond to the 1802's memory read requests. No writes required. If that works, then I'll add write capability and put in a program to test that (again using Q as an output.)

If that works, I'll proceed to give the 1802 some more sophisticated output and input capability and see where it goes from there. But best not to plan in too detailed a fashion too far ahead until the immediate problems are solved.