There is a free market argument that can be made about how Apple gets to design its own ecosystem and, if it is so restrictive, people will be more hesitant to buy an iPhone since they can get more choices with an Android phone. I get that. But I think it is unfortunate so much of our life coalesces around devices which are so restrictive compared to those which came before.
Recall Apple’s “digital hub” strategy. The Mac would not only connect to hardware like digital cameras and music players; the software Apple made for it would empower people to do something great with those photos and videos and their music.
The iPhone repositioned that in two ways. First, the introduction of iCloud was a way to “demote” the Mac to a device at an equivalent level to everything else. Second, and just as importantly, is how it converged all that third-party hardware into a single device: it is the digital camera, the camcorder, and the music player. As a result, its hub-iness comes mostly in the form of software. If a developer can assume the existence of particular hardware components, they have extraordinary latitude to build on top of that. However, because Apple exercises control over this software ecosystem, it limits its breadth.
Like the Mac of 2001, it is also a hub for accessories — these days, things like headphones and smartwatches. Apple happens to make examples of both. You can still connect third-party devices — but they are limited.
I want to set expectations accordingly. We will build a good app for iOS, but be prepared – there is no way for us to support all the functionality that Apple Watch has access to. It’s impossible for a 3rd party smartwatch to send text messages, or perform actions on notifications (like dismissing, muting, replying) and many, many other things.
Even if you believe Apple is doing this not out of anticompetitive verve, but instead for reasons of privacy, security, API support, and any number of other qualities, it still sucks. What it means is that Apple is mostly competing against itself, particularly in smartwatches. (Third-party Bluetooth headphones, like the ones I have, mostly work fine.)
The European Commission announced guidance today for improving third-party connectivity with iOS. Apple is, of course, miserable about this. I am curious to see the real-world results, particularly as the more dire predictions of permitting third-party app distribution have — shockingly — not materialized.
Imagine how much more interesting this ecosystem could be if there were substantial support across “host” platforms.
LED-based festive decorations are a fascinating subject for exploration of ingenuity in low-cost electronics. New products appear every year and often very surprising technology approaches are used to achieve some differentiation while adding minimal cost.
This year, there wasn’t any fancy new controller, but I was surprised how much the cost of simple light strings was reduced. The LED string above includes a small box with batteries and came in a set of ten for less than $2 shipped, so <$0.20 each. While I may have benefitted from promotional pricing, it is also clear that quite some work went into making the product cheap.
The string is constructed in the same way as one I had analyzed earlier: it uses phosphor-converted blue LEDs that are soldered to two insulated wires and covered with an epoxy blob. In contrast to the earlier device, they seem to have switched from copper wire to cheaper steel wires.
The interesting part is in the control box. It comes with three button cells, a small PCB, and a tactile button that turns the string on and cycles through different modes of flashing and and constant light.
Curiously, there is nothing on the PCB except the button and a device that looks like an LED. Also, note how some “redundant” joints have simply been left unsoldered.
Closer inspection reveals that the “LED” is actually a very small integrated circuit packaged in an LED package. The four pins are connected to the push button, the cathode of the LED string, and the power supply pins. I didn’t measure the die size exactly, but I estimate that it is smaller than 0.3×0.2 mm² = ~0.1 mm².
What is the purpose of packaging an IC in an LED package? Most likely, the company that made the light string is also packaging their own LEDs, and they saved costs by also packaging the IC themselves—in a package type they had available.
I characterized the current-voltage behavior of IC supply pins with the LED string connected. The LED string started to emit light at around 2.7V, which is consistent with the forward voltage of blue LEDs. The current increased proportionally to the voltage, which suggests that there is no current limit or constant current sink in the IC – it’s simply a switch with some series resistance.
Left: LED string in “constantly on” mode. Right: Flashing
Using an oscilloscope, I found that the string is modulated with an on-off ratio of 3:1 at a frequency if ~1.2 kHz. The image above shows the voltage at the cathode, the anode is connected to the positive supply. This is most likely to limit the current.
All in all, it is rather surprising to see an ASIC being used when it barely does more than flashing the LED string. It would have been nice to see a constant current source to stabilize the light levels over the lifetime of the battery and maybe more interesting light effects. But I guess that would have increased the cost of the ASIC too much and then using an ultra-low cost microcontroller may have been cheaper. This almost calls for a transplant of a MCU into this device…
Apple sells two external displays, including the Pro Display XDR and the Studio Display, but neither has received hardware upgrades in years. In fact, the Pro Display XDR is nearly five years old, having been released all the way back in December 2019.
This is not surprising, since Apple has historically taken a long time to update its displays. I don’t think the panels necessarily need to be updated. But it’s disappointing because the Studio Display has well documented camera problems and power issues. I had high hopes that, coming from Apple, it would be reliable as a USB hub, but I end up directly connecting as many storage devices as possible to the meager ports on my MacBook Pro.
Displays are a product category conducive to infrequent updates. The plentiful problems I have been reading with the Studio Display, in particular, worry me. Most sound like software problems, but that is not consolation. Apple’s software quality has been insufficiently great for years and, so, it does not surprise me that a display running iOS is not as reliable as a display that does not use an entire mobile operating system.
Bouyed by the surprisingly good performance of neural networks with quantization aware training on the CH32V003, I wondered how far this can be pushed. How much can we compress a neural network while still achieving good test accuracy on the MNIST dataset? When it comes to absolutely low-end microcontrollers, there is hardly a more compelling target than the Padauk 8-bit microcontrollers. These are microcontrollers optimized for the simplest and lowest cost applications there are. The smallest device of the portfolio, the PMS150C, sports 1024 13-bit word one-time-programmable memory and 64 bytes of ram, more than an order of magnitude smaller than the CH32V003. In addition, it has a proprieteray accumulator based 8-bit architecture, as opposed to a much more powerful RISC-V instruction set.
Is it possible to implement an MNIST inference engine, which can classify handwritten numbers, also on a PMS150C?
On the CH32V003 I used MNIST samples that were downscaled from 28×28 to 16×16, so that every sample take 256 bytes of storage. This is quite acceptable if there is 16kb of flash available, but with only 1 kword of rom, this is too much. Therefore I started with downscaling the dataset to 8×8 pixels.
The image above shows a few samples from the dataset at both resolutions. At 16×16 it is still easy to discriminate different numbers. At 8×8 it is still possible to guess most numbers, but a lot of information is lost.
Suprisingly, it is still possible to train a machine learning model to recognize even these very low resolution numbers with impressive accuracy. It’s important to remember that the test dataset contains 10000 images that the model does not see during training. The only way for a very small model to recognize these images accurate is to identify common patterns, the model capacity is too limited to “remember” complete digits. I trained a number of different network combinations to understand the trade-off between network memory footprint and achievable accuracy.
Parameter Exploration
The plot above shows the result of my hyperparameter exploration experiments, comparing models with different configurations of weights and quantization levels from 1 to 4 bit for input images of 8×8 and 16×16. The smallest models had to be trained without data augmentation, as they would not converge otherwise.
Again, there is a clear relationship between test accuracy and the memory footprint of the network. Increasing the memory footprint improves accuracy up to a certain point. For 16×16, around 99% accuracy can be achieved at the upper end, while around 98.5% is achieved for 8×8 test samples. This is still quite impressive, considering the significant loss of information for 8×8.
For small models, 8×8 achieves better accuracy than 16×16. The reason for this is that the size of the first layer dominates in small models, and this size is reduced by a factor of 4 for 8×8 inputs.
Surprisingly, it is possible to achieve over 90% test accuracy even on models as small as half a kilobyte. This means that it would fit into the code memory of the microcontroller! Now that the general feasibility has been established, I needed to tweak things further to accommodate the limitations of the MCU.
Training the Target Model
Since the RAM is limited to 64 bytes, the model structure had to use a minimum number of latent parameters during inference. I found that it was possible to use layers as narrow as 16. This reduces the buffer size during inference to only 32 bytes, 16 bytes each for one input buffer and one output buffer, leaving 32 bytes for other variables. The 8×8 input pattern is directly read from the ROM.
Furthermore, I used 2-bit weights with irregular spacing of (-2, -1, 1, 2) to allow for a simplified implementation of the inference code. I also skipped layer normalization and instead used a constant shift to rescale activations. These changes slightly reduced accuracy. The resulting model structure is shown below.
All things considered, I ended up with a model with 90.07% accuracy and a total of 3392 bits (0.414 kilobytes) in 1696 weights, as shown in the log below. The panel on the right displays the first layer weights of the trained model, which directly mask features in the test images. In contrast to the higher accuracy models, each channel seems to combine many features at once, and no discernible patterns can be seen.
Implementation on the Microntroller
In the first iteration, I used a slightly larger variant of the Padauk Microcontrollers, the PFS154. This device has twice the ROM and RAM and can be reflashed, which tremendously simplifies software development. The C versions of the inference code, including the debug output, worked almost out of the box. Below, you can see the predictions and labels, including the last layer output.
Squeezing everything down to fit into the smaller PMS150C was a different matter. One major issue when programming these devices in C is that every function call consumes RAM for the return stack and function parameters. This is unavoidable because the architecture has only a single register (the accumulator), so all other operations must occur in RAM.
To solve this, I flattened the inference code and implemented the inner loop in assembly to optimize variable usage. The inner loop for memory-to-memory inference of one layer is shown below. The two-bit weight is multiplied with a four-bit activation in the accumulator and then added to a 16-bit register. The multiplication requires only four instructions (t0sn, sl,t0sn,neg), thanks to the powerful bit manipulation instructions of the architecture. The sign-extending addition (add, addc, sl, subc) also consists of four instructions, demonstrating the limitations of 8-bit architectures.
void fc_innerloop_mem(uint8_t loops) {
sum = 0;
do {
weightChunk = *weightidx++;
__asm
idxm a, _activations_idx
inc _activations_idx+0
t0sn _weightChunk, #6
sl a ; if (weightChunk & 0x40) in = in+in;
t0sn _weightChunk, #7
neg a ; if (weightChunk & 0x80) in =-in;
add _sum+0,a
addc _sum+1
sl a
subc _sum+1
... 3x more ...
__endasm;
} while (--loops);
int8_t sum8 = ((uint16_t)sum)>>3; // Normalization
sum8 = sum8 < 0 ? 0 : sum8; // ReLU
*output++ = sum8;
}
In the end, I managed to fit the entire inference code into 1 kilowords of memory and reduced sram usage to 59 bytes, as seen below. (Note that the output from SDCC is assuming 2 bytes per instruction word, while it is only 13 bits).
Success! Unfortunately, there was no rom space left for the soft UART to output debug information. However, based on the verificaiton on PFS154, I trust that the code works, and since I don’t have any specific application in mind, I left it at that stage.
Summary
It is indeed possible to implement MNIST inference with good accuracy using one of the cheapest and simplest microcontrollers on the market. A lot of memory footprint and processing overhead is usually spent on implementing flexible inference engines, that can accomodate a wide range of operators and model structures. Cutting this overhead away and reducing the functionality to its core allows for astonishing simplification at this very low end.
This hack demonstrates that there truly is no fundamental lower limit to applying machine learning and edge inference. However, the feasibility of implementing useful applications at this level is somewhat doubtful.
Nilay Patel, of the Verge, interviewed Hanneke Faber, CEO of Logitech, for the Decoder podcast.
NP […] You sell me the keyboard once. It’s got Options Plus. It has an AI button. I push the button, and someone has to make sure the software still works. Someone probably has to pay ChatGPT for access to the service. Where is that going to come from? Are you baking that into the margin of the keyboard or the mouse?
HF Absolutely. We’re baking that in, and I’m not particularly worried about that. What I’m actually hoping is that this will contribute to the longevity of our products, that we’ll have more premium products but products that last longer because they’re superior and because we can continue to update them over time. And again, I talked about doubling the business and reducing the carbon footprint by half. The longevity piece is really important.
I’m very intrigued. The other day, in Ireland, in our innovation center there, one of our team members showed me a forever mouse with the comparison to a watch. This is a nice watch, not a super expensive watch, but I’m not planning to throw that watch away ever. So why would I be throwing my mouse or my keyboard away if it’s a fantastic-quality, well-designed, software-enabled mouse. The forever mouse is one of the things that we’d like to get to.
Faber goes on to say this is a mouse with always-updated software, “heavier” — which I interpreted as more durable — and something which could provide other services. In response to Patel’s hypothetical of paying $200 one time, Faber said the “business model obviously is the challenge there”, and floats solving that through either a subscription model or inventing new products which get buyers to upgrade.
The part of this which is getting some attention is the idea of a subscription model for a mouse which is, to be fair, stupid. But the part which I was surprised by is the implication that longevity is not a priority for business model reasons. I am not always keen to ascribe these things to planned obsolesce, yet this interview sure looks like Faber is outright saying Logitech does not design products with the intention of them lasting for what at least seems like “forever”.
To be fair, I have not bought anything from Logitech in a long time, and I do not remember when I last did. I believe its cable may have terminated in a PS/2 plug. I switched to a trackpad on my desk long ago. When I bought my Magic Trackpad in 2015, I assumed I would not have to replace it for at least a decade; nine years later, I have not even thought about getting a new one. Even if its built-in battery dies — its sole weakness — I think I will be able to keep using it in wired mode.
But then I went on Wikipedia to double-check the release date of the second-generation Magic Trackpad, and I scrolled to the “Reception” section. Both generations were criticized as being too expensive at $70 for the first version, and $130 for the second. But both price tags seem like a good deal for a quality product. Things should be built with the intention they will last a long time, and a $200 mouse is a fine option if it is durable and could be repaired if something breaks.
I know this is something which compromises business models built on repeat business from the same customers, whether that means replacing a broken product or a monthly recurring charge. But it is rare for a CEO to say so in such clear terms. I appreciate the honesty, but I am repelled by the idea.
Yesterday I took the M1 MacBook Pro to my local Apple-authorized service provider that I’ve been going to for many years, who performed all of the work on my Intel MacBook Pro, including the battery replacements and a Staingate screen replacement. This is a third-party shop, not an Apple Store. To my utter shock, they told me that they couldn’t replace the battery in-house, because starting with the Apple silicon transition, Apple now requires that the MacBook Pro be mailed in to Apple for battery replacement! What. The. Hell.
The battery in my 14-inch MacBook Pro seems to be doing okay, with 89% capacity remaining after nearly two years of use. But I hope to use it for as long as I did my MacBook Air — about ten years — and I swapped its battery twice. This spooked me. So I called my local third-party repair place and asked them about replacing the battery. They told me they could change it in the store with same-day turnaround for $350, about the same as what Apple charges, using official parts. It is unclear to me if a Apple could replace the battery in-store or would need to send it out, but every Mac service I have had from my local Apple Store has required me to leave my computer with them for several days.
The situation likely varies by geography. Apple’s Self Service Repair program is not available in Canada, which means a battery swap has to be done either by a technician, or using unofficial parts. If you are concerned about this, I recommend contacting your local shops and seeing what their policies are like.
In a recent interview with Marques Brownlee, John Ternus, Apple’s head of hardware engineering, compared ease of repair and long-term durability:
On an iPhone, on any phone, a battery is something […] that’s gonna need to be replaced, right? Batteries wear out. But as we’ve been making iPhones for a long time, in the early days, one of the most common types of failures was water ingress, right? Where you drop it in a pool, or you spill your drink on it, and the unit fails. And so we’ve been making strides over all those years to get better and better and better in terms of minimizing those failures.
This is a fair argument. While Apple has not — to my knowledge — acknowledged any improvements to liquid resistance on MacBook Pros, I spilled half a glass of water across mine in November, and it suffered no damage whatsoever. Ternus’ point is that Apple’s solution for preventing liquid damage to all components, including the battery, compromised the ease of repairing an iPhone, but the company saw it as a reasonable trade-off.
But it is also a bit of a red herring for two reasons. The first is that Apple actually made recent iPhone models more repairable without reducing water or dust resistance, indicating this compromise is not exactly as simple as Ternus implies. It is possible to have easier repairs and better durability.
The second reason is because batteries eventually need replacing on all devices. They are a consumable good with a finite — though not always predictable — lifespan, most often shorter than the actual lifetime usability of the product. The only reason I do not use my AirPods any more is because the battery in each bud lasts less than twenty minutes; everything else is functional. If there is any repair which should be straightforward and doable without replacing unrelated components or the entire device, it is the battery.
[Apple vice president of iPad and Mac product marketing Tom] remained firm: iPads are for touch, Macs are not. “MacOS is for a very different paradigm of computing,” he said. He explained that many customers have both types of devices and think of the iPad as a way to “extend” work from a Mac. Apple’s Continuity easily allows you to work across devices, he said.
So there you have it, Apple wants you to buy…both? If you pick one, you live with the trade-offs. I did ask Boger if Apple would ever change its mind on the touch-screen situation.
“Oh, I can’t say we never change our mind,” he said. One can only hope.
This is fair, and if you were forced to use a touch screen Mac on a vertical screen with no keyboard or mouse to help, then sure, I believe that would be a tiring experience as well. What I find frustrating about this idea is that it lacks imagination. I get the impression that people who hate the idea of touch on Macs can only imagine the current laptops with a digitizer in the screen detecting touch. It’s kind of ironic, but this is exactly the sort of thinking that Apple so rarely does. As we often say, Apple doesn’t add technology for the sake of technology, they add features users will enjoy.
Apple has never pretended the iPad is a tablet Mac. As I wrote several years ago, it has been rebuilding desktop features for a touch-first environment: multitasking, multiwindowing, support for external pointing devices, a file browser, a Dock, and so on. This is an impressive array of features which reference and reinterpret longtime Mac features while respecting the iPad’s character.
But something is missing for some number of people. Developers and users complain annually about the frustrations they experience with iPadOS. A video from Quinn Nelson illustrates how tricky the platform is. One of the great fears of iPad users is that increasing its capability will necessarily entail increasing its complexity. But the iPad is already complicated in ways that it should not be. There is nothing about the way multiwindowing works which requires it to be rule-based and complicated in the way Stage Manager often is.
Perhaps a solution is to treat the iPad as only modestly evolved from its uniwindow roots with hardware differentiated mostly by niceness. I disagree; Apple does too. The company clearly wants it to be so much more. It made a capable version of Final Cut Pro for iPad models which use the same processor as its Macs, but it makes you watch the progress bar as it exports a video because it cannot complete the task in the background.
iPadOS may have been built up from its touchscreen roots but, let us not forget, it is also built up from smartphone roots — and the goals and objectives of smartphone and tablet users can be very different.
What if it really did make more sense for an iPad to run MacOS, even if that is only some models and only some of the time? What if the best version of the Mac is one which is convertible to a tablet that you can draw on? What if the most capable version of an iPad is one which can behave like a Mac when you need it? None of this would be simple or easy. But I have to wonder: is what Apple has been adding for fourteen years produced a system which remains as simple and easy to use as it promises for its most dedicated iPad customers?
Bouyed by the surprisingly good performance of neural networks with quantization aware training on the CH32V003, I wondered how far this can be pushed. How much can we compress a neural network while still achieving good test accuracy on the MNIST dataset? When it comes to absolutely low-end microcontrollers, there is hardly a more compelling target than the Padauk 8-bit microcontrollers. These are microcontrollers optimized for the simplest and lowest cost applications there are. The smallest device of the portfolio, the PMS150C, sports 1024 13-bit word one-time-programmable memory and 64 bytes of ram, more than an order of magnitude smaller than the CH32V003. In addition, it has a proprieteray accumulator based 8-bit architecture, as opposed to a much more powerful RISC-V instruction set.
Is it possible to implement an MNIST inference engine, which can classify handwritten numbers, also on a PMS150C?
On the CH32V003 I used MNIST samples that were downscaled from 28×28 to 16×16, so that every sample take 256 bytes of storage. This is quite acceptable if there is 16kb of flash available, but with only 1 kword of rom, this is too much. Therefore I started with downscaling the dataset to 8×8 pixels.
The image above shows a few samples from the dataset at both resolutions. At 16×16 it is still easy to discriminate different numbers. At 8×8 it is still possible to guess most numbers, but a lot of information is lost.
Suprisingly, it is still possible to train a machine learning model to recognize even these very low resolution numbers with impressive accuracy. It’s important to remember that the test dataset contains 10000 images that the model does not see during training. The only way for a very small model to recognize these images accurate is to identify common patterns, the model capacity is too limited to “remember” complete digits. I trained a number of different network combinations to understand the trade-off between network memory footprint and achievable accuracy.
Parameter Exploration
The plot above shows the result of my hyperparameter exploration experiments, comparing models with different configurations of weights and quantization levels from 1 to 4 bit for input images of 8×8 and 16×16. The smallest models had to be trained without data augmentation, as they would not converge otherwise.
Again, there is a clear relationship between test accuracy and the memory footprint of the network. Increasing the memory footprint improves accuracy up to a certain point. For 16×16, around 99% accuracy can be achieved at the upper end, while around 98.5% is achieved for 8×8 test samples. This is still quite impressive, considering the significant loss of information for 8×8.
For small models, 8×8 achieves better accuracy than 16×16. The reason for this is that the size of the first layer dominates in small models, and this size is reduced by a factor of 4 for 8×8 inputs.
Surprisingly, it is possible to achieve over 90% test accuracy even on models as small as half a kilobyte. This means that it would fit into the code memory of the microcontroller! Now that the general feasibility has been established, I needed to tweak things further to accommodate the limitations of the MCU.
Training the Target Model
Since the RAM is limited to 64 bytes, the model structure had to use a minimum number of latent parameters during inference. I found that it was possible to use layers as narrow as 16. This reduces the buffer size during inference to only 32 bytes, 16 bytes each for one input buffer and one output buffer, leaving 32 bytes for other variables. The 8×8 input pattern is directly read from the ROM.
Furthermore, I used 2-bit weights with irregular spacing of (-2, -1, 1, 2) to allow for a simplified implementation of the inference code. I also skipped layer normalization and instead used a constant shift to rescale activations. These changes slightly reduced accuracy. The resulting model structure is shown below.
All things considered, I ended up with a model with 90.07% accuracy and a total of 3392 bits (0.414 kilobytes) in 1696 weights, as shown in the log below. The panel on the right displays the first layer weights of the trained model, which directly mask features in the test images. In contrast to the higher accuracy models, each channel seems to combine many features at once, and no discernible patterns can be seen.
Implementation on the Microntroller
In the first iteration, I used a slightly larger variant of the Padauk Microcontrollers, the PFS154. This device has twice the ROM and RAM and can be reflashed, which tremendously simplifies software development. The C versions of the inference code, including the debug output, worked almost out of the box. Below, you can see the predictions and labels, including the last layer output.
Squeezing everything down to fit into the smaller PMS150C was a different matter. One major issue when programming these devices in C is that every function call consumes RAM for the return stack and function parameters. This is unavoidable because the architecture has only a single register (the accumulator), so all other operations must occur in RAM.
To solve this, I flattened the inference code and implemented the inner loop in assembly to optimize variable usage. The inner loop for memory-to-memory inference of one layer is shown below. The two-bit weight is multiplied with a four-bit activation in the accumulator and then added to a 16-bit register. The multiplication requires only four instructions (t0sn, sl,t0sn,neg), thanks to the powerful bit manipulation instructions of the architecture. The sign-extending addition (add, addc, sl, subc) also consists of four instructions, demonstrating the limitations of 8-bit architectures.
void fc_innerloop_mem(uint8_t loops) {
sum = 0;
do {
weightChunk = *weightidx++;
__asm
idxm a, _activations_idx
inc _activations_idx+0
t0sn _weightChunk, #6
sl a ; if (weightChunk & 0x40) in = in+in;
t0sn _weightChunk, #7
neg a ; if (weightChunk & 0x80) in =-in;
add _sum+0,a
addc _sum+1
sl a
subc _sum+1
... 3x more ...
__endasm;
} while (--loops);
int8_t sum8 = ((uint16_t)sum)>>3; // Normalization
sum8 = sum8 < 0 ? 0 : sum8; // ReLU
*output++ = sum8;
}
In the end, I managed to fit the entire inference code into 1 kilowords of memory and reduced sram usage to 59 bytes, as seen below. (Note that the output from SDCC is assuming 2 bytes per instruction word, while it is only 13 bits).
Success! Unfortunately, there was no rom space left for the soft UART to output debug information. However, based on the verificaiton on PFS154, I trust that the code works, and since I don’t have any specific application in mind, I left it at that stage.
Summary
It is indeed possible to implement MNIST inference with good accuracy using one of the cheapest and simplest microcontrollers on the market. A lot of memory footprint and processing overhead is usually spent on implementing flexible inference engines, that can accomodate a wide range of operators and model structures. Cutting this overhead away and reducing the functionality to its core allows for astonishing simplification at this very low end.
This hack demonstrates that there truly is no fundamental lower limit to applying machine learning and edge inference. However, the feasibility of implementing useful applications at this level is somewhat doubtful.
The CH32V203 is a 32bit RISC-V microcontroller. In the produt portfolio of WCH it is the next step up from the CH32V003, sporting a much higher clock rate of 144 MHz and a more powerful RISC-V core with RV32IMAC instruction set architecture. The CH32V203 is also extremely affordable, starting at around 0.40 USD (>100 bracket), depending on configuration.
An interesting remark on twitter piqued my interest: Supposedly the listed flash memory size only refers to a fraction that can be accessed with zero waitstate, while the total flash size is even 224kb. The datasheet indeed has a footnote claiming the same. In addition, the RB variant offers the option to reconfigure between RAM and flash, which is rather odd, considering that writing to flash is usually much slower than to RAM.
Then the 224kb number is mentioned in the memory map. Besides the code flash, there is also a 28Kb boot section and additional configurable space. 224 kbyte +28 kbyte+4=256kbyte, which suggests that the total available flash is 256 kbyte and is remapped to different locations of the memory.
All of these are red flags for an architecture where a separate NOR flash die is used to store the code and the main CPU core has a small SRAM that is used as a cache. This configuration was pioneered by Gigadevice and is also famously used by the ESP32 and RP2040 more recently, although that latter two use an external NOR flash device.
Flash memory is quite different from normal CMOS devices as it requires a special gate stack, isolation and much higher voltages. Therefore, integrating flash memory into a CMOS logic die usually requires extra process steps. The added complexity increases when going to smaller technologies nodes. Separating both dies offers the option of using a high density logic process (for example 45 nm) and pairing it with a low-cost off-the-shell NOR flash die.
Decapsulation and Die Images
To confirm my suspicions I decapsulated a CH32V203C8T6 sample, shown above. I heated the package to drive out the resin and then carefully broke the, now brittle, package apart. Already after removing the lead frame, we can cleary see that it contains two dies.
The small die is around 0.5mm² in area. I wasn’t able to completely removed the remaining filler, but we can see that it is an IC with a smaller number of pads, fitting to a serial flash die.
The microcontroller die came out really well. Unfortunately, the photos below are severely limited by my low-cost USB microscope. I hope Zeptobars or others will come up with nicer images at some point.
The die size of ~1.8 mm² is surprisingly small. In fact it is even smaller than the die of the CH32V003 with a die size of ~2.0 mm² according to Zeptobars die shot. Apart from the fact that the flash was moved off-chip, most likely also a much smaller CMOS technology node was used for the CH32V203 than for the V003.
Summary
It was quite surprising to find a two-die configuration in such a low-cost device. But obviously, it explains the oddities in the device specification, and it also explains why 144 MHz core clock is possible in this device without wait-states.
What are the repercussions?
Amazingly, it seems that, instead of only 32kb of flash, as listed for the smallest device, a total of 224kb can be used for code and data storage. The datasheet mentions a special “flash enhanced read mode” that can apparently be used to execute code from the extended flash space. It’s not entirely clear what the impact on speed is, though, but that’s certainly an area for exploration.
I also expect this MCU to be highly overclockable, similar to the RP2040.
The WS2812 has been around for a decade and remains highly popular, alongside its numerous clones. The protocol and fundamental features of the device have only undergone minimal changes during that time.
However, during the last few years a new technology dubbed “Gen2 ARGB” emerged for use in RGB-Illumination for PC, which is backed by the biggest motherboard manufacturers in Taiwan. This extension to the WS2812 protocol allows connecting multiple strings in parallel to the same controller in addition to diagnostic read out of the LED string.
Not too much is known about the protocol and the supporting LED. However, recently some LEDs that support a subset of the Gen2 functionality became available as “SK6112”.
I finally got around summarizing the information I compiled during the last two years. You can find the full documentation on Github linked here.
Years ago I spent some time analyzing Candle-Flicker LEDs that contain an integrated circuit to mimic the flickering nature of real candles. Artificial candles have evolved quite a bit since then, now including magnetically actuated “flames”, an even better candle-emulation. However, at the low end, there are still simple candles with candle-flicker LEDs to emulate tea-lights.
I was recently tipped off to an upgraded variant that includes a timer that turns off the candle after it was active for 6h and turns it on again 18h later. E.g. when you turn it on at 7 pm on one day, it would stay active till 1 am and deactive itself until 7 pm on the next day. Seems quite useful, actually. The question is, how is it implemented? I bought a couple of these tea lights and took a closer look.
Nothing special on the outside. This is a typical LED tea light with CR2023 battery and a switch.
On the inside there is not much – a single 5mm LED and a black plastic part for the switch. Amazingly, the switch does now only move one of the LED legs so that it touches the battery. No additional metal parts required beyond the LED. As prevously, there is an IC integrated together with a small LED die in the LED package.
Looking top down through the lens with a microscope we can see the dies from the top. What is curious about the IC is that it rather large, has plenty of unused pads (3 out of 8 used) and seems to have relatively small structures. There are rectangular regular areas that look like memory, there is a large area in the center with small random looking structure, looking like synthesized logic and some part that look like hand-crafted analog. Could this be a microcontroller?
Interestingly, also the positions of the used pads look quite familiar.
The pad-positions correspond exactly to that of the PIC12F508/9, VDD/VSS are bonded for the power supply and GP0 connects to the LED. This pinout has been adopted by the ubiqitous low-cost 8bit OTP controllers that can be found in every cheap piece of chinese electronics nowadays.
Quite curious, so it appears that instead of designing another ASIC with candle flicker functionality and accurate 24h timer they simply used an OTP microcontroller and molded that into the LED. I am fairly certain that this is not an original microchip controller, but it likely is one of many PIC derivatives that cost around a cent per die.
Electrical characterization
For some quick electrical characterization is connected the LED in series with a 220 Ohm resistor to measure the current transients. This allows for some insight into the internal operation. We can see that the LED is driven in PWM mode with a frequency of around 125Hz. (left picture)
When synchronizing to the rising edge of the PWM signal we can see the current transients caused by the logic on the IC. Whenever a logic gate switches it will cause a small increase in current. We can see that similar patterns repeat at an interval of 1 µs. This suggests that the main clock of the MCU is 1 MHz. Each cycle looks slightly different, which is indicative of a program with varying instruction being executed.
Sleep mode
To gain more insights, I measured that LED after it was on for more than 6h and had entered sleep mode. Naturally, the PWM signal from the LED disappeared, but the current transients from the MCU remained the same, suggesting that it still operates at 1 MHz.
Integrating over the waveform allows to calculate the average current consumption. The average voltage was 53mV and thus the average current is 53mV/220Ohn=240µA.
Can we improve on this?
This is a rather high current consumption. Employing a MCU with sleep mode would allow to bring this down significiantly. For example the PFS154 allows for around 1µA idle current, the ATtiny402 even a bit less.
Given a current consumption of 240µA, a CR2032 with a capacity of 220mAh would last around 220/0.240 = 915h or 38 days.
However, during the 6h it is active a current of several mA will be drawn from the battery. Assuming an average current of 2 mA, the battery woudl theoretically last 220mAh/3mA=73h. In reality, this high current draw will reduce its capacity significantly. Assuming 150mAh usable capacity of a low cost battery, we end up with around 50h of active operating time.
Now lets assume we can reduce the idle current consumption from 240µA to 2µA (18h of off time per day), while the active current consumption stays the same (mA for 6h):
a) Daily battery draw of current MCU: 6h*2mA + 18h*240µA = 16.3mAh b) Optimzed MCU: 6h*2mA + 18h*2µA = 12mAh
Implementing a proper power down mode would therefore allows extending the operating life from 9.2 days to 12.5 days – quite a significant improvement. The main lever is the active consumption, though.
Summary
In the year 2023, it appears that investing development costs in a candle-flicker ASIC is no longer the most economical option. Instead, ultra-inexpensive 8-bit OTP microcontrollers seem to be taking over low-cost electronics everywhere.
Is it possible to improve on this candle-LED implementation? It seems so, but this may be for another project.
The RGB curtain predictably turns into a mess of wires when not used according to instructions.
As should be obvious from this blog, I am somewhat drawn to clever and minimalistic implementations of consumer electronics. Sometimes quite a bit of ingeniosity is going into making something “cheap”. The festive season is a boon to that, as we are bestowed with the latest innovation in animated RGB Christmas lights. I was obviously intrigued, when I learned from a comment on GitHub about a new type of RGB light chain that was controlled using only the power lines. I managed to score a similar product to analyze it.
The product I found is shown below. It is a remote controlled RGB curtain. There are many similar products out there. What is special about this one, is that there are groups of LEDs with individual color control, allowing not only to set the color globally but also supporting animated color effects. The control groups are randomly distributed across the curtain.
Remote controlled RGB curtain (vendor image)
The same type of LEDs also seems to be used in different products, like “rope-light” for outside usage. A common indication for this LED type seems to be the type of remote control being used, that has both color and animation options (see above).
There seems to be an earlier version of similar LEDs (thanks to Harald for the link) that allows changing global color setting in a similar scheme but without the addressability.
Physical analysis
Let’s first take a quick look at the controller. The entire device is USB powered. There is a single 8 pin microcontroller with a 32.768kHz quarz. Possibly to enable reliable timing (there is a timer option on the remote controler) and low power operation when the curtain is turned off. The pinout of the MCU seems to follow the PIC12F50x scheme which is also used by many similar devices (e.g. Padauk, Holtek, MDT). The marking “MF2523E” is unfamiliar though and it was not possible to identify the controller. Luckily this is not necessary to analyze the operation. There are two power mosfets which are obviously used to control the LED string. Only two lines connect to the entire string, named L- (GND) and L+.
All 100 (up to 300 in larger versions) LEDs are connected to the same two lines. These types of strings are known as “copper string lights” and you can see how they are made here (Thanks to Harald from µC.net for the link!). It’s obvious that it is easier to change the LED than the string manufacturing process, so any improvement that does not require additional wires (or even a daisy chain connection like WS2812) is much easier to introduce.
Close up images of a single LED are shown above. We can clearly see that there is a small integrated circuit in every lightsource, and three very tiny LED chips.
Trying to break the LED apart to get a better look at the IC surface was not successful, as the package always delaminated between carrer (The tiny pcb on the left) and chips (still embedded in the epoxy diffusor on the right). What can be deduced however, is that the IC is approximatly 0.4 x 0.6 = 0.24 mm² in area. That is actually around the size of a more complex WS2812 controller IC.
LED Characterization
Hooking up the LEDs directly to a power supply caused them to turn on white. Curiously there does not seem to be any kind of constant current source in the LEDs. The current changes direclty in proportion to the applied voltage, as shown below. The internal resistance is around 35 Ohms.
This does obviously simplify the IC a lot, since it basically only has to provide a switch instead of a current source like in the WS2812. It also appears that this allows to regulate the overall current consumption of the LED chain from the string controller by changing the string voltage and control settings. The overall current consumption of the curtain is between 300-450 mA, right up to the allowable maximum of power draw of USB2.0. Maybe this seemingly “low quality” solution is a bit more clever than it looks at the first glance. There is a danger of droop of course, if too much voltage is lost over the length of the string.
How Is It Controlled?
Luckily, with only two wires involved, analyzing the protocol is not that complex. I simply hooked up one channel of my oscilloscope to the end of the LED string and recorded what happened when I changed the color settings using the remote control.
The scope image above shows the entire control signal sequence when setting all LEDs to “red”. Some initial observations:
The string is controlled by pulling the entire string voltage to ground for short durations of time (“pulses”). This is very simple to implement, but requires the LED controller to retain information without external power for a short time.
We can directly read from the string voltage whether LEDs are turned on or off.
The first half of the sequence obviously turns all LEDs off (indeed, the string flickers when changing color settings), while the second half of the sequence turns all LEDs on with the desired color setting.
Some more experimentation revealed that the communication is based on messages consisting of an address field and a data field. The data transmission is initiated with a single pulse. The count of following pulses indicates the value that is being transmitted using simple linear encoding (Which seems to be similar to what ChayD observed in his string, so possibly the devices are indeeed the same). No binary encoding is used.
Address and data field are separated by a short pause. A longer pause indicates that the message is complete and changes to the LED settings are latched after a certain time has passed.
My findings are summarized in the diagram above. The signal timing seems to be derived from minimum cycle timing of the 32.768kHz Crystal connected to the microcontroller, as one clock cycle equals ~31 µs. Possibly the pulse timing can be shortened a bit, but then one also has to consider that the LED string is basically a huge antenna…
AddressField
Function
0
Unused / No Function
1 … 6
Address one of six LED subgroubs (zones), writes datafield value into RGB Latch.
7
Address all LEDs at once (broadcast), adds datafield value to RGB latch content.
RGB LatchValue
RGB encoding
0 (000)
Turn LEDs off (Black)
1 (001)
Red
2 (010)
Green
3 (011)
Yellow (Red+Green)
4 (100)
Blue
5 (101)
Magenta (Red+Blue)
6 (110)
Cyan (Green+Blue)
7 (111)
White
The address field can take values between 1 and 7. A total of six different zones can be addressed with addresses 1 to 6. The data that can be transmitted to the LED is fairly limited. It is only possible to turn the red, green and blue channels on or off, realizing 7 primary color combinations and “off”. Any kind of intermediate color gradient has to be generated by quickly changing between color settings.
To aid this, there is a special function when the address is set to 7. In this mode, all zones are addressed at the same time. But instead of writing the content of the data field to the RGB latch, it is added to it. This allows, for example, changing between neighbouring colors in all zones at once, reducing communication overhead.
This feature is extensively used. The trace above sets the string colour to “yellow”. Instead of just encoding it as RGB value “011”, the string is rapibly changed between green and red, by issuing command “7,1” and “7,7” alternatingly. The reason for this is possibly to reduce brightness and total current consumption. Similar approaches can be used for fading between colors and dimming.
Obviously the options for this are limited by protocol speed. A single command can take up to 1.6ms, meaning that complex control schemes including PWM will quickly reduce the maximum attainable refresh rate, leading to visible flicker and “rainbowing”.
It appears that all the light effects in the controller are specifically built around these limitation, e.g. by only fading a single zone at a time and using the broadcast command if all zones need to be changed.
Software Implementation
Implementing the control scheme in software is fairly simple. Below you can find code to send out a message on an AVR. The code can be easily ported to anything else. A more efficient implementation would most likely use the UART or SPI to send out codes.
The string is directly connected to a GPIO. Keep in mind that this is at the same time the power supply for the LEDs, so it only works with very short strings. For longer strings an additional power switch, possibly in push-pull configuration (e.g. MOSFET), is required.
It seems to be perfectly possible to control the string without elaborate reset sequence. Nevertheless, you can find details about the reset sequence and a software implementation below. The purpose of the reset sequence seems to be to really make sure that all LEDs are turned off. This requires sending everything twice and a sequence of longer duration pulses with nonobvious purpose.
// Emulation of reset sequence
void resetstring(void) {
PORTB &= ~_BV(PB0); // Long power off sequence
_delay_ms(3.28);
PORTB |= _BV(PB0);
_delay_us(280);
for (uint8_t i=0; i<36; i++) { // On-off sequence, purpose unknown
PORTB &= ~_BV(PB0);
_delay_us(135);
PORTB |= _BV(PB0);
_delay_us(135);
}
_delay_us(540);
// turn everything off twice.
// Some LEDs indeed seem to react only to second cycle.
// Not sure whether there is a systematic reason
for (uint8_t i=7; i>0; i--) {
sendcmd(i,0);
}
for (uint8_t i=7; i>0; i--) {
sendcmd(i,0);
}
}
Pulse Timing And Optical Measurements
Update: To understand the receiver mechanism a bit more and deduce limits for pulse timing I spent some effort on additional measurements. I used a photodiode to measure the optical output of the LEDs.
An exemplary measurement is shown above. Here I am measuring the light output of one LED while I first turn off all groups and then turn them on again (right side). The upper trace shows the intensity of optical output. We can see that the LED is being turned off and on. Not surprisingly, it is also off during “low” pulses since no external power is available. Since the pulses are relatively short this is not visible to the eye.
Taking a closer look at the exact timing of the update reveals that around 65µs pass after the last pulse in the data field until the LED setting is updated. This is an internally generated delay in the LED that is used to detect the end of the data and address field.
To my surprise, I noticed that this delay value is actually dependent on the pulse timing. The timeout delay time is exactly twice as long as the previous “high” period, the time between the last two “low” pulses.
This is shown schematically in the parametrised timing diagram above.
An internal timer measures the duration of the “high” period and replicates it after the next low pulse. Since no clock signal is visible in the supply voltage, we can certainly assume that this is implemented with an analog timer. Most likely a capacitor based integrator that is charged and discharged at different rates. I believe two alternating timers are needed to implement the full functionality. One of them measures the “on”-time, while the other one generates the timeout. Note that the timer is only active when power is available. Counting the pulses is most likely done using an edge detector in static CMOS logic with very low standby power that can be fed from a small on-chip capacitor.
The variable timeout is actually a very clever feature since it allows adjusting the timing over a very wide range. I was able to control the LEDs using pulsewidths as low as 7 µs, a significant speed up over the 31 µs used in the original controller. This design also makes the IC insensitive to process variation, as everything can be implemented using ratiometric component sizing. No trimming is required.
See below for an updated driver function with variable pulse time setting.
#define basetime_us 10
#define frameidle_us basetime_us*5 // cover worst case when data is zero
void sendcmd(uint8_t address, uint8_t cmd) {
for (uint8_t i=0; i<address+1; i++) {
sendpulse();
}
_delay_us((basetime_us*3)/2);
for (uint8_t i=0; i<cmd+1; i++) {
sendpulse();
}
_delay_us(frameidle_us);
}
// Send pulse
void sendpulse(void) {
PORTB &= ~_BV(PB0);
_delay_us(basetime_us);
PORTB |= _BV(PB0);
_delay_us(basetime_us);
}
Conclusions
All in all, this is a really clever way to achieve a fairly broad range of control without introducing any additional data signals and while absolutely minimizing the circuit overhead per light source. Of course, this is far from what a WS2812 and clones allow.
Extrapolating from the past means that we should see of more these LEDs at decreasing cost, and who knows what kind of upgrades the next Christmas season will bring.
There seem to be quite a few ways to take this scheme further. For example by finding a more efficient encoding of data or storing further states in the LEDs to enable more finely grained control of fading/dimming. May be an interesting topic to tinker with…
What would it take to build an addressable LED like the WS2812 (aka Neopixel) using only discrete transistors? Time for a small “1960 style logic meets modern application” technology fusion project.
The Objective
What exactly do we want to build? The diagram above shows how a system with our design would be set up. We have a micontroller with a single data output line. Each “Pixel” module has a data input and a data output than can be used to connect many devices together by “daisy chaining”.
This is basically how the WS2812 works. To simplify things a bit, I had to make some concessions compared to the original WS2812:
Each Pixel controls only a single LED that can be either turned on or off instead using pulse width modulation to allow greyscale (This can be implemented on the controller)
Since only one bit of information is needed to turn the LED on or off, each LED will only accept a single bit of data.
The LED will be immediately updated upon receipt of data, instead of latching only during “reset”.
We don’t implement signal retiming of the dataoutput. The data input will be buffered and directly forwarded to the ouput. This will lead to degradation of the signal timing after a while, but it is sufficient to control a few LEDs in cascade.
The protocol is shown above. “LED Off” is encoded as a short pulse, “LED On” as a long pulse. After the first LED has accepted the data from the first pulse, any subsequent pulses will be forwarded to the next device and so on. This allows programming a chain of TransistorPixels with a train of pulses. If 20µs has passed without any pulse, all devices will reset and are ready to accept new data.
Top Level Architecture
There are many ways to implement the desired functionality. A straightforward but complex way would be to used a clocked logic design. But, of course this is also a challenge in minimalism. Since we are using discrete transistors, we can utilize all kinds of analog circuit tricks. I chose to go with using clockless logic with asynchronous timing elements. The choices in this design may seem obvious in hindsight, but quite some thought went into this.
The schematic above shows the top level architecture of the TransistorPixel. There are three main blocks:
State Generator: This blocks decides whether the data on the input belongs to this Pixel or whether it is to be forwarded to the next one. This is signalled by a state signal on the output. The input is the Data Input.
Data Director: This block is a basic multiplexer that, depending on the state signal, decides whether the data on the input is directed to the protocol decoder or the data output
Protocol Decoder and Data latch: This block receives inputs signals that belong to this Pixel and will turn the LED on or off depending on encoded state. There is just a single input.
The basic logic style for the circuit is Resistor-Transistor-Logic (RTL). This was the very first transistor based logic style and was, for example, used in the CDC6600 super computer. A benefit of this logic style is that it is very simple and therefore well suitable for small discrete logic designs. There are some drawbacks though, that led to numerous other logic style being developed in later years.
The entire design was first implemented and simulated in LTSpice. You can download the design files from the Hackaday.io project page.
Simulation results from the top level testbench are shown above. You can see the input and output signals of three Pixels and the state of the associated LEDs. Observe how each part of the chain will removed one pulse from the pulse train, use this to turn the LED on or off, and forward the other pulses to the next device. The gap of 20µs between the two trains of pulses is sufficient to reset the receiver so the cycle can start anew.
Below is a photo of the final design implementation. Each block is clearly delineated on the PCB.
Lets review how the individual blocks are implemented.
The Data Director
The data director is a simple multiplexer that consists of two inverters and two NOR gates. The gate symbols along with their respective circuit implementation are shown below. Since two of each are needed, a total of 6 transistors and 10 resistors are used. The beige component is a capacitor that was added for decoupling. I used releatively high base (4.7 kOhm) and collector resistances of (2.2 and 4.7 kOhm) to keep current consumption moderate. The collector resistance has to be adjusted according to the fan-out.
One important aspect is the choice of transistors. Since this is not a super-fast design I chose the PMBT3904, which is a low cost switching transistor. I also tried a chinese clone (The CJ MMBT3904), but encountered issues due to high base charge storage time.
While the design of the data director is straightforward, I encountered some issues with pulse deformation during data forwarding. Since RTL logic operates transistors in the saturating regime, the output delay for a low-high transistion on the input of a gate is much slower than for a high-low transition. This can cause an increase or decrease of pulse length and detoriates the signal during data forwarding. I solved this by ensuring that the signal is fed through two identical inverters in series during data forwarding. This ensures symmatrical timing on rising and falling edge and reduces pulse deformation sufficiently.
The State Generator
The purpose of the state generator is to change the device to the forwarding state after the first pulse has arrived and switch it back to the receiving state when no signal arrived f0r 20µs.
This requirements is met with a retriggerable monoflop that triggers on a falling edge.
The cicuit of the monoflop is shown above. The inverter transforms the falling edge into a rising edge. The rising edge is filtered with C4 and R4 that form a high pass and trigger Q7. If Q7 is turned on, it will discharge C5, which will turn Q8 off and the output of the monoflip is pulled high. C5 is slowely recharged through R6 and will turn on Q8 again after around 20µs.
You can see the output of the state generator above. When the output is high, all input signal will be directed to the output (“datain2”), when it is low, the input will be directed to the receiver.
Decoder and Latch
The decoder and latch unit consists of a non-retriggerable monoflop and a latch that is formed from to NOR-gates. The monoflop triggers on the rising edge of the input signal and will generate an output pulse with a length of approximately 0.6 µs that is fed into the latch. The length of this output pulse is independent of the length of the input pulse. The other input of the latch will directly receive the data signal from the input. When both inputs of the latch are low, the latch will remember the last state of the inputs. Therefore it can discriminate which signal went low first: the reference timing signal from the monoflop or the pulse from the data input. This does effectively allow to discirminate between a pulse that is longer or shorted than the reference. The output of the latch is fed into an additional inverter that serves as a driver for the LED.
The circuit of the non-retriggerable monoflop is shown above. A positive edge on the input of Q2 will pull the collector of Q2 and Q3 down and hence also the base of Q4 through the capacitor. This will pull the output high and turn Q3 on. As long as Q3 is on, further pulses on the base of Q3 are ignored. The monoflop will only turn off once C3 has been recharged via R9. Some more info here. Once interesting aspect of this circuit is that the base of Q4 will be pulled to a negative potential due to capacitive coupling. Hence the stored saturation charge from Q4 is removed much quicker and the output can switch to high with very little delay.
Real-Life Performance
After verification of the design in LTSpice, I created a PCB and used an assembly service to manufacture the PCBs and pick and place the parts. The interesting part is of course, how the actual circuit performs.
A first step was to investigate the timing behavior of the logic gates. The screenshots above show the timing of one of the input inverters base on PMBT3904. The yellow trace corresponds to the input (driven from a MCU GPIO) and the lower channel corresponds to the output. We can see that the propagation time for a L->H transition is 18 ns and for a H->L transition is 162 ns. This is quite some difference to the model I used in LTspice. The main reason for this is that the model used a very pessimistic value for the storage time. After adjusting the storage time parameter (“TR”) to 380 ns, I was able to replicate the behavior of the inverter also in LTspice. The story is a bit different when using chinese clone transistors (CJ MMBT3904). They had a much longer storage time and hence a H->L delay of 283 ns which caused marginal timing. However, it turns out that the circuit does still somewhat work even with the CJ transistors.
Shown above is the input-output relationship when a signal is forwarded through a “Pixel”. The output pulse looks much less steep, thanks due to the slow response of the used RTL gates. However, pulses above 250 ns are still faithfully reproduced, which is sufficient to guaruantee proper operation.
This screenshot shows the behavior of the LED out in relation to input pulses of different length. The upper trace corresponds to the state of the LED (ON/OFF). We can see that a very short pulse of 62.5 ns will not turn off the LED and is ignored. Staring from 125 ns the LED is properly turned off and pulses of 625 ns and above will turn the LED on – as expected.
Summary
Not much to say – this was a fun little design that allowed to explore ancient logic styles in a maneagable and purposeful way (ok, at least somewhat…). The interesting challenge is to reduce the number of transistors used. Can you do better? Remember that it has to work in a real circuit, so using idealistic assumptions about the behavior of the transistors may not be sufficient.