Okay, this one took too long, and it’s huge as a result of that. Here we go.
Table of Contents
Open Table of Contents
Machine Learning Stuff
There’s so much of this lately, isn’t there? It’s tough to get away from. And I’m so curious about so much of it. Let’s start with a polarizing opinion.
Algorithmic Disgorgement
Since machine learning is such a salty damn topic, and I feel strongly about it, I want to begin with this and a very clear belief of mine: I think OpenAI, Midjourney, etc., have engaged in absurd amounts of theft of intellectual property. I think they ought to be penalized with algorithmic disgorgement, and rebuild it all from scratch, ethically sourcing it all along the way.
If this means they fail, they fail.
Mamba: The Easy Way
Jack Cook writes at length about training and a new kind of training approach he calls Mamba. I don’t understand it, I can barely even look at the math without my brain falling off, but I’m curious about it.
The Era of 1-Bit LLMs
This is a paper that describes a way to create embeddings that lets you use 1-bit numbers (-1, 0, 1) instead of 16-bit floating point numbers. Then it goes on to talk about how you can use additive matrix math instead of multiplicative. Do I get it? Hell no! Again, no. But the implications struck me just the same:
- This means less storage footprint for an LLM.
- It also means less of a processing footprint.
- Together, that means we can make them quicker, tinier, and more performant.
The paper goes on to say they’ve proofed the output and capabilities of the 1-bit model and observed comparable performance.
The implication
I’ve long believed that we’re looking at LLMs in a pre-utility state. They’re too big, too difficult to train, too experimental, and frankly much of the code I’ve seen powering them is absolutely a hastily-wrought tangled mess. Understandably! But that means we’re looking at legacy systems in the making. The coming days will involve a lot of clean-up.
If we have 1-Bit LLMs, we’ve got a chance to do our first major optimization of performance and miniaturization. That’s going to position us to make tinier models and have them collaborate more. Complex interlocking models in toolkits we can rapidly work with and get feedback from, that’s the future of this kind of computer program.
Finally, in order to embrace 1-Bit LLMs we’d have to do the algorithmic disgorgement, and retrain everything. So we can enforce ethics on retraining points, or at the very least if anyone does unethical shit now… well we know they did it willingly.
Training LLMs From Scratch as a Startup
This is a walkthrough of one org’s journey training a bunch of LLMs on the cheap with rented cloud hardware. This is important because right now the training moat for this stuff is considerable. Only megacorporate sponsors really need apply for some of it. So any escape strategy or local-first training approach is very welcome.
You can now train a 70b model from home
In the same vein as the previous entry, another startup with a DIY journey and strategy that they’re willing to share. This one needs major GPUs but although expensive, that puts the approach well within reach of affordability for a small technology company.
AICI
This is a controller system for AI. This is a big one, too. Again, in the grain of making machine learning models production-grade and useful on the household level.
It’s important to bear in mind that most of the time, your models have absolutely no ability to be correct. It’s actually not possible for a generative system to understand things to begin with, much less understand that it’s right or wrong. Any proximity to the truth or fact is coincidence hedged by statistical models that can never actually reach 100%.
It is categorically not possible for LLMs to be factually accurate. So you can’t rely on them, right?
Well, the problem there is that we will rely on them just ‘cause we want to. For most of us, I guess, we just don’t care, right?
One thing the LLM industry doesn’t want to talk about is just how intractable this issue is. There’s no resolving it with machine learning, not presently. What could resolve it, though? Well, formal proofing mechanisms of some kind.
In order to reign in an LLM, you can build an entirely second system that provides you with formal proofing of the subject matter. Every single input for the LLM can be checked against this formal proof, and then flunked until you get accuracy.
Or in other words, for LLMs to function, you’d probably need to build traditional software for it and let it leverage that as a skeletal system, a framework from which it will more accurately feign comprehension and thought.
The problem with doing that is twofold:
- You’re still building up traditional software. Gosh! I thought LLMs were supposed to lessen our tech worker burden, not amplify it. Who’d have thought that Bay Area capitalists were lying yet again?
- If you’re flunking LLM output you’re going to be rerunning a lot of processes. So everything gets slower and more expensive.
AICI is one, but not the only, way to answer problem number 2, at least. It lets you integrate these proofing systems in the token stream as the models generate them. So it’s not quite as expensive as a full flunk and rerun.
Rust + Machine Learning
It’s not just Rust and it’s not just Machine Learning! It’s a delightful melange of the intersection between them both.
Kalosm
Kalosm is a rust-based kit for doing machine learning work (running models, training, etc) in a way that can be embedded in applications. This means you can do it in your own program, and deploy it wherever you like, and you don’t have to do silly stuff like hitch your wagon to OpenAI’s API servers, then cross your fingers and hope they won’t just come eat your milkshake if you make something cool they want.
Tensorflow Rust Guide
This is a tutorial on how to use Tensorflow, a neural network toolkit, with Rust bindings. Not just a whole lot more to say about this one, except that Holy Shit my browser blocked 72 tracking scripts on this page. Reader beware, I guess.
Rust Roundup
Plenty of these, too! No real salty rants about rust or anything. Sorry. I guess.
Rust Tooling
This is just kind of a top 8 list of Rust tools. I’d like to come back to this when I get elbow deep in my rust side project again. Some cool standouts:
cargo machete
, which helps prune unused deps from manifests.cargo nextest
, which is a nicer test runner than the stock one.cargo flamegraph
, which is a dep sizing analysis tool.cargo audit
, which is a security / static analysis tool.
JCO
This is a Javascript binding kit for using WebAssembly Components. WASI components are pretty sweet specification for interoperability guidelines for anything that can compile with a WASI target. In other words, WASI components let you be more language agnostic with your tooling and JCO helps that.
Smaller rust bites
These are each still pretty important or interesting, but I can summarize them fast-like, so they’re going in a single list.
- Lessons Learned From Building a Distributed System in Rust - A collection of experiences and reflections on building a distributed system in rust. The ups and downs, and opinions of the author in retrospect.
- Matching and Iterators in Rust - I’m curious if there’s any tricks in here that I don’t know yet. Iterators have been hard for me in the past. I’d like to be better with them.
- Practical Guide to Error Handling in Rust - I don’t feel like I can ever read enough about how to do this sort of thing cleanly. I have opinions but I want to evolve them.
- Building an Async Runtime With MIO - I’ve been using Tokio and a bespoke Tower-based handler system to pass async functions around in my pet project, and it winds up that there’s easy and hard ways to do this. I need to learn more.
General Software
Not specific to any language, but to the craft in general.
Pipeline-oriented Programming
Scott Wlaschin kicks ass in a lot of ways, but this specific article really stands the test of time. This and Parse, Don’t Validate fit in the same mold, really. The notion is that if you pursue certain design approaches, your code can be deterministic even if your IO is not. Pipelines are one approach, and here Scott demonstrates several different languages implementing them.
Wide Events
This is a set of learnings of large-scale observability by a former Facebook engineer. Facebook has a one-of-a-kind observability stack, I hear, and some of the learnings of that stack are in this post. I’m not sure what Wide Events are at the moment but I’m definitely curious! Especially as I need more of it with my day job.
Building a Fly.io-like Scheduler, pt 2
This is written in Golang, but has some architectural potential in any language I think. Scheduling resources will be important for me for my side project, as it has a stack provisioning feature set that I don’t want to be stupid about building. This might help as reference!
Code Toys
I’m calling them toys in the title but these all seem seriously cool in their own way. I’m guessing they’re also fun to play around with, though.
PGLite
The Electric SQL team is pretty kick ass, and their data tooling is really bleeding edge while using some seriously battle-tested stuff, like for example, Postgres, here.
This tool lets you embed Postgres in a web app or a file, much like you would SQLite. It abstracts all the details of persistence away from you, just letting you access the SQL interface directly.
Readability
This takes a webpage and makes it nicer to read. Pretty simple, right? Well, it could be useful for other kinds of tools as well. For example, rendering embedded webpages, or RSS links, stuff like that.
Teable
I love Airtable so much, but I wind up wishing I had full on access to the data. If I did, I could do some mean shit with it, you know? Let my users mess with air tables instead of building interfaces for them, or maybe embed little DBs in my web projects, like I would a WYSIWYG editor. This thing looks like it does just that! And maybe it works with PGLite?
JSON Canvas
Speaking of cool interfaces to use in a webpage, this comes from the Obsidian team and lets you embed an editable infinite canvas in a web app. It saves to a universal format that can be encoded in a file, too. Maybe this could be another block type in the pet project, too?
Conclusion
That was so many tabs. Oh my god. Hopefully my next round won’t be as bad. Until next time!