CoPilot for Everything: Training Your AI Replacement One Keystroke at a Time
Our employers have all the data they need to train AI models to replace us. How long will it be until this actually happens?
Between 2020 and 2021 I worked full-time at Google. Although my original plan was to work from Google headquarters in Mountain View, California, due to an international pandemic, my year-long job ended up being fully remote from start to finish. I received my laptop in the mail before my first day and sent it back in a box when I left the company.
During this stint I never set foot on the Google campus. Every contribution I made was digital: a series of keypresses, touchpad movements, and mouse clicks made from my official Google laptop. I also provided audio and video inputs via my laptop’s camera and microphone during video meetings. But no one ever saw me in person.
It only recently dawned on me that my actions may have been recorded. Of course, certain things I already knew *were* recorded: all the lines of code I submitted to the Google monorepository, for example, are probably still there. I’d imagine my emails are stored somewhere too, as well as the notes I wrote via Google docs. But what about the rest of my actions on the company computer?
It’s entirely possible that the entire daily work process was documented. In theory, my employer had a right to collect every input I provided on my company computer. Every mouse click, every keypress. And they could have sent this all back to store in some data warehouse. This is a much richer data type than simply the lines of code I outputted: these are *behavioral traces* that define how I solve problems from start to finish.
As I’ll explain later in the post, the thing that scares me about the existence of this data is that it seems well within the capabilities of current technology to train a model that can replicate *me*, in some sense. I’m calling it as a *Copilot for Everything*: an assistant that can auto-complete entire tasks for me, based on the actions I’ve taken at work in the past. And this feels like such an economically useful tool that it would be crazy for it *not* to happen in the next few years.
I don’t intend to pick on Google specifically, and I don’t know whether they do this or even what their policies are; I’m just trying to use my own experience in the corporate world to speculate on what I imagine will be a much bigger issue in the future. With today’s technology, it is totally plausible that a corporation might train a large AI model to mimic the digital work of an employee, (even after that employee has left the company). Is it ethical for my former employer to use my prior work output to create a digital version of me? Is it ethical for someone’s *current* employer to do this?
Automating remote work with supervised learning
Behavioral traces reduce the concept of automating work output to a well-defined learning problem for AI. The model inputs would be all the inputs the computer provides me: pixels on the screen, maybe audio. Outputs are the “actions” that I took on my computer: keystrokes, mouse movements, clicks.
At a large enough scale, it seems entirely feasible to train a large model that can predict actions from computer inputs. This is doable with the same current technology that powers large language models: supervised learning, which would allow learning to predict actions from computer inputs, and transformers, a type of neural network that excels at learning from large amounts of data.
From the company’s perspective, the process would be simple. Employer records you doing your work for some amount of time; employer trains model to replicate your output for your most monotonous and boring tasks. Or perhaps the company takes all the data they have and trains a model on the aggregate of *all* employees outputs. That would be closer to what we’ve seen work well in other domains, like vision and language, where training on more data is usually the right answer.
Can imperfect models of us improve our productivity?
So that tells us how a company could train a model to mimic employees’ actions. But how do we use these models? If we think about the resulting artifact, it wouldn’t be a drop-in replacement for an employee: instead it’d be a probabilistic model that tells us the likelihood of a given action given some input. First of all, these models have issues. Neural networks are easily tricked by adversarial inputs and generally don’t perform well if the inputs are too out-of-distribution. Neither of these are problems with humans.
Second, how do we do sampling? The obvious answer is to be “greedy” and take the most likely action at every step. This would work well in very low-entropy situations. For example, performing a task that’s identical to one that you’ve performed many times would be easy, and a good model could do it perfectly with no supervision. And in the case where the model has also learned from *other* employees’ actions, if it were to perform a rote task that someone else had done many times, that would be easy too.
But what about higher-entropy situations? In the case of higher uncertainty, the model would output a less useful distribution over potential actions. In this case, greedy action-sampling could lead to “exposure bias” where the model does something weird, ends up in a situation it’s never seen before, and inevitably goes off the rails.
Even good models will sometimes end up in situations of significant uncertainty. What I’d imagine is that we’ll end up with a solution similar to what has worked in self-driving cars: the computer can ‘drive’ by itself until it encounters something completely unexpected, at which point a human will intervene. Once the novel scenario is resolved and the task again resembles something that the computer knows how to do, the model can resume operation.
It’s hard to imagine what this might look from a user interface perspective, but I’ll suggest two basic options: “fast-forward” style and “foreman” style.
In a fast-forward setup, the assistant would perform certain tasks for you very quickly while you wait. In theory the model could still operate the computer at 100x speed; if input latency isn’t a limiting factor, the computer could complete tasks in the blink of an eye. Maybe a button on your screen will light up when the computer detects you’re about to perform a task that’s “understood” by the assistant; clicking the button would fast-forward through that task being accomplished at 100x speed.
In the foreman mode, one human would oversee *many* AIs working in parallel. This is only possible if the models can do most of their work without any interevention. In the foreman setup, one person would be responsible for the success of several AI assistants, and jump in to help them when the task reaches a certain level of uncertainty. This is similar to Zoox operating stations where human operators remotely pilot cars out of unexpected situations from hundreds of miles away.
What does this mean for us?
Even if none of the exact futures I’ve imagined here play out, note the underlying theme: **progress increases the average entropy of digital work**. This principle has has nothing to do with neural networks. As we develop better tools, we’re able to do repetitive tasks more quickly. Better abstractions reduce the amount of input it takes for us to generate the same amount of output. One way to do this, the one that’s discussed here, is directly modeling employee behavior with big neural networks.
The reality is that every remote worker’s job tends to be repetitive sometimes. And your value comes from how you handle the *least* repetitive things you do: reacting to novel situations, adapting to change. New tools will reduce the amount of repetition in your work. Our day-to-day jobs will become *less* predictable as our most mundane and monotonous tasks are modeled away.
Given all this — how can you make yourself indispensable? The answer is by fighting entropy. If the most efficient company is the one where its employees are doing the least repetitive thing at all times, then the most productive employee is the one that’s the least modelable. Focus on the things that are hard to gather training data for.
This seems like it would show up first in something like Microsoft Excel Copilot, but the docs for the tool feel like they've aimed more for "here's interesting insights into your data" rather than "let me help you build a macro for processes you perform often."
Fascinating perspective, totally possible, maybe inevitable… maybe it’ll inspire us to come with unexpected AND enjoyable new paths and goals 🍀