An image generated with Adobe Firefly, responding to the prompt "an author training a generative language model on their own corpus of writing". It shows a young woman with dark hair and light skin, sitting in front of her laptop. She is surrounded by books, and behind her is a projection of the code she is writing, which is illegible.

To each author, a mimetic AI model

Jenny Hedley

 Imagine a future where writers control their own small language models (SLMs) trained on select high-quality data including their creative works. Unlike resource-guzzling, copyright-be-damned large language models (LLMs), compact models can be run locally on minimal hardware[1], keeping privacy intact while conserving resources.

That future is already here for the tech-curious who are willing to accept coding assistance from LLMs. While suggesting such purposeful use of LLMs, I feel reproached by my dear colleague Beau Windon who wrote ‘Is it Time to Bully Generative AI Users?’. Am I the LLM user/‘slopsucker’/‘botlicker’ who, according to Windon, treats ‘the water-guzzling, stolen hard work of workers as a plaything’? I carry on, for research’s sake, under the weight of these concerns.

I am not looking for a surrogate to take over the writing processes that I so adore. My experiments with technology, including machine learning, are designed to augment rather than replace my creative practice. I write because of an overwhelming impulse to contextualise my experiences within literary and sociopolitical fields; technology offers ways of expanding the known. Training a SLM to speak in my voice and style—what I think of as digital mimesis—offers new entry points into my research interests around archives and the multiple self. A self-mimetic model becomes a kind of living archive—queryable, a laboratory for testing authorial identity and versioned selves.

It is astonishingly quick and easy to train an open-source SLM.  With only ten lines of code in my Mac’s Terminal, I installed the necessary tools (Homebrew) and languages (Python), set up my project folder, created and activated a virtual environment within which I could train Microsoft’s Phi-2 base model, and installed libraries of open-source tools to do the heavy lifting for me.[2] My first JenAI model was trained (on approximately 280,000 of my words) and responsive in twenty-four hours. It was bland, sounding more like an encyclopaedia than like me. It did, however, argue for the human embrace of AI:

What if [humanity’s] greatest strengths lie beyond the scope of our physicality? Perhaps we shouldn’t view AI agents simply as extensions of ourselves but rather as tools that empower us to achieve feats far beyond our biological limitations.

When I share my passion for digital writing with undergraduate creative writers, I derive satisfaction from breaking down barriers and internalised biases which prevent people from taking advantage of the tools at hand. Coding, today, is for everyone. I am not a coding expert but rather an enthusiastic researcher conducting iterative training of self-mimetic models. Keeping in alignment with the open-source ethos of the founders of the World Wide Web Consortium, my code for this project is available for curious readers to inspect, reuse, and build upon.

The open-source ethos is not an abstract allegiance to web history; it allows writers to reclaim agency. Author-controlled models are viable precisely because of the open-source ecosystems that underpin contemporary machine learning. Without shared repositories, transparent architectures and community documentation, the mimetic AI model would remain the preserve of corporations like the heavily AI-invested ‘MAG-7’ tech giants.[3] To train compact models locally is to participate in a culture of shared tooling which ultimately allows one to loosen the bonds of extractive capitalism.

I have no ethical qualms about using LLMs to write the Python code for model training. Code is freely available from the Hugging Face community and in GitHub repositories anyhow. Without AI assistants such as Claude (which I used for the experiment, though most any LLM can serve a coding assistant function), the step-by-step processes, and troubleshooting would remain the domain of programming experts.

Style and voice remain areas which are difficult for LLMs to emulate. The claim that LLMs flatten style is not just anecdotal. Stylometrics looks at the fingerprints of authorial style—that is, the ‘linguistic signature detectable through the frequency with which particular words are chosen’, as well as ‘sentence length, syntactical structures, or punctuation patterns’ (O’Sullivan 2). In a stylometric study of human versus AI creative writing responding to particular prompts, James O’Sullivan found that LLMs displayed ‘uniform stylistic patterns’ which were clearly distinguishable from ‘the richer stylistic diversity characteristic of human creativity’ (4). Fine-tuning SLMs for the specific purpose of emulating one’s own authorial style might close this gap between human and AI.

Ethical AI for writers depends upon transparent, inspectable architectures—as opposed to the notoriously black-box LLMs. For my first JenAI model, I chose to replicate and fine-tune Phi-2, which has 2.7 billion parameters, a fraction of the scale of contemporary LLMs. Microsoft’s Phi-2 was trained on a mixture of filtered educational-quality web data, ‘textbook quality’ data and synthetic datasets aimed at teaching ‘common sense reasoning and general knowledge’ (Javaheripi et al.). Although I chose to work with Phi-2 for this experiment because it hadn’t trained on pirated books in the LibGen dataset, in the future I plan to work with the Allen Institute for AI’s (AI2) OLMo model, which offers complete transparency about their training datasets. My concern was that the Allen Institute’s 1B parameter model would be too lightweight for creative writing purposes and the 7B parameter model could overload my hardware. Finetuning involves taking a pretrained language model and training it further on a smaller, specialised dataset to shape its behaviour for a particular purpose; Low-Rank Adaptation, or LoRA, is just one parameter-efficient finetuning technique proven to streamline the process. I trained JenAI using LoRA, which is ‘both storage- and compute-efficient’ (Hu et al. 2).

My first model sounded less than human due to both its machineness and the mistakes I made when training it, such as not cleaning my training datasets properly. I acted on lessons learned in subsequent iterations of JenAI, which I will cover in my next post. If writers learn to train models on their own work, the politics of generative AI shift from extraction to consent. Rather than be swept up in the shifting tides of our data-hungry tech overlords, we can claim agency over the terms of our human–AI interactions, keeping as trade secrets the unpublished datasets which emerge from our creative and philosophical labour.

[1] I trained and ran my SLM on a 2021 M1 MacBook Pro with 16GB RAM while running OS 13.1.

[2] As reflected in Step 1 of my instructions for training a SLM, I installed libraries including PyTorch, which runs the math/training on my computer; Hugging Face’s ‘transformers’ toolkit loads my model, while ‘datasets’ handles the data and ‘accelerate’ makes the training faster; ‘peft’ enables LoRA, which saves memory and time by fine-tuning a slice of the model only; and ‘bitsandbytes’ shrinks the model so that it runs on less memory.

[3] The Magnificent 7, or Mag-7 include seven high-flying US tech stocks that have driven significant market gains: Alphabet (Google), Amazon, Apple, Meta, Microsoft, Nvidia, and Tesla.

 

Works cited

Hu, Edward J., et al. ‘LoRA: Low-Rank Adaptation of Large Language Models.’ arXiv:2106.09685, arXiv, 16 Oct. 2021, https://doi.org/10.48550/arXiv.2106.09685.

Javaheripi, Mojan, et al. ‘Phi-2: The Surprising Power of Small Language Models.’ Microsoft Research, 12 Dec. 2023, https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/.

O’Sullivan, James. ‘Stylometric Comparisons of Human versus AI-Generated Creative Writing.’ Humanities and Social Sciences Communications, vol. 12, no. 1, Nov. 2025,  1708, https://doi.org/10.1057/s41599-025-05986-3.

Windon, Beau. ‘Is It Time to Bully Generative AI Users?’ Substack newsletter. Some Tasty Mash BeauTato, 19 Dec. 2025, https://mashbeautato.substack.com/p/is-it-time-to-bully-generative-ai.

 

The series

Part 1: https://southerlylitmag.com.au/down-with-copyright-infringing-llms-long-live-the-small-language-model/

Part 2: https://southerlylitmag.com.au/to-each-author-a-mimetic-ai-model/

Part 3: https://southerlylitmag.com.au/generative-ai-model-training-why-do-you-sound-like-me-because-i-am-you/

Part 4: https://southerlylitmag.com.au/archival-bots-my-mother-my-model-for-language/

 

About the author

Jenny Hedley is a neurodivergent writer, digital artist, literary critic, teacher and third-year PhD candidate at RMIT whose research spans personal archives, autotheory, experimental nonfiction, digital and creative-critical writing. Links to her works can be found on jennyhedley.github.io. She lives on unceded Boon Wurrung land with her son.

@ jennyisanauthor@gmail.com

 

About the artwork

An image the author generated with student access to Adobe Firefly. The prompt used was “illustration of an author training a generative language model on their own corpus of writing, to create a self-mimetic model”. Adobe Firefly is trained on licensed images and is a more ethical option for image generation than models trained on pirated or otherwise stolen images.

Leave a Reply

Your email address will not be published. Required fields are marked *