For five of us over the last week, life’s been defined by the chug-and-whir of digital copiers sucking in page after page of reject literature. With the support of the State Library of Victoria’s Storage and Digital Collection Services, a small group of volunteers has been digitising a portion of the archive relating to Australian literary magazines. Edmund La Touche Armstrong, Chief Librarian from 1896-1925 after working his way up from junior assistant, championed a policy of archiving unofficial as well as official culture. His vision meant that from the 1940s until the late 1990s, the SLV stored not just the printed magazines like Meanjin, Southerly and Quandrant, but all the submissions as well. As I suggested in earlier posts, finding this rejected material is akin to tapping the unconscious of the country’s literary culture.
Each morning we load a trolley with archival boxes from the stacks then wheel it to the digital imaging studio. The work involves mind-numbing hours removing staples and clips and feeding pages into scanners, relieved by the pleasure of reading excerpts along the way. As we’re discovering, the material is a treasure-trove of forgotten poems, essays and short stories, not to mention myriad other experimental forms. To date each piece has only been read in isolation by the individual authors and the rejecting editors. Now, with stunning results, it’s being read by machines.
Ongoing collaborations between humanities scholars and computer scientists mean there are now superb tools to help make sense of such a vast corpus of text. As Mark Olsen and Schlomo Argamon argue in the Digital Humanities Quarterly:
Machine learning and text mining approaches appear to offer a compelling complement to traditional text analysis, by having the computer sift through massive amounts of text looking for “suggestive patterns.” The power of modern machine learning systems to uncover patterns in large amounts of data has led to their widespread use in many applications, from spam filters to analyzing genetic sequences. And the potential for using these sophisticated algorithms to find meaningful patterns in humanistic texts has been recently observed. Drawing a link between Ian Witten’s general description of data mining and the practice of literary criticism, Stephen Ramsay states that “[f]inding interesting patterns and regularities in data is generally held to be of the deepest significance.”[i]
In looking for that ‘deepest significance’, we’ve taken the Philomine software package, developed by the ARTFL Project at the University of Chicago, and started feeding it the material digitised so far. Keeping in mind that these are provisional results from less than 10% of the material, two trends are emerging.
First, as you’d expect, is a body of classic Oz-Lit fiction that simply didn’t make the cut. If you recall the New Writing for the Real Australia manifesto found in this same archive, 42% of the material analysed to date conforms largely to its requirements of realism, understated melancholy, and individuals framed against a harsh Australian landscape. Undoubtedly there will be some gems in there, and the text mining algorithm has already flagged several unpublished pieces from famous names.
But what’s more interesting to my mind is the remaining 48% of the material. It’s wildly diffuse and comes from a wide variety of would-be authors—but it’s not random. Far from it. The great thing about analysing an archive of this size is finding common threads over time. Or, as I prefer to think about it, finding other possible magazines that could have been published.
If you consider the total literary output of the country in any given year, it gets shaped into published work or slush pile throw-out depending on the rules and categories used to make magazines. Change the rules and categories (which is easy to do when analysing a digital archive) and you create a totally different set of published magazines, and a totally different slush pile. If there had been editors with different proclivities, here are some hypothetical magazines that could have been published from the total body of Australian writing 1945-1996:
- Undersea, Meanjane, Quidrant, Northerly, Easterly (1945-1996): strong literary magazines from across the political spectrum that all share a structural bias towards women. The overall gender balance in rejected submissions is significantly skewed towards women, in a near-perfect inversion of the graphs we’re used to seeing in the Stella Count. Which means there have been parallel woman-dominated magazines accruing in the archive for decades. These have a larger catchment of writers and material to draw from than those that were in fact published.
(NB: the graph is provisional because we’re still compiling data, and because there are margins of error in automatically interpreting gender based on first name).
- Now Now (1945 – 1984): a quarterly of topical realist fiction. I’ve long been interested in fiction that deals with topical events. One of the tests we’re running on the archive is searching year-by-year for key terms from digitised newspapers from the same period. This tells us which current events were being written about. We’re excited to say that in addition to timeless tales about the bush and interpersonal relationships, there was a significant parallel stream of literary writing that dealt with topical current events; enough to have formed the basis for a magazine in their own right. Take 1965, the high point of Australian military involvement in Vietnam. For that year we’ve found 194 mentions of Vietnam across 47 pieces of unpublished writing. Most are in the realist vein.[ii]
- B (1963 – 1996): a quarterly of writing on indigenous Australian subjects. Hoping to identify work in the archive from Aboriginal writers and compile it into its own retrospective magazine, we fed the search algorithm a wide range of published texts by Aboriginal writers, then asked it to identify work of a similar nature. Thus far we’ve found a stream of work that blooms from the second half of the 1960s across a wide range of forms (stories and poems but also letters, speeches, photos, etc.), including what may well be unpublished work from Kath Walker and Jack Davis. The complicating factor here is that there’s no reliable way to get an algorithm to select work by rather than simply about Aboriginal people. The process will find real gems, but it may also privilege external voices, and has already turned up a dozen rejected short stories by ‘Wanda Koolmatrie’: immature frauds by taxi driver Leon Carmen, before his My Own Sweet Time hoax took off. Search products claiming to reliably detect ethnicity from text analysis come from the security industry and are used in racial profiling and watch-listing – not exactly a comfortable fit for this project. We prefer to let the First Nations Australia Writers’ Network lead a team of writers and editors in determining how to use the material.
- Elect (1945 – present): a triennial poetry journal devoted to angst-ridden polemics. We noticed a spike in the number of poetry submissions roughly every three years. We tried mapping this against arts funding, weather and economic growth cycles, but with no luck. Jaspreet solved the mystery by simply reading samples from the spike periods. It turns out that whenever the nation goes to a general election, its literary citizens turn to poetry to express their reactions to the results.
- Susurrus (1945-1987): a quarterly of experimental writing. There’s a lot of experimental writing in the archive. A surprisingly large amount, not least because in the wake of the Ern Malley hoax, experimental work in Australia lost all credibility. Yet, looking at the accumulating tally of bizarre pieces, it seems people were still writing it, but no one would publish it, perhaps for fear of being similarly taken in. Some work in this category is terrible, but there’s also plenty to get excited about, and it would have easily been possible to publish a regular magazine of edgy literature. We’re turning up formal experimentation, high literary modernism, obscure satire, all kinds of surrealist work produced under the influence of alcohol and psychedelic drugs, and, from the 1970s onwards, a small but consistent strand of generative writing: literature produced by automated processes.
Coming out of Queensland, there seems to have been a group of experimental writers submitting to pretty much every issue of every literary magazine through the 1970s and into the 1980s. The author names are widely varied, but the submissions all come from the same two postcodes. Have a look at this:
Jack Kangaroo is somewhat hungry. Jack Kangaroo wants to get some berries. Jack Kangaroo wants to get near the berries. Jack Kangaroo thinks about walking from a cave entrance to the bush by going through a pass through a valley through a field. Jack Kangaroo would take the berries. Jack Kangaroo thinks about eating the berries. The berries would be gone. Jack Kangaroo stays in the cave. Jack Kangaroo is still hungry.
Stories like this start showing up with great regularity from 1979. Pretty shit, right? Just like the editors, you’d dismiss them out of hand. Unless, that is, you’d come across James Meehan and the TAIL-SPIN computer program written in the 1970s to generate stories. Meehan’s system takes a simple character, gives him or her a goal, then decides the outcome of the story through simple logic tests. It’s one of the earliest experiments in artificial intelligence and narrative. The above looks a lot like the output of his US system, only with an Australian character. We’ve turned up dozens of these stories in the SLV archive. Random? Apparently not.
You’ll recall another find from the archive, the realist manifesto New Writing for the Real Australia, and my attempts to discover who wrote it. The only reference I could find to the manifesto was from a speech given by a computer scientist at MIT in 1977. He essentially quoted the manifesto, saying that in good stories, “action may be retrospective or prospective, but rarely on the page.” Who was that computer scientist? James Meehan.
So here’s where things get interesting. Creating a mechanism to write stories like the one above isn’t hard. What’s hard is giving the results a sense of aesthetics. Meehan, a programmer rather than author, needed a set of guiding literary principles for his TAIL-SPIN system. New Writing for the Real Australia provided exactly that.
We have no idea how Meehan came by the manifesto (our best guess is a collaborator in Queensland), but sure enough, when we downloaded the source code from Meehan’s TAIL-SPIN, the manifesto was there, translated into machine logic: a list of symbols to build the story around; vocabulary and grammar choices reflecting a sense of melancholic longing; various natural settings; and a set of stereotypical traits used to create characters. Once this logic was plugged into the TAIL-SPIN system, it spat out increasingly sophisticated tales of the Australian bush.
Now, in the artificial intelligence world, success means passing the Turing Test, where a person who interacts with your software believes they’re interacting with a human.[iii] With literature, that means readers assume a story was written by a human author. And what better way to run the test than submit your generated stories to the editors of magazines? Because that’s precisely what Meehan and subsequent generations of artificial intelligence programmers did. They sent their results to magazines, to see who could get a story picked up.
I don’t know what the editors made of the atrocious early attempts, but with the advent of natural language processing, and the birth of the earliest forms of the internet, both language and content became exponentially more sophisticated. In the 1980s, these computer-generated tales dwindle and finally disappear from the archive in the SLV. Did the programmers give up? Quite the opposite. They finally started passing the Turing Test, and machine-written stories made it into print.
It’s worth noting that while approaches varied, the underlying aesthetic principles stayed the same. Subsequent programmers simply inherited Meehan’s guiding literary rule-set. His logic (or rather, the logic of New Writing) was taken across into the US, to Europe and around the world. Over decades, dozens of artificial intelligence programs produced stories using a wide variety of grammars and technologies, but the ideal type of story they were trying to write remained the same.
All of this leads to the most startling discovery of the project, and what we’ve been dying to share: based on current analysis of the slush-pile archive, we now believe that until the end of 1996, and likely up until the present day, using New Writing for the Real Australia as the guiding logic, close to half of all short stories submitted and published in Australian journals, including this one, were written by machines.
[i] Schlomo Argamon,and Mark Olsen. “Words, Patterns and Documents: Experiments in Machine Learning and Text Analysis.” Digital Humanities Quarterly, 3.2, 2009. This offers a good overview; and the ‘special cluster’ on Data Mining in this issue of Digital Humanities Quarterly has more on the specifics.
[ii] I pulled the published magazines from the same year for a quick comparison, and though it was only a cursory scan, I only found four mentions. One in particular jumped out, from David Martin talking about the future of Australian writing and its archetypes in Overland 33 ( December 1965) p. 40: “He is nearly gone already, that hard-bitten bloke under the slouch-hat; his wings have started to moult not in heaven but here below. The digger who, somewhere in Vietnam, cleans his nails with his bayonet is not his brother but his step-brother, sired by a different father to a different likeness. The lad balloted into the army like a chook in a raffle to fight in an unjust war, won’t stir the folk imagination … Writers who [still] write about our old friend will write about a ghost. The roots have withered, in bush and city. Good-bye!”
[iii] You might not realise, but these days a vast number of articles, reports and click-bait articles are generated by algorithms. If you don’t believe me, take your own Turing Test to see if you can spot AI-generated literature over at the New York Times.