Toki Pona Net Forum

Full Version: Treasure hunt for missing forum texts
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Anyone want help with the treasure hunt for forum texts from June 2011 to Dec 2012 in the google cache? If the forums are gone for good, then as soon as the google cache clears, those texts will be lost forever, I don't have a copy and doesn't have a copy. I do have a copy of the forums as of about June 2011.

To help, go to google and enter a search like thus:

Texts of a paragraph or more are generally the most interesting for corpus work, i.e. when you are trying to see if anyone has ever used such and such a construction.

If you find anything post it here, pref with attribution. It's legit since everything on the forum was CC licensed.
forums seem to be back at the old url, stuff from last month up to disappearance is there and running back into 2011 -- and the old site as well.
Can I help in any way with this process? Or is the corpus essentially made?
Contact me
jan Pije wrote a bunch of website that went up and down-- they were the forerunners of the current toki pona lesson plan. He also created some translations of children's books by scanning them and swapping out the English with toki pona text-- those are mostly missing, but could exist somewhere on the web. The old jan Pije stuff exists on but is a pain to search.

The other best way you can help is to write texts, release them to public domain, or CC attribution and that allows other people to use them to move toki pona forward (either by editing them for a future toki pona anthology) or just as another data point for amateur linguists looking to see "has anyone ever said it *that* way?"
And when you publish pick a web site or two that is likely to stay up for a few decades, like blogger (but who knew that livejournal would see hard times or that geocities would go down altogether)

The closes thing that exists to a corpus is mine, see here: We don't have a canonical corpus because jan Sonja actually didn't write that much text (at least not a lot that survives, for all I know she wrote a book worth on IRC, almost all of which is lost), so jan Pije text and the stuff that self appointed toki ponists consider to be good toki pona (e.g. many of the group edited longer texts from the forum are very good and if they said it *that* way, well, they are probably right.)
Thank you, I will be looking through this.
I found the English-Toki Pona Dictionary on searching through the older This is a great source, I think, for usages (there are many example sentences).

I've spoken with the creator of the "Live Search Toki Pona Dictionary" and he said the following about the code he made for that:

Quote:Hi, Yes, there is a http request (ajax) that connects to a php script here. If you are interested to see how it was done, you can download the source and do what you want with it…

What I want to do is use that same code to fill out the bidirectional dictionary, even sourcing it the way jan Sonja's Esperanto dictionary was sourced (when/if possible). The way the code goes, even if the site goes down, a .txt file of all the entries will exist.
Reference URL's