More on dealing with unknown languages

Une autre livraison concernant le mystère des langues mystères.

First of all, I was right, and so was caelestis at (or le?) sauvage noble: the mystery language is Romansh. It is interesting to look at the differences between our approaches. Caelestis writes in his comment section:

For the record, I should state that all I went on was the MP3, the exercise having been billed as a quiz. I just played it looped and trascribed and transcribed.

Well, that’s what I did, too, except that I googled first and only started transcribing after settling on Romansh. But then he goes on:

I also decided not to transcribe too narrowly, opting instead to approximate orthography, on the principle that hints about historical spelling might also betray something of the text’s language’s history.

I, on the other hand, took it as an exercise in using IPA, playing around with the values of the symbols and trying to get them right. Of course, I was also keen on actually understanding the recording, so I did think about the sense (adjective + noun combinations were particularly easy to identify, pi might mean more (plus in French); I neglected to look for possible occurrences of the conjugated forms of be, which would have been logical to do).

Focussing on the sound and the intricacies of IPA led to a few difficulties. For example, the speaker says religiun(s) several times, but sometimes the stressed syllable comes out as [dʒun] (with a voiced postalveolar fricative), sometimes in my ears more like [djun] (palatal approximant) or, between the two, [dʝun] (palatal fricative). Should I have made a distinction between the occurrences? I decided against a transcription that narrow.

One of those light bulbs that go on in ones mind came after I wrote the last entry: Grischun is the canton of Graubünden, Grisons in French. (I’m really bad at French geographic names for places outside France that I already know the German and/or English version of. Imagine the consternation of the French friend who once told me he liked Aix-la-Chapelle, when I replied I had no idea of what place he was referring to.)

As far as understanding the recording is concerned, caelestis’ has, in my opinion, the edge. I understood two more snippets from reading his. Other meanings came to me later. For example, the passage un model che corresponde miglier a lur habilitats (in an approximation of what the spelling might be) obviously means a model that better corresponds to their capacities. So the sentences before that should contain the antecedent of their, presumably referring to pupils. But where and what is it? It took me a thorough look through this page in Romansh dealing with school questions (actually, with the very issue of the place of the language in education) to find out that it is scolasts (for boys) or scolastas (for girls). Which in turn points to a tentative acuire as the verb of the third sentence. For the rest of it, caelestis is quite helpful: some aspects of the school system “come into question” (similar to the German idiomatic expression which could be calqued as put something into question). And so forth.

The bits on which we don’t agree (l/m/n? o/u?) don’t seem very important, and may be very hard to get “right” anyway: the language is called Rumantsch by its speakers and Romansh in English, most of the time, anyway. The presence or absence of the t illustrates a similar difficulty: saying [nʃ] is hard without an intruding stop/plosive ([t]). Is this stop part of “how the language is pronounced” or just of “how a particular speaker sometimes pronounces it”. It may be one or the other, depending on the particular language. But since it was unknown in the first place …

The transcriptions that the unique and in many ways excellent speech accent archive proposes aren’t totally uncontroversial either, however instructive they may be. There’s rarely a distinction between clear and dark l, and most pronunciations of the English w are transcribed as [w], even though there are quite a few instances of [v] and [β] in their samples.

The standard variety Rumantsch Grischun appears to be the the equivalent of Hochdeutsch in German: a normalized compromise agreed upon for teaching purposes, to have a unified written language and, in the case of Romansh, to keep the language alive and legitimize it, but one that hardly anyone speaks in its pure form. It is, with tudestg, franzos and talian, one of the four “national languages” in Switzerland. (Calling French franzos makes a German speaker smile with amused embarrassment since this sounds vaguely insulting in German.) In German, it is prefectly acceptable if the phonetic features of one’s region of origin’s dialect shine through even in the most formal speech situations. Romansh apparently has five distinct dialects, and I agree with the Debian geeks (er, and thanks for providing the OS that runs my computer!), and Mark Liberman’s justification, that the speaker’s dialect is Surmiran. Any French reader who has made it to this point of this post will easily recognize the text used in the dialect samples. Yes, I know you prefer the version that every French school kid learns by heart.

I have not made any headway to speak of with Welsh. Those conjugation tables of bod need learning by heart, and as long as I’m hampered by a very vague understanding of the pronunciation, I hesitate. For not too long, hopefully.

Edit: I originally was wrong about the official status of Romansh in Switzerland. Corrected now. Thanks, Steph!

3 comment(s) for 'More on dealing with unknown languages'

  1. (Comment, 2004-10-26 11:46 )

    I’m no linguist but I could guess read some of the Romansh from knowing Italian. Do you know which neighbouring language is closest to it?

  2. (Comment, 2004-10-27 17:48 )

    Since it belongs to the group or Raethian languages, the closest would be Ladin and Friulian. Ladin gave me this feeling of “I’ve already heard this language” since I twice spent a few weeks in the Ladin-speaking Alto Adige (Dolomite region in northern Italy). Ethnologue has a family tree of this branch, which suggests that of the major Romance languages, French and Provençal should be closest to Romansh, then a number of languages spoken in northern Italy (Emiliano-Romagolo, Ligurian, Lombard, Piemontese and Venetian). This may be right, but doesn’t tell us anything about the pronunciation, which for French would blur matters considerably. What is closest genetically doesn’t necessarily have the closest “phenotype”, if I may borrow terms from biology.

  3. (Pingback, 2006-02-07 22:20 )

    […] Mark Liberman at Language Log has posted a second transcribe-and-guess-the-language quiz. I believe most readers of this blog interested in this sort of question, so you probably know this already. As one of those who got the first one right, I couldn’t resist of course. (More seriously, though, it’s an excellent exercise.) […]