Wikipedia talk:WikiProject Languages/Archive 12

This is an archive of past discussions on Wikipedia:WikiProject Languages. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 5

←

Archive 10

Amami and Kunigami

I propose to:

Rename "Amami language" (no glottocode) to "Northern Amami languages".
Rename (1) "Kunigami language" (no glottocode) to "Okinoerabu–Yoron-Northern Okinawan languages" and (2) "Northern Okinawan language" (kuni1268) to "Kunigami language".

In short, my solution is to give preference to Ethnologue, glottolog and other linguistic literature over the UNESCO Atlas of the World's Languages in Danger.

As I reviewed at Amami–Okinawan languages, there are numerous theories on classification of the languages in the Amami and Okinawa Islands. What is worse, names given to proposed taxa are far from standardized. As for Amami, there are at least 4 distinct entities, three of which have their own articles:

Northern Amami Ōshima (glottocode: nort2935) is spoken on the northern half of Amami Ōshima (a large island).
Northern and Southern Amami Ōshima (sout2954) form Amami Ōshima (oshi1235), for which we do not have an article.
Amami Ōshima, Tokunoshima (toku1246) and possibly Kikai (kika1239), in other words, the northern half of the Amami Islands (a group of islands), form Northern Amami (no glottocode), for which we have an article under the misleading title of Amami language.
Northern Amami and Southern Amami (Okinoerabu (okin1246) and Yoron (yoro1243) and possibly Kikai) form Amami (amam1245), for which we have an article under the title of Amami languages.

The problem is that we currently give undue weight to the UNESCO Atlas of the World's Languages in Danger, which gives the name of "Amami language" to the third entity (no glottocode):

UNESCO only represents one of numerous theories on classification, the one proposed by Uemura (1972) and Karimata (2000).
The third entity (no glottocode) does not have a standardized name. Uemura (1972) and Karimata (2000) do not use the name of "Amami languages". The misleading name was invented by UNESCO. This is the main cause of confusion.
Moreover, recent publications such as Pellard (2009) and Heinrich et al. (2015) dismiss the classification UNESCO adopted. They re-evaluate an older theory.

I have two possible solutions to the problem.

Merge the third entity into the fourth entity. Both are higher-level taxa and all we can say about them is classification. But I don't like this option. I think the article-per-entity policy makes things easier to understand.
Rename the third entity (no glottocode) to "Northern Amami languages" per Nakamoto (1990), or "Amami–Tokunoshima languages" per Karimata (2000). I prefer the former title.

As for Kunigami, we have two distinct entities.

Kunigami (kuni1268) is placed under the title of Northern Okinawan language because we give undue weight to UNESCO.
The title of Kunigami language is currently occupied by another entity (no glottocode). Calling this supergroup Kunigami is UNESCO's invention. Moreover, the existence of this taxon is dismissed by recent publications.
The solution is to rename the entity (no glottocode) to "Okinoerabu–Yoron-Northern Okinawan languages" per Karimata (2000) to give way to Kunigami (kuni1268).

Unfortunately, I don't have time to continue the discussion as my short vacation is almost over. But as a rogue is gone, there will be no disruption. --Nanshu (talk) 06:57, 16 August 2015 (UTC)

@Nanshu: the only disruption comes from when you make your sweeping changes to Wikipedia and expect no one to challenge you due to your near-divine wisdom.

I firmly oppose your first proposition. You've literally copy-pasted whole articles with minor changes to meet your agenda. I'm sure your next suggestion would be to rename the articles as "dialect clusters".

Your push for glottolog is laughable considering that it calls your "Amami–Okinawan" classification "Northern Ryukyuan". If you're willing to relent and give up Northern Okinawan language, why not that name change too? Besides, the UNESCO classifications are also based on Nakasone.

Kunigami is under "Northern Okinawan" not because of UNESCO, but because you created a content fork and put it there yourself. Now you may be right that the current article is about a "supergroup", but Nakasone also included Yoron, Kikai, etc as part of Kunigami. Your solution of introducing a taxon made up and supported by a single scholar is also laughable, especially considering there is no glottocode for it.

So here are my counter-solutions:

rename Amami–Okinawan languages to Northern Ryukyuan languages.
redirect Northern Okinawan language to Kunigami language.
expand/rewrite all of the articles falling under Amami languages.
rename Yoron language, Kikai language, etc to Yoron dialect, Kikai dialect, etc or classify them as Okinawan languages.

Do respond soon. ミーラー強斗武 (StG88ぬ会話) 08:55, 17 August 2015 (UTC)

Rather than just saying "Move X to Y", etc., etc., please post here a simple outline of how you see the finished language classification and compare it to reliable sources. That will make it a lot easier for non-specialists to look at what you have and what you are proposing. This is obviously controversial, so the easier you can make your presentation(s) for the rest of us, the better. --Taivo (talk) 13:28, 17 August 2015 (UTC)

@Taivo: here are my proposed classifications, in order of preference:

Ryukyuan languages

Northern Ryukyuan languages

Amami language
Okinawan languages

Kunigami language
Okinawan language

Southern Ryukyuan languages

Miyako language
Yaeyama language
Yonaguni language

Ryukyuan languages

Northern Ryukyuan languages

Amami languages

Northern Amami language
Southern Amami language

Okinawan languages

Kunigami language

Kikai dialect
Toku dialect
Okinoerabu dialect
Yoron dialect

Okinawan language

Southern Ryukyuan languages

Miyako language
Yaeyama language
Yonaguni language

Ryukyuan languages

Northern Ryukyuan languages

Amami languages

Northern Amami language
Southern Amami language

Okinawan languages

Kikai language
Toku language
Okinoerabu language
Yoron language
Kunigami language
Okinawan language

Southern Ryukyuan languages

Miyako language
Yaeyama language
Yonaguni language

The first is my preferred classification, the way it was before Nanshu created almost a dozen copy/paste articles. The second is a compromise with Nanshu, working with the existence of his articles but still favorable to my preference. I think the second classification is the best solution. The last classification is a further compromise with Nanshu and a better solution than his proposed "Okinoerabu-Yoron-Northern Okinawan languages" taxon. Besides, Kikai dialect/language shouldn't be grouped with Amami language because it is linguistically closer to Kunigami language. I will add reliable sources later today. ミーラー強斗武 (StG88ぬ会話) 15:43, 17 August 2015 (UTC)

Which articles are the "copy/paste" ones? If they offer no new content, but are just placeholders for names, their existence is questionable and they might be eligible for speedy deletion. --Taivo (talk) 16:10, 17 August 2015 (UTC)

@Taivo: Northern Amami Ōshima language, Southern Amami Ōshima language, Kikai language, Yoron language, Okinoerabu language, Tokunoshima language, Amami language, Amami–Okinawan languages, Amami languages, Northern Okinawan language. I'll add more or remove them after I recheck them (sorry if some of them had been reverted/redirected already). ミーラー強斗武 (StG88ぬ会話) 16:33, 17 August 2015 (UTC)

@Taivo: The latest and most up-to-date source on the Ryukyuan languages, Handbook of the Ryukyuan Languages by Patrick Heinrich et al (2015), has a classification chart on page 15 very similar to my first proposed classification (the chart places Yaeyama and Yonaguni in a supergroup). It also uses "Northern Ryukyuan" as opposed to "Amami–Okinawan". Here is a link to the book in Google books. ミーラー強斗武 (StG88ぬ会話) 23:50, 18 August 2015 (UTC)

Taivo, this proposal is not about choosing one classification. This is clearly against our NPOV policy. The problems and solutions are actually simple if you are not fooled by User:Sturmgewehr88's attempt to dodge the questions.

First of all, as I reviewed at Amami–Okinawan languages, there are numerous theories on classification of the languages in the Amami and Okinawa Islands. In our terminology, there are multiple POVs. It is clear what we should do in such a case: representing all of the significant views, fairly, proportionately, and, as far as possible, without bias. This is the very point Sturmgewehr88 is trying to obscure. He is trying to push one POV and kill anything else.

Before going into the controversial part of the story, we need to review what are not controversial. We are talking about taxonomy. Near-top and near-bottom taxa are relatively stable. Near-bottom taxa are:

Northern Amami Ōshima (nort2935, ryn)
Southern Amami Ōshima (sout2954, ams)
Tokunoshima (toku1246, tkn)
Kikai (kika1239, kzg)
Okinoerabu (okin1246, okn)
Yoron (yoro1243, yox)
Kunigami (kuni1268, xug)
Central Okinawan (cent2126, ryu)

As you can see at Kikai language and Okinoerabu language, there are non-negligible variations within these taxa, but they are well established. They can be found even the old Okinawa-go jiten (1963). Also note that they are given ISO 693-3 codes by Ethnologue.

The near-top taxon is

Amami–Okinawan languages (nort3255) aka Northern Ryukyuan

Again, this taxon can be found even the old Okinawa-go jiten (1963) and can be found in Ethnologue's classification.

Interestingly, Sturmgewehr88 and Ryulong (now blocked indefinitely) tried hard to delete articles for these taxa, and they failed:

You can also see that Sturmgewehr88 has weird notions of WP:SYNTH, WP:NOR and WP:CFORK, that are very different from ours. Do not take these labels seriously.

This month I created missing pieces: Northern Amami Ōshima (nort2935, ryn), Southern Amami Ōshima (sout2954, ams) and Kikai (kika1239, kzg).

Next, what are controversial? All intermediate taxa are more or less controversial.

Amami (amam1245)
Northern Amami (no glottocode)
Okinoerabu–Yoron-Northern Okinawan (no glottocode)
Okinawan (okin1244)

These are products of comparative linguistics starting from the 1970s. If we chose some taxa, others will be displaced. For example, if we follow the two-subdivision hypothesis, which is recently re-evaluated by Pellard (2009), Okinoerabu–Yoron-Northern Okinawan (no glottocode) becomes an invalid taxon.

What should we do to maintain NPOV? There are two options:

Keep articles for intermediate taxa even if they are invalid in some theories.
Merge controverial taxa into higher-level ones (i.e., Amami–Okinawan languages (nort3255)).

I prefer the former option. But if you really wish to choose the latter, I will not object strongly. A top priority is given to NPOV.

Now you can see why Sturmgewehr88's personal classifications above are meaningless and rather harmful. By posting multiple classifications, he gives a false impression that he seeks compromise. Don't be fooled. Sturmgewehr88's true objective is to kill POVs he doesn't like. Specifically, he is trying to push a minority view presented by UNESCO.

Because his objective is clearly against NPOV, he is trying to obfuscate the situation. For example, he still refuses to accept the simple, clear fact that there are two distinct taxa for Kunigami

Northern Okinawan (kuni1268), a well-established taxon. Kunigami usually refers to this taxon.
Okinoerabu–Yoron-Northern Okinawan (no glottocode), a controversial taxon. The name of Kunigami is only given by UNESCO.

He made blind reverts[1][2] to keep you away from the reality. He made groundless accuations against me, such as content fork and copy/paste, to dodge the point.

Now go back to my first post in this thread. We must adhere to NPOV, but we sometimes need to choose one among several for technical reasons. Title is the case and we have policies on this. My proposals are a variant of the non-neutral-but-common-names policy.

The basic policy is to give preference to the two-subdivision hypothesis, in consideration of latest publications in which Thomas Pellard, a vocal critic of UNESCO's classification, re-evaluates the two-subdivision hypothesis.
We implement this policy by not using common names reserved for other taxa in the two-subdivision hypothesis:
1. In the two-subdivision hypothesis, Amami refers to Amami (amam1245). Fortunately, UNESCO's "Amami language" (no glottocode) is still a valid taxon under the two-subdivision hypothesis but should be referred to by Northern Amami per Nakamoto (1990).
2. In the two-subdivision hypothesis, Kunigami refers to Northern Okinawan (kuni1268). UNESCO's Kunigami (no glottocode) is an invalid taxon under the two-subdivision hypothesis. It should be referred to by Okinoerabu–Yoron-Northern Okinawan per Karimata (2000).

These names are more descriptive too.

Northern Amami (no glottocode) is spoken in the northern half of the Amami Islands, not in the whole island group.
Okinoerabu–Yoron-Northern Okinawan (no glottocode) is spoken not only in the northern portion of Okinawa Island (aka Kunigami) but on Okinoerabu and Yoron Islands.

To summarize, I'm discussing how to implement NPOV while Sturmgewehr88 is trying to push a minority POV. Nanshu (talk) 14:28, 19 August 2015 (UTC)

You're accusing me of POV pushing? You're one to talk. You've cherry-picked bits of reliable sources that support your position and ignored what contradicts you, and then synthesized them to create your own classifications. And you created Northern Okinawan language, which is literally just your version of Kunigami language, a textbook example of a content fork. So maybe your definition of these policies are one thing, but they are definitely not what the policy actually says.

As for Nanshu's claim of me trying to supress the other classifications, he obviously doesn't understand. The infobox for languages doesn't support more than one classification, let alone three. The same goes for the article titles. I'm not at all objecting to mentioning the various classification theories within the article, however, my above proposal is for the classification used for infoboxes and article titles on Wikipedia, which is supported by the majority of reliable sources. Also, most reliable sources don't recognize Kikai, Tokunoshima, Okinoerabu, and Yoron as independent languages, and Heinrich et al (2015) are more inclined to recognize dialects of Yaeyama as independent.

As for the AfDs, per WP:FORK the Northern Okinawan language article rightfully needed to be deleted. Amami–Okinawan languages, in retrospect, should be renamed instead of deleted. The other AfDs were only supported by Ryulong.

I wouldn't oppose Nanshu's edits if it weren't for his policy violations, and the fact that he keeps degrading these articles to nothing more than arguments about the classification of "dialect clusters". He also repeatedly removes the native name of these languages and claims that the speakers have no self identity (you know, because 国頭口 Yanbarukutuuba was just made up by someone on Wikipedia). ミーラー強斗武 (StG88ぬ会話) 00:56, 20 August 2015 (UTC)

Break

My problem is that the organization is completely incoherent. Ignoring Kunigami for now, we have

Amami
- N. Amami
- S. Amami

Currently, we have both Amami language and Amami languages, also Northern Amami Ōshima language, Southern Amami Ōshima language. Now, the Amami languages are a dialect of the Amami language, and N. and S. Amami are dialects of the Amami languages but not of the Amami language. The Amami language family is equivalent to the northern Amami dialect. Or so it seems. What I'd like to see is a list of uncontroversial clades followed by their ISO or glotto codes and links to their articles.

Here's the classification in Pelland (2009), which was adopted by glottolog:

Northern Ryukyuan (8)
- Amami (6)
  - Kikai [kzg]
  - Nuclear Amami (4)
    - Okinoerabu-Tokunoshima (2)
      - Oki-no-Erabu [okn]
      - Toku-no-Shima [tkn]
    - Oshima (2)
      - Northern Amami-Oshima [ryn]
      - Southern Amami-Oshima [ams]
  - Yoron [yox]
- Okinawa (2)
  - Central Okinawan [ryu]
  - Kunigami [xug]

Which of these should we have articles on? We don't need an article for every lect that has an ISO code, nor every clade that has a glotto code, nor do we need to use the ISO or glottolog names or use their classifications. — kwami (talk) 00:50, 21 August 2015 (UTC)

Well, I agree with what Kwamikagami said. ミーラー強斗武 (StG88ぬ会話) 02:10, 21 August 2015 (UTC)

@Kwamikagami: so you're endorsing the Pellard classification correct? If so, (and by "we don't need an article for every [dia]lect") which articles do you believe should or should not have an article? ミーラー強斗武 (StG88ぬ会話) 00:24, 22 August 2015 (UTC)

Sorry if I intrude in the discussion. We cannot endorse any classification as correct. The problem arises when we have to deal with the structure of Wikipedia articles, as noted: do we need one article for each languoid (be it a language, dialect, clade, group, lect, sociolect, doculect, etc.)? Well, maybe not an article each, but at least they should all be mentioned somewhere. Still, the problem remains how to structure articles, infoboxes and categories. I have no immediate answer for that, but it clearly creates conflicts. --SynConlanger (talk) 11:38, 22 August 2015 (UTC)

@SynConlanger: maybe we can't "endorse" any one classification, but we can only use one to organize/name these articles. Nanshu has been cherrypicking or making up his own taxa to create his own classification for Wikipedia to abide by, then he guts the articles and leaves them with 80% of the content being a rant about how no one agrees on classification as if the only significance of these "dialect clusters" is this "debate". Just look at the edit history for Amami language or Kunigami language. So, although you have no immediate answer, what would you possibly propose? ミーラー強斗武 (StG88ぬ会話) 16:56, 22 August 2015 (UTC)

I'm with you in that. Regarding the infoboxes, one drastic solution would be to just give the highest level linguistic family. A moderate solution could be to give the lowest universally accepted classification level from the top: in some cases it could be just the top level. The problem with the second is that if you miss the one source that doesn't accept what the other 50 sources do, well, that's an issue. The same solutions could be applied to categories. As for the article content, well, as you have already said we report all the classifications proposed to date.

The problem with infoboxes and categories arises from when wikipedia was born and someone decided to provide a detailed classification there, probably without knowing that classification is highly problematic in most of the cases: not only which branch a languoid—in Cysouw and Good's sense—belongs to, but also how languoids could be nested (sorry for the terminology, but it's easier to keep track of what we're talking about). We are not obliged to give that kind of classification at the infobox and category level. We simply cannot accept one classification over the other based on acceptance rate among linguists (how could we even assess that? Moreover, the rate may and does change over time).

Which articles should we have? All of them, with a sensible redirect structure. See for example Mawayana and the Mapidian: classification is problematic, existence is problematic. According to some sources they are the same language but nonetheless we have both. According to some, Mawayana is unclassified, but according to others it is not: so what happened? Well, nothing... because I din't make any change so far regarding that. 😄 The same goes with romance languages of Italy (plus the useless debate language/dialect). But it would nice to discuss the general problem with you all, as we are doing. What are your impressions? --SynConlanger (talk) 18:05, 22 August 2015 (UTC)

(edit conflict) Personally, I'd only have articles on clades above the language level that are clearly valid, or are well-known in the lit, or have been reconstructed. When we have articles on nodes in a particular classification, we run into the question of whether we should delete those articles if we change classifications. A lot of wasted work. Something like Southern Ryukyuan languages, for example: Is there any info there that wouldn't be more convenient in the main Ryukyuan languages article? I suppose it's no problem as long as it is universally accepted, or nearly so, or if "Southern Ryukyuan" is s.t. a reader would often come by in the lit, but I wonder if the intermediate nodes of Northern Ryukyuan deserve articles when sources don't agree on which languages are Okinawan and which are Amami. I might be responsible for Amami language, as I wasn't going to bother with a stub for every ISO language just because it had an ISO code. The new N Oshima and S Oshima articles could be merged into Amami, or Amami could be deleted. I doubt we need both.

Re. SynConlanger's comments on the info box, I've often omitted unimportant intermediate clades and left established branches, except for the clades immediately above the language in question. Usually we've chosen a particular classification as best supported, in which case we give its details, but when we can't do that we can be agnostic by omitting questionable superior nodes, providing one with a question mark, or even providing two (X or Y?). — kwami (talk) 18:11, 22 August 2015 (UTC)

What is "above language level"? That's the problem: sources may be unclear. But I agree that in some cases (to be discussed on a single basis) we could do without an article about a branch. Or we could redirect the name to an article that states that classification is problematic and mentions that branch. I think it is not really relevant if there is a separate article or not: as long as there is that bit of info stated somewhere. The problem, again, resides in infoboxes and categories. Choosing to omit intermediate levels relies on the editor and that could lead to NNPOVs, IMHO. As for which of the two (or more) conflicting branches that can give the title to the article, I think we could initially choose the one that would have more content and give the other as a redirect or similar: we could always move it later if a proposed classification becomes obsolete. I see no problem with that. However, I would rather have a separate article stub if having e redirect would create problems (and viceversa). --SynConlanger (talk) 18:28, 22 August 2015 (UTC)

P.S.: as you say, we cannot use naming inventions by UNESCO (or whichever non- or near- linguistic association—I was just reading the "Interpreting Ethnologue data" section on this project, very well formulated. I think it applies in general to any world-wide classification initiatives.) 😊 --SynConlanger (talk) 18:37, 22 August 2015 (UTC)

Yes, I threw that in to counter people who claim Ethn. as the end-all RS. It's mostly from discussions I've had with other linguists, as well as problems I've encountered using it on WP. I'm glad you don't think it's too POV. — kwami (talk) 20:36, 22 August 2015 (UTC)

I've made a start at cleaning things up, and also changing faux IPA into real IPA. The real question I see at this point is whether we wish to keep Amami language (presumably reducing its scope to Ōshima) or forego it for the two new articles. Don't much care, but for the latter we should probably move it to Northern Amami Ōshima language to preserve the page history. (We can get an admin to merge the histories.) — kwami (talk) 20:24, 22 August 2015 (UTC)

Also, we need a good phonetic description of what the glottalized stops are. Are they ejectives? — kwami (talk) 20:31, 22 August 2015 (UTC)

BTW, just found a very nice volume:

Heinrich, Miyara, & Shimoji eds. (2015) Handbook of the Ryukyuan Languages: History, Structure, and Use. Walter de Gruyter.

Yes, I linked to the Handbook in Google books above. ミーラー強斗武 (StG88ぬ会話) 21:39, 22 August 2015 (UTC)

I debated whether Ōshima should be one article or two. There was a lot of overlap, and the Northern and Southern articles were partly defined in opposition to the other and used many of the same sources, so it made more sense to me to have a unified article. But revert me (and please clean them up!) if you decide they should be split. As with other articles, I trimmed the classification section down to what was relevant per CONTENTFORK. — kwami (talk) 00:35, 23 August 2015 (UTC)

@Kwamikagami: I'm glad we're making progress. What about moving Amami–Okinawan languages to Northern Ryukyuan languages? I requested the page move already, per WP:COMMONNAME after browsing the sources. The only post-1970s English source to use "Amami Okinawan" is Ethnologue. On the Amami–Okinawan article itself, the only source besides Ethnologue to use "Amami Okinawan" is a Japanese source that uses 奄美沖縄方言群 (Amami Okinawa dialect family), but then gives the English translation as "Northern Ryukyu dialects". Most of these articles only discuss the Japanese classification theories and refer to these languages as dialects and need to be rewritten or added upon. ミーラー強斗武 (StG88ぬ会話) 04:26, 23 August 2015 (UTC)

Fine by me. And IMO we can keep S. Ryukyuan for balance even though it has practically no unique content. — kwami (talk) 15:25, 23 August 2015 (UTC)

At Amami language, does anyone know where the village of "Ōshama" is? Is that just a typo for "Ōshima"? — kwami (talk) 15:58, 23 August 2015 (UTC)

@Kwamikagami: ~~My gut tells me it's a typo. He typed it twice, but he refers to it once as "Ōshama (Southern)", which means it's probably a typo.~~ It's definitely a village somewhere in Southern Amami Ōshima. ミーラー強斗武 (StG88ぬ会話) 18:58, 23 August 2015 (UTC)

Looks like there hasn't been any controversy to the changes. As for the question I missed above about endorsing Pelland (2009), I wasn't. I chose it as an example because it was convenient. Glottolog chose it, which likely means that the editors over there thought it was the best classification available (at least in a language they could read; that might not include Japanese). Ethnologue doesn't say where their classification comes from, though one of you can probably identify it. I have no problem using a different classification (if we have 2ary sources verifying that it's respected), and our articles don't follow blindly: lects between Amami and Okinawa currently list both as their parent node in the info box, for example, and a few others have question marks. That's from my clean-up efforts. If I've misread the lit, please fix. — kwami (talk)

@Kwamikagami: I think the lack of controversy is more due to Nanshu having not edited in a week than a lack of a challenge. I'll be honestly surprised if Nanshu doesn't throw a fit over these changes.

Anyhow, Ethnologue probably uses Karimata (2000), but I'd have to double check. Which classification did you use for those infoboxes? ミーラー強斗武 (StG88ぬ会話) 03:04, 26 August 2015 (UTC)

Originally I used whatever we happened to be using on the Ryukyuan languages at the time. I changed things a bit to reflect the different classifications we now have there (like whether there's a third branch in N.Ryukyu), though honestly I couldn't see much difference between many of them except for the names. — kwami (talk) 04:45, 26 August 2015 (UTC)

Ah, per Martin (1970), describing S. Ōshima but apparently for other N. Ryukyuan as well, the "glottalized" stops are actually just plain unaspirated, whereas the "plain" stops are aspirated. I assume the "glottalized" nasals and approximants are actually glottalized. — kwami (talk) 01:35, 27 August 2015 (UTC)

Bhojpuri language § Politeness

Bhojpuri language § Politeness gives several contradictory values for the number of levels of politeness. See Talk:Bhojpuri language § Politeness for details. --Thnidu (talk) 02:13, 27 August 2015 (UTC)

Modifying the IPA edit tool

You might have noticed that the IPA edit tool under your edit window has changed. We've gotten rid of those inconvenient carrier letters now that MediaWiki coding can accommodate bare diacritics. But I'm wondering if we might want to go a bit further, and mark off consonants, vowels, etc. into labeled subsections for easier access. (Also, should be bother to keep ɧ and ɶ?) Chime in at MediaWiki_talk:Edittools#Request_for_comment if you have an opinion. — kwami (talk) 23:28, 25 August 2015 (UTC)

FYI, this is what I'm proposing. Beside labeling subsections, it groups all the spacing diacritics together and all the combining diacritics together (but with tone set apart), adds {{IPA link}}, sub's the pipe so it works inside {{IPA}}, and restores the rhotic diacritic which got lost somehow:

'IPA': 'consonants: t̪d̪ʈɖɟɡɢʡʔ ɸβθðʃʒɕʑʂʐçʝɣχʁħʕʜʢɦɧ ɱɳɲŋɴ ʋɹɻɥɰʍ ʙⱱɾɽʀ ɫɬɮɺɭʎʟ ɓɗʄɠʛ ʘǀǃǂǁ vowels: ɨʉɯ ɪʏʊ øɘɵɤ ə ɚ ɛœɜɝɞʌɔ æɶɐɑɒ spacing_diacritics: ʼˀˤᵝᵊᶢˠʰʱʲˡⁿᵑʷᶣ˞ ˈˌːˑ‿˕˔ combining_diacritics: ̚ ̪ ̺ ̻ ̼ ̬ ̥ ̊ ̞ ̝ ̘ ̙ ̽ ̟ ̠ ̈ ̤ ̹ ̜ ̩ ̆ ̯ ̃ ̰ ͡ ͜ tone_&_prosody: ̋ ́ ̄ ̀ ̏ ̌  ̂  ᷄  ᷅  ᷇  ᷆  ᷈  ᷉  ˥ ˦ ˧ ˨ ˩ ꜛ ꜜ {\{!}} ‖ ↗ ↘ extIPA: ͈ ͉ ͎ ̣ ̍ ͊ ᷽ ̫ ͇ ˭  {\{angle.bracket|+}} {\{IPA|+}} {\{IPA.link|+}}',

— kwami (talk) 18:50, 26 August 2015 (UTC)

+1 --SynConlanger (talk) 18:56, 26 August 2015 (UTC)

P.S. S.o. made a suggestion to mix up the stops and fricatives for s.t. like:

ɸβθð t̪d̪ʈɖ ʃʒʂʐɕʑ ɟɡɢ çʝɣχʁ ʡʔ ħʕʜʢɦɧ

— kwami (talk) 01:33, 27 August 2015 (UTC)

ɧ was never anything more than a joke anyway. But why get rid of ɶ? Please {{Ping}} me to discuss. --Thnidu (talk) 01:51, 27 August 2015 (UTC)

@Thnidu: There was an admin suggestion that we cut the number of symbols. But cutting these has already been objected to, and wouldn't make much difference to the length anyway. — kwami (talk) 05:54, 27 August 2015 (UTC)

Scanian is back

Comment needed at Scanian dialect. There's an edit war on deleting the population with various spurious arguments. The single reasonable argument is that Ethnologue did not base their estimate on a reliable source. Of course, that could be true for many of our articles, and not just ones based on Ethnologue. — kwami (talk) 23:45, 1 September 2015 (UTC)

Requested move at Sakha language

"Sakha language" > "Yakut language". — kwami (talk) 18:42, 21 September 2015 (UTC)

Chichewa tones

Please help get this new article into shape, it has several issue tags. Roger (Dodger67) (talk) 07:27, 7 October 2015 (UTC)

Most of the issues have now been dealt with. The only tag now left is that Dodger67 feels that it 'may contain an excessive amount of intricate detail that may only interest a specific audience'. As the original creator of this article and following his advice to consult WikiProject Languages, I would welcome it if any linguist reading this would offer advice on whether it is too detailed (since it seems to me that it is no more detailed than a number of other articles on languages - and also it isn't the main page on the language) and if so, where it should be shortened. Perhaps you could put the advice on the Talk page of the article. Kanjuzi (talk) 11:22, 7 October 2015 (UTC)

Indo-European peoples

Please join this category discussion about Indo-European peoples. Marcocapelle (talk) 08:52, 17 October 2015 (UTC)

Goidelic languages listed at Requested moves

A requested move discussion has been initiated for Goidelic languages to be moved to Gaelic languages. This page is of interest to this WikiProject and interested members may want to participate in the discussion here. — AjaxSmack 02:09, 24 October 2015 (UTC)

Altaic and infoboxes discussion

FYI

– Pointer to relevant discussion elsewhere.

Please see Wikipedia talk:WikiProject Linguistics#Altaic and infoboxes; I think that the content of that discussion is actually more tied to dispute on this project's talk page than that one. And why are these separate projects to begin with? — SMcCandlish ☺ ☏ ¢ ≽^ʌⱷ҅_ᴥⱷ^ʌ≼ 03:53, 24 October 2015 (UTC)

Merge discussions

FYI

– Pointer to relevant discussions elsewhere.

Proposal to merge two articles at new title, Language extinction: Please see Talk:Language death#Proposed merger with Extinct language. — SMcCandlish ☺ ☏ ¢ ≽^ʌⱷ҅_ᴥⱷ^ʌ≼ 03:49, 24 October 2015 (UTC)

Proposal to move glottophagy material mostly out of Language death (and certainly out of its lead) and integrate it into the lead at Language death and expand upon it there: Please see Talk:Language shift#Glottophagy. — SMcCandlish ☺ ☏ ¢ ≽^ʌⱷ҅_ᴥⱷ^ʌ≼ 04:28, 24 October 2015 (UTC)

Glottocodes & ISO 639-codes

Hello

I am currently working on ISO 639-3 codes and Glottocodes in the Wikipedia in French. I got a list of from Glottolog of 7800 languages with their Glottocodes and ISO 639-3 codes. I think it could be useful for wp.en to have these datas, that could also be used on WikiData. I have put the CSV, XLSX and ODS (better presentation) files on Mega if you want to download them. If you don't trust these files, the sortable table and CSV are also on my subpages : ISO3 codes+Glottocodes table and ISO3 codes+Glottocodes CSV. The list was put together with the help of Robert Forkel from Glottolog, using the data available as JSON from http://glottolog.org/resourcemap.json?rsc=language.

Regards, Ѕÿϰדα×₮ɘɼɾ๏ʁ ^{You talkin' to me?} 11:22, 26 October 2015 (UTC)

There is one thing that might interfere. There are Glottolog codes for languages that don't have ISO 639-3 codes and vice versa. --Taivo (talk) 14:40, 26 October 2015 (UTC)

This doesn't interfere with this list : these 7800+ languages have both codes. I know that I'm not likely to be trusted or even respected by some people here as I do not contribute much on this Wikipedia compared to the one in French, I just did this to give some help, you can do what you want with it. Ѕÿϰדα×₮ɘɼɾ๏ʁ ^{You talkin' to me?} 16:35, 26 October 2015 (UTC) — Preceding unsigned comment added by SyntaxTerror (talk • contribs)

Selau

Hallo, at Halia language "Selau" appears in the infobox in the "altname" field, though isn't mentioned in the text, and Selau language redirects there too. The two refs both call it a dialect of Halia. In a new stub at Selau, Papua New Guinea there's some confused, unsourced, wording about the language. Could a linguist have a look at it and perhaps sort it out? Thanks. PamD 15:00, 27 October 2015 (UTC)

Category:Nordic languages

Category:Nordic languages, which you created, has been nominated for possible deletion, merging, or renaming. If you would like to participate in the discussion, you are invited to add your comments at the category's entry on the Categories for discussion page. Thank you. LjL (talk) 16:14, 25 November 2015 (UTC)

Category:Germanic languages

Please see this category discussion. Marcocapelle (talk) 21:32, 25 November 2015 (UTC)

RfC: What should the language infobox display when editors have not found any speaker figures?

There is a RfC at Template talk:Infobox language#RfC: What should the language infobox display when editors have not found any speaker figures?:

In the case that editors have looked for speaker figures, but have not found any, they can set the parameter speakers of Template:Infobox language to ?. This currently causes the infobox to display “Native speakers (no data)”. There are two questions:

Should we display something in this case, or should we display nothing?
If we should display something, then what should it say?

Thanks for your comments. --mach 🙈🙉🙊 09:39, 29 November 2015 (UTC)

Comment - open-ended RfC questions ("what should we..." instead of yes/no questions) are often a bad idea, because everyone comes up with something slightly different, and it's very hard to establish consensus. The first question is fine, but perhaps you could come up with a few explicit alternatives (ideally including the ones that you do not favor but were proposed by other editors) for the second. LjL (talk) 16:24, 29 November 2015 (UTC)
Comment - It is also very bad manners to not notify the principal participants in the original discussion. --Taivo (talk) 18:14, 29 November 2015 (UTC)

Categories for languages of Spain

I've recently spotted some changes to languages spoken in Spain that I found strange, made by User:Marcocapelle. We have Castilian languages becoming a part of Extremaduran language. I don't believe that's typical linguistic classification...? Incredibly, Spanish is no longer a language of Spain. I understand that that's included in parent categories, but surely a reader expects Spanish to be directly listed in Category:Languages of Spain, also according to the guidelines at WP:EPONYMOUS. Likewise, Extremaduran is no longer a language of Spain, even though as we saw above, Spanish (aka Castilian) has become a dialect of Extremaduran. Same with Galician. The only relevant category that is left to Aragonese is the Aragonese language itself. Evidently the reader isn't interested in linking it to Iberian or Romance languages or anything.

There may be more of this that I missed. May I have a rationale, from the editor making the changes or other, for these rather counter-intuitive re-categorizations? LjL (talk) 20:35, 22 October 2015 (UTC)

Most edits simply follow one of the possibilities that are mentioned in the guidelines (see Wikipedia:Categorization#Categorizing_pages). With Castilian languages I can well imagine that you have some doubts. In theory one could create a Category:Castilian languages and then parent Category:Spanish language and Category:Extremaduran language to it - but it would remain a very small category so that's not really desirable. The reason I've classified the article Castilian languages in Category:Spanish language and Category:Extremaduran language is obviously not because Castilian is a dialect of Extremaduran, but because the article Castilian languages is (also) about the Extremaduran language. Categories are intended to group articles about a topic, after all. Marcocapelle (talk) 20:49, 22 October 2015 (UTC)

Still, Extremaduran language has its own article, which can be and is made to belong to Category:Extremaduran language (although I really believe it would be very beneficial, for it and similarly other articles, to keep it directly in Category:Languages of Spain too); Castilian languages links to it but is not about it, in fact it's in a sense its "parent", but with this categorization, you're making it its "child".

As to the rest, I'll let others say their opinion at point. I think having Category:Languages of Spain and possibly more in most of those articles is simply common sense and useful to the reader. Articles aren't necessarily meant to have just one category when it's useful to have several. LjL (talk) 21:23, 22 October 2015 (UTC)

The category "Castilian languages" is used virtually nowhere outside of Ethnologue—the article itself says so. That gives us three different meanings of "Castilian", instead of the already confusing two. One of the "Castilian languages" is Spanish, alias Castilian, and one dialect of Spanish is Castilian. Is Ethnologue so influential that its idiosyncratic categories must be adopted by Wikipedia? Is it too late to delete the article? Kotabatubara (talk) 23:34, 8 December 2015 (UTC)

RfC: Should we continue recommending the sign ⟨ɵ⟩?

There is an RfC at Help talk:IPA for English#RfC: Should we continue recommending the sign ⟨ɵ⟩?:

We are currently recommending that the sign ⟨ɵ⟩ be used for a reduced vowel diaphoneme that can correspond either to the phoneme /oʊ/ or the phoneme /ə/, for example in the word omission. Should we continue to do so?
In the IPA, the sign ⟨ɵ⟩ represents the close-mid central rounded vowel. Our use of ⟨ɵ⟩ is based on Bolinger, Dwight (1986), Intonation and Its Parts, Stanford University Press, pp. 347–360. Bolinger proposed not to analyze the reduced vowels as mere versions of the full vowels, but as a special set consisting of three vowels: The “fronted” Willie vowel /ɨ/, the “central” Willa vowel /ə/, and the “backed” willow vowel /ɵ/. Bolinger’s use is slightly different from ours (1) because Bolinger’s /ɵ/ is not a diaphoneme, but a phoneme; (2) because there are words such as willow or lasso or the second part of the MOUTH diphthong where Bolinger would use /ɵ/, but we would not; and (3) because Bolinger’s analysis allows for an alternation between /ɵ/ and /ə/ in words such as canopy, Indonesia, allophonic, or composition.
The English Wikipedia is the only major dictionary that has adopted Bolinger’s sign ⟨ɵ⟩ in a broad phonemic IPA transcription scheme (for an overview of other dictionaries’ broad phonemic IPA transcription schemes, see Help:IPA conventions for English#Reduced vowels).
If you are interested in the edit history of ⟨ɵ⟩ on Help:IPA for English, you may want to check out User:J. 'mach' wust/sandbox#Edit history of ⟨ɵ⟩ (with diff links etc.).

Thanks for your comments. --mach 🙈🙉🙊 12:42, 16 December 2015 (UTC)

Roman languages

Would experts in the field of classification of Roman languages be willing to join this discussion? Marcocapelle (talk) 16:34, 2 January 2016 (UTC)

Rutgers University Class Project

As part of a project for the undergraduate course 01:013:305 "Languages in Peril," at Rutgers, the State University of New Jersey, I am asking my students to adopt one of orphaned profiles among the 3,392 profiles at the Endangered Languages Project, and aggregate information about the language for presentation here and at EndangeredLanguages.com. The students will identify the languages they have adopted by pasting an "Under construction/Rutgers" tag at the top of the stub representing it. The actual work on these pages will not occur until later in the semester (likely April), and the students will receive some training in editing Wikipedia before they begin work on their page. I'd very much like to ask your forbearance while they develop their pages. Many thanks! Chuck Haberl (talk) 04:20, 18 January 2016 (UTC)

Time for more language family template colors?

Although this probably has been brought up before and would be rather difficult is it time to come up with new color-coding for some of the language family templates? Especially when it comes to the Indigenous languages of the Americas the vast majority of these language families are lumped together under the same color, which in my opinion really does not reflect the vast diversity that there is here. I know it would be rather difficult both to come up with a new color to be used and to decide which language families deserve a distinct template color but I think it should still be proposed. Thanks Inter&anthro (talk) 16:41, 18 January 2016 (UTC)

Another cringeworthy thing about the coloration: they have a colour for Altaic (an actively-criticized, dying horse proposal) in the first place. It looks awful and POV-shovey seeing disparate stuff like Turkish, Japanese, Korean, and Mongolian squashed under one colour. Hill Crest's WikiLaser! (BOOM!) 21:10, 21 January 2016 (UTC)

I would definitely support doing away with the Altaic color. Currently "Altaic" isn't listed in the "Language Family/Linguistic Classification" parameter of any of the infoboxes, yet they all (except Ainu) still appear in the Altaic-colored and -labeled infobox. It is POV as Hillcrest suggests. Each should have their own color or at least not be in an "Altaic" infobox. Issues related to Altaic as they relate to each family can be covered in the text of the articles. The Indigenous languages of the Americas issue is a bit more hairy. It would really depend on the wording of the proposal. Many of the larger well-documented and accepted families, including Uto-Aztecan languages, Algic languages, Na-Dene languages and Eskimo–Aleut languages, have their own color already. Perhaps a few others, such as Mayan languages should as well, but our article lists 56 families and isolates in [[North America alone (138 for South America/Caribbean!). That would be a lot of colors or, rather, subtle variation of colors. Some color differences are already so subtle (to my red-green colorblind eyes) that I have to see them side-by-side to be sure they are indeed different colors.--William Thweatt ^Talk^Contribs 23:56, 21 January 2016 (UTC)

I am not sure how significant the infobox colors are - I doubt the casual reader notices at all.·maunus · snunɐɯ· 00:33, 22 January 2016 (UTC)

(ec) I fully support breaking up Altaic--no question. But Native America and New Guinea offer similar problems--far too many separate families to assign a different color to each. --Taivo (talk) 00:35, 22 January 2016 (UTC)

We could also just have continental colors instead "Languages of Africa", "Languages of South America", "Languages of Asutralia" etc.·maunus · snunɐɯ· 00:44, 22 January 2016 (UTC)

@Maunus: I would be against the continental colors idea, it could possibly be interpreted as scientific racism as when in the old days when the languages of Africa and Americas were regarded as primitive and all lumped together. In no way am I trying to say that I think you or others think that way but it could be interpreted that way. Inter&anthro (talk) 03:36, 23 January 2016 (UTC)

I definitely agree that the Altaic one should be broken up, the hypothesis was debunked long ago. Inter&anthro (talk) 03:36, 23 January 2016 (UTC)

@TaivoLinguist: I wasn't suggesting a different color for every language family just for the major ones. For example perhaps Mayan languages and Carib languages could be given a slightly different color from the other languages of the Americas, as well as Pama–Nyungan languages could be a slightly different color than the rest of the Australian Aboriginal languages. It would be difficult to change and I acknowledge that, but it seems weird that we have all these different colors for the language families of Europe and parts of Asia and the rest we just lump together on geographic basis. Inter&anthro (talk) 03:36, 23 January 2016 (UTC)

Such an interpretation would only occur if readers were expecting the color to express linguistic family relations, and I dont think anyone other than the regulars of this project actually interpret them that way. I really dont see any usefulleness in marking families with slightly different colors from eachother - it really takes away the small possibility of the color coding to be informative. Lets just have one color for all the language infoboxes.·maunus · snunɐɯ· 03:45, 23 January 2016 (UTC)

User:Maunus has an excellent point. The only people who care about (or even notice) the different colors are the editors here. 99.9% of all readers look up one language and they might notice that there is a colorful border to the infobox, but they won't notice (or care) that there is a different color on another language infobox because they aren't looking at other languages. Using color to code language families is utterly obscure unless one has a copy of the "language quilt" on their personal user page (as I have an outdated one). It's really just an exercise in microtriviality. One nice color to border all language infoboxes should be sufficient. --Taivo (talk) 04:44, 23 January 2016 (UTC)

@Maunus: & @TaivoLinguist: I disagree, while yeah the info box colors might be more aesthetically pleasing rather than factual, just because the majority of people using Wikipedia aren't interested in linguistics &/or languages doesn't alone mean it shouldn't being included. I think the different colors add another dimension of how rich and diverse languages are, and having all the info boxes a dull grey color as is becoming the norm on ethnic group infoboxs would take away from that. While yeah info box colors for languages might be a little out of the norm per WP:BOLD & Wikipedia:PAPER we have the freedom to expand and be creative. Inter&anthro (talk) 05:18, 23 January 2016 (UTC)

But with 200 some-odd separate and distinct language families in the world, assigning a separate, distinct color to each is simply impractical. A hybrid solution to User:Manaus' proposal for continental colors might be a useful compromise: Each continent (NA, SA, Europe, Asia, Africa, Oceania) gets a separate section of the color range assigned (NA = reds, SA = yellows, Europe = greens, etc.). Then each separate family within that continent gets a color within that range. One could then quickly see a red and say, "Ah, North America", or a blue and say "Ah, Europe", but then notice a dark blue and say, "Yes, Indo-European" or a light blue and say, "Yes, Uralic". --Taivo (talk) 00:24, 24 January 2016 (UTC)

@TaivoLinguist: I never meant to imply that every language family should get a different color, just the major one is the Americas, but I'm against the areal solution since the Indo-Aryan languages or Austronesian languages for example are spoken on many different continents. In reality the system we have now probably is the best with some minor adjustments (like getting rid of the Altaic group) - I kinda look like a hypocritical dumbass now haha. Inter&anthro (talk) 01:25, 24 January 2016 (UTC)

I'm sure you meant Indo-European and not Indo-Aryan, which is a subset of the former. But for the Americas, we end up with the question of how to define "major language group". Is Iroquoian a "major group" because it includes Cherokee, Mohawk, and other well-known tribes? But it only has about 10 attested languages. Chapacuran has more languages, but is virtually unknown (it has sometimes even been omitted from lists of South American language families by accident). Without a bright, clear line, "major language groups" is meaningless. But we can at least agree to eliminate the "special status" of Altaic. --Taivo (talk) 02:16, 24 January 2016 (UTC)

@TaivoLinguist: I 100% agree with you that letting users decide what languages are notable enough etc would violate WP:POV, but I think we can all also agree there some languages families that are clearly notable, such as the top 10 or so lets say, which could be distinctive form the rest and the rest classified by area, which is roughly what we have now. Inter&anthro (talk) 02:25, 24 January 2016 (UTC)

Adding the UNESCO Atlas of the World's Languages in danger to Wikidata

Hi all

I'm involved in adding all languages listed in the UNESCO Atlas of the World's Languages in danger to Wikidata using Mix n' Match, just scroll down the list to find the correct catalogue. I hope you can help, there are instructions of how to use Mix n' Match in Manual mode here. Creating this catalogue will make it much easier for people to see which articles relating to endangered languages are still needed.

Thanks

John Cummings (talk) 09:31, 26 January 2016 (UTC)

Pages for native names of languages

A quick look of the various pages titled with the native name of various languages will prove their targets to be inconsistently placed. Generally, they fall into one of these categories:

Redirects to the article about that language, e.g. Čeština, Deutsch, Italiano, Português; this is currently the case for most such pages, including transcriptions such as Hangugeo, Nihongo, and Zhongwen
Redirects to the disambiguation page of its English name, e.g. Español, Français, Russkiy and it seems that a discussion over at Nederlands is in favour of such a redirect
Are disambiguation pages themselves, e.g. English, Magyar, Norsk, Suomi

Is this type of naming acceptable, or should we derive a convention for the nature of such pages? <<< SOME GADGET GEEK >>> (talk) 19:19, 31 January 2016 (UTC)

Ethnologue is going Paywall: Should we deprecate the use of Ethnologue data?

Ethnologue is becoming a (quite expensive) subscription service. This means two things: That their content will not be as easily accessible for us and the world, and that they may eventually become tired of the fact that we basically provide a good deal of their content for free. Eventually this may result in some kind of copyright issue (though I am unsure if they can copyright basic facts and prohibit others form citing them for free). I would suggest that in response we begin deprecating the use of Ethnologue data, gradually working to rely on other sources for speaker data and demographics. Other paper sources written by specialists in specific languages are often more accurate anyway. By shifting away from our reliance on Ethnologue data, we will neither have the problem (even if it is only a moral problem) of providing their data, nor the problem of providing Ethnologue with free publicity for their subscription service. Arguments and suggestions welcome.·maunus · snunɐɯ· 00:26, 29 January 2016 (UTC)

The discussion was started at Template talk:Infobox language#RfC: What should we do now that Ethnologue has put up a paywall?. I'm pretty sure it isn't a formal RfC. In any case, this project page is probably the best location to host the centralized discussion anyway.--William Thweatt ^Talk^Contribs 00:40, 29 January 2016 (UTC)

Oh, sorry about that. I hadnt seen that as I dont watch that template. I have added my opinion there as well. And yes I think we would get broader participation in an RfC if it were located here.·maunus · snunɐɯ· 00:49, 29 January 2016 (UTC)

And, no, they can't copyright basic facts. We're even carefully attributing them, especially when they're making claims rather than reporting raw data. Anyway, WP:V doesn't require that sources be free. That said, there's no point in keeping them in the infobox if people normally can't actually verify the material themselves. — SMcCandlish ☺ ☏ ¢ ≽^ʌⱷ҅_ᴥⱷ^ʌ≼ 23:25, 2 February 2016 (UTC)

WikiProject English Language

It's rather shocking that this project is missing, though its absence explains why we have so few articles about our language (most of the linguistic material on English is buried as sections, when we're lucky, in general linguistics articles), and why so much of what we do have is riddled with Victorian prescriptivist PoV-pushing and original research. Properly sourced, neutral, linguistically descriptive, detailed, encyclopedic coverage of the language is one of en.wp's most obvious topical gaps. The scope is general, including everything from the Great English Vowel Shift to Quotation marks in English to English as a global language, and various missing articles like major styles of English writing.

I've drafted the wikiproject outline at User:SMcCandlish/WikiProject English Language, including some "Goals" and "Scope" points. Please "pre-sign" as a participant so that it already has a number of supporters (7+ would be nice) when I take it to Wikipedia:WikiProject Council/Proposals in the next day or so. I've not yet created a to-do list for it or other resources (mainly so I don't have to move them later after the proposal goes through).

As far as I can determine, this project is missing because several times in the past, various camps of prescriptivists have tried to create something called "WikiProject English" to PoV-push their version of "correct" English on Wikipedia, and had it deleted at WP:MFD. This proposal would be the diametric opposite of such WP:FRINGE / WP:SOAPBOX campaigning. — SMcCandlish ☺ ☏ ¢ ≽^ʌⱷ҅_ᴥⱷ^ʌ≼ 23:52, 2 February 2016 (UTC)

Maybe I'm misreading the comments, but this proposal doesn't sound very NPOV. Yes, prescriptivism is a POV, but so is descriptivism. Both are valid and there are respectable people on both sides of the fence; in fact, as our article Linguistic description says in the lede, they're not opposing approaches but complementary. You seem to be using "prescriptivists" as a swear word. Any such project would need need to include input from both POVs.--William Thweatt ^Talk^Contribs 01:32, 3 February 2016 (UTC)

Descriptivism isnt a POV it is a scientific stance and it encompasses many theoretical POVs, whereas prescriptivism is a political stancethat also encompasses a number of POVs. Prescriptivism is pretty much irrelevant for encyclopedic purposes, at least outside of the Manual of Style. There is not relevant prescriptivist view on the great vowel shift or English relative clause formation etc. In articles about punctuation of course they will have to describe which usage is considered correct according to different style guides.·maunus · snunɐɯ· 02:24, 3 February 2016 (UTC)

ISO code for Mentonasc dialect

There is currently no ISO code for the Mentonasc dialect.

https://en.wikipedia.org/wiki/Mentonasc_dialect — Preceding unsigned comment added by 68.197.140.14 (talk) 03:27, 9 February 2016 (UTC)

Hello, it is normal, it is a dialect of Occitan, the ISO 639 code to use is OC / OCI. Regards. --82.225.233.169 (talk) 13:43, 9 February 2016 (UTC)

I didn't see I was not logged in. So I'm the author of the answer above. Feel free to contact me to discuss that topic. --— J. F. B. ^{(me´n parlar)} 13:44, 9 February 2016 (UTC)

question about Wagiman language, a GA

Wagiman language is listed as a current GA, but has been tagged with Template:More footnotes since 2014. I'm not sure if it was just a drive-by tagging or if somebody really had an issue with the article. It looks like it was tagged because, from the Grammar section all the way to the end, there are only four inline cites. But right under the "Grammar" subheading, it says "All grammatical information from Wilson, S. (1999)^[14] unless otherwise noted." Is that sufficient for a GA? (not a rhetorical question; I really don't know) If so, can we remove the "More footnotes" tag?--William Thweatt ^Talk^Contribs 07:45, 13 February 2016 (UTC)

It is a very old GA, it wouldnt qualify as GA under the current standard. It does need more inline footnotes.·maunus · snunɐɯ· 13:30, 13 February 2016 (UTC)

Hijra Farsi

I've just come across the stub Hijra Farsi – it's pretty minimal, but seems legit. I'm not very knowledgeable in how we treat individual languages, especially secret languages; can anyone help? Kwamikagami? --Florian Blaschke (talk) 02:14, 13 February 2016 (UTC)

Sounds interesting, but I'd want to see what makes it a code separate from Farsi. Sorry, no time now, but we have articles on other codes like this. Should be a category for them. Sorry, almost out of internet range. — kwami (talk) 06:18, 13 February 2016 (UTC)

The category is Category:Cant languages. Upon looking for sources, I've found it is indeed an interesting language, although probably not one I'll be working on. I did, however, list a decent paper I found and a news article in a "Further reading" section in case anybody would like to read up and use them for some expansion.--William Thweatt ^Talk^Contribs 07:12, 13 February 2016 (UTC)

Thanks, folks. Note that Hijra Farsi appears to be based on Urdu, not Persian. --Florian Blaschke (talk) 17:43, 15 February 2016 (UTC)

Actually, I cannot find that statement in the source given.·maunus · snunɐɯ· 17:55, 15 February 2016 (UTC)

Well, it says "Hindi", presumably meant in the narrower sense:

The latter variety, structurally consistent with Hindi yet unintelligible to Hindi speakers, is characterized by distinctive intonational patterns and an extensive alternative lexicon. Although Koti/Hijra Farsi is unrelated to Persian Farsi, its speakers conceptualize it as the language of the Mughals, employing it in the construction of an historically authentic sexual identity.

And the article says "Hindustani", so it's OK. It's definitely not a variety of Persian, anyway, but Indic. --Florian Blaschke (talk) 22:39, 15 February 2016 (UTC)

Inconsistent classification of Tai Yo language

Tai Yo language says alternate names are Tai Do, Tai Mene, and Tai Nyaw, all of which redirect to Tai Yo. That's fine, but:

The Tai Yo article lists it as a Southwestern Tai language under Chiang Saen languages
The Southwestern Tai languages article also lists it (as "Nyaw") as a Southwestern Tai language but grouped with Lao-Phutai, not Chiang Saen
Even more confusingly, the Northern Tai languages article prominantly lists it (as "Tai Mene") as Northern Tai! (this guy, on pg 12, seems to agree)

I know the classification of the Tai languages still isn't completely worked out, but Southwestern and Northern differ sufficiently that there should be agreement on whether a language is Southwestern (closer to Thai/Lao) or Northern (closer to Saek and Zhuang). I haven't been able to find any references other than the book above which cites Chamberlain. Does anybody else have access to a definitive reference so we can sort out our articles? If not, are there any suggestions regarding how we should fix the inconsistencies in our articles?--William Thweatt ^Talk^Contribs 10:10, 17 February 2016 (UTC)

Mayan languages

Hi all, I am in the process of updating and improving Mayan languages and old FA that has not been maintained and which is at risk of eventually being delisted if it is not brought up to current FA standards. Your input and peer commentary will be welcome and helpful in the process.·maunus · snunɐɯ· 18:22, 26 February 2016 (UTC)

Displaying ISO code of a language for its dialects

@Kwamikagami: reverted one of my edits on a dialect of ASL. I had added the ISO code for ASL to the template, but then he explained that WP only provided the ISO code for a variety if the code referred exclusively to that variety. That makes sense to me, but I'd also point out that librarians, archivists, etc. who are wondering what ISO code to use in their metadata may consult Wikipedia when looking for the appropriate ISO code, so providing no code isn't helpful to them. According to the ISO standard, the appropriate ISO code for a dialect of a language is the language's code. So, it would be of service to people to provide that in WP articles, although of course (as noted before) distinguishing this case from the one in which a code refers exclusively to a particular dialect. What I'd like to suggest is that in these cases the Infobox should report "included in [___]", with a link taking people to the appropriate ISO 639-3 page. (That wording is preferable to something like "dialect of [___]", partly because of the potentially contentious word "dialect" and partly because, in some cases, a variety might be included--incorrectly--in some ISO code when it isn't a dialect of it--it's just that no one has gotten around to requesting a separate code.) A similar wording could be done for historical antecedents of languages with 639-3 codes, e.g. "historical variety of [___]. As a practical matter, Kwami suggested that if iso= has a code and isoexception=dialect/historical, that combination is what would trigger the alternate wording. Comments? AlbertBickford (talk) 22:13, 8 March 2016 (UTC)

The “included in” proposal seems very sensible to me. --mach 🙈🙉🙊 08:01, 11 March 2016 (UTC)

Heritage languages

Not sure this is in the scope of this project, but in researching heritage language, I noticed an emerging set of articles relating to heritage languages in Canada (see for instance Heritage Languages in Toronto), with less attention to this rubric in other contexts. This raised in my mind a questions about whether and how this topic might be expanded (for example, Chinese language in the United States is in some respects a [or several] heritage language[s]). Should there be a Category:Heritage languages? Some general guidelines about use of "heritage language" in titles? Part of the problem is that use of this term is dependent on context. TIA for any thoughts.--A12n (talk) 15:09, 14 March 2016 (UTC)

Social justice warrior - move discussion

Notifying this WikiProject talk page as article is relevant to the topic

There is a move discussion ongoing related to this WikiProject.

Article = Social justice warrior
Move discussion at Talk:Social_justice_warrior#Requested_move_6_April_2016.

Feel free to comment however you wish.

Thank you,

— Cirt (talk) 02:35, 6 April 2016 (UTC)

We need a new article on the "English languages"

The old one was moved to "Anglic languages" and then, under the guise of a "merge" to History of English, an editor simply deleted the entirety of its content (0 edits occurred to the new article at the time of the "merge") and replaced it with a bad redirect. I fixed that to at least point at the parent article (Anglo-Frisian languages).

If English is the single and only member of the "English languages" then the redirect should point to English language (not the history article). If there are several members, then there should be some article developed that discusses and links to them. — $Llywelyn II$ 03:22, 25 April 2016 (UTC)

I think an article on World Englishes and one on Old English dialects would make sense.·maunus · snunɐɯ· 16:24, 25 April 2016 (UTC)

Category:Indo-European-speaking peoples

Category:Indo-European peoples was deleted, but it is present in 49 other wikis. So i propose to create similar category, with more neutral name Category:Indo-European-speaking peoples. Discussion is here Wikipedia:Categories_for_discussion/Log/2016_April_28#Category:Indo-European-speaking_peoples Cathry (talk) 14:59, 29 April 2016 (UTC)

The largest section in the Modern Hebrew article describes the language as "non-semitic"

The largest section in the Modern Hebrew article describes the language as "non-semitic". This view is WP:Fringe, as confirmed by reliable sources, yet this view currently occupies the largest space in the article, going into extreme detail including a table for individual opinions, while everything else is presented at a broad/high level. In my opinion, and in the opinion of the majority of editors on the talk page, this is WP:Undue. Over the past year, six editors have expressed their view that the section should be removed or minimized, while only two have supported it. Despite this consensus, the section remains in the article in its current state, likely due to the slow nature of the subject. Any editors wishing to contribute are welcome. Drsmoo (talk) 17:17, 17 May 2016 (UTC)

Accents in language titles

Hi folks. Is there guidance on whether or not the article title for a language should be the latinized name (e.g. Xaracuu language) vs. the accented name (Xârâcùù language)? I looked in Wikipedia:Naming conventions (languages) but did not find anything dispositive. I suspect (based on other examples) the practice is not to latinize the name but I'd like some confirmation one way or the other. Thanks! Adam (Wiki Ed) (talk) 17:10, 21 June 2016 (UTC)

I see some suggestion here that combining marks are to be avoided (though I don't know if the title includes them or they're on a keyboard somewhere). Adam (Wiki Ed) (talk) 17:39, 21 June 2016 (UTC)

Maybe someone more knowledgeable should be able to point us to some guidelines somewhere, but until that happens here are my two pennies. I don't think there can be universally applicable guidelines about non-ASCII symbols in language names (and accents are just a subset of these). I think it's best if consensus about the title of a given language is reached on a case-by-case basis, as attitudes vary between (and occasionally within) countries and broad cultural areas. I think in any case, the wikiproject for the country the language is spoken in might be the most relevant. I can think of a couple of linguistic/geographic areas where special symbols are commonplace. In the Salishan languages of the US Northwest (see 1 and 2), none of the special symbols from the language names seem to have made it into the article titles. Maybe for all of them there just happens to be a common name in English and it's predictably a simple one. In another area: Category:languages of Brazil, the acute accent is ubiquitous in language names, but this is the case because it's ordinarily used in the dominant Portuguese language, which has probably been the direct source of the established names in English for these languages.

Do you have any specific examples in mind? Uanfala (talk) 15:16, 22 June 2016 (UTC)

The specific question came from a student working on Xaracuu language, which has used the unaccented title since the article was created in 2011, though the text of the article has (near as I can tell) always used Xârâcùù. Looking at Template:Languages of New Caledonia it appears that most of the southern language titles use accents in titles where present in the language. Adam (Wiki Ed) (talk) 16:04, 22 June 2016 (UTC)

Xârâcùù seems to be overwhelmingly more prevalent in both French and English sources ([3] [4]). I've started a requested move, see Talk:Xaracuu language#Requested move 22 June 2016 (I'm prevented from moving it straight away by the extra edit in the history of Xârâcùù language). Uanfala (talk) 16:37, 22 June 2016 (UTC)

Thanks for gophering. I can move it on Protonk. Adam (Wiki Ed) (talk) 19:45, 22 June 2016 (UTC)

Linking of Open Access publications about linguistics

Open Access publications about a particular topic are a useful addition to articles as they are available to people outside of academia as well. I have held that conviction for a long time, but now I work for Language Science Press, which happens to produce Open Access monographs. This means on the one hand that I am very well informed about new open access books, on the other hand, it means I have a WP:COIN.

I have added some of these monographs to articles where I was sure that it was relevant (Gramars of Yakkha, Mauwake, Pite Saami); for others I have suggested inclusion on the relevant talk pages. Most of the smaller languages receive few edits and might not even have anybody watching them to whom I could suggest inclusion.

I am not very happy about this state of affairs. Technically, I am violating policies about conflicts of interest and paid contribution. I still think that for the coverage of linguistics, the inclusion of these books is useful, so I ignored all rules.

I would appreciate discussion about this issue and would be happy if someone could suggest a good course of action.

For the record and FWIW, my former job was the creation of Glottolog. This might or might not lend me some credibility

Jasy jatere (talk) 18:48, 23 June 2016 (UTC)

It looks to me like you are behaving properly: editing judiciously and being upfront about your potential conflict. If you plan to continue adding LSP citations or links, you might want to disclose the relationship on your user page. But I don't see any advantage in demanding strict adherence to the letter of the law. Cnilep (talk) 01:18, 27 June 2016 (UTC)

ISO 639 redirects & project tagging

I see there are ~8662 redirects (#Rs) of the form ISO 639:[a-z]{2,3} (like ISO 639:aa, etc.) which are missing a talk page, and so missing the {{WikiProject Languages}} banner and the corresponding talk-page #R (like Talk:ISO 639:aa to Talk:Afar language, etc.). Is there any desire by WP:LANG to tag these existing #Rs and to create talk-page #Rs? I did something very similar to this at WP:AST with our plethora of minor planet #Rs and can do the same here, if there's interest. ~ Tom.Reding (talk ⋅dgaf) 17:52, 29 June 2016 (UTC)

Please verify

Old Štokavian Xx236 (talk) 11:12, 30 June 2016 (UTC)

Thanks for pointing this out. I've redirected it to Shtokavian. – Uanfala (talk) 11:27, 30 June 2016 (UTC)

Using UNESCO open license text to create Wikipedia articles about endangered languages and language groups

Hi all

I'm currently working with UNESCO to help find ways to make their content more useful for Wikipedia. I'm developing a way for text from UNESCO publications to be easily usable on Wikipeda, please see here for more details and instructions.

I think a very useful publication for Wikiproject Languages would be Atlas of the World Languages in Danger which is provides an overview of endangered languages within each region, perhaps the desriptions could be used to create Wikipedia articles for endangered languages within each area and/or endangered languages within language groups?

Please let me know what you think and if you need any more information, I'm also currently indexing all the languages listed in the world atlas into Wikidata which would provide an overview of what languages are not covered on Wikipedia already. I'm currently doing a project to create Wikipedia articles from official descriptions of Biosphere Reserves, here is a map of all the Biosphere Reserves in the world without English language Wikipedia articles generated live by Wikidata, something similar could possibly be created for languages.

Many thanks

John Cummings (talk) 20:22, 17 July 2016 (UTC)

That's a great project! You might want to also post at Wikipedia Talk:WikiProject Endangered languages (which has admittedly been rather quiet lately). Uanfala (talk) 22:34, 17 July 2016 (UTC)

Thanks very much @Uanfala:, I will do that now. --John Cummings (talk) 15:41, 18 July 2016 (UTC)

Establishments/disestablishment categories

Should languages be organized with establishments/disestablishment categories? It would likely be vague, by centuries. They aren't created but a page like Meroitic language would be included in something like Category:Languages attested in the 3rd-century BC, Category:3rd-century BC establishments in Africa, Languages extinct in the 4th-century and finally Category:4th-century disestablishments in Africa. -- Ricky81682 (talk) 00:52, 24 July 2016 (UTC)

If at all, we should use these categories only for constructed languages where the year of first publication can be verified, and for extinct languages where the death year of the last speaker has been recorded. De728631 (talk) 01:05, 24 July 2016 (UTC)

That's kind of limiting isn't it? It'd basically be 20th century with specific years, wouldn't it be? If reliable sourced linguists can give an estimate on both the start and end period (within a century), why not include it? -- Ricky81682 (talk) 20:47, 24 July 2016 (UTC)

That's a big, somewhat dubious if. It's difficult and more-or-less arbitrary to date the establishment of a natural language. For example, Middle English is generally thought of as having arisen in the 11th century, but that is because that's the date of the Norman Conquest. I think it was Ed Finegan who pointed out that there never was a moment when Middle English speaking children could not understand their Old English speaking grandparents and vice-versa, so even that vague dating is something of an abstraction. It's virtually impossible to pinpoint when almost any natural language diverged from an ancestral form. Given that fact, I'm not convinced of the utility of categorizing articles in terms that are necessarily vague and somewhat arbitrary. Cnilep (talk) 01:40, 25 July 2016 (UTC)

Absolutely agree with Cnilep: languages don't normally have beginnings that are pinpointable within any degree of vagueness. But there are small groups of exceptions: artificial languages, pidgins, maybe mixed languages and independently arisen sign languages (like Nicaraguan Sign Language). At any rate, I don't think this is what the OP's proposal is about. It's about having categories for languages that have been attested or extinct since a certain point in time. This would be a helpful category, wouldn't it? It's another matter if such categories will ultimately be placed within the subcategories of Category:Establishments by time, but it's worth pointing out that "establishment" seems to have a very broad meaning here: for example Category:5th-millennium BC establishments includes archaeological cultures. Uanfala (talk) 08:53, 25 July 2016 (UTC)
- Yeah, I suggested "attested" and "extinct" (which probably isn't the best grammatically) as opposed to "established"/"disestablished" because those two only really work for things that were created. I think it would be interesting to have a category of all languages that attested worldwide around say the 3rd century AD. Again, this is something that there are reliable sources about and since Template:Infobox language uses "era" and "extinct" it's not like the information isn't out there. If it's disputed, that's one issue but it's just a question of whether the categorization seems useful. -- Ricky81682 (talk) 20:53, 25 July 2016 (UTC)

List of languages by time of extinction might be helpful if you haven't come across it yet. Given the sheer number of languages that have been going extinct in the last century, and the relative specificity of recent dates, I think it might be a good idea to have Languages extinct in the 20th-century broken up by decades. Uanfala (talk) 21:33, 25 July 2016 (UTC)

Ok, that's good. The size of mergers and splitting is always a WP:SMALLCAT debate that can happen at WP:CFD in the future. It's an ebb and flow but it seems like the idea is at least understandable. -- Ricky81682 (talk) 04:35, 26 July 2016 (UTC)

@Ricky81682: I see that now Classical Arabic has both Category:Languages attested from the 4th century and Category:4th-century establishments. The latter already has the former as a subcategory, so I'm not really seeing the point of it here. Uanfala (talk) 21:58, 27 July 2016 (UTC)

@Uanfala: Normally, it's by continent or country but I didn't review it in detail (probably Asia I think). Yana language for example is both Category:Languages extinct in the 1910s and Category:1916 disestablishments in California. The second one, as you get into 1910s, etc., is an interesting way to look at it. The first not so much because it's empty right now. It's the same dual categorization that banks, organizations, political parties and others are done. -- Ricky81682 (talk) 22:48, 27 July 2016 (UTC)

@Ricky81682: Language-specific categories like Category:Languages attested in the 3rd-century BC are squarely uncontroversial but I'm wondering if it isn't a good idea to see input from more editors and wait for consensus to develop before applying the broader ones like Category:3rd-century BC establishments in Africa, because they they might cause confusion as they only apply narrowly to the written tradition of the language and not any other aspect of it. Uanfala (talk) 07:24, 28 July 2016 (UTC)

Is there a specific language there I'm confusing or something? I see that Proto languages by definition do not actually have attestation and I think Meroitic language is implying that the languages' usage is the traditional attestation time period but I think I'll organize something more here to have maybe some guideline. Would it better to develop something here and then incorporate it into say Wikipedia:WikiProject Languages/Template as a sort of local consensus on how the categories should apply? -- Ricky81682 (talk) 07:32, 28 July 2016 (UTC)

You aren't confusing anything, I just wanted to see some more input by other editors on the applicability of the broader "establishment" categories. Uanfala (talk) 08:26, 28 July 2016 (UTC)

Ok, I revised Wikipedia:WikiProject Languages/Template to suggest including those categories. It could probably use some revising. -- Ricky81682 (talk) 17:09, 29 July 2016 (UTC)

@Uanfala: In the alternative, Esperanto uses Category:1887 introductions but Category:Introductions by century only really goes to the 14th-century and as for television series debuts and products. -- Ricky81682 (talk) 20:31, 30 July 2016 (UTC)

That's interesting. The "introductions" categories seem to also contain inventions, so including Esperanto there makes sense. I think the establishment categories are more suitable for languages (an obvious, and orthogonal to the matter, exception would be including English in Category:17th-century introductions in America, but I don't see any geographical subdivisions there). Anyway, I remain a bit suspicious about seeing language articles placed directly in categories that contain the word "establishment" in their names. Uanfala (talk) 22:15, 30 July 2016 (UTC)

Uanfala, RFC is below. -- Ricky81682 (talk) 03:19, 3 August 2016 (UTC)

I agree with Uanfala, but my reservations go beyond being "a bit suspicious". I'm firmly against the use of "establishment" and "disestablishment" categories for natural languages. With the exception of the very few situations mentioned by Uanfala above, natural languages aren't ever "established", they evolve slowly over generations. The oft-quoted statement that Cnilep paraphrased above applies here: there is normally no point in any language where younger generations and older generations are unable to understand each other. A language isn't established, but rather it is defined. When we assign labels, such as Middle English to English after 1066, for example, it is then Middle English only by definition. So to then say that Middle English should be considered an 11th century establishment is circular reasoning. Also, by the time a language is written down (i.e. "attested") it has already evolved and been spoken for an unknown number of generations. To say that the date of attestation is the date the language was "established" is just wrong and misleading. As for the "disestablishment" categories, I'm not in favor of these even for extinct languages. "Disestablishment" implies an affirmative action (such as closing a business or dissolving a state). A language becomes extinct because the last speaker dies, not because of any action taken towards the language.--William Thweatt ^Talk^Contribs 05:47, 3 August 2016 (UTC)

Is that really any different than when countries sort of start? Besides, it's not like, other than constructed languages, you are going to get a definitive period of like "established in 1925 BC" rather than "established during the 2nd century BC". We have evidence from specific dates, that's obvious, but I'm presuming that, like archaeologists, there is a guess to when the actual language developed separate from when the writings exist. Still the language is extinct on that date either because the last speaker has died or because all speakers have converted to a newer language. I mean it is said that Middle English became modern English because someone who used to speak Middle English died and thus no one did any longer (I know, it's transitional so no literal person exists) which occurred vaguely during the 11th century. I don't expect this to be defined to the year (the 3rd-millennium BC ones are obviously going to be vague) but is it useful you think to know which languages and other things were developing together in the 3rd-millennium BC in Africa, so to speak? It seems like a useful and accurate categorization. -- Ricky81682 (talk) 19:26, 3 August 2016 (UTC)

Is Gyani Maiya notable?

Is Gyani Maiya, one of the last speaker of the Kusunda language notable? We have a lot of articles on the last speakers of a language (see Category:Last known speakers of a language). See more information here. Thanks :) Inter&anthro (talk) 03:51, 10 August 2016 (UTC)

Inter&anthro I'd review Wikipedia:Articles for deletion/Boa Sr. and how Wikipedia:Articles for deletion/Roscinda Nolasquez is going. -- Ricky81682 (talk) 05:10, 10 August 2016 (UTC)

They both closed as keep, and there's a suggestion that last speakers of a language, covered as such in RS, are inherently notable or at least notable by default absent a strong showing to the contrary. Some extended debate in one of them in interesting for reminding (rather strenuously) that even if WP:GNG was not met, the information in a stub on such a person should be merged into the article on the language. — SMcCandlish ☺ ☏ ¢ ≽^ʌⱷ҅_ᴥⱷ^ʌ≼ 11:02, 13 August 2016 (UTC)

Ozark English → Appalachian English?

I know that this discussion (Talk:Southern_American_English#Merger from "Ozark English"?) has been open for a very long time, but it seems to me that very little actual discussing is occurring. Can we please have more voices/opinions on whether the measly stub Ozark English could be incorporated as a section of Appalachian English, due to its being a subset of this variety. So far, two have opposed, but when I try to continue discussions with them or counter their arguments... silence. I also recently found some new information to bolster my argument, but there have been no responses. Obviously I'm in favor of the merger, but I'm fine if it's shot down so long as people actually carry on a true dialogue. Please agree/disagree/comment/etc. Thanks! Wolfdog (talk) 15:11, 19 August 2016 (UTC)