Talk:Usage share of operating systems: Difference between revisions
Line 221: | Line 221: | ||
Both AT Internet and StatOwl ignore mobile clients in their reports (well, in fact AT Internet notices the existence of iOS but doesn't consider Android worth counting, nor as a Linux "variant", StatOwl just ignores them). That makes the rest of the values inflated, so comparing the numbers from these two sources with the rest isn't a fair comparison. Thus, I propose for us to just stop taking into account both these sources, until they start reporting (or taking into account in their reports) the existence of mobile web clients. [[Special:Contributions/195.23.131.230|195.23.131.230]] ([[User talk:195.23.131.230|talk]]) 15:58, 12 April 2011 (UTC) |
Both AT Internet and StatOwl ignore mobile clients in their reports (well, in fact AT Internet notices the existence of iOS but doesn't consider Android worth counting, nor as a Linux "variant", StatOwl just ignores them). That makes the rest of the values inflated, so comparing the numbers from these two sources with the rest isn't a fair comparison. Thus, I propose for us to just stop taking into account both these sources, until they start reporting (or taking into account in their reports) the existence of mobile web clients. [[Special:Contributions/195.23.131.230|195.23.131.230]] ([[User talk:195.23.131.230|talk]]) 15:58, 12 April 2011 (UTC) |
||
:AFAICS AT Internet includes Android and a number of things under 'other'. This is perfectly OK for our purposes. StatOwl is more of a problem because they just take desktop OSes with above 0.1% share and expand the numbers so they add up to 100%. This is inconsistent with the rest of our table and there's no easy way of fixing it. So I think we should keep AT but I've no objection to removing StatOwl if that's where the consensus is.--[[User:Harumphy|Harumphy]] ([[User talk:Harumphy|talk]]) 07:17, 13 April 2011 (UTC) |
:AFAICS AT Internet includes Android and a number of things under 'other'. This is perfectly OK for our purposes. StatOwl is more of a problem because they just take desktop OSes with above 0.1% share and expand the numbers so they add up to 100%. This is inconsistent with the rest of our table and there's no easy way of fixing it. So I think we should keep AT but I've no objection to removing StatOwl if that's where the consensus is.--[[User:Harumphy|Harumphy]] ([[User talk:Harumphy|talk]]) 07:17, 13 April 2011 (UTC) |
||
::AT Internet: The fact that AT Internet puts under "other" things that we don't makes our data on "other" and what fits in there for AT Internet and not for us erroneous. The only ways we're being correct about the data we're dealing with is either by removing AT Internet as a source, or putting things like Android also under other, like they do. So we actually have three different choices: 1) being wrong (as we are now), 2) removing one source (and thus removing the accuracy of the data we're presenting), or 3) putting Android under other, which I honestly don't like, since Android is technically Linux, so the numbers of "Linux" would be "some Linux", which would cause confusion... [[Special:Contributions/195.23.92.1|195.23.92.1]] ([[User talk:195.23.92.1|talk]]) 16:07, 8 August 2011 (UTC) |
|||
::StatOwl - I vote on removing StatOwl, since the fact that they don't have an "other" makes their data meaningful only in comparison between those OSs they have stats on. It might be interesting data, but it simply doesn't fit on what we're trying to represent in this table. [[Special:Contributions/195.23.92.1|195.23.92.1]] ([[User talk:195.23.92.1|talk]]) 16:07, 8 August 2011 (UTC) |
|||
:I oppose removing StatOwl - they are valid reliable and verifiable source. We could add note explaining their methodology if more explanations is needed, but not to remove this source.[[User:Wikiolap|Wikiolap]] ([[User talk:Wikiolap|talk]]) 17:34, 13 April 2011 (UTC) |
:I oppose removing StatOwl - they are valid reliable and verifiable source. We could add note explaining their methodology if more explanations is needed, but not to remove this source.[[User:Wikiolap|Wikiolap]] ([[User talk:Wikiolap|talk]]) 17:34, 13 April 2011 (UTC) |
||
::They are reliable and verifiable, yes, but they're not measuring the same thing we're representiong on that table. They represent the share between a list of OSes, while we're representing the share between all OSes (thus the "other" column). They don't give us enough data (an other column, for instance) to even find out what's the real percentage of those OSes they're representing, so their numbers, while interesting, simply don't have enough info to fit in our table. Putting them there, as they are nowadays, just adds known-yet-unmeasurable error into the table... [[Special:Contributions/195.23.92.1|195.23.92.1]] ([[User talk:195.23.92.1|talk]]) 16:07, 8 August 2011 (UTC) |
|||
== Linux table headings == |
== Linux table headings == |
Revision as of 16:07, 8 August 2011
Technology Unassessed | |||||||
|
Computing Unassessed | ||||||||||
|
|
|||||||
Linux share: Caitlyn Martin's blog piece
Is this [1] a credible secondary source? It seems to me to be an exercise in wishful thinking. It seems to be clutching at straws. As a Linux enthusiast myself I've tried to follow her argument but it doesn't stack up IMHO. She says "The best estimate for present sales is around 8%", but she doesn't cite a source for this estimate, and in any case, present sales is a very different thing from the total installed base, bought over several years, that makes up usage share.
I can quite accept that the web client stats under-measure Linux a bit, mainly because Linux users are relatively security and privacy conscious and thus more likely to disable javascript, install adblock etc., all things which reduce counting on the third-party stats sites. It's interesting that Wikimedia's figures, based on server log files and thus immune to this hazard, show a somewhat higher figure (1.57%) than most of the others.--Harumphy (talk) 14:00, 4 December 2010 (UTC)
- Agreed, but still not 8%. I tried to work that 8% figure in somewhere too, but the jump from 'installed user-base' to 'current sales' seemed too sharp for a short addition to existing text. The only way would be to devote a whole couple of sentences to it somewhere, and I'm not sure if she is notable enough for that. O'Reilly is a good source, but I'm not sure of her status to be speaking for them. --Nigelj (talk) 15:10, 4 December 2010 (UTC)
- The only cited source there is quote from Steve Ballmer where he says that internal Microsoft research showed Linux and MacOS shares comparable. We already have this source covered. The blog doesn't seem notable enough to include in the article.Wikiolap (talk) 20:03, 4 December 2010 (UTC)
Should we remove the Wikimedia web client statistics?
The article currently states: "All of these sources monitor a substantial number of web sites. Statistics that relate to a single web site are excluded." To a large extent, this is not true for Wikimedia, of which Wikipedia alone is by far their most trafficked web site (although one that most English-language Web users have visited).
Also note that the Wikimedia report is based on the total number of HTTP requests rather than the number of unique clients (as determined using cookies). We need to consider the merits of the two approaches and which is more accurate. The Wikimedia report could easily be biased toward those operating systems used by those who access Wikipedia more often (although the others could be influenced by how much of each browser's user base regularly clears cookies). On these two principles, should we exclude the Wikimedia statistics? PleaseStand (talk) 00:48, 14 December 2010 (UTC)
- Wikimedia's stats cover 60-odd sites within the Wikimedia family.[2] While this is much less 'substantial' than many of the other sources, it's much greater than 'one', the avoidance of which (specifically w3schools) was the original purpose of that sentence. (From time to time we get people trying to add w3schools' stats to the table, or suggesting that we should on this discussion page. Often they seem to be unaware that that site's stats are for its own site only, and that that site is aimed at web developers - a highly atypical readership with a much more diverse set of web clients than the general web-using population.) Also, the Wikipedias are very high-traffic sites. The English one disproportionately so, granted, but there are similar regional/linguistic skews in many of the other stats sources too. So I wouldn't exclude Wikimedia stats on the grounds that they monitor an insubstantial number of sites.
- AFAIK there's no evidence to suggest that certain operating systems are used by those who access WP more often. Just as there's no evidence that certain OS's are used more by those who clear cookies, block scripts, use adblock etc. I imagine many of us have our suspicions in this regard, but no actual evidence. And if we had such evidence, the magnitude of the biases they introduce may be no larger than many of the other biases we already know about and to which all the sources are prone. So I don't think there's a case for excluding Wikimedia stats here, either. Harumphy (talk) 11:05, 14 December 2010 (UTC)
Northern Ontario Jacob12190 (talk) 11:06, 14 December 2010 (UTC)
Web client table tweaks - January 2011
I propose to make a couple of minor tweaks to the table when the December figures come out in the new year, unless there are objections here first:
- For the Clicky desktop/mobile 'in lieu' split, take the mean of the Net Market Share and Statcounter figures instead of just using Statcounter. (This will probably have the effect of reducing Clicky's mobile share from around 4.1% to around 3.6%.) The footnote will explain what has been done.
- Android is rising rapidly and within a few months may overtake what we currently call 'mainstream' Linux. I propose to change the "mainstream" sub-heading to "desktop distros.". Harumphy (talk) 14:56, 29 December 2010 (UTC)
- 2) Oppose. There are several other mobile Linux distributions such as Maemo currently included within Mainstream Linux.1exec1 (talk) 00:24, 31 December 2010 (UTC)
- Fair enough. I've done #1 but not #2 in today's update.Harumphy (talk) 11:09, 1 January 2011 (UTC)
Should we remove AT Internet Institute from web client stats?
We're seeing constant changes in the data, month by month (summary of each month here). With lack of frequent updates by ATII, we're ending up with less accurate results... 195.23.92.1 (talk) 18:18, 7 January 2011 (UTC)
- Support 1exec1 (talk) 00:55, 8 January 2011 (UTC)
- Be consistent - if we are to remove sources which don't update every month - then remove all of them, i.e. including Wikipedia one. If Wikipedia stays then ATII should also stay. Wikiolap (talk) 17:12, 8 January 2011 (UTC)
- Remove - ATII is consistently slow at updating their stats. Remove for January's stats, unless they update. As for Wikimedia, we have a little more control over it. I've emailed the person updating the stats in the past, I think we just need to have him set up something more automated, because I think he has to manually run his scripts. Or talk to someone else that has access to the logs and can give them to us. Jdm64 (talk) 18:57, 8 January 2011 (UTC)
- Keep. Last time we discussed this (see archive) we decided to keep stats for up to 12 months. It used to say as much in the web client section. (Somebody, unaware of the discussion and seeing that all the stats at the time were more recent, changed the "12 months" to "few months". I've just changed that back.) If 12 months is too long a period, then we should reduce that period. Whatever we do, we should apply the same time limit to all the sources.Harumphy (talk) 19:21, 8 January 2011 (UTC)
- I, the guy who raised the issue in the first place, agree with this item, them. I didn't know nor found out anywhere that the discussion was held in the past and that "12 months" was decided. Since it was, let's just abide to the decision. 195.23.92.1 (talk) 14:09, 10 January 2011 (UTC)
- Hmm, I haven't found any discussion in the archives. The problem is that the software market evolves very rapidly. Most of the time initial adoption of some product increases exponentially, by allowing 12 month delay we face data errors of more than 10% [3]. For example now AT Institute data is different from the median/mean data in Windows columns by huge margins (W7 - ~7%, Vista - ~4%, XP - ~11%). If we reduce the allowed delay to, say 6 months or less, we can lower the possible data errors more than two times.1exec1 (talk) 12:10, 21 January 2011 (UTC)
- The 12 month precedent came up in Talk:Usage_share_of_operating_systems/Archive_1#web_clients_summary_table, in a brief discussion about what to do with OneStat data from Dec 8, 2008 that was getting on for one year old at the time. Jdm64 suggested "less than one year old" and I agreed - nobody else took part in the discussion. We removed OneStat on Dec 9, 2009. For a long time afterwards the article mentioned 12 months, which I recently reinstated.
- As far as the time limit goes, I don't think it matters about AT being 'out of date' because (a) it's an encyclopedia, not a news site, so up-to-the-minute topicality is not essential, and (b) our choice of median rather than mean does a good job of excluding outlier figures. Harumphy (talk) 15:04, 21 January 2011 (UTC)
Median Windows Numbers
I've been playing with the Median numbers, which I had intended to quote in an article I was writing. I'm not quoting them. The numbers do NOT add up. No matter what I've tried, I cannot get those numbers of make any sense, and since there is no explanation of the calculations used to determine the Median, the only conclusion I can draw is that the numbers were either invented, or are in error. So instead of reporting your numbers, I'm reporting them, and my conclusion that this article is in error. If you want to look at the article, which is my prediction for where OS usage shares will be in 2012, it will be at: http://madhatter.ca
One other point - Netbooks should be included as Notebooks, and Tablets should be in a separate category, due to the form factor. Tablets have more in common with phones than they do with notebooks.
UrbanTerrorist (talk) 20:01, 8 January 2011 (UTC)
- The medians are calculated just like any other median, surely? In each column, the median is the middle value of the group. Thus the median of 1, 2 and 999 is 2. Where there is an even number, it's the mean of the middle two, so the median of 1,2,3 and 999 would be 2.5.) Are you saying that our table doesn't do this? If not, what precisely do you think does not add up?Harumphy (talk) 20:08, 8 January 2011 (UTC)
- Are they calculated like any other median? Who knows? There is no explanation as to the method(s) used, and no reference to an explanation. In effect we are given numbers, and told to believe them, which is against the policies of Wikipedia. Either provide an explanation, or remove the median figures.
- The Median label links to Median article which explains how medians are calculated. Do you consider this is not enough ? Do you propose adding a footnote with more details on it ? Wikiolap (talk) 18:51, 11 January 2011 (UTC)
- The term 'median' has a precise meaning and there's only one way of calculating it. Anyone who wants to can calculate the median herself and get the same figure. The problem here is not the article, but your inability to click on the link to the median page to find out what it means. Harumphy (talk) 13:58, 17 January 2011 (UTC)
- A link to the explanation would make things clearer. UrbanTerrorist (talk) 07:51, 27 January 2011 (UTC)
- And the very common problem of confusing median with mean. or believing they share some properties that they do not. Like adding up to 100 % (under certain circumstances). The different properties are not necessarily easy to understand, and the article is quite heavy for somebody not used to mathematical theory. --LPfi (talk) 10:49, 18 January 2011 (UTC)
- Agreed. Which is why an explanation, or a link to an explanation is needed. UrbanTerrorist (talk) 07:51, 27 January 2011 (UTC)
- I should mention that I'm using the numbers for some articles that I'm writing, and I want to make sure that they are as accurate as possible, thus the questions. And yes, I link back to the source.
What gives with server market share BY REVENUE.
Revenue doesn't measure server market share, it measures how much money the supplier of one type of server rakes in. It also measures how much purchaser have had to pay for servers of that type ... both of these are the same number. Becuase some types of server cost far more than others, the metric skews the perception of market share towards which servers cost the most. These are perceived as having a greater market share, even though they may have relatively small numbers.
Furthermore, there are far more purchasers of servers than there are suppliers. Ergo, from the overwhelmingly predominant perspective, it is better to label this number as COST rather than revenue. Far more people would see it as a cost as opposed to those who see it as revenue.
So, I will keep trying to change the word "Revenue" within the server market share section to read "Cost" instead, until it sticks, because this wording gives the proper perspective on it from the vast majority viewpoint.
Alternatively, one could remove the "Revenue" metrics entirely, because they simply do not show server market share as they purport to.—Preceding unsigned comment added by 118.210.63.179 (talk) 10:46, 9 January 2011
- Please sign your comments - otherwise we won't know who said what. Please see Wikipedia:Signatures.
- Before you 'keep trying to change the word "Revenue" etc.' please read Wikipedia:Edit_warring.
- As you've said, revenue and cost refer to the same number. It's the same sum of money seen from the two sides of the deal. The sources we're citing measure sales, not purchases. So revenue is the more accurate term in this context. Harumphy (talk) 13:59, 9 January 2011 (UTC)
- Why include market share by revenue figures at all? They're useful for stock investors (i.e. which OS is generating the most revenue from a given market), but they will deceive casual readers expecting to learn about the "Usage share of operating systems on servers". Wallers (talk) 14:16, 9 January 2011 (UTC)
- IDC and Gartner are well known sources in the server industry. They report in revenue probably because business people are more interested in the money instead of the number of units -- it's what their use to. Also it's easier to measure because you can't just check what OS a server is running like you can with desktop web browsers. A server could be hosting several virtual servers (upwards to 15) and each virtual server can have it's own ip address. So without detailed investigation, you'd see 15 servers when in reality there's only one real server. Henceforth, the real number of servers (as opposed to number of installs of an OS) is more closely correlated to revenue. Although, it gets complicated because the licensing of Linux servers can be free if the company goes with a distro without 24/7 support (like debian or centos) or it could be costly if paying for a full subscription of RHEL. So, even-though Linux is low on revenue, it's still really high in actual usage. Jdm64 (talk) 18:20, 9 January 2011 (UTC)
- Jdm64 wrote it exactly right. There are different metrics to measure market share, and they indeed measure different things. Market share by units is interesting to know who which server OS is most popular. Market share by revenue is interesting to see which server OS vendor is making most money. So there is no contradiction - both are interesting, both are useful, just for different purposes. Wikiolap (talk) 18:11, 10 January 2011 (UTC)
- "Making the most money" also means "getting the most money out of people for less cost to yourself". "Revenue" is a word with positive connotations in people's mind, whereas "Cost" has negative connotations. "Revenue" and "cost" are the same number ... what is revenue to sellers of servers is cost to buyers of servers. Since there are far more buyers than sellers, it is in the best interests of more people to show market share from the perspective of buyers rather than sellers (that is, to show it as cost rather than revenue). A casual reader might see the OS with the highest revenue and without thinking about it associate that positive term with "the best choice", when in fact it is the most costlly to him/her. From this perspective, citing statistics for sale value(price) of servers, labelling it as positive-sounding term revenue, and claiming that this shows "market share" is doing a sever dis-service to most people. In fact, it comes perilously close to free advertising for one company, which I would have thought goes against Wikipedia policy.—Preceding unsigned comment added by 118.210.63.179 (talk) 14:08, 11 January 2011
- I think you are stretching the definition of the advertising a bit :) Your thinking about positive vs. negative association is interesting, but it is your opinion only. In encyclopedia we cite verifiable and reliable sources - and both Gartner and IDC qualify as such. They label the metric Revenue, and we must respect that.Wikiolap (talk) 18:54, 11 January 2011 (UTC)
- "Making the most money" also means "getting the most money out of people for less cost to yourself". "Revenue" is a word with positive connotations in people's mind, whereas "Cost" has negative connotations. "Revenue" and "cost" are the same number ... what is revenue to sellers of servers is cost to buyers of servers. Since there are far more buyers than sellers, it is in the best interests of more people to show market share from the perspective of buyers rather than sellers (that is, to show it as cost rather than revenue). A casual reader might see the OS with the highest revenue and without thinking about it associate that positive term with "the best choice", when in fact it is the most costlly to him/her. From this perspective, citing statistics for sale value(price) of servers, labelling it as positive-sounding term revenue, and claiming that this shows "market share" is doing a sever dis-service to most people. In fact, it comes perilously close to free advertising for one company, which I would have thought goes against Wikipedia policy.—Preceding unsigned comment added by 118.210.63.179 (talk) 14:08, 11 January 2011
- Jdm64 wrote it exactly right. There are different metrics to measure market share, and they indeed measure different things. Market share by units is interesting to know who which server OS is most popular. Market share by revenue is interesting to see which server OS vendor is making most money. So there is no contradiction - both are interesting, both are useful, just for different purposes. Wikiolap (talk) 18:11, 10 January 2011 (UTC)
- IDC and Gartner are well known sources in the server industry. They report in revenue probably because business people are more interested in the money instead of the number of units -- it's what their use to. Also it's easier to measure because you can't just check what OS a server is running like you can with desktop web browsers. A server could be hosting several virtual servers (upwards to 15) and each virtual server can have it's own ip address. So without detailed investigation, you'd see 15 servers when in reality there's only one real server. Henceforth, the real number of servers (as opposed to number of installs of an OS) is more closely correlated to revenue. Although, it gets complicated because the licensing of Linux servers can be free if the company goes with a distro without 24/7 support (like debian or centos) or it could be costly if paying for a full subscription of RHEL. So, even-though Linux is low on revenue, it's still really high in actual usage. Jdm64 (talk) 18:20, 9 January 2011 (UTC)
- Why include market share by revenue figures at all? They're useful for stock investors (i.e. which OS is generating the most revenue from a given market), but they will deceive casual readers expecting to learn about the "Usage share of operating systems on servers". Wallers (talk) 14:16, 9 January 2011 (UTC)
IDC also report units, so there's no reason to use revenue, and I've changed the numbers to the unit rather than revenue figures. For the Gartner figures, I checked the source, and they appear to be unit figures as well, not revenue figures.Shalineth (talk) 06:36, 10 February 2011 (UTC)
- Can you show where in the source the reported percentages are refered as units ? In the table that we cite, the column headers say Revenue. Wikiolap (talk) 20:53, 10 February 2011 (UTC)
- The Gartner source is a three-year-old Reuters article. The text in the article reads:
- According to research firm Gartner, the Windows share of global server shipments gained a percentage point to 66.8 percent in 2007 from a year earlier. Open-source Linux's share fell by a percentage point to 23.2 percent last year and Unix dropped to 6.8 percent in 2007 from 8.1 percent in 2006.
- Note that this refers to the share of global server shipments, i.e. units, not to the share of global server revenue. The figures are also similar to IDC unit figures from about the same time, but very different from IDC revenue figures. It was only in 2005/6 or so that Windows severs overtook Unix servers in terms of revenue, but Windows has been ahead of Unix in unit shipments since the 1990s.
- The IDC figures in the table are indeed revenue for server hardware (not revenue for operating systems), but it is a methodological error to use this as an indicator of server OS 'usage share'. What possible sense is there in saying that one server costing €20 000 and running one instance of AIX contributes 40 times as much to the AIX 'usage share' as one server costing €500 and running one instance of Linux or Windows?
- I had provided a source for IDC unit shipments and corrected the table to include them, but this was reverted.
- In a comparison of server revenue or server profitability, prices matter. If HP, for example, are selling a lot of €50 000 servers and Dell are selling a lot of €1 000 servers, that has a huge impact on their respective results. If each server runs only one copy of an OS, however, the 'usage share' is 1 for each server. The idea that multiplying server operating system units by the cost of the hardware the OS runs on somehow represents 'usage share' is completely nonsensical. Shalineth (talk) 16:28, 12 February 2011 (UTC)
Overview section
Recently somebody added the overview table, and various editors including me attempted to knock it into shape. I don't think we've been very successful. I can't see where many of the figures come from. The web client medians don't (and shouldn't be expected to) add up to 100% and so don't constitute 'share' anyway. Should we delete this section, or can it be improved? Harumphy (talk) 08:55, 17 January 2011 (UTC)
- I'd lean for deleting it. Many of the fields are blank because some of the selected OSs come from disjoint usage (ie. mainframe and smartphones). Although, the one thing going for the table is the quick summary. I'd purpose that key stats from the table to be included in the opening paragraph. Something to elaborate on the current opening, but with some actual numbers. Jdm64 (talk) 10:32, 17 January 2011 (UTC)
- Does anyone want to speak in the overview section's defence? If not I'll delete it in a couple of days from now. As far as putting figures in the opening paragraph goes, ideally that would be done in a way that doesn't need updating every month. Harumphy (talk) 09:12, 18 January 2011 (UTC)
- I vote to delete it, the idea of this overview table never appealed to me, and it doesn't look like it is helping the article. Wikiolap (talk) 17:26, 18 January 2011 (UTC)
- Now deleted as agreed. Harumphy (talk) 10:14, 21 January 2011 (UTC)
Tablets
Tablets need to be moved to their own section. They have nothing in common with Netbooks, though they may replace them in some sales. Netbooks should be combined with Notebooks, the difference between them is artificial.
UrbanTerrorist (talk) 07:55, 27 January 2011 (UTC)
- Yes. A netbook is just a small laptop/notebook. I've changed the sub-heading "Netbooks and Tablets" to just "Netbooks". There's very little info on tablets as a category so far. The main one is the iPad which runs iOS and this gets covered anyway. Do all tablets speak mobile, or are some them WiFi only?--Harumphy (talk) 09:20, 27 January 2011 (UTC)
Font sizes
I've reverted 1exec1's changes to a couple of tables, in which the font size was fixed at 85% of normal.
Please, if font sizes look too big on your computer, it doesn't mean they look too big on everyone else's. The web is not a wysiwyg medium. You can adjust your browser's normal font size to suit your preferences. I have adjusted mine, and I don't want you reducing the font size on my computer (and everyone else's) just because it looks better on yours!
Besides, from a graphic design point of view it looked awful. --Harumphy (talk) 09:37, 9 February 2011 (UTC)
Possibly dubious server share claims based on websites
Are there any authoritative sources suggesting that scanning public websites is a reasonable way to estimate server market share? It seems rather dubious to me. For one thing, Linux is well known as a good OS for running web servers (the LAMP stack), so a sample of web servers may not be representative of servers in general, but rather biased towards Linux.
Another problem is that a single server OS can host a large number of small websites, whereas a large website may require several servers, especially if it makes heavy use of SSL. If website characteristics differ across sites, then an estimate of server market share based on the number of websites would be biased towards the system most favoured by smaller, less complex sites. Netcraft report a 50 per cent Windows share for SSL websites (http://news.netcraft.com/ssl-survey/), compared with a 25 per cent share for non-SSL websites, which suggests that estimates based on websites may indeed be biased towards Linux.
Unless an authoritative source suggesting that counting the number of public websites using a particular server OS is a valid way of estimating server OS market share, this looks like original research, and I suggest it be deleted. A separate section on web server OS share might be reasonable. Shalineth (talk) 06:55, 10 February 2011 (UTC)
- Netcraft is reliable and verifiable source, and we properly reference it. They choose to analyze OS share of web servers, and we accurately mention it. Hence this is not original research. Some readers may or may not agree with their methodology, but it is really not up to us to make judgments whether or not we like it. As encyclopedia we report citing reliable and verifiable sources. Wikiolap (talk) 20:51, 10 February 2011 (UTC)
- Netcraft present website statistics, not server OS market share. You're confusing two different things, and that's where the original research lies. It's like looking at valid statistics for flights out of a particular airport and claiming that represents airliner market share.Shalineth (talk) 11:13, 12 February 2011 (UTC)
Marketshare Servers based on websites is totally misleading
The current server share language is complete nonsense. It somehow asserts that webs servers are a good indicator of server share. That is utter nonsense. The vast majority of servers are not web servers.
There has been edits made to temper the uncited claims in this section, but they have been reverted.
It's clear that this section is merely a +POV apology for certain products. It's clear that wikipedia is being used as a promotional tool for certain products. —Preceding unsigned comment added by 173.206.8.177 (talk) 18:59, 11 February 2011 (UTC)
- There are two issues here:
- 1. The explanations in the Server section are indeed unreferenced, and this invites people to add even more unreferenced material. I am OK with completely removing the text, but the past experience shows that someone will add it again anyway. The better approach is to find reliable and verifiable references for that portion (something I wasn't able to easily find myself).
- 2. Methodology of measuring market share. We as encyclopedia should not pass judgement on whether some methodologies are "complete and utter nonsense" or not. Some people claim that measuring revenue is nonsense, some claim that measuring web servers is nonsense. Everybody is free to make their opinions - but we as encyclopedia just report on what our sources say - Gartner, IDC, Netcraft etc.
- Re: #2; "We as encyclopedia should not pass judgement on whether some methodologies are "complete and utter nonsense" or not." I agree. But in this context, presenting only publicly available webservers -- in a conversation about Server OS Marketshare **is**.
- The discussion re: web server marketshare is irrelevant here, in this context. All the uncited material from the above paragraph and the data in the "method units (web)" table should be removed. This is intentionally misleading and utter nonsense to compare oranges to bananas in way.
- 173.206.8.177 (talk) 20:29, 11 February 2011 (UTC)
- Could you tell why exactly you want to remove web units data?1exec1 (talk) 22:56, 11 February 2011 (UTC)
- We have had, at various times, three different methods of counting servers in this section: unit sales/revenue/web servers. All three have strengths and weaknesses - there is no clear-cut right and wrong here. We should just report all three, perhaps in three separate tables, pointing out the strengths and weaknesses of the methods too.--Harumphy (talk) 00:00, 12 February 2011 (UTC)
- Reporting all three in separate tables sounds like a good first step. Based on the source, however, the Gartner figures are units, not revenue, so that's wrong to start with (I corrected the mistake, but someone reverted it with no explanation). Reporting server market share in both units and revenue would be the best idea.
- Website share should be split out into a separate category, since it's a completely different issue from server OS market share. The text is also horrible, and should probably be deleted. I made some minor improvements to make it less POV, but those were reverted too.
- If you oppose splitting the website figures into a separate category, can you point to an authoritative source that claims Netcraft's website survey has anything at all to do with server market share?
- Overall, it's obvious someone is abusing the article to promote a particular POV. I'm not really interested enough to bother with it, but maybe someone with more time on their hands can correct this. If not, I suppose it'll be another case where the reputation of WikiPedia is damaged by a zealot pushing a particular POV and reverting any corrections or attempts to make the text NPOV. Shalineth (talk) 11:32, 12 February 2011 (UTC)
- Web servers are a subset of servers-in-general, so I think the best thing to do would be to do units and revenue in two tables, then add a sub-heading "Web servers" with the third table in that new sub-section.--Harumphy (talk) 13:45, 12 February 2011 (UTC)
- Yes, certainly, but web sites are not the same as web servers. I imagine it takes an enormous number of servers to run www.facebook.com, for example, whereas even a small server could run hundreds of very simple sites. The fact that web servers are a subset of total servers is a minor problem. The bigger problem is that there is no one to one correspondence between web sites and web servers, much less between web sites and either servers generally or server OS installations. This means that, barring authoritative evidence to the contrary, web site numbers cannot be considered valid estimators of even web server OS market share, much less overall server OS market share. Shalineth (talk) 14:50, 12 February 2011 (UTC)
- Web servers are a subset of servers-in-general, so I think the best thing to do would be to do units and revenue in two tables, then add a sub-heading "Web servers" with the third table in that new sub-section.--Harumphy (talk) 13:45, 12 February 2011 (UTC)
- 3. The definition of "server" is broad. Web server market share might be estimated via the web while giving numbers for File server market share (down to NAS) via web is a challenge. --95.117.233.197 (talk) 13:59, 12 February 2011 (UTC)
- 4. The conjecture that IDC or Gartner figures substantially underestimate Linux or open source servers is logically unsound. As documented here, IDC unit figures for server shipments include Windows, Linux, Unix and other. Servers sold with no operating system would thus fall into the other category. However such servers make up only about 0.3% of the total (for Q1 2010). This implies two things:
- 1. The Windows and Unix market shares, 75.3% and 3.6% respectively in Q1 2010, are minimum market share levels, and do not overstate market shares for shipped servers.
- 2. The Linux market share is not substantially understated. Even if Linux is installed on every single server that didn't ship with either Windows or Unix, its Q1 2010 market share would only increase from 20.8% to 21.1%.
- In light of the above, I suggest that the unsupported conjecture that IDC numbers understate open source server operating systems be deleted from the article, unless authoritative evidence to the contrary is provided. Shalineth (talk) 14:50, 12 February 2011 (UTC)
Suggestions for correcting server market share section
I propose the following corrections to the server market share section:
- Remove unsupported text claiming that IDC/Gartner figures understate open source OS share.
- Remove irrelevant web site share figures for possible inclusion in a separate section on website OS shares.
- Correct labelling of Gartner unit figures, which are currently mislabelled as revenue figures.
- Replace methodologically incorrect IDC server hardware revenue figures with methodologically correct server unit figures.
I probably shan't have time to check the page before next weekend. Comments appreciated. Shalineth (talk) 16:35, 12 February 2011 (UTC)
- 1 - I support removing all unreferenced claims.
- 2 - measuring market share of web sites is a valid method that at least 3 different sources use (Netcraft, securityspace, w3tech) - we should not remove legitimate reliable and verifiable sources. We already have some text which tries to clarify difference between methodologies. Maybe this text could be improved, but it should not have unreferenced claims either (see #1)
- 3 - Gartner reports revenue. The source is reliable and verifiable, but not public - the report itself costs money. I had access to it couple of years ago, I will try to get access again and verify that it is indeed revenue.
- 4 - IDC reported market share by revenue, and it is perfectly valid methodology (IDC is reliable and verifiable source). I used to have additional line in the table for IDC numbers by unit, but it was removed by other editors. I will be happy to add it back.
- Remove unreferenced claims. Report web site share in a new section, separate from the server section. Report both units and revenue in separate tables with correct labelling, even if there's only one cited source.--Harumphy (talk) 11:08, 13 February 2011 (UTC)
- If we separate website and server-share reports we better do not include the website share at all. I suggest reordering the current table in the way that sources reporting website share are grouped together. We can also introduce one more column that says which method was used to acquire the statistics. Also see my answer below. 1exec1 (talk) 15:20, 13 February 2011 (UTC)
- I disagree. The article already has a separate section for 'Web clients', which is distinct from the sections for 'Desktop and laptop computers', 'Netbooks' and 'Mobile devices'. The consistent approach for servers would be to have a section for 'Web sites', which is distinct from 'Servers'. Shalineth (talk) 21:09, 21 February 2011 (UTC)
- If we separate website and server-share reports we better do not include the website share at all. I suggest reordering the current table in the way that sources reporting website share are grouped together. We can also introduce one more column that says which method was used to acquire the statistics. Also see my answer below. 1exec1 (talk) 15:20, 13 February 2011 (UTC)
- Remove unreferenced claims. Report web site share in a new section, separate from the server section. Report both units and revenue in separate tables with correct labelling, even if there's only one cited source.--Harumphy (talk) 11:08, 13 February 2011 (UTC)
- 2. Measuring market share of web sites is a valid way of measuring web site share. This is an article about server OS usage. Is there an authoritative source claiming that measuring web site share is a valid way of measuring either web server share or web server OS share? If not, I suggest it belongs in its own section (or perhaps own article) -- an article about web site market share, as opposed to (web) server OS market share. Again, I must stress, these are not synonymous. It is a severe methodological error to assume they are. Shalineth (talk) 12:19, 13 February 2011 (UTC)
- 3,4. Gartner and IDC report both revenue and units, although not all reports contain both measures. Revenue is a valid measure for market share, which can be defined in terms of either revenue or units. This article is about usage share, which implies units. Second, the revenue figure is for servers, not server OSes. That would be fine in an article about server hardware market share, but this is an article about server OS usage share. Again, the figures are absolutely valid, but they're being used incorrectly in this article. Shalineth (talk) 12:19, 13 February 2011 (UTC)
- @Shalineth: Website share is a proxy to the actual server OS usage share in the same way as inspecting user agent strings is a proxy to desktop OS market share. If you consider them not appropriate, then sources reporting server market/unit share are not appropriate reference points either, as they report the current sales, not the share of already deployed servers.
- In conclusion all sources used in the article are biased in one or another way. Since we are only presenting and commenting the data, not interpreting it, all sources must have the same credibility, unless there is a strong reason not to do so.1exec1 (talk) 15:20, 13 February 2011 (UTC)
- This is true. Sales by hardware units and sales by hardware revenue are also proxies for OS usage share. None of the three methods correlates directly with OS usage share, but all three are of interest nevertheless. We should just report what the sources say, accompanied a concise summary of the strengths and weaknesses of each method. It is for the reader to decide how much credence to give to each method, not us. --Harumphy (talk) 09:57, 14 February 2011 (UTC)
- @ 1exec1
- It isn't quite the same thing, since there's usually a 1:1 mapping of web clients to client OSes. For web servers, a single server can run a huge number of websites, and at the other extreme, some websites require large server farms. All this means that the approximation is much closer on the client side. In any case, I think it's perfectly reasonable to include web site OS share, as long as it's properly labelled as 'Web site OS usage' and not conflated by original research with 'server OS usage'.
- The same applies to 'server OS unit shipments' and 'server hardware revenue'. It's fine to include them both, as long as it's made very clear what they are, and 'server hardware revenue' isn't mislabelled as 'server OS revenue' or 'server OS usage'. What actually brought this article to my attention in the first place was confused comments by Linux advocates who thought 'revenue' in this article referred to software vendor revenue, not to server hardware revenue, and were going on about how most users don't pay for Linux so revenue figures are invalid, etc. The section on server OSes is very unclear about these things, and looks like a clear case of misrepresentation of data (not necessarily intentional -- though the unreferenced comments suggest it is). The data are valid, but are being misused. Shalineth (talk) 21:09, 21 February 2011 (UTC)
- This sounds like a consensus to me - we keep the valid data in the article, but relabel it to disambiguate what it actually means. I would support this effort.Wikiolap (talk) 23:51, 21 February 2011 (UTC)
- It sounds like consensus to me too.--Harumphy (talk) 13:34, 27 February 2011 (UTC)
Time limit for out-of-date sources
There was some discussion earlier in Talk:Usage_share_of_operating_systems#Should_we_remove_AT_Internet_Institute_from_web_client_stats.3F. I think it's fair to say there's a consensus that we should apply the same time limit, whatever that limit is, to all the sources. At the moment it's 12 months. Someone suggested we should reduce it to 6 months. (If we did that then ATII would get removed on 1st April if they haven't updated by then, because they last updated on 31/9/2010.) So, should be cut the time limit to 6 months?--Harumphy (talk) 13:34, 27 February 2011 (UTC)
- I think yes. The previous discussion was stopped by the fact, that Wikipedia doesn't update either. As the problem has since been solved, I know no reason to keep a single old source, that skews the data.1exec1 (talk) 17:11, 27 February 2011 (UTC)
- 6 months seems reasonable to me. Jdm64 (talk) 02:21, 28 February 2011 (UTC)
- FYI ATII has just updated. They must have heard us!--Harumphy (talk) 16:02, 1 March 2011 (UTC)
- Yes, and I'm extremely disappointed with them. As you can see with the more detailed PDF, they consider Android as the "Google Operating System" and as if not being Linux, providing unaccurate data for this table... 89.181.106.123 (talk) 00:29, 2 March 2011 (UTC)
Mobile Devices Citation
Caption on image currently reads "Share of 2010 Q4 smartphone sales to end users by operating system, according to Gartner", followed by a citation.
The numbers in the pie chart are not contained within the cited article. The cited article was written on 19 May 2010, and reports on 2010 Q1 numbers.
Caption should be revised to cite an article containing the numbers used on the pie chart, or the pie chart should be changed to reflect the numbers in the cited article. Mismatches are bad, mmmkay?
64.113.8.130 (talk) 22:55, 4 April 2011 (UTC)
Web clients - remove sources
Both AT Internet and StatOwl ignore mobile clients in their reports (well, in fact AT Internet notices the existence of iOS but doesn't consider Android worth counting, nor as a Linux "variant", StatOwl just ignores them). That makes the rest of the values inflated, so comparing the numbers from these two sources with the rest isn't a fair comparison. Thus, I propose for us to just stop taking into account both these sources, until they start reporting (or taking into account in their reports) the existence of mobile web clients. 195.23.131.230 (talk) 15:58, 12 April 2011 (UTC)
- AFAICS AT Internet includes Android and a number of things under 'other'. This is perfectly OK for our purposes. StatOwl is more of a problem because they just take desktop OSes with above 0.1% share and expand the numbers so they add up to 100%. This is inconsistent with the rest of our table and there's no easy way of fixing it. So I think we should keep AT but I've no objection to removing StatOwl if that's where the consensus is.--Harumphy (talk) 07:17, 13 April 2011 (UTC)
- AT Internet: The fact that AT Internet puts under "other" things that we don't makes our data on "other" and what fits in there for AT Internet and not for us erroneous. The only ways we're being correct about the data we're dealing with is either by removing AT Internet as a source, or putting things like Android also under other, like they do. So we actually have three different choices: 1) being wrong (as we are now), 2) removing one source (and thus removing the accuracy of the data we're presenting), or 3) putting Android under other, which I honestly don't like, since Android is technically Linux, so the numbers of "Linux" would be "some Linux", which would cause confusion... 195.23.92.1 (talk) 16:07, 8 August 2011 (UTC)
- StatOwl - I vote on removing StatOwl, since the fact that they don't have an "other" makes their data meaningful only in comparison between those OSs they have stats on. It might be interesting data, but it simply doesn't fit on what we're trying to represent in this table. 195.23.92.1 (talk) 16:07, 8 August 2011 (UTC)
- I oppose removing StatOwl - they are valid reliable and verifiable source. We could add note explaining their methodology if more explanations is needed, but not to remove this source.Wikiolap (talk) 17:34, 13 April 2011 (UTC)
- They are reliable and verifiable, yes, but they're not measuring the same thing we're representiong on that table. They represent the share between a list of OSes, while we're representing the share between all OSes (thus the "other" column). They don't give us enough data (an other column, for instance) to even find out what's the real percentage of those OSes they're representing, so their numbers, while interesting, simply don't have enough info to fit in our table. Putting them there, as they are nowadays, just adds known-yet-unmeasurable error into the table... 195.23.92.1 (talk) 16:07, 8 August 2011 (UTC)
Linux table headings
For clarity and consistency between sections, I suggest we change the top-level heading in both the web client and mobile device tables from "Linux" and "Linux based" respectively to "Linux kernel based", and change the second-level heading in the web client table from "mainstream" to "Linux". --Harumphy (talk) 19:34, 11 May 2011 (UTC)
- I don't think that's the best solution. For one "Linux kernel based" is a long title. Second, I think it would be confusing. What's the difference between "Linux" and "Linux kernel base"? I understand what you're trying to say, but would others? I think it's fine how it is, or possibly, "Linux" as the top heading (or "Linux based") and then sub-headings of "GNU/Linux" and "Android/Linux". Jdm64 (talk) 22:14, 11 May 2011 (UTC)
- Linux has two meanings: (1) the Linux kernel, and (2) the family of operating systems based around it, which are largely binary compatible with each other and traditionally known as Linux distributions. Then there is Android, which uses a forked Linux kernel, is binary incompatible with Linux distributions and has a stack sitting on the kernel which is very different from anything else. The only thing that Android has in common with Linux distributions is the kernel, and that is a heavily modified, incompatible derivative. I am aiming to better reflect the two meanings, and to deal with the fact that within a couple of months or so it looks as though Android will be more mainstream than the stuff we currently call "mainstream". As far as length goes, "Linux kernel based" will fit without expanding column width. (I've tried it.) I don't thing we should use GNU/Linux or Android/Linux as they really are too long, don't reflect what the sources say and do not aid understanding at all. --Harumphy (talk) 08:05, 12 May 2011 (UTC)
- I am more confused by "Linux kernel based" vs "Linux based" as they may be understood as synonyms and anything Linux based is certainly Linux kernel based. We must of course use terminology that reflects what the sources are talking about, but isn't most Linux except Android indeed GNU/Linux (which is not longer than "Linux based")? If there is significant use of other Linuces (affecting the decimal points we are writing out) simply "Android" and "Other Linux" should do. --LPfi (talk) 11:51, 12 May 2011 (UTC)
[section break]
Just to be clear, I'm suggesting this:
Linux kernel based | |
---|---|
Linux | Android |
The top line is an umbrella heading that accurately reflects the only thing that Linux distributions and Android have in common: some sort of Linux kernel. In the second line, Linux means what it is most commonly understood to mean - a Linux distribution. In this I'm taking the view that Android is *not* a Linux distribution in the conventional sense because it has so little in common with Debian, Ubuntu, Fedora, RHEL, SuSE etc. All of the stats sources except Wikimedia separate Linux and Android in this way. --Harumphy (talk) 12:40, 12 May 2011 (UTC)
- Like LPfi said, anything Linux based is surly Linux kernel based; This is like how Linux is a Unix-Like OS. Your headings look redundant, especially to somebody that doesn't know about Linux; and it doesn't make somebody want to learn what the distinction is. I think the layout below clearly shows the distinction between normal Linux and android. "Linux based" is a link to "Linux kernel". "GNU/Linux" could be 2 separate links to GNU and Linux or one link to Linux Distribution. How is that not simple and clear? Jdm64 (talk) 20:22, 12 May 2011 (UTC)
Linux Based | |
---|---|
GNU/Linux | Android |
- The phrase "Linux based" is no more informative than just "Linux", because it doesn't make clear which of the two things called Linux forms the base. Is could mean either just the kernel or the kernel plus the stuff that makes a Linux distribution. So, to answer your question, it's not simple and clear because it's ambiguous. Sure, the kernel's always there, even in Android, but the other stuff isn't. By excluding the word kernel, it doesn't make it clear that Android is based on only the kernel and not the other stuff. The 'umbrella' heading should reflect what the things under it have in common. They have only one thing in common: the kernel. That is why the k-word is the key to comprehension here. --Harumphy (talk) 23:38, 12 May 2011 (UTC)
- Ok, fine, include kernel. But that still doesn't remove the confusion about "Linux kernel based" and "Linux". It should be "GNU/Linux" to show how Linux kernel based is different than Linux. Jdm64 (talk) 01:24, 13 May 2011 (UTC)
- Fair enough. Thanks. I'll settle for that. --Harumphy (talk) 07:13, 13 May 2011 (UTC)
- Ok, fine, include kernel. But that still doesn't remove the confusion about "Linux kernel based" and "Linux". It should be "GNU/Linux" to show how Linux kernel based is different than Linux. Jdm64 (talk) 01:24, 13 May 2011 (UTC)
- The phrase "Linux based" is no more informative than just "Linux", because it doesn't make clear which of the two things called Linux forms the base. Is could mean either just the kernel or the kernel plus the stuff that makes a Linux distribution. So, to answer your question, it's not simple and clear because it's ambiguous. Sure, the kernel's always there, even in Android, but the other stuff isn't. By excluding the word kernel, it doesn't make it clear that Android is based on only the kernel and not the other stuff. The 'umbrella' heading should reflect what the things under it have in common. They have only one thing in common: the kernel. That is why the k-word is the key to comprehension here. --Harumphy (talk) 23:38, 12 May 2011 (UTC)
I believe Android should be reclassified as a mobile device. See my comments there. hhhobbit (talk) 14:24, 5 June 2011 (UTC)
Count Amazon Kindle?
Amazon Kindle was reported to likely break 8 million units sold last year. http://www.slashgear.com/amazon-likely-to-break-8-million-kindle-units-sold-this-year-21120580/
With quite a few media being sold: http://news.cnet.com/amazon-kindle-books-outselling-all-print-books/8301-17938_105-20064302-1.html Better data is likely available. Seems these are significant numbers. --89.12.7.116 (talk) 20:57, 26 May 2011 (UTC)
- This page is about usage share of operating systems, not devices. The OS that the Kindle uses is Linux, so if we were to add it, it would only be a small side note that the Kindle runs Linux. I think it's more appropriate that the information be added to Linux-based devices. Jdm64 (talk) 00:16, 27 May 2011 (UTC)
I have written this about thirty times and each time started over. I would like to do that again right now Saying Kindle is Linux is like saying Mac iOS is OS-X, or OS-X is FreeBSD. Mac OS-X uses launchd to start everything. Except for a few things that init starts, init is basically something that all other processes have as their parent if they lose their immediate parent. launchd does not work the same way. Is OS-x's launchd the same thing as init in Unix / Linux? No. The same thing is occurring with these mobile OS. One mobile OS has the distinction of being derived from nothing but being its own little entity from the start - Blackberry. All the other mobile OS are diverging so far away from what they were derived from that the code base is becoming meaningless. iOS really is that different from OS-X. But each OS is really not just the kernel. It is all of the things that go together including the hardware that make up that system. Unless you want to have a separate category for each of these mobile OS I suggest you lump them all together with the category mobile OS. They have more similarities with each other than they do with what they were derived from. Apple has joined Windows in having malware that self installs now with no password required on Macintosh OS-X as long as the user account you are using has administrator privileges. It has the promise of continuning that way unless Apple finally wises up and begins requiring a password for software installs for all OS-X users. May I humbly suggest these malware problems are making a lot of people mobile OS only users? But you have been caught napping. Apple sold more iPhone and iPad systems in the last two quarters than they did OS-X. The malware problems with the predominant desktop systems combined with Twitter and other things are making many current desktop OS systems dinosaurs. So I suggest you have a separate mobile OS category with maybe a break down showing what each was derived from. But the malware problems of the predominant desktop systems are rapidly making mobile OS as the tour de force of the future. Would I have predicted that two short years ago? No. I was also caught napping. It is rapidly progressing toward a future where many people will be mobile OS only users, storing their data in the cloud (data storage repositories) and printing to new printers that use BlueTooth. Any general mobile OS that doesn't make provisions to share the data that was created on it with a different general mobile OS from another vendor will rapidly become a relic of the past. IMHO, your current classification scheme was what was there in the past and what we have now is becoming increasingly incongruent with what you have. You are missing what has been happening with these mobile devices. Mobile OS are rapidly becoming the OS of the future. The fact that 8 million Kindle units have been sold indicates that things are changing. Did we have eight million new installs of desktop Linux systems last year? No. Your percentages are woefully out of data, but mostly because your categorization is wrong. Kindle is not Linux. iOS is not Macintosh / OS-X. They are now separate entities with very little similarity to what they were derived from. hhhobbit (talk) 02:57, 6 June 2011 (UTC)
- The problem is I still don't know where the data would fit on this page given the current sections. I'm not saying the information is unimportant, just not suited for this page. This page is still about OSs, and the OS of the Kindle is Linux kernel based. It's just not a traditional desktop distribution. Similarly iOS is based on the Darwin OS, just like MacOSX. Jdm64 (talk) 20:41, 6 June 2011 (UTC)