Jump to content

Talk:ASCII

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Dustek (talk | contribs) at 23:37, 17 June 2020 (Incomprehensible without a working knowledge of the subject: new section). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Template:Vital article

Former featured articleASCII is a former featured article. Please see the links under Article milestones below for its original nomination page (for older articles, check the nomination archive) and why it was removed.
Article milestones
DateProcessResult
January 19, 2004Refreshing brilliant proseKept
December 30, 2005Featured article reviewKept
May 10, 2008Featured article reviewDemoted
Current status: Former featured article

ZX Spectrum chr$ 96

Sinclair ZX Spectrum uses chr$ 96 as POUND-sign 85.149.83.125 (talk) —Preceding undated comment added 15:44, 18 June 2018 (UTC)[reply]

Yes - the ZX Spectrum character set page says

It is based on ASCII-1967 but the characters ^, ` and DEL are replaced with ↑, £ and ©. It also differs in its use of the C0 control codes other than the common BS and CR, and it makes use of the 128 high-bit characters beyond the ASCII range.

so perhaps ASCII#8-bit codes should not say "the more common ASCII-1967, such as found on the ZX Spectrum computer", but instead say the ZX Spectrum character set was a modified version of ASCII-1967 (it's not based on ISO 646, because DEL is part of ISO 646 but the Spectrum code uses it as a printable character). Guy Harris (talk) 19:01, 18 June 2018 (UTC)[reply]

Discussion on ASCII table

I see you reverted it back to hideous. Well I tried. Would like to explain some of my reasons to change this:

Good faith, but I very much prefer the old table:

Standard layout (used almost everywhere else as well),

I was intending to change all the other instances, after trying this out, so this one would be "used almost everywhere else as well". This is already used for every Unicode code page entry so it may actually be more-used even though it is in the end less visible.

color grouping

One of my primary goals was to eliminate this bullshit. It is wrong in every other table for any non-ASCII character, and does not convey any usable information (I think everybody knows the difference between digits and letters), and destroys the ability to use colors and legends to indicate more useful information such as variances!

indication of variances

Fully intended to support this with colors, though in this case I thought it quite redundant with the information in the above tables so I left it out. Also I think the variance indicators should be reserved for character sets that a user may actually have a tiny chance of encountering, pre-67 ASCII just does not exist anywhere in the world.

more and directly readable codes

The unicode code points are consistently wrong in other tables (due to well-meaning editors changing them to the code point), so putting them in the tooltip with a clear U+ prefix and name would help a lot. Text also allows non-Unicode characters to be described.

(no tooltips, which don't show at all over here, and would require a mouse anyway))

Yes there is little if anything that can be done. If you want more information about each character the only possibility is to make a big vertical table, one character per line. If that is what you think should be done, do that instead (the ASCII page already has such a table).

Please send some constructive criticism, or state that in no way will you consider the removal of information from the boxes an improvement. Spitzak (talk) 17:29, 17 July 2018 (UTC)[reply]

Answer see below.
--Matthiaspaul (talk) 18:46, 30 September 2018 (UTC)[reply]

New smaller table

Well that got reverted with the following comment:

Good faith, but I very much prefer the old table: Standard layout (used almost everywhere else as well), color grouping, indication of variances, more and directly readable codes (no tooltips, which don't show at all over here, and would require a mouse anyway))

IMHO the current table is absolutely hideous and has been for a long time.

The color grouping is stupid busy-work and is wrong for many non-ASCII characters (many of which don't fall in any of the catagories) and does not convey any information of any use to anybody. Freeing colors to indicate variants and other interesting information helps a lot (sorry I deleted the examples in this table, which he complained about, but I felt the information was already in the huge redundant tables above, also pre-97 ASCII just does not exist in the world at all now, which is somewhat different than the variants indicated in other tables).

The extra information is confusing to users and bloat the table and make it hard to see the letters. The Unicode code points need at least a U+ prefix, but putting them in the tooltip and adding the character name helps a lot in making it clear these numbers are Unicode and not the index of the table entry (well meaning editors keep screwing this up, in fact). Also text allows non-Unicode points to be described. It is true that the tooltips do not show on tablets and phones, I don't see anything that can be done about that. If you want them visible, you should use a big vertical table, one row per character (like the earlier ASCII tables). IMHO these are far harder to locate a character in, but I can see an argument for replacing all these tables with those vertical ones. But the current hybrid is not working, change it one way or another, please!

For reference here is my sample table (note this is not intended to be used unchanged in any article, as it fills in the control characters and has some bogus coloration to show what it would look like). Check the tooltip for 'A' to see a possible way to show the decimal index:

ISO-8859-1
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
0_ NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI
1_ DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
2_ SP ! " # $ % & ' ( ) * + , - . /
3_ 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4_ @ A B C D E F G H I J K L M N O
5_ P Q R S T U V W X Y Z [ \ ] ^ _
6_ ` a b c d e f g h i j k l m n o
7_ p q r s t u v w x y z { | } ~ DEL
8_ PAD HOP BPH NBH IND NEL SSA ESA HTS HTJ VTS PLD PLU RI SS2 SS3
9_ DCS PU1 PU2 STS CCH MW SPA EPA SOS SGC SCI CSI ST OSC PM APC
A_ NBSP ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ SHY ® ¯
B_ ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
C_ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
D_ Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
E_ à á â ã ä å æ ç è é ê ë ì í î ï
F_ ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ

Template:Legend4 Template:Legend4 Template:Legend4

PLEASE post some opinions on this! I will leave it but I am not happy with blind reversion without any constructive criticism. If you have constructive changes insert them here (possibly only a few rows of the table).

Spitzak (talk) 17:57, 17 July 2018 (UTC)[reply]

One idea: does anybody know how to make it so clicking on a table cell pops up some kind of dialog box with formatting and clickable links, similar to how the references work? That would allow the information to be visible on tablets, and the ability for format it with line breaks at least would help a huge amount. Although it would not work here, does anybody know how to get the information into the link popup, while still keeping the very nice preview of the linked-to page? Spitzak (talk) 18:09, 17 July 2018 (UTC)[reply]
I will think about it as well, Spitzak, but right now I'm not aware of such a possibility. In either case, please note that the problem is not restricted to tablets and smartphones, but extends to normal desktop environments as well (even with a mouse). In general, I think the point of having the table is to have at least all the glyphs and codes visible at the same time for easy direct comparison, that is, without having to move some "cell pointer" first or open/close sub-menus. Also, I can't see how information in tooltips or submenus would print. Printing a table of only the glyphs appears to be pretty much pointless to me. --Matthiaspaul (talk) 21:40, 17 July 2018 (UTC)[reply]
(edit conflict) You made a very bold change and got reverted per WP:BRD. Nobody likes being reverted. That's why I am very reluctant to revert other editors' contributions, but in this case I saw no way forward without first restoring the long established "status quo" - after all, your proposed changes were undiscussed. Please don't change the table in other articles (as you already did in the ISO/IEC 8859-1 article) unless there would be broad consensus for it.
Also, it is counter-productive to use foul language. As a constructive editor I certainly didn't revert you blindly or light-heartly but gave a number of reasons in the edit summary.
The current table layout is used not only here but in almost all character set and codepage related articles all over Wikipedia. This alone is already a strong reason to keep it for consistency (not ruling out the possibility that it can be further improved - however, removing information or making information difficult to access is certainly not an improvement...
The color-grouping and boxing is used to indicate groups of glyphs or highlight multiple meanings of the same code point even in the same character set, or to indicate differences between revisions or closely related character sets (the actual usage depends on the article, and should be explained there). This is sometimes very useful to see patterns, or at least helps to easily spot code points which may need special attention. Ideally, this is also discussed in the text, but the information could be easily overlooked if only provided there.
The various codes given in the table cells are vital information. Tooltips can't be used as a substitute for accessibility reasons: They would show only when hovering over the glyphs with a mouse - it is impossible to see code patterns this way. Also, they don't show in many scenarios, including normal desktop configurations. This problem is not restricted to tablets and phones, or to users without a mouse. For example, my browser (Firefox) does not show them on a normal desktop PC - if anything, it shows the names of the linked-to articles, which is not helpful at all.
I take it that you want to provide the Unicode names as well. This is something I support in general (if we'd find a good way to provide this info).
As this is extra info I would not object if you would manage to put that info into tooltips without changing the other layout or removing the directly readable info from the table cells. Alternatively (and probably better for accessibility reasons), I also would not object adding a separate list of the 128 or 256 codepoints with the Unicode names (and other info) to all character set articles - that would even allow to provide more detailed comments. However, many years ago we already had such Unicode conversion lists in many character set articles and some users found them too long. Theoretically, the lists could be put on sub-pages but that would violate our naming conventions.
Regarding historical information, in an encyclopedia the details of the older ASCII variants are just as relevant as the latest version. I already didn't like it when you removed some info a while ago.
--Matthiaspaul (talk) 21:20, 17 July 2018 (UTC)[reply]

Okay, how about some ideas to remove the parts that I most object to. Here are a few that could be done independently:

  • Remove the decimal numbers, which are constantly getting confused as a decimal version of the Unicode code point or vice-versa
  • Add "U+" to the start of the unicode numbers
  • Remove the 100% width, as much padding as possible, and use minumum row/column sizes in ems to try to keep table square instead
  • Put Unicode name and other information that is not visible now into the title (ie tooltip)
  • Make a new set of colors based on Unicode character classes so they can be applied automatically (I still think these colors are an absolute and complete waste of time, and just there as puffery to try to make the table look fancier than it really is). Right now they are almost always wrong for non-ASCII and there is no "no catagory" color which forces the tables to put incorrect colors on things.
  • Make the default size the larger one used for the letters, so that you use small to make the control characters and headers, rather than the other way around.

Also can you explain what information I removed? Spitzak (talk) 22:14, 17 July 2018 (UTC)[reply]

Spitzak: Remove the decimal numbers, which are constantly getting confused as a decimal version of the Unicode code point or vice-versa
As explained many times now, the decimal index is vital information in a character set table. Removing it from the cells makes the table useless. You can't expect readers to calculate the index through some formula.
Spitzak: Add "U+" to the start of the unicode numbers
Personally, I very much like the "U+" notation, but it would make the table even larger, so I think a four-digit-value with leading zeros is good enough. Some years ago, some tables actually included the "U+", but it was removed by other editors. Perhaps the table lede macro should be improved to explain this better.
Spitzak: Remove the 100% width, as much padding as possible, and use minumum row/column sizes in ems to try to keep table square instead
I don't think the table needs to be square, however it would be nice to have all rows and columns of the same size respectively, but only if this can be achieved without sacrificing information. You can propose alternative layouts on the talk pages, but not in live articles.
Spitzak: Put Unicode name and other information that is not visible now into the title (ie tooltip)
This has been answered as well further above already: If the cell macros could be expanded to include the Unicode names without removing any of the other information, this would be fine. I would also support routing most character glyph links through redirects carrying the Unicode name (BTW some of your redirect removals weren't a good idea, as it makes reverse lookup more difficult, and also WP:NOTBROKEN). The general point regarding tooltips is that not all browsers show this information, so it can be used only for optional information not for the core information that needs to be visible without hovering over it with a mouse etc. The decimal index (and in some cases also the octal index), the character glyph and the corresponding Unicode code (if it exists) are not optional and must be visible all the time.
  • Spitzak: Make a new set of colors based on Unicode character classes so they can be applied automatically (I still think these colors are an absolute and complete waste of time, and just there as puffery to try to make the table look fancier than it really is).
The current color grouping isn't perfect, but your new color group proposal isn't either. For example, standard ASCII letters and international letters should have different colors.
  • Spitzak: Make the default size the larger one used for the letters, so that you use small to make the control characters and headers, rather than the other way around.
Symbolic control code names (NUL, LF, etc.) should be displayed in a smaller size than normal glyphs. IIRC, this has been the case (with some minor exceptions caused by special cases in a few tables) before you started to force in your changes all over the place.
--Matthiaspaul (talk) 18:46, 30 September 2018 (UTC)[reply]

Latest changes to the table

As this page is more visible, I have restored my changes, but this time in several steps, and also avoided changing links unnecessarily (the previous ones were due to copying the links from another page). The decimal numbers have not been deleted, they are in an unused template argument. I hope that this will actually lead to a post by somebody other than me or Matthiaspaul in favor or opposed to the changes. Currently there is a post in favor on IBM 3270 but that person also posted in favor of removing the Unicode (which I am in favor of but such a drastic change can be done as a further step).

Please see ISO-8859-1 for an example of a possible way to show the decimal numbers. Though IMHO the ability to calculate 16*y+x can be assumed by anybody who can use this information. Spitzak (talk) 16:27, 28 September 2018 (UTC)[reply]

For computer experts like us it is easy enough to calculate the hex index from the top row and left column as the tables are aligned with 16 characters per line, however Wikipedia should be accessible also for laymen. The decimal index is vital information (actually, it is among the most basic information a character set table needs to provide) and removing it reduces the utility value of the table significantly - you can't expect readers to calculate the index through some formula.
People are perfectly capable of hitting PageUp and seeing the HUGE table showing the same information right above!Spitzak (talk) 19:46, 15 October 2018 (UTC)[reply]
We have a consistent table format for character set tables used (almost) all over Wikipedia. This has been stable for many years so before you apply your changes to the articles, you need to find broad community consensus for it, as you have been told many times in various places already. Your repeated attempts to force your changes into dozens of articles is not covered by WP:BRD, it's edit-warring causing large-scale disruption. No good at all.
What I can think of to better distingush the index from the value would be cells splitted into two areas, possibly even using different colors. The top area would be reserved for the glyph and Unicode or ASCII code, the bottom area for the index. Ideally the index area would only occupy the lower right corner of a cell, but I don't think diagonal split lines are possible in HTML tables.
Regarding your IBM 3270 comment above, this applied to removing some faulty Unicode codes in that specific table, not removing the Unicode at all (which would be a really silly idea driving the utility value of the table's to near zero).
No, obviously not. For your instruction I will cut and paste Peter Flass's comment here, it is not about the control characters:

"I think having the ASCII values in the character set table is more than confusing. I haven"t been following the various changes and undos, butI think mixing ASCII and EBCRIC in one table is a bad idea(tm). There is a 3270 ASCII character set, and I think if people want to see ASCII there should be a separate ASCII table. Peter Flass (talk) 12:18 pm, 26 September 2018, Wednesday (19 days ago) (UTC−7)"

Note he is referring to the Unicode numbers when talking about "ASCII"
--Matthiaspaul (talk) 18:46, 30 September 2018 (UTC)[reply]
I strongly agree w/ Matthiaspaul; quickly looking up decimal (*and* hex) codes for ascii characters should be about the highest priority for the article's main-image. (I do like this image, but as a historical document, and am fine w/ it occurring later in the article.)
not-just-yeti (talk) 19:08, 8 October 2018 (UTC)[reply]
Nobody is suggesting changing the info box picture. Please check again what is being discussed.Spitzak (talk) 19:34, 15 October 2018 (UTC)[reply]

As you insist on ignoring any comments other than mine and yours, here is another from the IBM 3270 page:

"User:Matthiaspaul claimed that table 1 of IBM 3270 was useless without the decimal numbers and reverted an edit by User:Spitzak; the table is perfectly useful without the decimal numbers and, in fact, the decimal numbers are useless; anybody programming for the 3270 uses hexadecimal or symbolic references to control characters. Shmuel (Seymour J.) Metz Username:Chatul (talk) 11:42 am, 25 September 2018, Tuesday (20 days ago) (UTC−7)"

LEM

The table of control characters shows LEM as an alias for ETB, but nowhere is there any discussion of what LEM is. I've checked the talk archives, and nada there as well. Nor is there anything at ETB. There is an expansion of the initialism as "Logical End of Media" available from http://ascii-table.com, but no further explanation; it would appear to be intended to signify the data end of a tape, whether paper or magnetic, rather than the physical end. 192.31.106.36 (talk) 11:49, 14 October 2018 (UTC)[reply]

"glass TTY"

At least in the PDP-10 community, the name "glass TTY" did not mean any display terminal; it was a derogatory name for a display that did not support moving the cursor to an arbitrary position, overwriting characters, and deleting characters. Such terminals were viewed as no better than a Teletype, since they couldn't support Emacs. --Briankharvey (talk) 19:37, 30 December 2018 (UTC)[reply]

Base64 ?

Maybe give a little nod toward Base64, that it was developed to pass through all the old encodings. Since you could mostly be sure of A-Z a-z 0-9. — Preceding unsigned comment added by 94.232.167.4 (talk) 10:05, 27 February 2020 (UTC)[reply]

Incomprehensible without a working knowledge of the subject

What. Who. Why. When.

Instantly dives into detail.

What is this? In simple English.

Then who.

Why.

When.

And then technobabble about line output between different types of printer and the very specific details you think are important but actually make this opaque gibberish. Dustek (talk) 23:37, 17 June 2020 (UTC)[reply]