Talk:Null (SQL)
The good article status of this article is being reassessed by the community to determine whether the article meets the good article criteria. Please add comments to the reassessment page. Date: 20:17, 19 December 2023 (UTC) |
This is the talk page for discussing improvements to the Null (SQL) article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1Auto-archiving period: 12 months |
Null (SQL) has been listed as one of the Engineering and technology good articles under the good article criteria. If you can improve it further, please do so. If it no longer meets these criteria, you can reassess it. | ||||||||||||||||||||||
|
This article is rated GA-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||
|
To-do list for Null (SQL):
|
I think this is duplicate page
Why has this page been written? In SQL, null has exactly the same meaning / behavior it has in ALL programming languages. The only thing that COULD (but not in my opinion) justify this page, would have been some discussion about UNKNOWN pseudo-value. But there is nothing like that, so this page is misleading for someone who doesn't know that null is. — Preceding unsigned comment added by 151.21.9.214 (talk) 14:37, 5 October 2011 (UTC)
Amazing and amusing contradictions
CJ Date says the SQL 3VL logic is none of those usually studied. John Grant says it's Kleene's K3. Thalheim and Schewe say it's Łukasiewicz L3. They can't all be right. What's more amusing is that they disagree about such basic facts. Tijfo098 (talk) 19:44, 7 November 2012 (UTC)
- Actually, this mystery has been solved, cf [1] p. 9. Basically it's both K3 and L3 because they differ only in their table for implication, but SQL has no such operand. And in this sense, SQL's is also an incompletely specified logic. So all of them (Date, Grant and Thalheim & Schewe) are correct! Tijfo098 (talk) 16:25, 8 November 2012 (UTC)
Claimed inconsistency of boolean variables
The catch is that in implementations which distinguish boolean NULL from Unknown, like in that of Microsoft, an expression like TRUE OR NULL (e.g. "(1 <> 2) OR NULL") fails the type check, so you can't speak of its result, because it's a parsing error. Unless someone can produce a reliable source for this so-called inconsistency, I'll delete the claim. Tijfo098 (talk) 15:31, 8 November 2012 (UTC)
- I've deleted it. Tijfo098 (talk) 16:17, 10 November 2012 (UTC)
Nulls are Equal ?
The chapter "When two nulls are equal: ..." seems to be outdated. Actually it reads: "Because SQL:2003 defines all Null markers as being unequal to one another...". The content of the chapter is not incorrect, but the actual standard uses more elegant expressions - with a slightly different semantic. SQL:2011 defines in 4.1.5 Properties of distinct the rule "Two null values are not distinct.". Is this only a problem of linguistic representation within the wiki-article or is it a progress in the SQL standard, which is not reflected by the article? As far as I know, the semantic of GROUP BY and other related language elements hasn't changed from 2003 to 2011. --Kelti (talk) 19:57, 11 May 2014 (UTC)
Where are nulls explicitly used?
I think this article needs to illustrate a real-world example where nulls are explicitly used, so we can see their use (if a use exists). I used SQL a little in work for 8 months, never needed to assign a NULL value to any field (but it was only 8 months of light SQL work after all).
If there are legit uses of explicit nulls, then we should probably NOT get rid of them. If we did, then anyone who needed them would just add an extra boolean field called NULL after every normal field, and thus use them again. The difference would be, they would have to implement the logic themselves and thus reinvent the wheel. If I'm not mistaken, I believe there are flags in a table that you can set to ensure that no field can ever be NULL, and if you really fear/hate NULLS you can do things safely this way.
Also, the "Closed World" argument was good, but in the real world, sometimes certain things are unknown. Again, a real-world example is needed and unfortunately I can't think of one myself due to my limited SQL experience. DrZygote214 (talk) 00:33, 18 April 2015 (UTC)
- Regarding "I believe there are flags in a table that you can set to ensure that no field can ever be NULL", see the section Check constraints and foreign keys in the article, starting at "In order to constrain a column to reject Nulls..." For example, to specify that a user registered with a website must have a handle, the following can be used in the user table definition:
username varchar(255) not null unique
- They are also important in the event that a record has missing fields. Instead of recording default values for such fields, the field is marked as "null", which basically signifies it is empty. (See also Nulls in the Oracle Database SQL Reference.) AS noted in the criticism section of this article. a null value signifies that there is no value assigned for that field, not that the field has a value of 0 (zero) or some such. Mindmatrix 00:58, 18 April 2015 (UTC)
- In real life "I don't know yet" is often the answer to a question. Even a Yes/No question has 3 possible answers (not 2): "Yes" (I know the fact and it is yes), "No" (I know the fact and it is no), and "I don't know yet" (I don't know the fact yet because it is to be determined). And with numbers, the same is true. For example, for each day, enter the weather data, such what the high temperature was. Let's say that in 2013, there were 2 days, July 23 and July 24, when no thermometer reading was recorded. Should you just say that the high temperature that day was zero degrees? Damn, that's funny, the average temperature in July 2013 was strangely cool! No, it wasn't—you just didn't know how to design your database.
This is a basic aspect of information science, and information technology should be designed to handle it, which is why database design and database theory have the null concept. Another way to say it is that a value of zero or no is not the same thing as the lack of a value. In other words, zero and no are information, whereas null is the lack of information, or, more precisely, a piece of meta-information that informs about a lack of information (or metadata about a lack of data). Going months or years without having to consider that distinction is a reflection of the application—that is, of the particular project being worked on: it was evidently a context in which every relevant piece of information was already known at the time of data entry. But in other applications (other projects), one may not be so lucky. It is quite true that a field property can be set to "required" (which means "no nulls"). But whether any particular database is reasonably able to use (or ought not to use) that property on any particular field is determined by the subject matter at hand. See the "July 23 and July 24" example. It should not be about each person's "fear of nulls" or reinvention of the null concept. The concept is already well known.
Related: The critique of the null concept did not really belong in this article's lede. It belongs in the article body. The only people who can properly understand its importance or context are people who have high database theory and database design knowledge. Computer science PhD types. Not me, not the average reader of Wikipedia. As evidenced by the question posed in this talk thread, that is not the average user of this Wikipedia article, and the discussion as currently placed and written in the lede was probably confusing more people than it is helping. I moved it down. First tell a database 101 student what null is and what it's good for. Discuss the advanced philosophical critique later, in a section that 101 students will skip over. — ¾-10 17:35, 18 April 2015 (UTC)
- In real life "I don't know yet" is often the answer to a question. Even a Yes/No question has 3 possible answers (not 2): "Yes" (I know the fact and it is yes), "No" (I know the fact and it is no), and "I don't know yet" (I don't know the fact yet because it is to be determined). And with numbers, the same is true. For example, for each day, enter the weather data, such what the high temperature was. Let's say that in 2013, there were 2 days, July 23 and July 24, when no thermometer reading was recorded. Should you just say that the high temperature that day was zero degrees? Damn, that's funny, the average temperature in July 2013 was strangely cool! No, it wasn't—you just didn't know how to design your database.
- I was going to add, but forgot, the acknowledgment that RDBMSs do have binary field data type (for 0/1, Yes/No, On/Off, True/False), which cannot give 3-value logic including null, but along with that fact comes the corollary that during database design, if that data type is chosen at all, it really should be chosen only for data where it is highly likely (and seems likely to remain likely) that at the time of data entry the person entering the data will always already know what the fact is. Otherwise the designer should use an integer data type that allows 3 options. An example of a website that explores this point is http://allenbrowne.com/noyesno.html. — ¾-10 23:29, 19 April 2015 (UTC)
The part
The section “Analysis of SQL Null missing-value semantics” seems to provide private Nonsens:
The “groundbreaking work of T. Imielinski and W. Lipski (1984)” was and is completely unknown to the database community.
The proposition “and if is its lifting to a construct intended to represent missing information” has no truth value, thus no content. — Preceding unsigned comment added by 94.219.112.101 (talk) 17:47, 13 May 2015 (UTC)
What is the binary representation of Null?
What exactly does a SQL engine put "on disk" when it writes Null into a column? Example, if I have a column of type INTEGER, all values between binary "00000000 00000000 00000000 00000000" and "11111111 11111111 11111111 1111111" can be stored, each representing an integer number. If this column also allows "Null" as a value, what exactly is written onto the disk? There's nothing left that can mean "Null", because any bit sequence already stands for some INTEGER value. --194.231.113.66 (talk) 13:29, 17 July 2015 (UTC)
- The field in the database is not assigned a value in the case of NULL; it is empty. (This is the typical case, though some DB may be implemented differently.) See also the discussion Where are nulls explicitly used? above. Mindmatrix
- I would hazard that this is implementation specific. Even a given SQL implementation may have multiple engines (e.g. Maria has, for example innodb) which may be implemented in multiple architectures.
- All the best: Rich Farmbrough, 21:36, 21 November 2016 (UTC).
NULL, Null, or null
I would suggest that "null" is used for the general concept and "NULL" the SQL implementation. While a lot of the SQL I write uses Null or null, it seems that listing a table with a null field will show NULL. This makes it the "natural" representation in SQL. Unless anyone has a better idea? All the best: Rich Farmbrough, 21:39, 21 November 2016 (UTC).
- The ISO SQL specification (ISO/IEC 9075-2:2016(E)) uses upper case for the
NULL
keyword, and lower case for the null value, but never Null. Fjerdingen (talk) 20:46, 8 June 2017 (UTC)
NULL
Several examples, some said to be ANSI/ISO SQL compliant, have syntax with = NULL
- which is syntactically invalid. Those examples need to be corrected! Fjerdingen (talk) 20:21, 31 May 2017 (UTC)
- In according to the SQL standard as well as to our SQL wikibook the expression
col_x = NULL
is syntactically correct and leads to the boolean valueUNKNOWN
. As long as someone thinks in the usual two-value-logic of many programming languages, this behaviour may be confusing. In opposite to such languages SQL allows the NULL marker as a legal 'content' of columns and defines rules how to evaluate expressions with the literal NULL (or NULL markers in the column of a row) to true/false/unknown. If someone searches rows containing the NULL marker, the expressioncol_x IS NULL
shall be used: it leads toTRUE
for all rows containing the NULL marker. --Kelti (talk) 08:10, 2 June 2017 (UTC)- When it comes to confirming ISO/ANSI SQL syntax compliance I use the Mimer SQL-2003 Validator (http://developer.mimer.com/validator/parser200x/index.tml). If you run the query
SELECT * FROM T WHERE C = NULL
the result is "syntax error: NULL correction: <identifier>". However, this is the SQL-2003 standard, and there are later revisions of the ISO/ANSI SQL standard. Can you give me a reference that the syntax later has become valid? Fjerdingen (talk) 19:33, 5 June 2017 (UTC)- a) The syntax of the standard did not change over time. Since the very first days "= null" was syntactically correct - and often missleads people. But Mimer seems to change their behaviour. Their SQL-92 validator http://developer.mimer.se/validator/parser92/index.tml#parser gives a different answer to the example. I interpret the Mimer answer in their 200x parser as a hint because this query NEVER returns any row.
- b) Please read the standard. SQL:2011, part-2, page 33 says "... any comparison involving the null value or an Unknown truth value will return an Unknown result.". Kelti (talk) 08:06, 6 June 2017 (UTC)
- It seems like we are discussing two different things here. I am talking about the
NULL
keyword, the syntactical element. Your SQL-2011 reference is about the Null value. Of course the Null value can be used when doing a comparison. However, this doesn't mean theNULL
keyword is syntactically allowed.
- It seems like we are discussing two different things here. I am talking about the
- When it comes to confirming ISO/ANSI SQL syntax compliance I use the Mimer SQL-2003 Validator (http://developer.mimer.com/validator/parser200x/index.tml). If you run the query
- However, the syntax has been changed. As you pointed out,
= NULL
was syntactically correct in SQL-92. But between SQL-92 and SQL-99, the ISO SQL standardization committee worked hard on cleaning up old sins. One of those things were the= NULL
syntax - which was removed! (Once in a while things get deprecated and removed from the SQL standard, e.g.select null from table
, ordinal position in ORDER BY clause, joins without theSELECT
keyword etc.)Fjerdingen (talk) 19:28, 7 June 2017 (UTC)
- However, the syntax has been changed. As you pointed out,
- Your distinction between the
NULL
keyword and anull value
in a column (regarding to the SQL syntax) is new to me. SQL:2011, part-1, page 14 says: "... Every data type includes a special value, called the null value, sometimes denoted by the keyword NULL." Isn't this a statement that both are interchangeable? Kelti (talk) 07:42, 8 June 2017 (UTC)
- Your distinction between the
- The
NULL
keyword is a <null specification> according to the ISO SQL specification (<null specification> ::= NULL
). (And the value of a <null specification> is the null value.) The <null specification> is not a part of the <row value predicand>, which is the argument to a comparison (<comparison predicate>). Below is some copy and paste from the SQL 2016 spec (ISO/IEC 9075-2:2016(E)), sorry for the ruined formatting:
- The
8.2 <comparison predicate> ::= <row value predicand> <comparison predicate part 2> <comparison predicate part 2> ::= <comp op> <row value predicand> <comp op> ::= <equals operator> | <not equals operator> | <less than operator> | <greater than operator> | <less than or equals operator> | <greater than or equals operator>
7.2 <row value predicand> ::= <row value special case> | <row value constructor predicand> <row value special case> ::= <nonparenthesized value expression primary>
6.3 <nonparenthesized value expression primary> ::= <unsigned value specification> | <column reference> | <set function specification> | <window function> | <nested window function> | <scalar subquery> | <case expression> | <cast specification> | <field reference> | <subtype treatment> | <method invocation> | <static method invocation> | <new specification> | <attribute or method reference> | <reference resolution> | <collection value constructor> | <array element reference> | <multiset element reference> | <next value expression> | <routine invocation> | <row pattern navigation operation> | <JSON value function>
6.4 <unsigned value specification> ::= <unsigned literal> | <general value specification> <general value specification> ::= <host parameter specification> | <SQL parameter reference> | <dynamic parameter specification> | <embedded variable specification> | <current collation specification> | CURRENT_CATALOG | CURRENT_DEFAULT_TRANSFORM_GROUP | CURRENT_PATH | CURRENT_ROLE | CURRENT_SCHEMA | CURRENT_TRANSFORM_GROUP_FOR_TYPE <path-resolved user-defined type name> | CURRENT_USER | SESSION_USER | SYSTEM_USER | USER | VALUE
5.3 ( <literal> Function: Specify a non-null value.) <unsigned literal> ::= <unsigned numeric literal> | <general literal> <general literal> ::= <character string literal> | <national character string literal> | <Unicode character string literal> | <binary string literal> | <datetime literal> | <interval literal> | <boolean literal>
7.1 <row value constructor predicand> ::= <common value expression> | <boolean predicand> | <explicit row value constructor>
6.28 <common value expression> ::= <numeric value expression> | <string value expression> | <datetime value expression> | <interval value expression> | <user-defined type value expression> | <reference value expression> | <collection value expression>
- No <null specification> found above for the <comparison predicate>!
- Instead the <null predicate> is used:
8.8 <null predicate> Function Specify a test for a null value. Format <null predicate> ::= <row value predicand> <null predicate part 2> <null predicate part 2> ::= IS [ NOT ] NULL
- When it comes to assignment, e.g.
UPDATE
, the <null specification> can be used (as a part of <contextually typed value specification>, but not of <value expression> - which includes literals etc.):
- When it comes to assignment, e.g.
14.15 <update source> ::= <value expression> | <contextually typed value specification>
6.5 <contextually typed value specification> ::= <implicitly typed value specification> | <default specification> <implicitly typed value specification> ::= <null specification> | <empty specification> <null specification> ::= NULL
<value expression> ::= <common value expression> | <boolean value expression> | <row value expression> <common value expression> ::= <numeric value expression> | <string value expression> | <datetime value expression> | <interval value expression> | <user-defined type value expression> | <reference value expression> | <collection value expression>
- It's possible that I have missed to include some parts of interest. But as you have the SQL-2011 specification available, you can perhaps check it out by yourself, or ask me to do it.
- Conclusion: the
NULL
keyword is not interchangeable with the null value. TheNULL
keyword can not syntactically be an argument to a comparison.
- Conclusion: the
- I know that many, but not all, dbms implementations cheat and allow the
= NULL
syntax. That's why I made my first edit (Revision as of 20:26, 31 May 2017, which was reverted.) Fjerdingen (talk) 20:40, 8 June 2017 (UTC)
- I know that many, but not all, dbms implementations cheat and allow the
@Fjerdingen: You have conviced me that "= NULL" is illegal in the sense of the SQL standard. I have changed the article. (Maybe that I will do some more changes in the near future.) Thank's for your investigation. --Kelti (talk) 16:18, 20 June 2017 (UTC)
GA Reassessment
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
- Article (edit | visual edit | history) · Article talk (edit | history) · Watch • • GAN review not found
- Result: Delisted. ~~ AirshipJungleman29 (talk) 13:07, 8 January 2024 (UTC)
This 2007 listing contains significant uncited material, including whole subsections, violating GA criterion 2b). ~~ AirshipJungleman29 (talk) 20:17, 19 December 2023 (UTC)
- I could probably look into this over the coming weeks. Sohom (talk) 09:19, 21 December 2023 (UTC)
- Hi Sohom Datta, do you still intend to work on this? No worries if not. ~~ AirshipJungleman29 (talk) 18:08, 30 December 2023 (UTC)
- I do intend to, got sidetracked by Site isolation :) Sohom (talk) 19:49, 30 December 2023 (UTC)
- On second thoughts, I'm probably not gonna be working on this. A lot of the text wrt to the article is from 2008, and seems to be non-trivial to source to a source that was online in 2008 and is still online today, while keeping WP:CIRCULAR at bay :( Sohom (talk) 12:24, 8 January 2024 (UTC)
- Sorry to hear that. ~~ AirshipJungleman29 (talk) 13:07, 8 January 2024 (UTC)
- On second thoughts, I'm probably not gonna be working on this. A lot of the text wrt to the article is from 2008, and seems to be non-trivial to source to a source that was online in 2008 and is still online today, while keeping WP:CIRCULAR at bay :( Sohom (talk) 12:24, 8 January 2024 (UTC)
- I do intend to, got sidetracked by Site isolation :) Sohom (talk) 19:49, 30 December 2023 (UTC)
- Hi Sohom Datta, do you still intend to work on this? No worries if not. ~~ AirshipJungleman29 (talk) 18:08, 30 December 2023 (UTC)