Talk:Human genome: Difference between revisions
→Junk DNA: removed |
|||
Line 111: | Line 111: | ||
90% of the human genome is junk but "junk" is only mentioned once in the article. That needs to be fixed. [[User:Genome42|Genome42]] ([[User talk:Genome42|talk]]) 20:40, 27 July 2022 (UTC) |
90% of the human genome is junk but "junk" is only mentioned once in the article. That needs to be fixed. [[User:Genome42|Genome42]] ([[User talk:Genome42|talk]]) 20:40, 27 July 2022 (UTC) |
||
:Yeah, it needs to be removed. 'Junk' DNA is a concept defined by what it isn't, and is no longer considered a useful distinction given the diverse types of functional (in some cases critically important) and non-functional DNA that is included in this catch-all term. I have removed it. [[User:Agricolae|Agricolae]] ([[User talk:Agricolae|talk]]) 22:40, 27 July 2022 (UTC) |
Revision as of 22:40, 27 July 2022
Human genome was one of the Natural sciences good articles, but it has been removed from the list. There are suggestions below for improving the article to meet the good article criteria. Once these issues have been addressed, the article can be renominated. Editors may also seek a reassessment of the decision if they believe there was a mistake. | |||||||||||||
| |||||||||||||
Current status: Delisted good article |
This article has not yet been rated on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||||||||||||||||||||||||
Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.
Please add the quality rating to the {{WikiProject banner shell}} template instead of this project banner. See WP:PIQA for details.
|
This article has been mentioned by a media organization:
|
Index
|
|||
This page has archives. Sections older than 90 days may be automatically archived by Lowercase sigmabot III when more than 5 sections are present. |
quality
according to the url below, a paper from top experts in a top journal (ie highly authoritative) says that there are many many gaps (unsequenced) regions in the human genome imo, the lack of attention paid to these gaps is somewhat misleading for the general public; eg when scientists use the word "complete" it means, per the dictionary, that we have no gap, no missign sequence genome yet this is empirically false http://www.nature.com/nature/journal/vaop/ncurrent/full/nature13907.html
Publication Date.
OK: this article states that the initial draft was published on the 12th February, 2001.
However … ?
The publication date is given as the 15th February: in the On This Day article for February 15th.
Is it possible we can get the wrong one, corrected?
Thanks
Cuddy2977 (talk) 14:33, 3 February 2019 (UTC)
- Yes--the date given in this article was that of the press release, which preceded actual publication. I've fixed it. Jbening (talk) 16:36, 3 February 2019 (UTC)
- My own view is that the precise day is unnecessary detail. Agricolae (talk) 21:01, 3 February 2019 (UTC)
Reversion
I undid this edit for a couple of reasons:
- Saying genomes were sequenced for $200 per person is misleading, because as the cited article explains, the company did that at a loss, so the true cost is still >$200.
- We can't say that estimates of the number of human genes went from 100,000 to 20,000 then up to >46,000, as the first two numbers are for protein-coding genes, whereas >46,000 includes RNA genes.
I apologise for the hasty wholesale reversion, which undid some good changes along with the disputed bits. Adrian J. Hunter(talk•contribs) 05:31, 31 May 2019 (UTC)
- Regarding these edits, I think the material on 'complete' genome sequencing needs a more nuanced approach. The human genome is usually spoken of as complete, even though there is some repetitive DNA and other hard-to-get-at regions that have not been fully spanned. You have to go pretty deep in the weeds to find a source that discusses the distinction between 'complete' and complete. I think more confusion than precision is likely if we insist on a perspective that while linguistically accurate is different than common usage not only in the popular scientific press, which routinely reports the 'complete' (really meaning 'sequenced with high fold coverage') genome sequencing of all kinds of species, and also in the scientific research community itself where it is spoken of more like 'complete (not really complete, but you know what we mean)' than incomplete. As written it seems almost to be an expression of a pet peeve about near-universal misuse of 'complete' when talking about genomes. I think it would be better to refer to it as complete while explaining that this doesn't really mean absolutely complete, than to state that it is not complete nor are any sequenced genomes in contrast to the way it is always reported and spoken of. Agricolae (talk) 13:18, 1 June 2019 (UTC)
- It's probably worth keeping a statement about completeness in the lede; most non-specialists I've talked with on the subject are surprised to hear that there are "degrees of completeness", even within the particular genome being sequenced. Their assumption is generally that we have a computer file which contains the entirety of binary DNA information, with no gaps. The general reaction to the probabilistic and overlap coverage description is "But that's not complete!" Tarl N. (discuss) 19:24, 1 June 2019 (UTC)
- I am not saying we shouldn't address completeness - it is a distinction worth making, but it needs to be nuanced. What I am not comfortable with is the recent edit. The original text was:
- "Completion of the Human Genome Project Sequence was published in 2004.[1] The human genome was the first of all vertebrates to be completely sequenced. As of 2012, thousands of human genomes have been completely sequenced, . . . "
- The editor was correct that this is problematic because it doesn't make clear what is meant by 'complete', that it doesn't mean the same as literally complete. However the new text:
- " . . . a more polished version published in 2004.[1] As of mid-2019, no human genome (nor any vertebrate genome) has been completely sequenced. "
- This goes the other direction. It was referred to then and afterwards as 'complete', even though everyone involved knew full well that this did not mean literally complete. This usage continues with the reporting to the 'complete' genome of the white tiger, even though it had more than a thousand gaps, and the 'complete' genome of the mandarin orange, even though it too had missing sequence. Simply put, 'complete' when talking about genomes, does not mean the same thing: 'complete genome sequence' is a term of art referring to sequencing the genome with high-fold coverage, even though some small regions of repetitive and 'unsequenceable' DNA remain unknown. We could even give a couple of sentences explicitly explaining the nature of the 'missing' sequence - I remember seeing a few years back an 'update' in Science or Nature with an overview of what types of DNA were still missing and interviewing a group using PacBio sequencing to try to close some of the gaps, so I know there is citable material out there to write such a summary. One way or another, I think we want to get across the point that it is referred to as complete (though it technically isn't), rather than just saying that no genome has ever been completely sequenced. Agricolae (talk) 20:16, 1 June 2019 (UTC)
- Oops, we already do describe this, just somewhere else in the article. I just think the recent edit went to far and obscures meaning rather than clarifying it. Agricolae (talk) 20:18, 1 June 2019 (UTC)
- OK, so I returned to the earlier text then modified to make the status explicit. As always, linguistic improvement is encouraged but I would rather we avoid the 'no genome has ever been completely sequenced' phrasing which, while literally true, doesn't match with typical usage. Agricolae (talk) 15:16, 2 June 2019 (UTC)
- I am not saying we shouldn't address completeness - it is a distinction worth making, but it needs to be nuanced. What I am not comfortable with is the recent edit. The original text was:
- It's probably worth keeping a statement about completeness in the lede; most non-specialists I've talked with on the subject are surprised to hear that there are "degrees of completeness", even within the particular genome being sequenced. Their assumption is generally that we have a computer file which contains the entirety of binary DNA information, with no gaps. The general reaction to the probabilistic and overlap coverage description is "But that's not complete!" Tarl N. (discuss) 19:24, 1 June 2019 (UTC)
References
Current list of human protein-coding genes
I know we have sub-articles on this with partial lists by chromosome, but there's now a complete list in the pages below in the event anyone is interested. I unfortunately couldn't add more information columns to those wikitables since I was running right up against the page size limit on both pages and wanted to split the list across as few pages as possible. Seppi333 (Insert 2¢) 19:28, 2 November 2019 (UTC)
- Wikipedia:WikiProject Molecular Biology/Molecular and Cell Biology/Human protein-coding genes1
- Wikipedia:WikiProject Molecular Biology/Molecular and Cell Biology/Human protein-coding genes2
july 2020 NIH X CHROMASOME RESULT
id like to add this to the article.
should i ?
and whare ?
- Not sure this is really the best place for it - the result is really more about the process of completing the sequence than it is about the genome itself. I would suggest: Human Genome Project#State of completion (and it would be better to cite the Nature paper than the press releases). Agricolae (talk) 21:07, 19 July 2020 (UTC)
Number of genes?
The article says
- As genome sequence quality and the methods for identifying protein-coding genes improved,[9] the count of recognized protein-coding genes dropped to 19,000-20,000.[12] However, a fuller understanding of the role played by sequences that do not encode proteins, but instead express regulatory RNA, has raised the total number of genes to at least 46,831,[13] plus another 2300 micro-RNA genes.[14]
but also says
- The haploid human genome (23 chromosomes) is about 3 billion base pairs long and contains around 30,000 genes.[29]
Which is it, or are both numbers, properly understood, correct? —WWoods (talk) 18:54, 14 November 2020 (UTC)
- It depends on what you count as a gene, and how long ago the analysis was done. That the 46k number comes from an article with the sub-headline: "The new estimate is based on a broader definition of just what a gene is". The second is more vague and generic, and it is unclear if it is rejecting the new definition of a gene offered by the other analysis, or if this is simply 'old data', that though the page has been updated as recently as this August, this particular datum is of older vintage. I think we need to see what sources from the past year and a half are saying about the gene count. Without that, the 46,831 number just represents one paper's conclusion that the definition of a gene should be changed, and what number that would produce, but has the field accepted this adjustment? Even if they have followed this thinking, the number is overly precise given that it results from making a whole lot of individual calls over whether each site is a gene or not. I think we should either present it less precisely as 'more than 46800' and likewise not present this redefinition of the gene as the recent 'new understanding' if it is just one paper's position - i.e. we should use conditional language, 'if the view of a gene is expanded, . . . ' or else we should use more descriptive language to refer to the precise number 'a recent analysis arguing the definition of a gene should be expanded concluded. . . .' so it is clear this is a single analysis with a specific set of assumptions. Agricolae (talk) 19:20, 14 November 2020 (UTC)
Junk DNA
90% of the human genome is junk but "junk" is only mentioned once in the article. That needs to be fixed. Genome42 (talk) 20:40, 27 July 2022 (UTC)
- Yeah, it needs to be removed. 'Junk' DNA is a concept defined by what it isn't, and is no longer considered a useful distinction given the diverse types of functional (in some cases critically important) and non-functional DNA that is included in this catch-all term. I have removed it. Agricolae (talk) 22:40, 27 July 2022 (UTC)
- Delisted good articles
- C-Class Molecular Biology articles
- Unknown-importance Molecular Biology articles
- C-Class Genetics articles
- High-importance Genetics articles
- WikiProject Genetics articles
- C-Class MCB articles
- Top-importance MCB articles
- WikiProject Molecular and Cellular Biology articles
- All WikiProject Molecular Biology pages
- C-Class medicine articles
- Mid-importance medicine articles
- Medicine portal selected articles
- All WikiProject Medicine pages
- Wikipedia pages referenced by the press