Content analysis: Difference between revisions
m One more hyperlink deletion. Wow it's hard to read this in edit mode! |
Tags: Mobile edit Mobile web edit Advanced mobile edit |
||
(45 intermediate revisions by 25 users not shown) | |||
Line 1: | Line 1: | ||
{{Short description|Research method for studying documents and communication artifacts}} |
|||
{{expert needed|sociology|ex2=Media|date=April 2008}} |
|||
{{Sociology}} |
{{Sociology}} |
||
{{Research}} |
|||
'''Content analysis''' is the study of [[document]]s and communication artifacts, which might be texts of various formats, pictures, audio or video. Social scientists use content analysis to examine patterns in communication in a replicable and systematic manner.<ref>{{Cite book|title=Business research methods| |
'''Content analysis''' is the study of [[document]]s and communication artifacts, which might be texts of various formats, pictures, audio or video. Social scientists use content analysis to examine patterns in communication in a replicable and systematic manner.<ref>{{Cite book|title=Business research methods|first=Alan|last=Bryman|date=2011|publisher=Oxford University Press|last2=Bell|first2=Emma|isbn=9780199583409|edition=3rd|location=Cambridge|oclc=746155102}}</ref> One of the key advantages of using content analysis to analyse social phenomena is their non-invasive nature, in contrast to simulating social experiences or collecting survey answers. |
||
Practices and philosophies of content analysis vary between academic disciplines. They all involve systematic reading or observation of [[text (literary theory)|texts]] or artifacts which are [[Coding (social sciences)|assigned labels (sometimes called codes)]] to indicate the presence of interesting, [[semantics|meaningful]] pieces of content.<ref>{{cite book|last=Hodder|first=I.|title=The interpretation of documents and material culture|year=1994|publisher=Sage|location=Thousand Oaks etc.|isbn=978-0761926870|page=155|url=https://books.google.com/books?id=CdCGek5KJ_QC&q=hodder+mute+evidence&pg=PA155}}</ref><ref name="Tipaldo 2014 42">{{cite book|url=https://www.mulino.it/isbn/9788815248329|title=L'analisi del contenuto e i mass media|last=Tipaldo|first=G.|publisher=Il Mulino|year=2014|isbn=978-88-15-24832-9|location=Bologna, IT|pages=42}}</ref> By systematically labeling the content of a set of [[text (literary theory)|texts]], researchers can analyse patterns of content [[Quantitative research|quantitatively]] using [[Statistics|statistical methods]], or use [[Qualitative research|qualitative]] methods to analyse meanings of content within [[text (literary theory)|texts]]. |
Practices and philosophies of content analysis vary between academic disciplines. They all involve systematic reading or observation of [[text (literary theory)|texts]] or artifacts which are [[Coding (social sciences)|assigned labels (sometimes called codes)]] to indicate the presence of interesting, [[semantics|meaningful]] pieces of content.<ref>{{cite book|last=Hodder|first=I.|title=The interpretation of documents and material culture|year=1994|publisher=Sage|location=Thousand Oaks etc.|isbn=978-0761926870|page=155|url=https://books.google.com/books?id=CdCGek5KJ_QC&q=hodder+mute+evidence&pg=PA155}}</ref><ref name="Tipaldo 2014 42">{{cite book|url=https://www.mulino.it/isbn/9788815248329|title=L'analisi del contenuto e i mass media|last=Tipaldo|first=G.|publisher=Il Mulino|year=2014|isbn=978-88-15-24832-9|location=Bologna, IT|pages=42}}</ref> By systematically labeling the content of a set of [[text (literary theory)|texts]], researchers can analyse patterns of content [[Quantitative research|quantitatively]] using [[Statistics|statistical methods]], or use [[Qualitative research|qualitative]] methods to analyse meanings of content within [[text (literary theory)|texts]]. |
||
Computers are increasingly used in content analysis to automate the labeling (or coding) of documents. Simple computational techniques can provide descriptive data such as word frequencies and document lengths. [[Machine learning]] classifiers can greatly increase the number of texts that can be labeled, but the scientific utility of doing so is a matter of debate. Further, numerous computer-aided text analysis (CATA) computer programs are available that analyze text for |
Computers are increasingly used in content analysis to automate the labeling (or coding) of documents. Simple computational techniques can provide descriptive data such as word frequencies and document lengths. [[Machine learning]] classifiers can greatly increase the number of texts that can be labeled, but the scientific utility of doing so is a matter of debate. Further, numerous computer-aided text analysis (CATA) computer programs are available that analyze text for predetermined linguistic, semantic, and psychological characteristics.<ref name="Neuendorf2016" /> |
||
== Goals == |
== Goals == |
||
Line 21: | Line 21: | ||
The simplest and most objective form of content analysis considers unambiguous characteristics of the text such as [[word frequencies]], the page area taken by a newspaper column, or the duration of a [[radio]] or [[television]] program. Analysis of simple word frequencies is limited because the meaning of a word depends on surrounding text. [[Key Word in Context|Key Word In Context]] (KWIC) routines address this by placing words in their textual context. This helps resolve ambiguities such as those introduced by [[synonym]]s and [[homonym]]s. |
The simplest and most objective form of content analysis considers unambiguous characteristics of the text such as [[word frequencies]], the page area taken by a newspaper column, or the duration of a [[radio]] or [[television]] program. Analysis of simple word frequencies is limited because the meaning of a word depends on surrounding text. [[Key Word in Context|Key Word In Context]] (KWIC) routines address this by placing words in their textual context. This helps resolve ambiguities such as those introduced by [[synonym]]s and [[homonym]]s. |
||
A further step in analysis is the distinction between dictionary-based (quantitative) approaches and qualitative approaches. Dictionary-based approaches set up a list of categories derived from the frequency list of words and control the distribution of words and their respective categories over the texts. While methods in quantitative content analysis in this way transform observations of found categories into quantitative statistical data, the qualitative content analysis focuses more on the intentionality and its implications. There are strong parallels between qualitative content analysis and [[thematic analysis]].<ref>{{Cite journal|last1=Vaismoradi|first1=Mojtaba|last2=Turunen|first2=Hannele|last3=Bondas|first3=Terese|date=2013-09-01|title=Content analysis and thematic analysis: Implications for conducting a qualitative descriptive study|journal=Nursing & Health Sciences|language=en|volume=15|issue=3|pages=398–405|doi=10.1111/nhs.12048|pmid=23480423|issn=1442-2018}}</ref> |
A further step in analysis is the distinction between dictionary-based (quantitative) approaches and qualitative approaches. Dictionary-based approaches set up a list of categories derived from the frequency list of words and control the distribution of words and their respective categories over the texts. While methods in quantitative content analysis in this way transform observations of found categories into quantitative statistical data, the qualitative content analysis focuses more on the intentionality and its implications. There are strong parallels between qualitative content analysis and [[thematic analysis]].<ref>{{Cite journal|last1=Vaismoradi|first1=Mojtaba|last2=Turunen|first2=Hannele|last3=Bondas|first3=Terese|date=2013-09-01|title=Content analysis and thematic analysis: Implications for conducting a qualitative descriptive study|journal=Nursing & Health Sciences|language=en|volume=15|issue=3|pages=398–405|doi=10.1111/nhs.12048|pmid=23480423|s2cid=10881485 |issn=1442-2018|doi-access=free}}</ref> |
||
== Qualitative and quantitative content analysis == |
== Qualitative and quantitative content analysis == |
||
Quantitative content analysis highlights frequency counts and |
Quantitative content analysis highlights frequency counts and statistical analysis of these coded frequencies.<ref name=":03">{{Cite journal|last=Kracauer|first=Siegfried|date=1952|title=The Challenge of Qualitative Content Analysis|journal=Public Opinion Quarterly|volume=16|issue=4, Special Issue on International Communications Research|pages=631|doi=10.1086/266427|issn=0033-362X}}</ref> Additionally, quantitative content analysis begins with a framed hypothesis with coding decided on before the analysis begins. These coding categories are strictly relevant to the researcher's hypothesis. Quantitative analysis also takes a deductive approach.<ref name=":12">{{Cite journal|last1=White|first1=Marilyn Domas|last2=Marsh|first2=Emily E.|date=2006|title=Content Analysis: A Flexible Methodology|journal=Library Trends|volume=55|issue=1|pages=22–45|doi=10.1353/lib.2006.0053|issn=1559-0682|hdl=2142/3670|s2cid=6342233|hdl-access=free}}</ref> Examples of content-analytical variables and constructs can be found, for example, in the open-access database [https://www.hope.uzh.ch/doca DOCA]. This database compiles, systematizes, and evaluates relevant content-analytical variables of communication and political science research areas and topics. |
||
[[Siegfried Kracauer]] provides a critique of quantitative analysis, asserting that it oversimplifies complex communications in order to be more reliable. On the other hand, qualitative analysis deals with the intricacies of latent interpretations, whereas quantitative has a focus on manifest meanings. He also acknowledges an "overlap" of qualitative and quantitative content analysis.<ref name=":03"/> Patterns are looked at more closely in qualitative analysis, and based on the latent meanings that the researcher may find, the course of the research could be changed. It is inductive and begins with open research questions, as opposed to a hypothesis.<ref name=":12"/> |
[[Siegfried Kracauer]] provides a critique of quantitative analysis, asserting that it oversimplifies complex communications in order to be more reliable. On the other hand, qualitative analysis deals with the intricacies of latent interpretations, whereas quantitative has a focus on manifest meanings. He also acknowledges an "overlap" of qualitative and quantitative content analysis.<ref name=":03"/> Patterns are looked at more closely in qualitative analysis, and based on the latent meanings that the researcher may find, the course of the research could be changed. It is inductive and begins with open research questions, as opposed to a hypothesis.<ref name=":12"/> |
||
== Codebooks == |
|||
The data collection instrument used in content analysis is the codebook or coding scheme. In qualitative content analysis the codebook is constructed and improved ''during'' coding, while in quantitative content analysis the codebook needs to be developed and pretested for reliability and validity ''before'' coding.<ref name="Neuendorf2016" /> The codebook includes detailed instructions for human coders plus clear definitions of the respective concepts or variables to be coded plus the assigned values. |
|||
According to current standards of good scientific practice, each content analysis study should provide their codebook in the appendix or as supplementary material so that [[reproducibility]] of the study is ensured. On the [https://osf.io Open Science Framework] (OSF) server of the [[Center for Open Science]] a lot of codebooks of content analysis studies are freely available via search for "codebook". |
|||
Furthermore, the ''Database of Variables for Content Analysis'' (DOCA) provides an open access archive of pretested variables and established codebooks for content analyses.<ref>{{Cite web |last1=Oehmer-Pedrazzi |first1=Franziska |last2=Kessler |first2=Sabrina |last3=Humprecht |first3=Edda |last4=Sommer |first4=Katharina |last5=Castro Herrero |first5=Laia |date=2022 |title=DOCA - Database of Categories for Content Analysis |url=https://www.hope.uzh.ch/doca |issn=2673-8597}}</ref> Measures from the archive can be adopted in future studies to ensure the use of high-quality and comparable instruments. DOCA covers, among others, measures for the content analysis of fictional media and entertainment (e.g., measures for sexualization in video games<ref>{{Cite journal |last1=Wulf |first1=Tim |last2=Possler |first2=Daniel |last3=Breuer |first3=Johannes |date=2021 |title=Sexualization (Video Games) |url=https://www.hope.uzh.ch/doca/article/view/2654 |journal=DOCA - Database of Variables for Content Analysis |language=en |doi=10.34778/3e |s2cid=233683109 |issn=2673-8597|doi-access=free }}</ref>), of user-generated media content (e.g., measures for online hate speech<ref>{{Cite journal |last=Esau |first=Katharina |date=2021 |title=Hate speech (Hate Speech/Incivility) |url=https://www.hope.uzh.ch/doca/article/view/5a |journal=DOCA - Database of Variables for Content Analysis |language=en |doi=10.34778/5a |s2cid=235551271 |issn=2673-8597|doi-access=free }}</ref>), and of news media and journalism (e.g., measures for stock photo use in press reporting on child sexual abuse,<ref>{{Cite journal |last1=Döring |first1=Nicola |last2=Walter |first2=Roberto |date=2022 |title=Iconography of Child Sexual Abuse in the News (Justice and Crime Reporting) |url=https://www.hope.uzh.ch/doca/article/view/3621 |journal=DOCA - Database of Variables for Content Analysis |language=en |doi=10.34778/2zu |s2cid=248329276 |issn=2673-8597|doi-access=free }}</ref> and measures of personalization in election campaign coverage<ref>{{Cite journal |last=Leidecker-Sandmann |first=Melanie |date=2021 |title=Personalization (Election Campaign Coverage) |url=https://www.hope.uzh.ch/doca/article/view/2g |journal=DOCA - Database of Variables for Content Analysis |language=en |doi=10.34778/2g |s2cid=235520184 |issn=2673-8597|doi-access=free }}</ref>). |
|||
== Computational tools == |
== Computational tools == |
||
With the rise of common computing facilities like PCs, computer-based methods of analysis are growing in popularity.<ref>Pfeiffer, Silvia, Stefan Fischer, and Wolfgang Effelsberg. "[https://ub-madoc.bib.uni-mannheim.de/794/1/TR-96-008.pdf Automatic audio content analysis]." Technical Reports 96 (1996).</ref><ref>Grimmer, Justin, and Brandon M. Stewart. "[https://web.stanford.edu/~jgrimmer/Text14/tc5.pdf Text as data: The promise and pitfalls of automatic content analysis methods for political texts]." Political analysis 21.3 (2013): 267-297.</ref><ref>Nasukawa, Tetsuya, and Jeonghee Yi. "[http://nactem.ac.uk/files/workshop06/TAKMIandSA2006.pdf Sentiment analysis: Capturing favorability using natural language processing]." Proceedings of the 2nd international conference on Knowledge capture. ACM, 2003.</ref> Answers to open ended questions, newspaper articles, political party manifestos, medical records or systematic observations in experiments can all be subject to systematic analysis of textual data. |
|||
By having contents of communication available in form of machine readable texts, the input is analyzed for frequencies and coded into categories for building up inferences. |
By having contents of communication available in form of machine readable texts, the input is analyzed for frequencies and coded into categories for building up inferences. |
||
Line 36: | Line 43: | ||
Computer-assisted analysis can help with large, electronic data sets by cutting out time and eliminating the need for multiple human coders to establish inter-coder reliability. However, human coders can still be employed for content analysis, as they are often more able to pick out nuanced and latent meanings in text. A study found that human coders were able to evaluate a broader range and make inferences based on latent meanings.<ref>{{Cite journal|last=Conway|first=Mike|date=March 2006|title=The Subjective Precision of Computers: A Methodological Comparison with Human Coding in Content Analysis|journal=Journalism & Mass Communication Quarterly|volume=83|issue=1|pages=186–200|doi=10.1177/107769900608300112|s2cid=143292050|issn=1077-6990}}</ref> |
Computer-assisted analysis can help with large, electronic data sets by cutting out time and eliminating the need for multiple human coders to establish inter-coder reliability. However, human coders can still be employed for content analysis, as they are often more able to pick out nuanced and latent meanings in text. A study found that human coders were able to evaluate a broader range and make inferences based on latent meanings.<ref>{{Cite journal|last=Conway|first=Mike|date=March 2006|title=The Subjective Precision of Computers: A Methodological Comparison with Human Coding in Content Analysis|journal=Journalism & Mass Communication Quarterly|volume=83|issue=1|pages=186–200|doi=10.1177/107769900608300112|s2cid=143292050|issn=1077-6990}}</ref> |
||
==Reliability == |
==Reliability and Validity == |
||
Robert Weber notes: "To make valid inferences from the text, it is important that the classification procedure be reliable in the sense of being consistent: Different people should code the same text in the same way".<ref>{{cite book|last=Weber|first=Robert Philip|title=Basic Content Analysis|url=https://archive.org/details/basiccontentanal00webe|url-access=limited|year=1990|publisher=Sage|location=Newbury Park, CA|isbn=9780803938632|page=[https://archive.org/details/basiccontentanal00webe/page/n14 12]|edition=2nd}}</ref> The validity, inter-coder reliability and intra-coder reliability are subject to intense methodological research efforts over long years.<ref name="Krippendorff2004" /> |
Robert Weber notes: "To make valid inferences from the text, it is important that the classification procedure be reliable in the sense of being consistent: Different people should code the same text in the same way".<ref>{{cite book|last=Weber|first=Robert Philip|title=Basic Content Analysis|url=https://archive.org/details/basiccontentanal00webe|url-access=limited|year=1990|publisher=Sage|location=Newbury Park, CA|isbn=9780803938632|page=[https://archive.org/details/basiccontentanal00webe/page/n14 12]|edition=2nd}}</ref> The validity, inter-coder reliability and intra-coder reliability are subject to intense methodological research efforts over long years.<ref name="Krippendorff2004" /> |
||
Neuendorf suggests that when human coders are used in content analysis at least two independent coders should be used. Reliability of human coding is often measured using a statistical measure of ''inter-coder reliability'' or "the amount of agreement or correspondence among two or more coders".<ref name="Neuendorf2016">{{cite book|author=Kimberly A. Neuendorf|title=The Content Analysis Guidebook|url=https://books.google.com/books?id=nMA5DQAAQBAJ|date=30 May 2016|publisher=SAGE|isbn=978-1-4129-7947-4}}</ref> Lacy and Riffe identify the measurement of inter-coder reliability as a strength of quantitative content analysis, arguing that, if content analysts do not measure inter-coder reliability, their data are no more reliable than the subjective impressions of a single reader.<ref>{{Cite journal|last1=Lacy|first1=Stephen R|last2=Riffe|first2=Daniel|date=1993|title=Sins of Omission and Commission in Mass Communication Quantitative Research|journal=Journalism & Mass Communication Quarterly|volume=70|issue=1|pages=126–132|doi=10.1177/107769909307000114|s2cid=144076335}}</ref> |
Neuendorf suggests that when human coders are used in content analysis at least two independent coders should be used. [[Reliability (statistics)|Reliability]] of human coding is often measured using a statistical measure of ''inter-coder reliability'' or "the amount of agreement or correspondence among two or more coders".<ref name="Neuendorf2016">{{cite book|author=Kimberly A. Neuendorf|title=The Content Analysis Guidebook|url=https://books.google.com/books?id=nMA5DQAAQBAJ|date=30 May 2016|publisher=SAGE|isbn=978-1-4129-7947-4}}</ref> Lacy and Riffe identify the measurement of inter-coder reliability as a strength of quantitative content analysis, arguing that, if content analysts do not measure inter-coder reliability, their data are no more reliable than the subjective impressions of a single reader.<ref>{{Cite journal|last1=Lacy|first1=Stephen R|last2=Riffe|first2=Daniel|date=1993|title=Sins of Omission and Commission in Mass Communication Quantitative Research|journal=Journalism & Mass Communication Quarterly|volume=70|issue=1|pages=126–132|doi=10.1177/107769909307000114|s2cid=144076335}}</ref> |
||
According to today's reporting standards, quantitative content analyses should be published with complete codebooks and for all variables or measures in the codebook the appropriate inter-coder or [[inter-rater reliability]] coefficients should be reported based on empirical pre-tests.<ref name="Neuendorf2016" /><ref name=":1" /><ref>{{Cite journal |last1=Oleinik |first1=Anton |last2=Popova |first2=Irina |last3=Kirdina |first3=Svetlana |last4=Shatalova |first4=Tatyana |date=2014 |title=On the choice of measures of reliability and validity in the content-analysis of texts |url=https://doi.org/10.1007/s11135-013-9919-0 |journal=Quality & Quantity |language=en |volume=48 |issue=5 |pages=2703–2718 |doi=10.1007/s11135-013-9919-0 |s2cid=144174429 |issn=1573-7845}}</ref> Furthermore, the [[Validity (statistics)|validity]] of all variables or measures in the codebook must be ensured. This can be achieved through the use of established measures that have proven their validity in earlier studies. Also, the [[Validity (statistics)#Content validity|content validity]] of the measures can be checked by experts from the field who scrutinize and then approve or correct coding instructions, definitions and examples in the codebook. |
|||
=== Kinds of text === |
=== Kinds of text === |
||
Line 45: | Line 54: | ||
# [[written text]], such as books and papers |
# [[written text]], such as books and papers |
||
# oral text, such as speech and theatrical performance |
# oral text, such as speech and theatrical performance |
||
# |
# iconic text, such as drawings, paintings, and icons |
||
# audio-visual text, such as TV programs, movies, and videos |
# audio-visual text, such as TV programs, movies, and videos |
||
# [[hypertext]]s, which are texts found on the Internet |
# [[hypertext]]s, which are texts found on the Internet |
||
=== History === |
=== History === |
||
Content analysis is research using the categorization and classification of speech, written text, interviews, images, or other forms of communication. In its beginnings, using the first newspapers at the end of the 19th century, analysis was done manually by measuring the number of columns given a subject. The approach can also be traced back to a university student studying patterns in Shakespeare's literature in 1893.<ref>{{Cite journal|last=Sumpter|first=Randall S.|date=July 2001|title=News about News|journal=Journalism History|volume=27|issue=2|pages=64–72|doi=10.1080/00947679.2001.12062572|s2cid=140499059|issn=0094-7679}}</ref> |
|||
Over the years, content analysis has been applied to a variety of scopes. [[Hermeneutics]] and [[philology]] have long used content analysis to interpret sacred and profane texts and, in many cases, to attribute texts' [[authorship]] and [[authentication|authenticity]].<ref name="Tipaldo 2014 42"/><ref name="Krippendorff2004">{{cite book|last=Krippendorff|first=Klaus|title=Content Analysis: An Introduction to Its Methodology|year=2004|publisher=Sage|location=Thousand Oaks, CA|isbn=9780761915454|pages=413|url=https://books.google.com/books?id=q657o3M3C8cC|edition=2nd}}</ref> |
Over the years, content analysis has been applied to a variety of scopes. [[Hermeneutics]] and [[philology]] have long used content analysis to interpret sacred and profane texts and, in many cases, to attribute texts' [[authorship]] and [[authentication|authenticity]].<ref name="Tipaldo 2014 42"/><ref name="Krippendorff2004">{{cite book|last=Krippendorff|first=Klaus|title=Content Analysis: An Introduction to Its Methodology|year=2004|publisher=Sage|location=Thousand Oaks, CA|isbn=9780761915454|pages=413|url=https://books.google.com/books?id=q657o3M3C8cC|edition=2nd}}</ref> |
||
In recent times, particularly with the advent of [[mass communication]], content analysis has known an increasing use to deeply analyze and understand media content and |
In recent times, particularly with the advent of [[mass communication]], content analysis has known an increasing use to deeply analyze and understand media content and media logic. |
||
The political scientist [[Harold Lasswell]] formulated the core questions of content analysis in its early-mid 20th-century mainstream version: "Who says what, to whom, why, to what extent and with what effect?".<ref>{{cite book| |
The political scientist [[Harold Lasswell]] formulated the core questions of content analysis in its early-mid 20th-century mainstream version: "Who says what, to whom, why, to what extent and with what effect?".<ref>{{cite book |last1=Lasswell |first1=Harold |editor1-last=Bryson |editor1-first=L. |title=The Communication of Ideas |date=1948 |publisher=Harper and Row |location=New York |page=216 |url=http://sipa.jlu.edu.cn/__local/E/39/71/4CE63D3C04A10B5795F0108EBE6_A7BC17AA_34AAE.pdf |chapter=The Structure and Function of Communication in Society}}</ref> The strong emphasis for a quantitative approach started up by Lasswell was finally carried out by another "father" of content analysis, [[Bernard Berelson]], who proposed a definition of content analysis which, from this point of view, is emblematic: "a research technique for the objective, systematic and quantitative description of the manifest content of communication".<ref name="Berelson1952">{{cite book|last=Berelson| first=B.|title=Content Analysis in Communication Research|year=1952|publisher=Free Press|location=Glencoe|pages=18}}</ref> |
||
Quantitative content analysis has enjoyed a renewed popularity in recent years thanks to technological advances and fruitful application in of mass communication and personal communication research. Content analysis of textual [[big data]] produced by [[new media]], particularly [[social media]] and [[mobile devices]] has become popular. These approaches take a simplified view of language that ignores the complexity of semiosis, the process by which meaning is formed out of language. Quantitative content analysts have been criticized for limiting the scope of content analysis to simple counting, and for applying the measurement methodologies of the natural sciences without reflecting critically on their appropriateness to social science.<ref name=":0">{{Cite book|title=Content Analysis: An Introduction to Its Methodology|url=https://archive.org/details/contentanalysisi00krip_916|url-access=limited|last=Krippendorff|first=Klaus|publisher=Sage|year=2004|isbn=978-0-7619-1544-7|location=California|pages=[https://archive.org/details/contentanalysisi00krip_916/page/n101 87]–89}}</ref> Conversely, qualitative content analysts have been criticized for being insufficiently systematic and too impressionistic.<ref name=":0" /> Krippendorff argues that quantitative and qualitative approaches to content analysis tend to overlap, and that there can be no generalisable conclusion as to which approach is superior.<ref name=":0" /> |
Quantitative content analysis has enjoyed a renewed popularity in recent years thanks to technological advances and fruitful application in of mass communication and personal communication research. Content analysis of textual [[big data]] produced by [[new media]], particularly [[social media]] and [[mobile devices]] has become popular. These approaches take a simplified view of language that ignores the complexity of [[semiosis]], the process by which meaning is formed out of language. Quantitative content analysts have been criticized for limiting the scope of content analysis to simple counting, and for applying the measurement methodologies of the natural sciences without reflecting critically on their appropriateness to social science.<ref name=":0">{{Cite book|title=Content Analysis: An Introduction to Its Methodology|url=https://archive.org/details/contentanalysisi00krip_916|url-access=limited|last=Krippendorff|first=Klaus|publisher=Sage|year=2004|isbn=978-0-7619-1544-7|location=California|pages=[https://archive.org/details/contentanalysisi00krip_916/page/n101 87]–89}}</ref> Conversely, qualitative content analysts have been criticized for being insufficiently systematic and too impressionistic.<ref name=":0" /> Krippendorff argues that quantitative and qualitative approaches to content analysis tend to overlap, and that there can be no generalisable conclusion as to which approach is superior.<ref name=":0" /> |
||
Content analysis can also be described as studying [[Trace evidence|traces]], which are documents from past times, and artifacts, which are non-linguistic documents. Texts are understood to be produced by communication processes in a broad sense of that phrase—often gaining mean through [[Abductive reasoning|abduction]].<ref name="Tipaldo 2014 42"/><ref>{{cite journal |last1=Timmermans |first1=Stefan |last2=Tavory |first2=Iddo |title=Theory Construction in Qualitative Research |journal=Sociological Theory |date=2012 |volume=30 |issue=3 |pages=167–186 |doi=10.1177/0735275112457914|s2cid=145177394 |url=http://grap.ulb.ac.be/wp-content/uploads/Timmermans-and-Tavory_Abductive-Analysis.pdf}}</ref> |
Content analysis can also be described as studying [[Trace evidence|traces]], which are documents from past times, and artifacts, which are non-linguistic documents. Texts are understood to be produced by communication processes in a broad sense of that phrase—often gaining mean through [[Abductive reasoning|abduction]].<ref name="Tipaldo 2014 42"/><ref>{{cite journal |last1=Timmermans |first1=Stefan |last2=Tavory |first2=Iddo |title=Theory Construction in Qualitative Research |journal=Sociological Theory |date=2012 |volume=30 |issue=3 |pages=167–186 |doi=10.1177/0735275112457914 |s2cid=145177394 |url=http://grap.ulb.ac.be/wp-content/uploads/Timmermans-and-Tavory_Abductive-Analysis.pdf |access-date=2018-12-09 |archive-date=2019-08-19 |archive-url=https://web.archive.org/web/20190819065302/http://grap.ulb.ac.be/wp-content/uploads/Timmermans-and-Tavory_Abductive-Analysis.pdf |url-status=dead }}</ref> |
||
=== Latent and manifest content === |
=== Latent and manifest content === |
||
Manifest content is readily understandable at its face value. Its meaning is direct. Latent content is not as overt, and requires interpretation to uncover the meaning or implication.<ref>{{Cite |
Manifest content is readily understandable at its face value. Its meaning is direct. Latent content is not as overt, and requires interpretation to uncover the meaning or implication.<ref>{{Cite book|last1=Jang-Hwan Lee|last2=Young-Gul Kim|last3=Sung-Ho Yu|title=Proceedings of the 34th Annual Hawaii International Conference on System Sciences |chapter=Stage model for knowledge management |year=2001|page=10|publisher=IEEE Comput. Soc|doi=10.1109/hicss.2001.927103|isbn=0-7695-0981-9|s2cid=34182315}}</ref> |
||
==Uses== |
==Uses== |
||
Holsti groups fifteen uses of content analysis into three basic [[Categorisation|categories]]:<ref name=Holsti1969>{{cite book|last=Holsti|first=Ole R.|title=Content Analysis for the Social Sciences and Humanities|year=1969|publisher=Addison-Wesley|location=Reading, MA|pages= |
Holsti groups fifteen uses of content analysis into three basic [[Categorisation|categories]]:<ref name=Holsti1969>{{cite book|last=Holsti|first=Ole R.|title=Content Analysis for the Social Sciences and Humanities|year=1969|publisher=Addison-Wesley|location=Reading, MA|pages=14–93|id=(Table 2-1, page 26)}}</ref> |
||
* make [[inference]]s about the antecedents of a [[communication]] |
* make [[inference]]s about the antecedents of a [[communication]] |
||
*describe and make inferences about characteristics of a communication |
*describe and make inferences about characteristics of a communication |
||
Line 71: | Line 82: | ||
The following table shows fifteen uses of content analysis in terms of their general purpose, element of the communication paradigm to which they apply, and the general question they are intended to answer. |
The following table shows fifteen uses of content analysis in terms of their general purpose, element of the communication paradigm to which they apply, and the general question they are intended to answer. |
||
{| class="wikitable |
{| class="wikitable" |
||
⚫ | |||
|- |
|||
⚫ | |||
|- |
|- |
||
! Purpose |
|||
! Element |
|||
! Question |
|||
! Use |
|||
| align=center| '''Use''' |
|||
|- |
|- |
||
| rowspan=2| Make inferences about the antecedents of communications |
| rowspan=2| Make inferences about the antecedents of communications |
||
Line 112: | Line 122: | ||
| |
| |
||
*Relate known characteristics of audiences to messages produced for them |
*Relate known characteristics of audiences to messages produced for them |
||
*Describe |
*Describe patterns of communication |
||
|- |
|- |
||
| Make inferences about the consequences of communications |
| Make inferences about the consequences of communications |
||
Line 126: | Line 136: | ||
|} |
|} |
||
As a counterpoint, there are limits to the scope of use for the procedures that characterize content analysis. In particular, if access to the goal of analysis can be obtained by direct means without material interference, then direct measurement techniques yield better data.<ref>{{cite book|last=Holsti|first=Ole R.|title=Content Analysis for the Social Sciences and Humanities|year=1969|publisher=Addison-Wesley|location=Reading, MA|pages= |
As a counterpoint, there are limits to the scope of use for the procedures that characterize content analysis. In particular, if access to the goal of analysis can be obtained by direct means without material interference, then direct measurement techniques yield better data.<ref>{{cite book|last=Holsti|first=Ole R.|title=Content Analysis for the Social Sciences and Humanities|year=1969|publisher=Addison-Wesley|location=Reading, MA|pages=15–16}}</ref> Thus, while content analysis attempts to quantifiably describe ''communications'' whose features are primarily categorical——limited usually to a nominal or ordinal scale——via selected conceptual units (the ''unitization'') which are assigned values (the ''categorization'') for ''enumeration'' while monitoring ''intercoder reliability'', if instead the target quantity manifestly is already directly measurable——typically on an interval or ratio scale——especially a continuous physical quantity, then such targets usually are not listed among those needing the "subjective" selections and formulations of content analysis.<ref>{{cite book|last=Holsti|first=Ole R.|title=Content Analysis for the Social Sciences and Humanities|year=1969|publisher=Addison-Wesley|location=Reading, MA}}</ref><ref>{{cite book|last=Neuendorf|first=Kimberly A.|title=The Content Analysis Guidebook|year=2002|publisher=Sage|location=Thousand Oaks, CA|isbn=0761919783|pages=52–54|id=(On content analysis's descriptive role)}}</ref><ref>{{cite book|last=Agresti|first=Alan|title=Categorical Data Analysis|edition=2nd|year=2002|publisher=Wiley|location=Hoboken, NJ|isbn=0471360937|pages=2–4|id=(On the meanings of "categorical" and other measurement scales)}}</ref><ref>{{cite book|last=Delfico|first=Joseph F.|title=Content Analysis: A Methodology for Structuring and Analyzing Written Material|year=1996|publisher=United States General Accounting Office|location=Washington, DC|url=https://www.gao.gov/products/PEMD-10.3.1|id=(Linked to a PDF)|pages=19–21}}</ref><ref>{{cite book|last=Delfico|first=Joseph F.|title=Content Analysis: A Methodology for Structuring and Analyzing Written Material|year=1996|publisher=United States General Accounting Office|location=Washington, DC|id=(ASCII transcription; Chapter 3:1.1, on uses according to scale type, and Appendix III, on intercoder reliability)|url=https://www.govinfo.gov/content/pkg/GAOREPORTS-PEMD-10-3-1/html/GAOREPORTS-PEMD-10-3-1.htm}}</ref><ref>{{Cite journal|last1=Carney|first1=T[homas] F[rancis]|date=1971|title=Content Analysis: A Review Essay|journal=Historical Methods Newsletter|language=en|volume=4|issue=2|pages=52–61|doi=10.1080/00182494.1971.10593939 |url=https://doi.org/10.1080/00182494.1971.10593939|id=(On content analysis's quantitative nature, unitization and categorization, and descriptive role)}}</ref><ref name=":1">{{cite book|last=Krippendorff|first=Klaus|title=Content Analysis: An Introduction to Its Methodology|edition=2nd|year=2004|publisher=Sage|location=Thousand Oaks, CA|isbn=0761915451|url=https://usu.instructure.com/files/70315935/download?download_frd=1&verifier=kPCeVgRYVJ8UK2gEQNbehYHbiKYBNjWMFleh6j5G|pages=(passim)|id=(On content analysis's quantitative nature, unitization and categorization, and uses by scale type)}}</ref><ref>{{cite book|last1=Hall|first1=Calvin S.|last2=Van de Castle|first2=Robert L.|title=The Content Analysis of Dreams|year=1966|publisher=Appleton-Century-Crofts|location=New York|pages=1–16|id=(Chapter 1, "The Methodology of Content Analysis," on the quantitative nature and uses of content analysis, and quoting "subjective" from page 12)}}</ref> For example (from mixed research and clinical application), as medical images ''communicate'' diagnostic features to physicians, [[neuroimaging]]'s [[stroke]] (infarct) volume scale called ASPECTS is ''unitized'' as 10 qualitatively delineated (unequal) brain regions in the [[middle cerebral artery]] territory, which it ''categorizes'' as being at least partly versus not at all infarcted in order to ''enumerate'' the latter, with published series often assessing ''intercoder reliability'' by [[Cohen's kappa]]. The foregoing ''italicized operations'' impose the uncredited ''form'' of content analysis onto an estimation of infarct extent, which instead is easily enough and more accurately measured as a volume directly on the images.<ref>{{Cite journal|last=Suss|first=Richard A.|date=2020|title=ASPECTS, The Mismeasure of Stroke: A Metrological Investigation|journal=OSF Preprints|doi=10.31219/osf.io/c4tkp |s2cid=242764761 |language=en|url=https://doi.org/10.31219/osf.io/c4tkp|id=(§3, §6, and §7 for the nature of, risks of, and alternative to ASPECTS, and page 76 for comparison to content analysis)}}</ref><ref>{{Cite journal|last1=Suss|first1=Richard A.|last2=Pinho|first2=Marco C.|date=2020|title=ASPECTS Distorts Infarct Volume Measurement|journal=American Journal of Neuroradiology|language=en|volume=41|issue=5|page=E28|doi=10.3174/ajnr.A6485 |pmid=32241774 |pmc=7228155 |s2cid=214767536 |url=https://doi.org/10.3174/ajnr.A6485}}</ref> ("Accuracy ... is the highest form of reliability."<ref>{{cite book|last=Weber|first=Robert Philip|title=Basic Content Analysis|edition=2nd|year=1990|publisher=Sage|location=Newbury Park, CA|isbn=0803938632|page=17}}</ref>) The concomitant clinical assessment, however, by the [[National Institutes of Health Stroke Scale]] (NIHSS) or the [[modified Rankin Scale]] (mRS), retains the necessary form of content analysis. Recognizing potential limits of content analysis across the contents of language and images alike, [[Klaus Krippendorff]] affirms that "comprehen[sion] ... may ... not conform at all to the process of classification and/or counting by which most content analyses proceed,"<ref>{{Cite journal|last=Krippendorff|first=Klaus|date=1974|title=Review of Thomas F. Carney, ''Content Analysis: A Technique for Systematic Inference from Communications''|journal=University of Pennsylvania Scholarly Commons, Annenberg School of Communication Departmental Papers|language=en|url=https://repository.upenn.edu/cgi/viewcontent.cgi?article=1538&context=asc_papers|id=(Quote from 4th page, unnumbered)}}</ref> suggesting that content analysis might materially distort a message. |
||
== |
== Developing the initial coding scheme == |
||
The process of the initial coding scheme or approach to coding is contingent on the particular content analysis approach selected. Through a directed content analysis, the scholars draft a preliminary coding scheme from pre-existing theory or assumptions. While with the conventional content analysis approach, the initial coding scheme developed from the data. |
The process of the initial coding scheme or approach to coding is contingent on the particular content analysis approach selected. Through a directed content analysis, the scholars draft a preliminary coding scheme from pre-existing theory or assumptions. While with the conventional content analysis approach, the initial coding scheme developed from the data. |
||
=== |
=== Conventional process of coding === |
||
With either approach above, |
With either approach above, researchers may immerse themselves into the data to obtain an overall picture. A consistent and clear unit of coding is vital, with the choices ranging from a single word to several paragraphs and from texts to iconic symbols. Lastly, researchers construct the relationships between codes by sorting out them within specific categories or themes.<ref>{{Cite book|url=http://methods.sagepub.com/reference/the-sage-encyclopedia-of-educational-research-measurement-and-evaluation|title=Content Analysis|publisher=Sage|year=2018 |doi=10.4135/9781506326139 |access-date=December 16, 2019 |last1=Frey |first1=Bruce B. |isbn=9781506326153 |s2cid=4110403 }}</ref> |
||
==See also== |
==See also== |
||
Line 141: | Line 151: | ||
* [[Transition words]] |
* [[Transition words]] |
||
* [[Video content analysis]] |
* [[Video content analysis]] |
||
* [[Grounded theory]] |
|||
==References== |
==References== |
||
Line 146: | Line 157: | ||
==Further reading== |
==Further reading== |
||
* {{cite journal | last1 = Graneheim | first1 = Ulla Hällgren | last2 = Lundman | first2 = Berit | year = 2004 | title = Qualitative content analysis in nursing research: concepts, procedures and measures to achieve trustworthiness | journal = Nurse Education Today | volume = 24 | issue = 2| pages = 105–112 | doi=10.1016/j.nedt.2003.10.001| pmid = 14769454 }} |
* {{cite journal | last1 = Graneheim | first1 = Ulla Hällgren | last2 = Lundman | first2 = Berit | year = 2004 | title = Qualitative content analysis in nursing research: concepts, procedures and measures to achieve trustworthiness | journal = Nurse Education Today | volume = 24 | issue = 2| pages = 105–112 | doi=10.1016/j.nedt.2003.10.001| pmid = 14769454 | s2cid = 17354453 }} |
||
* Budge, |
* {{cite book|date=2001|editor-first=Ian|editor-last=Budge|location=Oxford, UK|publisher=Oxford University Press|title=Mapping Policy Preferences. Estimates for Parties, Electors and Governments 1945-1998}}<!-- auto-translated by Module:CS1 translator --> |
||
* |
* {{cite book|date=2008|editor-first=Klaus|editor-first2=Mary Angela|editor-last=Krippendorff|editor-last2=Bock|location=Thousand Oaks, CA|publisher=Sage|title=The Content Analysis Reader.}}<!-- auto-translated by Module:CS1 translator --> |
||
* |
* {{cite book|date=2017|edition=2nd|first=Kimberly|last=Neuendorf|location=Thousand Oaks, CA|publisher=Sage|title=The Content Analysis Guidebook}}<!-- auto-translated by Module:CS1 translator --> |
||
* Roberts, |
* {{cite book|date=1997|editor-first=Carl|editor-last=Roberts|location=Mahwah, NJ|publisher=Lawrence Erlbaum|title=Text Analysis for the Social Sciences: Methods for Drawing Inferences from Texts and Transcripts.}}<!-- auto-translated by Module:CS1 translator --> |
||
* |
* {{cite book|date=2005|edition=8th|first=Roger|first2=Joseph|last=Wimmer|last2=Dominick|location=Belmont, CA|publisher=Wadsworth|title=Mass Media Research: An Introduction}}<!-- auto-translated by Module:CS1 translator --> |
||
{{Prone to spam|date=October 2017}} |
{{Prone to spam|date=October 2017}} |
||
{{Z148}} |
|||
{{Psychology}} |
{{Psychology}} |
Latest revision as of 17:59, 1 November 2024
Part of a series on |
Sociology |
---|
Content analysis is the study of documents and communication artifacts, which might be texts of various formats, pictures, audio or video. Social scientists use content analysis to examine patterns in communication in a replicable and systematic manner.[1] One of the key advantages of using content analysis to analyse social phenomena is their non-invasive nature, in contrast to simulating social experiences or collecting survey answers.
Practices and philosophies of content analysis vary between academic disciplines. They all involve systematic reading or observation of texts or artifacts which are assigned labels (sometimes called codes) to indicate the presence of interesting, meaningful pieces of content.[2][3] By systematically labeling the content of a set of texts, researchers can analyse patterns of content quantitatively using statistical methods, or use qualitative methods to analyse meanings of content within texts.
Computers are increasingly used in content analysis to automate the labeling (or coding) of documents. Simple computational techniques can provide descriptive data such as word frequencies and document lengths. Machine learning classifiers can greatly increase the number of texts that can be labeled, but the scientific utility of doing so is a matter of debate. Further, numerous computer-aided text analysis (CATA) computer programs are available that analyze text for predetermined linguistic, semantic, and psychological characteristics.[4]
Goals
[edit]Content analysis is best understood as a broad family of techniques. Effective researchers choose techniques that best help them answer their substantive questions. That said, according to Klaus Krippendorff, six questions must be addressed in every content analysis:[5]
- Which data are analyzed?
- How are the data defined?
- From what population are data drawn?
- What is the relevant context?
- What are the boundaries of the analysis?
- What is to be measured?
The simplest and most objective form of content analysis considers unambiguous characteristics of the text such as word frequencies, the page area taken by a newspaper column, or the duration of a radio or television program. Analysis of simple word frequencies is limited because the meaning of a word depends on surrounding text. Key Word In Context (KWIC) routines address this by placing words in their textual context. This helps resolve ambiguities such as those introduced by synonyms and homonyms.
A further step in analysis is the distinction between dictionary-based (quantitative) approaches and qualitative approaches. Dictionary-based approaches set up a list of categories derived from the frequency list of words and control the distribution of words and their respective categories over the texts. While methods in quantitative content analysis in this way transform observations of found categories into quantitative statistical data, the qualitative content analysis focuses more on the intentionality and its implications. There are strong parallels between qualitative content analysis and thematic analysis.[6]
Qualitative and quantitative content analysis
[edit]Quantitative content analysis highlights frequency counts and statistical analysis of these coded frequencies.[7] Additionally, quantitative content analysis begins with a framed hypothesis with coding decided on before the analysis begins. These coding categories are strictly relevant to the researcher's hypothesis. Quantitative analysis also takes a deductive approach.[8] Examples of content-analytical variables and constructs can be found, for example, in the open-access database DOCA. This database compiles, systematizes, and evaluates relevant content-analytical variables of communication and political science research areas and topics.
Siegfried Kracauer provides a critique of quantitative analysis, asserting that it oversimplifies complex communications in order to be more reliable. On the other hand, qualitative analysis deals with the intricacies of latent interpretations, whereas quantitative has a focus on manifest meanings. He also acknowledges an "overlap" of qualitative and quantitative content analysis.[7] Patterns are looked at more closely in qualitative analysis, and based on the latent meanings that the researcher may find, the course of the research could be changed. It is inductive and begins with open research questions, as opposed to a hypothesis.[8]
Codebooks
[edit]The data collection instrument used in content analysis is the codebook or coding scheme. In qualitative content analysis the codebook is constructed and improved during coding, while in quantitative content analysis the codebook needs to be developed and pretested for reliability and validity before coding.[4] The codebook includes detailed instructions for human coders plus clear definitions of the respective concepts or variables to be coded plus the assigned values.
According to current standards of good scientific practice, each content analysis study should provide their codebook in the appendix or as supplementary material so that reproducibility of the study is ensured. On the Open Science Framework (OSF) server of the Center for Open Science a lot of codebooks of content analysis studies are freely available via search for "codebook".
Furthermore, the Database of Variables for Content Analysis (DOCA) provides an open access archive of pretested variables and established codebooks for content analyses.[9] Measures from the archive can be adopted in future studies to ensure the use of high-quality and comparable instruments. DOCA covers, among others, measures for the content analysis of fictional media and entertainment (e.g., measures for sexualization in video games[10]), of user-generated media content (e.g., measures for online hate speech[11]), and of news media and journalism (e.g., measures for stock photo use in press reporting on child sexual abuse,[12] and measures of personalization in election campaign coverage[13]).
Computational tools
[edit]With the rise of common computing facilities like PCs, computer-based methods of analysis are growing in popularity.[14][15][16] Answers to open ended questions, newspaper articles, political party manifestos, medical records or systematic observations in experiments can all be subject to systematic analysis of textual data.
By having contents of communication available in form of machine readable texts, the input is analyzed for frequencies and coded into categories for building up inferences.
Computer-assisted analysis can help with large, electronic data sets by cutting out time and eliminating the need for multiple human coders to establish inter-coder reliability. However, human coders can still be employed for content analysis, as they are often more able to pick out nuanced and latent meanings in text. A study found that human coders were able to evaluate a broader range and make inferences based on latent meanings.[17]
Reliability and Validity
[edit]Robert Weber notes: "To make valid inferences from the text, it is important that the classification procedure be reliable in the sense of being consistent: Different people should code the same text in the same way".[18] The validity, inter-coder reliability and intra-coder reliability are subject to intense methodological research efforts over long years.[5] Neuendorf suggests that when human coders are used in content analysis at least two independent coders should be used. Reliability of human coding is often measured using a statistical measure of inter-coder reliability or "the amount of agreement or correspondence among two or more coders".[4] Lacy and Riffe identify the measurement of inter-coder reliability as a strength of quantitative content analysis, arguing that, if content analysts do not measure inter-coder reliability, their data are no more reliable than the subjective impressions of a single reader.[19]
According to today's reporting standards, quantitative content analyses should be published with complete codebooks and for all variables or measures in the codebook the appropriate inter-coder or inter-rater reliability coefficients should be reported based on empirical pre-tests.[4][20][21] Furthermore, the validity of all variables or measures in the codebook must be ensured. This can be achieved through the use of established measures that have proven their validity in earlier studies. Also, the content validity of the measures can be checked by experts from the field who scrutinize and then approve or correct coding instructions, definitions and examples in the codebook.
Kinds of text
[edit]There are five types of texts in content analysis:
- written text, such as books and papers
- oral text, such as speech and theatrical performance
- iconic text, such as drawings, paintings, and icons
- audio-visual text, such as TV programs, movies, and videos
- hypertexts, which are texts found on the Internet
History
[edit]Content analysis is research using the categorization and classification of speech, written text, interviews, images, or other forms of communication. In its beginnings, using the first newspapers at the end of the 19th century, analysis was done manually by measuring the number of columns given a subject. The approach can also be traced back to a university student studying patterns in Shakespeare's literature in 1893.[22]
Over the years, content analysis has been applied to a variety of scopes. Hermeneutics and philology have long used content analysis to interpret sacred and profane texts and, in many cases, to attribute texts' authorship and authenticity.[3][5]
In recent times, particularly with the advent of mass communication, content analysis has known an increasing use to deeply analyze and understand media content and media logic. The political scientist Harold Lasswell formulated the core questions of content analysis in its early-mid 20th-century mainstream version: "Who says what, to whom, why, to what extent and with what effect?".[23] The strong emphasis for a quantitative approach started up by Lasswell was finally carried out by another "father" of content analysis, Bernard Berelson, who proposed a definition of content analysis which, from this point of view, is emblematic: "a research technique for the objective, systematic and quantitative description of the manifest content of communication".[24]
Quantitative content analysis has enjoyed a renewed popularity in recent years thanks to technological advances and fruitful application in of mass communication and personal communication research. Content analysis of textual big data produced by new media, particularly social media and mobile devices has become popular. These approaches take a simplified view of language that ignores the complexity of semiosis, the process by which meaning is formed out of language. Quantitative content analysts have been criticized for limiting the scope of content analysis to simple counting, and for applying the measurement methodologies of the natural sciences without reflecting critically on their appropriateness to social science.[25] Conversely, qualitative content analysts have been criticized for being insufficiently systematic and too impressionistic.[25] Krippendorff argues that quantitative and qualitative approaches to content analysis tend to overlap, and that there can be no generalisable conclusion as to which approach is superior.[25]
Content analysis can also be described as studying traces, which are documents from past times, and artifacts, which are non-linguistic documents. Texts are understood to be produced by communication processes in a broad sense of that phrase—often gaining mean through abduction.[3][26]
Latent and manifest content
[edit]Manifest content is readily understandable at its face value. Its meaning is direct. Latent content is not as overt, and requires interpretation to uncover the meaning or implication.[27]
Uses
[edit]Holsti groups fifteen uses of content analysis into three basic categories:[28]
- make inferences about the antecedents of a communication
- describe and make inferences about characteristics of a communication
- make inferences about the effects of a communication.
He also places these uses into the context of the basic communication paradigm.
The following table shows fifteen uses of content analysis in terms of their general purpose, element of the communication paradigm to which they apply, and the general question they are intended to answer.
Purpose | Element | Question | Use |
---|---|---|---|
Make inferences about the antecedents of communications | Source | Who? |
|
Encoding process | Why? |
| |
Describe & make inferences about the characteristics of communications | Channel | How? |
|
Message | What? |
| |
Recipient | To whom? |
| |
Make inferences about the consequences of communications | Decoding process | With what effect? |
|
Note. Purpose, communication element, & question from Holsti.[28] Uses primarily from Berelson[29] as adapted by Holsti.[28] |
As a counterpoint, there are limits to the scope of use for the procedures that characterize content analysis. In particular, if access to the goal of analysis can be obtained by direct means without material interference, then direct measurement techniques yield better data.[30] Thus, while content analysis attempts to quantifiably describe communications whose features are primarily categorical——limited usually to a nominal or ordinal scale——via selected conceptual units (the unitization) which are assigned values (the categorization) for enumeration while monitoring intercoder reliability, if instead the target quantity manifestly is already directly measurable——typically on an interval or ratio scale——especially a continuous physical quantity, then such targets usually are not listed among those needing the "subjective" selections and formulations of content analysis.[31][32][33][34][35][36][20][37] For example (from mixed research and clinical application), as medical images communicate diagnostic features to physicians, neuroimaging's stroke (infarct) volume scale called ASPECTS is unitized as 10 qualitatively delineated (unequal) brain regions in the middle cerebral artery territory, which it categorizes as being at least partly versus not at all infarcted in order to enumerate the latter, with published series often assessing intercoder reliability by Cohen's kappa. The foregoing italicized operations impose the uncredited form of content analysis onto an estimation of infarct extent, which instead is easily enough and more accurately measured as a volume directly on the images.[38][39] ("Accuracy ... is the highest form of reliability."[40]) The concomitant clinical assessment, however, by the National Institutes of Health Stroke Scale (NIHSS) or the modified Rankin Scale (mRS), retains the necessary form of content analysis. Recognizing potential limits of content analysis across the contents of language and images alike, Klaus Krippendorff affirms that "comprehen[sion] ... may ... not conform at all to the process of classification and/or counting by which most content analyses proceed,"[41] suggesting that content analysis might materially distort a message.
Developing the initial coding scheme
[edit]The process of the initial coding scheme or approach to coding is contingent on the particular content analysis approach selected. Through a directed content analysis, the scholars draft a preliminary coding scheme from pre-existing theory or assumptions. While with the conventional content analysis approach, the initial coding scheme developed from the data.
Conventional process of coding
[edit]With either approach above, researchers may immerse themselves into the data to obtain an overall picture. A consistent and clear unit of coding is vital, with the choices ranging from a single word to several paragraphs and from texts to iconic symbols. Lastly, researchers construct the relationships between codes by sorting out them within specific categories or themes.[42]
See also
[edit]- Donald Wayne Foster
- Hermeneutics
- Text mining
- The Polish Peasant in Europe and America
- Transition words
- Video content analysis
- Grounded theory
References
[edit]- ^ Bryman, Alan; Bell, Emma (2011). Business research methods (3rd ed.). Cambridge: Oxford University Press. ISBN 9780199583409. OCLC 746155102.
- ^ Hodder, I. (1994). The interpretation of documents and material culture. Thousand Oaks etc.: Sage. p. 155. ISBN 978-0761926870.
- ^ a b c Tipaldo, G. (2014). L'analisi del contenuto e i mass media. Bologna, IT: Il Mulino. p. 42. ISBN 978-88-15-24832-9.
- ^ a b c d Kimberly A. Neuendorf (30 May 2016). The Content Analysis Guidebook. SAGE. ISBN 978-1-4129-7947-4.
- ^ a b c Krippendorff, Klaus (2004). Content Analysis: An Introduction to Its Methodology (2nd ed.). Thousand Oaks, CA: Sage. p. 413. ISBN 9780761915454.
- ^ Vaismoradi, Mojtaba; Turunen, Hannele; Bondas, Terese (2013-09-01). "Content analysis and thematic analysis: Implications for conducting a qualitative descriptive study". Nursing & Health Sciences. 15 (3): 398–405. doi:10.1111/nhs.12048. ISSN 1442-2018. PMID 23480423. S2CID 10881485.
- ^ a b Kracauer, Siegfried (1952). "The Challenge of Qualitative Content Analysis". Public Opinion Quarterly. 16 (4, Special Issue on International Communications Research): 631. doi:10.1086/266427. ISSN 0033-362X.
- ^ a b White, Marilyn Domas; Marsh, Emily E. (2006). "Content Analysis: A Flexible Methodology". Library Trends. 55 (1): 22–45. doi:10.1353/lib.2006.0053. hdl:2142/3670. ISSN 1559-0682. S2CID 6342233.
- ^ Oehmer-Pedrazzi, Franziska; Kessler, Sabrina; Humprecht, Edda; Sommer, Katharina; Castro Herrero, Laia (2022). "DOCA - Database of Categories for Content Analysis". ISSN 2673-8597.
- ^ Wulf, Tim; Possler, Daniel; Breuer, Johannes (2021). "Sexualization (Video Games)". DOCA - Database of Variables for Content Analysis. doi:10.34778/3e. ISSN 2673-8597. S2CID 233683109.
- ^ Esau, Katharina (2021). "Hate speech (Hate Speech/Incivility)". DOCA - Database of Variables for Content Analysis. doi:10.34778/5a. ISSN 2673-8597. S2CID 235551271.
- ^ Döring, Nicola; Walter, Roberto (2022). "Iconography of Child Sexual Abuse in the News (Justice and Crime Reporting)". DOCA - Database of Variables for Content Analysis. doi:10.34778/2zu. ISSN 2673-8597. S2CID 248329276.
- ^ Leidecker-Sandmann, Melanie (2021). "Personalization (Election Campaign Coverage)". DOCA - Database of Variables for Content Analysis. doi:10.34778/2g. ISSN 2673-8597. S2CID 235520184.
- ^ Pfeiffer, Silvia, Stefan Fischer, and Wolfgang Effelsberg. "Automatic audio content analysis." Technical Reports 96 (1996).
- ^ Grimmer, Justin, and Brandon M. Stewart. "Text as data: The promise and pitfalls of automatic content analysis methods for political texts." Political analysis 21.3 (2013): 267-297.
- ^ Nasukawa, Tetsuya, and Jeonghee Yi. "Sentiment analysis: Capturing favorability using natural language processing." Proceedings of the 2nd international conference on Knowledge capture. ACM, 2003.
- ^ Conway, Mike (March 2006). "The Subjective Precision of Computers: A Methodological Comparison with Human Coding in Content Analysis". Journalism & Mass Communication Quarterly. 83 (1): 186–200. doi:10.1177/107769900608300112. ISSN 1077-6990. S2CID 143292050.
- ^ Weber, Robert Philip (1990). Basic Content Analysis (2nd ed.). Newbury Park, CA: Sage. p. 12. ISBN 9780803938632.
- ^ Lacy, Stephen R; Riffe, Daniel (1993). "Sins of Omission and Commission in Mass Communication Quantitative Research". Journalism & Mass Communication Quarterly. 70 (1): 126–132. doi:10.1177/107769909307000114. S2CID 144076335.
- ^ a b Krippendorff, Klaus (2004). Content Analysis: An Introduction to Its Methodology (2nd ed.). Thousand Oaks, CA: Sage. pp. (passim). ISBN 0761915451. (On content analysis's quantitative nature, unitization and categorization, and uses by scale type).
- ^ Oleinik, Anton; Popova, Irina; Kirdina, Svetlana; Shatalova, Tatyana (2014). "On the choice of measures of reliability and validity in the content-analysis of texts". Quality & Quantity. 48 (5): 2703–2718. doi:10.1007/s11135-013-9919-0. ISSN 1573-7845. S2CID 144174429.
- ^ Sumpter, Randall S. (July 2001). "News about News". Journalism History. 27 (2): 64–72. doi:10.1080/00947679.2001.12062572. ISSN 0094-7679. S2CID 140499059.
- ^ Lasswell, Harold (1948). "The Structure and Function of Communication in Society". In Bryson, L. (ed.). The Communication of Ideas (PDF). New York: Harper and Row. p. 216.
- ^ Berelson, B. (1952). Content Analysis in Communication Research. Glencoe: Free Press. p. 18.
- ^ a b c Krippendorff, Klaus (2004). Content Analysis: An Introduction to Its Methodology. California: Sage. pp. 87–89. ISBN 978-0-7619-1544-7.
- ^ Timmermans, Stefan; Tavory, Iddo (2012). "Theory Construction in Qualitative Research" (PDF). Sociological Theory. 30 (3): 167–186. doi:10.1177/0735275112457914. S2CID 145177394. Archived from the original (PDF) on 2019-08-19. Retrieved 2018-12-09.
- ^ Jang-Hwan Lee; Young-Gul Kim; Sung-Ho Yu (2001). "Stage model for knowledge management". Proceedings of the 34th Annual Hawaii International Conference on System Sciences. IEEE Comput. Soc. p. 10. doi:10.1109/hicss.2001.927103. ISBN 0-7695-0981-9. S2CID 34182315.
- ^ a b c Holsti, Ole R. (1969). Content Analysis for the Social Sciences and Humanities. Reading, MA: Addison-Wesley. pp. 14–93. (Table 2-1, page 26).
- ^ Berelson, Bernard (1952). Content Analysis in Communication Research. Glencoe, Ill: Free Press.
- ^ Holsti, Ole R. (1969). Content Analysis for the Social Sciences and Humanities. Reading, MA: Addison-Wesley. pp. 15–16.
- ^ Holsti, Ole R. (1969). Content Analysis for the Social Sciences and Humanities. Reading, MA: Addison-Wesley.
- ^ Neuendorf, Kimberly A. (2002). The Content Analysis Guidebook. Thousand Oaks, CA: Sage. pp. 52–54. ISBN 0761919783. (On content analysis's descriptive role).
- ^ Agresti, Alan (2002). Categorical Data Analysis (2nd ed.). Hoboken, NJ: Wiley. pp. 2–4. ISBN 0471360937. (On the meanings of "categorical" and other measurement scales).
- ^ Delfico, Joseph F. (1996). Content Analysis: A Methodology for Structuring and Analyzing Written Material. Washington, DC: United States General Accounting Office. pp. 19–21. (Linked to a PDF).
- ^ Delfico, Joseph F. (1996). Content Analysis: A Methodology for Structuring and Analyzing Written Material. Washington, DC: United States General Accounting Office. (ASCII transcription; Chapter 3:1.1, on uses according to scale type, and Appendix III, on intercoder reliability).
- ^ Carney, T[homas] F[rancis] (1971). "Content Analysis: A Review Essay". Historical Methods Newsletter. 4 (2): 52–61. doi:10.1080/00182494.1971.10593939. (On content analysis's quantitative nature, unitization and categorization, and descriptive role).
- ^ Hall, Calvin S.; Van de Castle, Robert L. (1966). The Content Analysis of Dreams. New York: Appleton-Century-Crofts. pp. 1–16. (Chapter 1, "The Methodology of Content Analysis," on the quantitative nature and uses of content analysis, and quoting "subjective" from page 12).
- ^ Suss, Richard A. (2020). "ASPECTS, The Mismeasure of Stroke: A Metrological Investigation". OSF Preprints. doi:10.31219/osf.io/c4tkp. S2CID 242764761. (§3, §6, and §7 for the nature of, risks of, and alternative to ASPECTS, and page 76 for comparison to content analysis).
- ^ Suss, Richard A.; Pinho, Marco C. (2020). "ASPECTS Distorts Infarct Volume Measurement". American Journal of Neuroradiology. 41 (5): E28. doi:10.3174/ajnr.A6485. PMC 7228155. PMID 32241774. S2CID 214767536.
- ^ Weber, Robert Philip (1990). Basic Content Analysis (2nd ed.). Newbury Park, CA: Sage. p. 17. ISBN 0803938632.
- ^ Krippendorff, Klaus (1974). "Review of Thomas F. Carney, Content Analysis: A Technique for Systematic Inference from Communications". University of Pennsylvania Scholarly Commons, Annenberg School of Communication Departmental Papers. (Quote from 4th page, unnumbered).
- ^ Frey, Bruce B. (2018). Content Analysis. Sage. doi:10.4135/9781506326139. ISBN 9781506326153. S2CID 4110403. Retrieved December 16, 2019.
Further reading
[edit]- Graneheim, Ulla Hällgren; Lundman, Berit (2004). "Qualitative content analysis in nursing research: concepts, procedures and measures to achieve trustworthiness". Nurse Education Today. 24 (2): 105–112. doi:10.1016/j.nedt.2003.10.001. PMID 14769454. S2CID 17354453.
- Budge, Ian, ed. (2001). Mapping Policy Preferences. Estimates for Parties, Electors and Governments 1945-1998. Oxford, UK: Oxford University Press.
- Krippendorff, Klaus; Bock, Mary Angela, eds. (2008). The Content Analysis Reader. Thousand Oaks, CA: Sage.
- Neuendorf, Kimberly (2017). The Content Analysis Guidebook (2nd ed.). Thousand Oaks, CA: Sage.
- Roberts, Carl, ed. (1997). Text Analysis for the Social Sciences: Methods for Drawing Inferences from Texts and Transcripts. Mahwah, NJ: Lawrence Erlbaum.
- Wimmer, Roger; Dominick, Joseph (2005). Mass Media Research: An Introduction (8th ed.). Belmont, CA: Wadsworth.