Culture, corpora and semantics is a methodological investigation in the use of elicited data and Web data in the analysis of cultural specificities starting from semantic elements. After considering and discussing several theoretical and analytical approaches to culture in linguistics, anthropology, psychology, and marketing research, a specifically developed method of analysis and cross-cultural comparison is applied to elicited data on chocolate and wine, gathered through free sentence-completion and sentence-writing tests on English and Italian respondents. The results obtained are discussed within the framework of cultural systems theories and used as control reference for further methodological investigations. In particular, the elicited data are qualitatively and quantitatively compared to non-elicited sentences on chocolate and wine from general Web corpora in English and Italian. Furthermore, in order to find an alternative route which could avoid the complex and time-consuming process of manually coding a large dataset, some alternative routes are tested, based on the creation of sub-corpora using sampling procedures and analysis of a limited number of the most frequent words in the dataset’s wordlist. Finally, an automatic semantic tagger is tested on the elicited data, in order to assess the extent of its possible application in cultural analysis. Comparisons between the Web corpora and the elicited data suggest that large general Web corpora can be considered representative of the cultural associations to a node word and could thus be used in cultural analysis or in exploratory marketing research. Finally, in the light of the results of the various methodological tests, the work discusses general issues, such as the relationship between word frequency and cultural relevance, and tagset granularity. The analysis of the two words in British English – chocolate and wine – and their denotationally comparable terms in Italian (cioccolato/a, cioccolatino/i, and vino/i) provides the opportunity to test different types of data, sampling procedures, coding methods, and a set of cultural theories in the identification of the cultural associations of those terms. As the subtitle of the book clarifies, the goal of the present work is methodological, namely the development of a viable corpus linguistics method for distinguishing cultural associations of a given word from personal mental associations. To this end, an interdisciplinary approach was adopted. The theoretical framework for this work draws on several disciplines that study culture through language, though from different perspectives, namely corpus linguistics, cultural studies, marketing, anthropology and psychology, with a focus on their shared elements relevant to the goal of the present research. This was considered necessary in order to make the method applicable outside linguistics. However, the book presents a linguistic piece of research and addresses a perspective audience of linguists. The work accomplishes two main goals. First, from a cultural perspective, it selects a cultural framework – cultural systems theories – that lends itself to computational semantic analysis, and develops a computational procedure for distinguishing the mental associations anchored in culture from those which are not. Second, from a methodological perspective, the quantitative comparisons performed between the entire datasets (both elicited and Web-based) on the one hand, and smaller samples of the data on the other, show, in this particular context, to what extent findings based on smaller data samples are generalisable to the whole database the samples come from, thus adding useful pieces of information to our general knowledge in corpus linguistics. In sum, this book, makes a foray into a multidisciplinary approach to the study of corpora, culture and semantics and provides researchers involved in (cross)cultural analysis with theoretical as well as practical ideas for a user-friendly corpus analysis of cultural associations.
Culture, corpora and semantics: Methodological issues in using elicited and corpus data for cultural comparison
BIANCHI, Francesca
2012-01-01
Abstract
Culture, corpora and semantics is a methodological investigation in the use of elicited data and Web data in the analysis of cultural specificities starting from semantic elements. After considering and discussing several theoretical and analytical approaches to culture in linguistics, anthropology, psychology, and marketing research, a specifically developed method of analysis and cross-cultural comparison is applied to elicited data on chocolate and wine, gathered through free sentence-completion and sentence-writing tests on English and Italian respondents. The results obtained are discussed within the framework of cultural systems theories and used as control reference for further methodological investigations. In particular, the elicited data are qualitatively and quantitatively compared to non-elicited sentences on chocolate and wine from general Web corpora in English and Italian. Furthermore, in order to find an alternative route which could avoid the complex and time-consuming process of manually coding a large dataset, some alternative routes are tested, based on the creation of sub-corpora using sampling procedures and analysis of a limited number of the most frequent words in the dataset’s wordlist. Finally, an automatic semantic tagger is tested on the elicited data, in order to assess the extent of its possible application in cultural analysis. Comparisons between the Web corpora and the elicited data suggest that large general Web corpora can be considered representative of the cultural associations to a node word and could thus be used in cultural analysis or in exploratory marketing research. Finally, in the light of the results of the various methodological tests, the work discusses general issues, such as the relationship between word frequency and cultural relevance, and tagset granularity. The analysis of the two words in British English – chocolate and wine – and their denotationally comparable terms in Italian (cioccolato/a, cioccolatino/i, and vino/i) provides the opportunity to test different types of data, sampling procedures, coding methods, and a set of cultural theories in the identification of the cultural associations of those terms. As the subtitle of the book clarifies, the goal of the present work is methodological, namely the development of a viable corpus linguistics method for distinguishing cultural associations of a given word from personal mental associations. To this end, an interdisciplinary approach was adopted. The theoretical framework for this work draws on several disciplines that study culture through language, though from different perspectives, namely corpus linguistics, cultural studies, marketing, anthropology and psychology, with a focus on their shared elements relevant to the goal of the present research. This was considered necessary in order to make the method applicable outside linguistics. However, the book presents a linguistic piece of research and addresses a perspective audience of linguists. The work accomplishes two main goals. First, from a cultural perspective, it selects a cultural framework – cultural systems theories – that lends itself to computational semantic analysis, and develops a computational procedure for distinguishing the mental associations anchored in culture from those which are not. Second, from a methodological perspective, the quantitative comparisons performed between the entire datasets (both elicited and Web-based) on the one hand, and smaller samples of the data on the other, show, in this particular context, to what extent findings based on smaller data samples are generalisable to the whole database the samples come from, thus adding useful pieces of information to our general knowledge in corpus linguistics. In sum, this book, makes a foray into a multidisciplinary approach to the study of corpora, culture and semantics and provides researchers involved in (cross)cultural analysis with theoretical as well as practical ideas for a user-friendly corpus analysis of cultural associations.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.