NorDiaSyn is one of several subprojects belonging to the joint Nordic project Scandinavian Dialect Syntax (ScanDiaSyn). The ScanDiaSyn network explores syntactic variation in the Scandinavian dialects, and simultaneously collects large amounts of material from speakers across the Scandinavian dialect continuum. The speech material is made available for research through the Nordic Dialect Corpus and Nordic Syntax Database.
The Norwegian Dialect Corpus and Syntax Database is the Norwegian part of the Nordic Dialect Corpus and Syntax Database. The corpus and the database are developed in collaboration with our partners in ScanDiaSyn and Nordic Center of Excellence in Microcomparative Syntax (NORMS). The Norwegian Dialect Corpus is the part of the corpus that includes recordings of Norwegian speech. The Nordic Syntax Database contains data from questionnaires that are designed to map the grammatical variation in Nordic dialects.
In the Norwegian Dialect Corpus, one can find more than 2 million words from Norwegian dialects. Version 1.0 of the corpus contains recordings from over 100 different measuring points in Norway, evenly distributed across the country. (See map of the measurement points.) Data has been collected in the period 2006-2010. The Norwegian Dialect Corpus includes recordings from the Dialect Archive at the Department of Linguistics and Scandinavian Studies at the University of Oslo. The transcriptions from the dialect archive are financed by Norsk ordbok 2014 (NO2014). The recordings in the Dialect Archive were made at different locations in Norway in the 1960s and 70s. Read more about the contents of the corpus under the tab About the Data Collection.
Through the search interface Glossa (also developed by the Text Laboratory, see the Tools tab at the top of the page) you can search the corpus through words, grammatical affixes, grammatical categories, etc. The search results come up as concordances, connected directly to audio and video.
The data collection in NorDiaSyn was led by Janne Bondi Johannessen, at the Text Laboratory, UiO, in close cooperation with Tor A. ┼farli, NTNU, and ěystein A. Vangsnes, UiT. The technical development is done by the Text Laboratory. For further information, see the Project Info tab.
Refer to the corpus as follows:
Johannessen, Janne Bondi, Joel Priestley, Kristin Hagen, Tor Anders ┼farli, and ěystein Alexander Vangsnes. 2009. "The Nordic dialect Corpus - an Advanced Research Tool". In Jokinen, Kristiina and Eckhard Bick (eds.): Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. NEALT Proceedings Series Volume 4.