Search Facilities
Data Collection
Technical Solutions
Nordic Dialect Corpus and Syntax Database

The corpus and database have been initiated under the ScanDiaSyn research network umbrella and the Nordic Centre of Excellence NORMS. The technical solutions are provided by the Text Laboratory. Both language resources are intended for research and education.

Nordic Atlas of Language Structures (NALS) Journal is published
• Nordic Dialect Corpus: Expanded Icelandic part - 20 new informants from 6 places
• Nordic Dialect Corpus: New Search Interface Documentation including User Manual

Nordic Dialect Corpus
Nordic Dialect Corpus is a corpus of Norwegian, Swedish, Danish, Faroese, Icelandic and Övdalian spoken language. It consists of spontaneous speech data from dialects of the North Germanic languages across all of the Nordic countries. The linguistic data in the corpus comes from a variety of sources, both old and new (see Data Collection). The corpus contains about 2,8 million words from conversations and interviews by dialect speakers. It is transcribed and linked to audio and video, has a map function, and can be searched in a large variety of ways. Even if the aim of the corpus is Nordic syntax research, the corpus is a general one, a Norwegian Dialect Corpus, a Swedish Dialect Corpus and so on, to be used in a wide range of research areas, such as phonology, morphology and lexicography.

You can search a small demo corpus here. (User name: guest, password: guest).

How to refer to the corpus: Johannessen, Janne Bondi, Joel Priestley, Kristin Hagen, Tor Anders Åfarli, and Øystein Alexander Vangsnes. 2009. The Nordic Dialect Corpus - an Advanced Research Tool. In Jokinen, Kristiina and Eckhard Bick (eds.): Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. NEALT Proceedings Series Volume 4. (Read the paper)
Read also: Johannessen, Janne Bondi, Øystein Alexander Vangsnes, Joel Priestley, and Kristin Hagen. 2014. A multilingual speech corpus of North-Germanic languages. In: Spoken Corpora and Linguistic Studies. John Benjamins Publishing Company p. 69-83.
(Read the paper)

Nordic Syntax Database
The database consists of judgments by 924 Nordic dialect speakers from 207 places to a list of sentences that illustrate various syntactic phenomena. Many of the speakers are the same in both database and corpus. The sentences have been given grades, and on the basis of this, dialet maps can be generated, and isoglosses drawn. The judgments can be sorted and filtered in many ways according to place, age, sex of informants or type of syntactic phenomenon.

How to refer to the database: Lindstad, Arne Martinus; Nøklestad, Anders; Johannessen, Janne Bondi; Vangsnes, Øystein Alexander. 2009. The Nordic Dialect Database: Mapping Microsyntactic Variation in the Scandinavian Languages. In Jokinen, Kristiina and Eckhard Bick (eds.): NEALT Proceedings Series;Volum 4. (Read the paper).