CMDI 1.1. Metadata
Header
MdCreator: Kristin Hagen
MdCreationDate: 2021-04-09
MdSelfLink:
MdProfile: clarin.eu:cr1:p_1407745711925
MdCollectionDisplayName: Clarino - Textlab
Resources
ResourceProxyList:
ResourceProxy [id=‘nordic-dialect-corpus-lp’]:
ResourceType [mimetype=‘’]: LandingPage
ResourceRef: http://www.tekstlab.uio.no/nota/scandiasyn/index.html
ResourceProxy [id=‘ndc-corpus’]:
ResourceType [mimetype=‘’]: Resource
ResourceRef: https://tekstlab.uio.no/glossa2/ndc2
ResourceProxy [id=‘ndc-transcriptions’]:
ResourceType [mimetype=‘’]: Resource
ResourceRef: http://www.tekstlab.uio.no/scandiasyn/download.html
JournalFileProxyList:
ResourceRelationList:
ResourceRelation:
RelationType: transcriptions
Res1 [ref=‘ndc-corpus’]:
Res2 [ref=‘ndc-transcriptions’]:
IsPartOfList:
Components
corpusProfile:
resourceCommonInfo [ComponentId=‘clarin.eu:cr1:c_1396012485126’]:
resourceType [ref=‘ndc-transcriptions’]: corpus
identificationInfo [ComponentId=‘clarin.eu:cr1:c_1396012485125’]:
resourceName [xml:lang=‘en’]: Nordic Dialect Corpus - downloadable transcriptions
resourceName [xml:lang=‘nb’]: Nordisk dialektkorpus - nedlastbare transkripsjoner
description [xml:lang=‘en’]: Nordic Dialect Corpus v. 4.0 is a corpus of Norwegian, Swedish, Danish, Faroese, Icelandic and Övdalian spoken language. It consists of spontaneous speech data from dialects of the North Germanic languages across all of the Nordic countries. The linguistic data in the corpus comes from a variety of sources, (see homepage - Data Collection), recorded in 1998 - 2015. The corpus contains more than 2.75 million words from conversations and interviews by dialect speakers.

The downloadable version of the corpus contains all transcriptions in the corpus, both in txt and html format. The Norwegian and Övdaliantranscriptions are available in to versions: one phonetic and one orthographic. The other transcriptions are orthographically transcribed.
resourceShortName [xml:lang=‘en’]: NDC - downloadable transcriptions
url: http://www.tekstlab.uio.no/scandiasyn/download.html
PID: http://hdl.handle.net/11538/0000-0005-E7C7-6
distributionInfo [ComponentId=‘clarin.eu:cr1:c_1396012485124’] [ref=‘ndc-transcriptions’]:
licenceInfo [ComponentId=‘clarin.eu:cr1:c_1396012485158’]:
userCategory: Public
distributionAccessMedium: downloadable
downloadLocation: http://www.tekstlab.uio.no/scandiasyn/download.html
executionLocation: http://www.tekstlab.uio.no/nota/scandiasyn/
licence [ComponentId=‘clarin.eu:cr1:c_1447674760330’]:
licenceFamily: Creative Commons (CC)
licenceName: Creative_Commons-BY-NC-SA (CC-BY-NC-SA)
licenceURL: http://creativecommons.org/licenses/by-nc-sa/4.0/
conditionsOfUse: BY
conditionsOfUse: NC
conditionsOfUse: SA
nonStandardConditionsOfUse: The corpus has audio and video recordings classified as personal data. In agreement with NSD, the Data Protection Official in Norway, the video and audio files are accessible only through Glossa, a search and post-processing tool developed by the Text Laboratory.

Please note that every individual researcher is responsible for treating the participants in the corpus with respect and sincerity. Furthermore, the participants must be kept anonymous in every published paper or other output.
licensor:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName [xml:lang=‘en’]: University of Oslo
organizationName [xml:lang=‘no’]: Universitetet i Oslo
organizationShortName [xml:lang=‘no’]: UiO
organizationShortName [xml:lang=‘en’]: UoO
departmentName [xml:lang=‘en’]: Department of Linguistics and Scandinavian Studies
departmentName [xml:lang=‘no’]: Institutt for lingvistiske og nordiske studier (ILN)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/english/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
distributionRightsHolder:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName [xml:lang=‘en’]: University of Oslo
organizationName [xml:lang=‘no’]: Universitetet i Oslo
organizationShortName [xml:lang=‘no’]: UiO
organizationShortName [xml:lang=‘en’]: UoO
departmentName [xml:lang=‘en’]: Department of Linguistics and Scandinavian Studies
departmentName [xml:lang=‘no’]: Institutt for lingvistiske og nordiske studier (ILN)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/english/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
contact:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
metadataInfo [ComponentId=‘clarin.eu:cr1:c_1407745711922’]:
metadataCreationDate: 2015-02-03
metadataLastDateUpdated: 2021-04-16
metadataCreator:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: person
personInfo [ComponentId=‘clarin.eu:cr1:c_1396012485192’]:
surname: Hagen
givenName: Kristin
sex: female
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
validationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711923’]:
validated: true
validationType: content
validationMode: manual
validationModeDetails: The transcriptions are proof read against the audio files. The national projects NorDiaSyn, DanDiaSyn and SweDiaSyn have proof read own transcriptions, see homepage - Transcription
validationExtent: full
resourceDocumentationInfo [ComponentId=‘clarin.eu:cr1:c_1355150532301’]:
documentationStructured [ComponentId=‘clarin.eu:cr1:c_1361876010648’]:
role: documentation
documentInfo [ComponentId=‘clarin.eu:cr1:c_1353678848788’]:
documentType: other
title: Nordic Dialect Corpus and Syntax Database
author: The Text Laboratory
year: 2013
url: http://www.tekstlab.uio.no/nota/scandiasyn/
documentLanguageId: en
documentationStructured [ComponentId=‘clarin.eu:cr1:c_1361876010648’]:
role: documentation
documentInfo [ComponentId=‘clarin.eu:cr1:c_1353678848788’]:
documentType: manual
title: The Nordic Dialect Corpus - Search Interface Documentation
author: Eirik Olsen
year: 2014
url: http://www.tekstlab.uio.no/nota/scandiasyn/help/
documentLanguageId: en
documentationStructured [ComponentId=‘clarin.eu:cr1:c_1361876010648’]:
role: documentation
documentInfo [ComponentId=‘clarin.eu:cr1:c_1353678848788’]:
documentType: book
title [xml:lang=‘nb’]: Om artiklene i denne boka og Nordisk dialektkorpus
editor: Janne Bondi Johannessen og Kristin Hagen
year: 2014
publisher: Novus forlag
bookTitle: Språk i Norge og nabolanda. Ny forskning om talespråk.
ISBN: 978-82-7099-795-4
documentLanguageName: Norwegian bokmål
documentLanguageId: nb
resourceCreationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711921’]:
creationStartDate: 2005-01-01
creationEndDate: 2019-09-31
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName [xml:lang=‘en’]: Scandinavian Dialect Syntax
projectShortName: ScanDiaSyn
url: http://websim.arkivert.uit.no/scandiasyn/scandiasyn/index.html%3fcolapsemenu=colapsemenu
url: http://www.tekstlab.uio.no/nota/scandiasyn/index.html
fundingType: other
funder: http://websim.arkivert.uit.no/scandiasyn/scandiasyn/29
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName [xml:lang=‘nb’]: NorDiaSyn - Norsk dialektsyntaks
projectName: Nordiasyn - Norwegian Dialect Syntax
projectShortName: Nordiasyn
url: http://www.tekstlab.uio.no/nota/NorDiaSyn/index.html
url: http://www.tekstlab.uio.no/nota/NorDiaSyn/english/index.html
fundingType: nationalFunds
funder: The Research Council of Norway
fundingCountry: Norway
projectStartDate: 2009-01-01
projectEndDate: 2013-12-31
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName [xml:lang=‘nb’]: For the funding of the national projects in Norway, Sweden, Denmark, Iceland and Faroese islands, see under National Projects: http://www.tekstlab.uio.no/nota/scandiasyn/dialect_data_collection.html
url: http://www.tekstlab.uio.no/nota/scandiasyn/dialect_data_collection.html
fundingType: nationalFunds
corpusInfo [ComponentId=‘clarin.eu:cr1:c_1407745711878’]:
corpusType [ref=‘ndc-transcriptions’]: Written Corpus
corpusPartInfo [ComponentId=‘clarin.eu:cr1:c_1407745711885’]:
mediaType: text
corpusTextInfo [ComponentId=‘clarin.eu:cr1:c_1396012485188’]:
textFormatInfo [ComponentId=‘clarin.eu:cr1:c_1427452477072’]:
mimeType: Downloadable transcriptions in txt and html format
sizePerTextFormat [ComponentId=‘clarin.eu:cr1:c_1447674760342’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 2 754 289
sizeUnit: tokens
characterEncodingInfo [ComponentId=‘clarin.eu:cr1:c_1447674760355’]:
characterEncoding: utf-8
corpusPartGeneralInfo [ComponentId=‘clarin.eu:cr1:c_1407745711882’]:
personSourceSetInfo [ComponentId=‘clarin.eu:cr1:c_1360931019775’]:
numberOfPersons: 737
ageOfPersons: teenager
ageOfPersons: adult
ageOfPersons: elderly
ageRangeStart: 11
ageRangeEnd: 94
sexOfPersons: mixed
originOfPersons: native
dialectAccentOfPersons: Dialects from Norway, Sweden, Denmark, the Faroe Islands, Iceland and Älvdalen.
geographicDistributionOfPersons: Norway, Sweden, Denmark, the Faroe Islands, Iceland and Älvdalen
lingualityInfo [ComponentId=‘clarin.eu:cr1:c_1355150532313’]:
lingualityType: multilingual
multilingualityType: other
multilingualityTypeDetails: Interviews and conversations in 5 scandinavian languages.
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: nb
languageName: Norwegian Bokmål (the orthographic transcriptions)
sizePerLanguage [ComponentId=‘clarin.eu:cr1:c_1447674760349’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 1 997 920
sizeUnit: tokens
languageVarietyInfo [ComponentId=‘clarin.eu:cr1:c_1428388179422’]:
languageVarietyType: dialect
languageVarietyName: Dialects from 111 places in Norway, 438 informants
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: Sv
languageName: Swedish (Övdalien included)
sizePerLanguage [ComponentId=‘clarin.eu:cr1:c_1447674760349’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 376 868,14 798 of them are Övdalian
sizeUnit: tokens
languageVarietyInfo [ComponentId=‘clarin.eu:cr1:c_1428388179422’]:
languageVarietyType: dialect
languageVarietyName: Dialects from 44 places in Sweden, 150 informants
17 informants from 7 places are Övdalian.
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: Da
languageName: Danish
sizePerLanguage [ComponentId=‘clarin.eu:cr1:c_1447674760349’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 220 360
sizeUnit: tokens
languageVarietyInfo [ComponentId=‘clarin.eu:cr1:c_1428388179422’]:
languageVarietyType: dialect
languageVarietyName: Dialects from 15 places in Denmark. 81 informants
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: Is
languageName: Icelandic
sizePerLanguage [ComponentId=‘clarin.eu:cr1:c_1447674760349’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 94 338
sizeUnit: tokens
languageVarietyInfo [ComponentId=‘clarin.eu:cr1:c_1428388179422’]:
languageVarietyType: dialect
languageVarietyName: Dialects from 8 places in Iceland, 48 informants
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: fo
languageName: Faroese
sizePerLanguage [ComponentId=‘clarin.eu:cr1:c_1447674760349’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 64 803
sizeUnit: tokens
languageVarietyInfo [ComponentId=‘clarin.eu:cr1:c_1428388179422’]:
languageVarietyType: dialect
languageVarietyName: Dialects from 5 places on the Faroese islands, 20 informants
modalityInfo [ComponentId=‘clarin.eu:cr1:c_1447674760356’]:
modalityType: spokenLanguage
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 2 754 289
sizeUnit: tokens
annotationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711924’]:
annotationType: speechAnnotation-phoneticTranscription
annotationManualUnstructured [ComponentId=‘clarin.eu:cr1:c_1355150532325’]:
role: annotationManual
documentUnstructured: Norwegian and Övdalian have phonetic transcriptions, see http://www.tekstlab.uio.no/nota/scandiasyn/transcription.html
annotationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711924’]:
annotationType: speechAnnotation-orthographicTranscription
annotationManualUnstructured [ComponentId=‘clarin.eu:cr1:c_1355150532325’]:
role: annotationManual
documentUnstructured: All languages are ortographical transcribed, see http://www.tekstlab.uio.no/nota/scandiasyn/transcription.html
annotationTool [ComponentId=‘clarin.eu:cr1:c_1355150532326’]:
targetResourceNameURI: Transcriber (http://trans.sourceforge.net/en/presentation.php )
ELAN (https://tla.mpi.nl/tools/tla-tools/elan/)
annotationTool [ComponentId=‘clarin.eu:cr1:c_1355150532326’]:
targetResourceNameURI: For Norwegian and Övdalian: https://www.hf.uio.no/iln/english/about/organization/text-laboratory/services/oslo-transliterator/index.html
classificationInfo [ComponentId=‘clarin.eu:cr1:c_1403588862809’]:
genreInfo [ComponentId=‘clarin.eu:cr1:c_1407745711877’]:
genreType: speechGenre
genre: informal
unstandardisedGenre: conversations
classificationInfo [ComponentId=‘clarin.eu:cr1:c_1403588862809’]:
genreInfo [ComponentId=‘clarin.eu:cr1:c_1407745711877’]:
genreType: speechGenre
genre: semi formal
unstandardisedGenre: interviews
timeCoverageInfo [ComponentId=‘clarin.eu:cr1:c_1447674760358’]:
timeCoverage: 1998 - 2015
geographicCoverageInfo [ComponentId=‘clarin.eu:cr1:c_1447674760357’]:
geographicCoverage: Norway, Sweden, Denmark, the Faroe Islands, Iceland and Älvdalen from 183 places
recordingInfo [ComponentId=‘clarin.eu:cr1:c_1426673949970’]:
recordingEnvironment: office
recordingEnvironment: closedPublicPlace
recordingEnvironment: conferenceRoom
recordingEnvironment: lectureRoom
recordingEnvironment: other