CMDI 1.1. Metadata
Header
MdCreator: Kristin Hagen
MdCreationDate: 2018-09-26
MdSelfLink:
MdProfile: clarin.eu:cr1:p_1407745711925
MdCollectionDisplayName: Clarino - Textlab
Resources
ResourceProxyList:
ResourceProxy [id=‘nota-oslo-lp’]:
ResourceType [mimetype=‘’]: LandingPage
ResourceRef: http://www.tekstlab.uio.no/nota/oslo/english.html
JournalFileProxyList:
ResourceRelationList:
IsPartOfList:
Components
corpusProfile:
resourceCommonInfo [ComponentId=‘clarin.eu:cr1:c_1396012485126’]:
resourceType: corpus
identificationInfo [ComponentId=‘clarin.eu:cr1:c_1396012485125’]:
resourceName [xml:lang=‘nb’]: LIA-korpuset for norske dialekter
resourceName [xml:lang=‘en’]: The LIA Corpus for Norwegian Dialects
description [xml:lang=‘en’]: The LIA corpus is a speech corpus with old recordings (1950 - 1990) from four Norwegian universities: NTNU, UoB, UoO and UoT. The recordings are mainly made for dialect and onomastic research and the topics of the interviews and conversations are typically about old trades such as agriculture, fisheries, logging and life at the summer farm. Other topics are weaving, knitting, baking or dialects. The recordings are semi-formal or informal and often take place in an informant’s home.
The first version of the corpus have 1.5 million tokens and 620 speakers from 120 places in Norway.
The corpus is morphologically tagged.
description [xml:lang=‘nb’]: LIA-korpuset er et talespråkskorpus med gamle opptak (1950 - 1990) fra fire norske universitet: NTNU, UiB, UiO og UiT. Opptakene er gjort for dialektforskning og navneforskning, og handler ofte om landbruk, skogbruk, fiske, livet på setra og gamle håndverkstradisjoner. Som regel er opptakene gjort i private hjem, og intervjuene og samtalene er ganske uformelle.

Den første versjonen av korpuset inneholder 1.5 millioner tokens og 620 talere fra 120 steder i Norge. Korpuset er morfologisk tagget.
resourceShortName [xml:lang=‘en’]: The LIA Corpus
resourceShortName [xml:lang=‘nb’]: LIA-korpuset
url: http://tekstlab.uio.no/LIA/norsk/index.html
url: http://tekstlab.uio.no/LIA/norsk/index.html
distributionInfo [ComponentId=‘clarin.eu:cr1:c_1396012485124’]:
licenceInfo [ComponentId=‘clarin.eu:cr1:c_1396012485158’]:
userCategory: Academic
distributionAccessMedium: accessibleThroughInterface
executionLocation: http://tekstlab.uio.no/LIA/norsk/index.html
licence [ComponentId=‘clarin.eu:cr1:c_1447674760330’]:
licenceFamily: CLARIN
licenceName: CLARIN_ACA-NC-LOC-PRIV-ND-*
licenceURL: https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&LOC=1&PRIV=1&NORED=1&ND=1
conditionsOfUse: *
conditionsOfUse: BY
conditionsOfUse: ID
conditionsOfUse: LOC
conditionsOfUse: NC
conditionsOfUse: ND
conditionsOfUse: NORED
conditionsOfUse: PRIV
nonStandardConditionsOfUse: The corpus has audio and video recordings classified as personal data. In agreement with NSD, the Data Protection Official in Norway, the corpus is accessible only through Glossa, a search and post-processing tool developed by the Text Laboratory.
The audio excerpts given by the search interface can not be shown in public unless you have an agreement with the Text Laboratory.
Please note that every individual researcher is responsible for treating the participants in the corpus with respect and sincerity. Furthermore, the participants must be kept anonymous in every published paper or other output.
licensor:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName [xml:lang=‘en’]: University of Oslo
organizationName [xml:lang=‘no’]: Universitetet i Oslo
organizationShortName [xml:lang=‘no’]: UiO
organizationShortName [xml:lang=‘en’]: UoO
departmentName [xml:lang=‘en’]: Department of Linguistics and Scandinavian Studies
departmentName [xml:lang=‘no’]: Institutt for lingvistiske og nordiske studier (ILN)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
distributionRightsHolder:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName [xml:lang=‘en’]: University of Oslo
organizationName [xml:lang=‘no’]: Universitetet i Oslo
organizationShortName [xml:lang=‘no’]: UiO
organizationShortName [xml:lang=‘en’]: UoO
departmentName [xml:lang=‘en’]: Department of Linguistics and Scandinavian Studies
departmentName [xml:lang=‘no’]: Institutt for lingvistiske og nordiske studier (ILN)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
contact:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
metadataInfo [ComponentId=‘clarin.eu:cr1:c_1407745711922’]:
metadataCreationDate: 2018-09-26
metadataLastDateUpdated: 2018-11-19
metadataCreator:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: person
personInfo [ComponentId=‘clarin.eu:cr1:c_1396012485192’]:
surname: Hagen
givenName: Kristin
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: kristin.hagen@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
versionInfo [ComponentId=‘clarin.eu:cr1:c_1430905751648’]:
version: First version (2018)
validationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711923’]:
validated: true
validationType: content
validationMode: manual
validationModeDetails: The transcriptions are proofread against the audio files.
validationExtent: partial
validator:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The LIA project
organizationShortName: LIA
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
resourceDocumentationInfo [ComponentId=‘clarin.eu:cr1:c_1355150532301’]:
documentationStructured [ComponentId=‘clarin.eu:cr1:c_1361876010648’]:
role: documentation
documentInfo [ComponentId=‘clarin.eu:cr1:c_1353678848788’]:
documentType: other
title: Heimesida til LIA-korpuset for norske dialekter
url: http://tekstlab.uio.no/LIA/norsk/index.html
resourceCreationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711921’]:
creationStartDate: 2014-04-01
creationEndDate: 2018-06-31
resourceCreator:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The LIA project
(Project participants and employees in the LIA project)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://tekstlab.uio.no/LIA/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName [xml:lang=‘nb’]: LIA (Language Infrastructure made Accessible)
projectShortName: LIA
projectID: 22 59 41
url: http://tekstlab.uio.no/LIA/
url: https://www.hf.uio.no/iln/english/research/projects/language-infrastructure-made-accessible/index.html
fundingType: nationalFunds
funder: The Research Council of Norway
fundingCountry: Norway
projectStartDate: 2014-01-04
projectEndDate: 2019-12-31
corpusInfo [ComponentId=‘clarin.eu:cr1:c_1407745711878’]:
corpusType: Multimodal Corpus
corpusPartInfo [ComponentId=‘clarin.eu:cr1:c_1407745711885’]:
mediaType: text
corpusTextInfo [ComponentId=‘clarin.eu:cr1:c_1396012485188’]:
textFormatInfo [ComponentId=‘clarin.eu:cr1:c_1427452477072’]:
mimeType: txt
sizePerTextFormat [ComponentId=‘clarin.eu:cr1:c_1447674760342’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 1 511 245
sizeUnit: tokens
characterEncodingInfo [ComponentId=‘clarin.eu:cr1:c_1447674760355’]:
characterEncoding: utf-8
corpusPartInfo [ComponentId=‘clarin.eu:cr1:c_1407745711885’]:
mediaType: audio
corpusAudioInfo [ComponentId=‘clarin.eu:cr1:c_1404130561236’]:
audioSizeInfo [ComponentId=‘clarin.eu:cr1:c_1360230992160’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: Approx 12,3 GB
sizeUnit: gb
settingInfo [ComponentId=‘clarin.eu:cr1:c_1360230992162’]:
naturality: spontaneous
conversationalType: dialogue
audience: few
interactivity: overlapping
interaction: Semiformal or informal interviews with one or more interviewers. Often the recordings are more like conversations. The recordings are mostly from peoples homes.
audioFormatInfo [ComponentId=‘clarin.eu:cr1:c_1427452477070’]:
mimeType: wav and mp3
recordingQuality: medium
compressionInfo [ComponentId=‘clarin.eu:cr1:c_1360230992165’]:
compression: true
compressionName: mp3
corpusPartGeneralInfo [ComponentId=‘clarin.eu:cr1:c_1407745711882’]:
personSourceSetInfo [ComponentId=‘clarin.eu:cr1:c_1360931019775’]:
numberOfPersons: 620
ageOfPersons: teenager
ageOfPersons: adult
ageOfPersons: elderly
ageRangeStart: 10
ageRangeEnd: 99
sexOfPersons: mixed
originOfPersons: native
dialectAccentOfPersons: Dialects from 132 places in Norway
geographicDistributionOfPersons: All over Norway
lingualityInfo [ComponentId=‘clarin.eu:cr1:c_1355150532313’]:
lingualityType: monolingual
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: No
languageName: Norwegian
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: Nn
languageName: Norwegian Nynorsk
modalityInfo [ComponentId=‘clarin.eu:cr1:c_1447674760356’]:
modalityType: spokenLanguage
modalityTypeDetails: Two annotation modes: One phonetic (with Norwegian alphabet) and one orthographic.
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 1 511 245
sizeUnit: tokens
annotationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711924’]:
annotationType: morphosyntacticAnnotation-posTagging
annotatedElements: other
segmentationLevel: word
tagset: POS tagset created for the statistical LIA-tagger - based on the tagset of the Oslo Bergen Tagger.
tagsetLanguageId: nn
tagsetLanguageName: Norwegian Nynorsk
theoreticModel: MarMoT
annotationMode: automatic
annotationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711924’]:
annotationType: speechAnnotation-orthographicTranscription
annotationManualUnstructured [ComponentId=‘clarin.eu:cr1:c_1355150532325’]:
role: annotationManual
documentUnstructured: Orthographic transcription,cf Nynorskordboka: https://ordbok.uib.no/
annotationManualStructured [ComponentId=‘clarin.eu:cr1:c_1361876010647’]:
role: annotationManual
documentInfo [ComponentId=‘clarin.eu:cr1:c_1353678848788’]:
documentType: manual
title [xml:lang=‘nn’]: Transkripsjonsrettleiing for LIA
author: Kristin Hagen and Live Håberg and Eirik Olsen and Åshild Søfteland
year: 2018
url: http://tekstlab.uio.no/LIA/pdf/transkripsjonsrettleiing_lia.pdf
annotationTool [ComponentId=‘clarin.eu:cr1:c_1355150532326’]:
targetResourceNameURI: https://www.hf.uio.no/iln/english/about/organization/text-laboratory/services/oslo-transliterator/index.html
classificationInfo [ComponentId=‘clarin.eu:cr1:c_1403588862809’]:
genreInfo [ComponentId=‘clarin.eu:cr1:c_1407745711877’]:
genreType: speechGenre
genre: informal
unstandardisedGenre: conversations and informal interviews
classificationInfo [ComponentId=‘clarin.eu:cr1:c_1403588862809’]:
genreInfo [ComponentId=‘clarin.eu:cr1:c_1407745711877’]:
genreType: speechGenre
genre: semi formal
unstandardisedGenre: interviews
timeCoverageInfo [ComponentId=‘clarin.eu:cr1:c_1447674760358’]:
timeCoverage: 1951 - 1995
geographicCoverageInfo [ComponentId=‘clarin.eu:cr1:c_1447674760357’]:
geographicCoverage: All over Norway
recordingInfo [ComponentId=‘clarin.eu:cr1:c_1426673949970’]:
recordingDeviceType: tapeVHS
recordingEnvironment: other