CMDI 1.1. Metadata
Header
MdCreator: Kristin Hagen
MdCreationDate: 2016-10-27
MdSelfLink:
MdProfile: clarin.eu:cr1:p_1407745711925
MdCollectionDisplayName: Clarino - Textlab
Resources
ResourceProxyList:
JournalFileProxyList:
ResourceRelationList:
IsPartOfList:
Components
corpusProfile:
resourceCommonInfo [ComponentId=‘clarin.eu:cr1:c_1396012485126’]:
resourceType: corpus
identificationInfo [ComponentId=‘clarin.eu:cr1:c_1396012485125’]:
resourceName [xml:lang=‘nb’]: SKRIV-korpuset
resourceName [xml:lang=‘en’]: The SKRIV Corpus
description [xml:lang=‘en’]: Texts written by students in upper secondary vocational
education programs. The corpus is especially suitable for the analysis of texts written by students with Norwegian as their second language.
There are approx 225 texts and 112 000 words in the corpus. The texts differ in length, genre and type.

The text corpus have three different versions of each text: one scanned original in pdf format and two transcribed versions in txt format: one original transcription with errors and one version where the errors are corrected.
All versions are linked and it is possible to search in both transcribed versions.
description [xml:lang=‘nb’]: SKRIV-korpuset (Skriving i videregående skole) består av tekster skrevet av elever i videregående opplæring på yrkesfaglige utdanningsprogrammer. Det er spesielt tilrettelagt for analyse av tekster skrevet av elever som har norsk som sitt andrespråk.

Materialet er autentiske elevtekster fra tentamener, skolearbeid og praksisuker. De er skrevet innenfor norskfaget og innenfor elevenes ulike programfag fra Bygg- og anleggsteknikk, Service og samferdsel, Elektrofag og Helse- og oppvekstfag.

Korpuset rommer rundt 225 tekster av ulik lengde og i ulike sjangere og teksttyper, ca 112 000 ord.

Tekstene er samlet inn ved tre ulike skoler - en storbyskole, en skole i en mindre by og en skole på et tettsted. Skriverne er både elever med norsk som førstespråk og minoritetsspråklige elever med norsk som sitt andrespråk, eller flerspråklige elever. Til tekstene er det knyttet opplysninger om elevenes morsmål og antall år i norsk skole.

De fleste tekstene finnes i tre utgaver: en skannet original i pdf-format og to transkriberte i txt-format, den ene versjonen med feil. I den andre versjonen er feilene rettet.
Versjonene er lenket til hverandre og det er mulig å søke i begge de transkriberte versjonene.
resourceShortName: SKRIV
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/prosjekter/skriv/
PID: http://hdl.handle.net/11538/0000-000B-C01F-A
distributionInfo [ComponentId=‘clarin.eu:cr1:c_1396012485124’]:
licenceInfo [ComponentId=‘clarin.eu:cr1:c_1396012485158’]:
userCategory: Academic
distributionAccessMedium: accessibleThroughInterface
executionLocation: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/prosjekter/skriv/
licence [ComponentId=‘clarin.eu:cr1:c_1447674760330’]:
licenceFamily: CLARIN
licenceName: CLARIN_ACA-NC-LOC-ND
licenceURL: https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&LOC=1&NORED=1&ND=1
conditionsOfUse: BY
conditionsOfUse: ID
conditionsOfUse: LOC
conditionsOfUse: NC
conditionsOfUse: ND
conditionsOfUse: NORED
nonStandardConditionsOfUse: Due to agreements with the text contributors, the texts are only available through Glossa, a search and post-processing tool developed by the Text Laboratory.
licensor:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName [xml:lang=‘en’]: University of Oslo
organizationName [xml:lang=‘no’]: Universitetet i Oslo
organizationShortName [xml:lang=‘no’]: UiO
organizationShortName [xml:lang=‘en’]: UoO
departmentName [xml:lang=‘en’]: Department of Linguistics and Scandinavian Studies
departmentName [xml:lang=‘no’]: Institutt for lingvistiske og nordiske studier (ILN)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: elisabeth.selj@iln.uio.no
url: http://www.hf.uio.no/iln/personer/vit/eselj/index.html
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
distributionRightsHolder:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName [xml:lang=‘en’]: University of Oslo
organizationName [xml:lang=‘no’]: Universitetet i Oslo
organizationShortName [xml:lang=‘no’]: UiO
organizationShortName [xml:lang=‘en’]: UoO
departmentName [xml:lang=‘en’]: Department of Linguistics and Scandinavian Studies
departmentName [xml:lang=‘no’]: Institutt for lingvistiske og nordiske studier (ILN)
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/english/about/organization/text-laboratory/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
contact:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: organization
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/english/about/organization/text-laboratory/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: person
personInfo [ComponentId=‘clarin.eu:cr1:c_1396012485192’]:
surname: Selj
givenName: Elisabeth
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: elisabeth.selj@iln.uio.no
metadataInfo [ComponentId=‘clarin.eu:cr1:c_1407745711922’]:
metadataCreationDate: 2017-03-21
metadataLastDateUpdated: 2017-08-14
metadataCreator:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: person
personInfo [ComponentId=‘clarin.eu:cr1:c_1396012485192’]:
surname: Hagen
givenName: Kristin
organizationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711883’]:
organizationName: The Text Laboratory
organizationShortName: Textlab
departmentName: Department of Linguistics and Scandinavian Studies, University of Oslo
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: kristin.hagen@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zipCode: 0317
city: OSLO
country: Norway
versionInfo [ComponentId=‘clarin.eu:cr1:c_1430905751648’]:
version: 2
lastDateUpdated: 2016-04-01
resourceDocumentationInfo [ComponentId=‘clarin.eu:cr1:c_1355150532301’]:
toolDocumentationType: online
documentationUnstructured [ComponentId=‘clarin.eu:cr1:c_1355150532302’]:
role: documentation
documentUnstructured: http://www.tekstlab.uio.no/nota/skriv/
resourceCreationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711921’]:
creationStartDate: 2013-01-01
creationEndDate: 2016-04-01
resourceCreator:
actorInfo [ComponentId=‘clarin.eu:cr1:c_1396012485194’]:
actorType: person
personInfo [ComponentId=‘clarin.eu:cr1:c_1396012485192’]:
surname: Selj
givenName: Elisabeth
sex: female
communicationInfo [ComponentId=‘clarin.eu:cr1:c_1352813745460’]:
email: elisabeth.selj@iln.uio.no
fundingProject:
projectInfo [ComponentId=‘clarin.eu:cr1:c_1430905751647’]:
projectName: SKRIV
fundingType: ownFunds
funder: Department of Linguistic and Scandinavian Studies, University of Oslo
corpusInfo [ComponentId=‘clarin.eu:cr1:c_1407745711878’]:
corpusType: Written Corpus
corpusPartInfo [ComponentId=‘clarin.eu:cr1:c_1407745711885’]:
mediaType: text
corpusTextInfo [ComponentId=‘clarin.eu:cr1:c_1396012485188’]:
textFormatInfo [ComponentId=‘clarin.eu:cr1:c_1427452477072’]:
mimeType: txt
characterEncodingInfo [ComponentId=‘clarin.eu:cr1:c_1447674760355’]:
characterEncoding: utf-8
sizePerCharacterEncoding [ComponentId=‘clarin.eu:cr1:c_1447674760346’]:
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 112 000
sizeUnit: tokens
corpusPartGeneralInfo [ComponentId=‘clarin.eu:cr1:c_1407745711882’]:
sourceWorkInfo [ComponentId=‘clarin.eu:cr1:c_1407745712071’]:
workDescription: Texts written by students in upper secondary education programs.The texts differ in length, genre and type.
languageInfo [ComponentId=‘clarin.eu:cr1:c_1428388179423’]:
languageId: nb
languageName: Norwegian Bokmål
modalityInfo [ComponentId=‘clarin.eu:cr1:c_1447674760356’]:
modalityType: writtenLanguage
sizeInfo [ComponentId=‘clarin.eu:cr1:c_1353678848785’]:
size: 112 000
sizeUnit: tokens
annotationInfo [ComponentId=‘clarin.eu:cr1:c_1407745711924’]:
annotationType: lemmatization
annotationType: morphosyntacticAnnotation-posTagging
segmentationLevel: word
tagset: The Oslo Bergen-tagger tagset: http://tekstlab.uio.no/obt-ny/english/index.html
tagsetLanguageId: Nb
tagsetLanguageName: Norwegian Bokmål
theoreticModel: Constraint Grammar
annotationMode: automatic
annotationManualUnstructured [ComponentId=‘clarin.eu:cr1:c_1355150532325’]:
role: annotationManual
documentUnstructured: http://www.tekstlab.uio.no/obt-ny/english/index.html
annotationTool [ComponentId=‘clarin.eu:cr1:c_1355150532326’]:
targetResourceNameURI: The Oslo-Bergen Tagger: http://tekstlab.uio.no/obt-ny/english/index.html
classificationInfo [ComponentId=‘clarin.eu:cr1:c_1403588862809’]:
genreInfo [ComponentId=‘clarin.eu:cr1:c_1407745711877’]:
genreType: textGenre
genre: unstandardised
unstandardisedGenre: Texts written by students in upper secondary education programs.

The texts are available in three different versions: one scanned original in pdf format and two transcribed versions in txt format: one original transcription with errors and one version where the errors are corrected.
All versions are linked and it is possible to search in both transcribed versions.
timeCoverageInfo [ComponentId=‘clarin.eu:cr1:c_1447674760358’]:
timeCoverage: The texts were mostly written in 2012