12#12
13#13
19#19
N.B. Codes <s> (for -suk/-sk ) and <z> (for
-szuk/-szk ) are obligatorily followed by explanation. Codes
<suk> and <zuk> unambiguously stand for hypercorrect uses of
-suk/-sk and -szuk/-szk respectively.
The shortening of phonologically long l, t, d is usually not transcribed, i. e. kelett is recorded in its standard form kellett , ntem as nttem . However, if the shortening results in a form that belongs to another lexeme, it is recorded in the shortened form and is followed by an explanation e. g. halom <= hallom>.
Overlapping speech is transcribed within asterisks. The speech of
the speaker who was speaking when the overlap began is transcribed till
the end of the overlap. The beginning and end of the overlap is marked
with an asterisk.
Underneath follows the overlapping speech of the intervening speaker, also
bounded by asterisks. If the second speaker takes over, his/her speech is
transcribed continuously after the asterisk terminating the overlap. If the
overlap is followed by the speech of the first speaker, then a new line is
opened with the code of the speaker (a or t ) followed by
: if the first speaker paused or by > if s/he carried on
without a pause, e.g.
a: j¢k vo<:><l>tak a do<:><l>gozatok. 25#25 Sza<l>
ezr<t>,
a *ezr<t> volt*
t: *Igen*.
a> n la k<l>nsen *furcsa az, hogy*
t: *Igen, 25#25 igen*.
The * can be used word internally as well. Inside the word it is to be
placed at syllable boundaries e.g.
t: 25#25 s ezzz 25#25 nem volt megfelel? 25#25 Rosszul esett,
25#25vagy
t 25#25 *nem tartotta megfelelnek*?
a: *Ez most 25#25 a munk m*mal kapcsolatosan van, ugye?
If a word is broken up because of overlapping speech and is continued, both the ending of the first fragment and the beginning of the second is indicated with = e. g.
t: 25#25 *nem tartotta megfelel*=
a: *H t nem csak az*
t> =nek?
26#26
They are standardized and not coded.
31#31
The following codes can occur inside words: 25#25, (), *
At the beginning and the end of long pauses, noises the tape counter setting must be recorded in [ ], see A.1.18.
exception: - special words (see dictionary) - compensatory lenghtening (see , A.1.20) - e/ variants e.g. fel - fl - the trtnetibe -type.
The following phenomena are standardized:
- shortening (see A.1.12)
- deletion (except l, t, d -kiess, see A.1.12)
- lenghtening (except ss , see A.1.20)
Compensatory lenghtening following vowel shortening (e. g. szll, htt) is not recorded.
Distinctly dialectal features (such as diphtongisation) should be recorded in the general profile of the informant. BSI transcripts only monitor e/ usage.
25#2525#2525#25 is recorded as many times as the informant utters it
but continuous hesitation is transcribed as (see A.1.5).
Deletion of one syllable is phonetically transcribed and then explained
e.g. szveki <=szvetkezeti>
but: szvetkeeti 32#32 szvetkezeti (standardized and not
explained).
BSI version 3 transcripts will transcribe not only syllable length deletion but also vowel deletion (including the concommittant deletion of neighbouring consonant(s) if any, e.g. tulankppen <=tulajdonkppen> .
keret25#25tet, but: keret -tet (pauses can be marked inside words, hesitation must be marked separately) see A.1.12 for how silence should be recorded.
- ovoda, blcsde, kr£t, p¢sta, ntde,
- mit tom n, asszem-asziszem, aszondja
- szal-szoal-sza-szoval
- kommonista, Ejr¢pa, inekci¢, Sofiane-Sofian
- spr, sztressz
- gyn
- viszonlag
- m
- oszt <=azt n>
- mert, <=mirt> and derived forms (mer, me, mir, mi).
Each conversation module forms a separate unit of text. Each unit has an identifier and a tape counter setting.
The identifier is made up of 8 characters, the first five of which is the ID of the informant, the rest is the three letter code of the conversation module, e. g. B7307bio.
Important formal conventions:
Each line has 80 characters and they are used divided into the following fixed format:
Figure A.1 illustrate the above conventions. Transcribers were instructed to carefully observe the following points:
The body of transcribed text occupies character positions 17 - 72. The program breaks the lines automatically, so <ENTER> should only be used to insert empty lines to set off text units from each other.
Character position 16 is only indicated at the beginning of each turn. If the turn extends over several lines this position remains empty meaning there was no change of speaker.
Turns must not be separated with empty lines.
Transcribers only need to fill in the speaker and the continuity positions on the left margin. The identifier, the line numbers are supplied automatically. Tape counter setting should be recorded at roughly 2 minute intervals.