[Note Title]

Research Goal

발표 음성을 STT 기반 텍스트 chunk로 변환한 뒤, 현재 발표 단계를 추론하고, 해당 단계의 appraisal question을 통해 발화 내용을 평가하여 기능군, 패턴, 채널 specification을 순차적으로 선택함으로써 AI 청중의 evaluative backchannel을 생성하는 rule-based generation pipeline을 정교화한다.

Research Question

RQ1. 발표 내용 기반 청중 반응 생성 엔진은 어떤 처리 단계와 중간 표현으로 구성되어야 하는가?
RQ2. 발표 단계 prior와 appraisal vector는 어떤 방식으로 기능군 가중치로 변환되어야 하는가?
RQ3. 주 기능군, 보조 기능군, self-regulation interrupt는 어떤 제약 아래 선택되고 조합되어야 하는가?
RQ4. 선택된 pattern은 어떤 채널 구조와 modulation rule을 통해 실현되어야 하는가?

Current Ambiguity

현재까지는 발표 구조 단계에 따른 appraisal objective와 청중 반응의 기본 틀은 정리되어 있으나, 실제 생성 시점에서 어떤 단서가 어떤 반응을 직접 유발해야 하는지는 아직 충분히 명확하지 않다. 특히 발표 내용 단서와 발화 단서가 동시에 존재할 때 무엇을 우선적으로 해석해야 하는지, 같은 단서라도 발표 구조 단계에 따라 다른 evaluative meaning으로 읽혀야 하는지, 그리고 하나의 강한 transient signal이 기본 sustained state를 얼마나 변화시켜야 하는지에 대한 작업적 기준이 더 필요하다.

Backchannel Selection Framework

본 노트에서는 AI 청중 백채널 생성을 위한 rule-based generation pipeline을 정리한다.
첫째, 발표 음성을 STT 기반 text chunk와 누적 문맥으로 변환한다.
둘째, 현재 발표 단계와 해당 단계의 appraisal objective prior를 추론한다.
셋째, appraisal vector를 기능군 가중치로 변환한 뒤 주 기능군과 보조 기능군을 선택한다.
넷째, 선택된 기능군 내부 pattern을 channel specification으로 실현하고, 필요 시 self-regulation interrupt를 별도 이벤트로 삽입한다.

Working Criteria

Basic Constraints

아래 규칙은 각 시점에서 청중 반응 생성 엔진이 따라야 하는 최소 제약 조건이다. 기능군, 패턴, 채널, interrupt event는 이 제약 안에서만 선택 및 조합된다.

Rule Type	Constraint	Operational Consequence
Primary function group	각 시점마다 주 기능군은 하나만 둔다.	동일 시점에 복수의 중심 기능군이 경쟁하지 않도록 한다.
Secondary function group	보조 기능군은 최대 하나만 둔다.	반응이 과도하게 복합화되는 것을 막는다.
Primary pattern	주 기능군에서는 패턴 1개만 선택한다.	해당 시점의 기본 청중 반응은 단일 pattern으로 고정한다.
Secondary pattern	보조 기능군도 패턴 1개만 약하게 붙인다.	보조 기능군은 full replacement가 아니라 약한 modulation으로만 작동한다.
Overt action	overt action은 허용된 패턴에서만 사용한다.	모든 반응 시점에 action episode가 남용되지 않도록 제한한다.
Self-regulation	self-regulation은 별도 interrupt layer로 평가한다.	기본 반응 패턴과 분리된 독립 event로 삽입할 수 있다.

Engine Pipeline Overview

발표 내용 기반 청중 반응 생성 엔진은 STT 텍스트, 누적 문맥, 슬라이드 정보, 그리고 발표 순서 정보를 입력으로 받아 발표 단계 추론, appraisal 평가, 기능군 선택, 패턴 선택, 채널 채움의 순서로 반응을 생성한다.

Step	Operation	Input	Output
1	STT chunking	발표 음성	current utterance text chunk, chunk timestamp, accumulated context
2	Stage inference	current text + accumulated context + slide / order cue	predicted stage, stage confidence, optional secondary stage
3	Stage prior loading	predicted stage	stage-specific appraisal objective prior
4	Appraisal scoring	current utterance + context + appraisal questions	appraisal vector (0–1)
5	Functional weighting	appraisal vector + stage prior	function-group weight distribution
6	Function-group selection	function-group weights + compatibility constraints	primary function group + optional secondary function group
7	Primary pattern selection	stage + appraisal result + content feature + recent response history	primary pattern
8	Channel realization	selected pattern	body, gaze, core head, facial state, optional head / action / facial detail
9	Secondary modulation	secondary function group	partial channel modulation without full pattern replacement
10	Interrupt insertion	self-regulation evaluation	independent self-regulation event if needed

Function Group Selection Rules

기능군 선택은 appraisal vector와 stage prior를 결합해 산출한 가중치를 바탕으로 수행한다. 이때 주 기능군은 하나만, 보조 기능군은 최대 하나만 허용한다.

Selection Target	Rule	Threshold / Constraint	Operational Consequence
Primary function group	가장 높은 가중치를 가진 기능군을 선택한다.	argmax after compatibility filtering	현재 시점의 중심 반응 방향을 결정한다.
Secondary function group	임계값 이상이며 주 기능군과 양립 가능한 기능군 1개만 선택한다.	weight ≥ secondary threshold	보조 기능군은 약한 modulation source로만 작동한다.
Incompatible group pair	양립 불가능한 기능군은 동시에 선택하지 않는다.	compatibility table reference	상충되는 반응 태도가 동시에 출력되지 않도록 한다.
No valid secondary	조건을 만족하는 보조 기능군이 없으면 주 기능군만 유지한다.	no compatible candidate above threshold	불필요한 modulation을 생략한다.

Pattern Realization Schema

선택된 pattern은 채널 수준에서 실현된다. 이때 모든 pattern은 필수 채널을 가져야 하며, 선택 채널은 조건 또는 확률에 따라 추가된다.

Channel Type	Channel	Status	Generation Rule
Required	body	mandatory	모든 pattern은 기본 posture / orientation specification을 포함해야 한다.
Required	gaze	mandatory	모든 pattern은 주 시선 target과 dwell behavior를 포함해야 한다.
Required	core head	mandatory	모든 pattern은 핵심 head configuration 또는 head movement를 포함해야 한다.
Required	facial state	mandatory	facial state는 해당 pattern의 핵심 표현을 구성한다.
Optional	optional head	conditional	핵심 head 외에 추가적인 국소 head modulation이 필요할 때만 추가한다.
Optional	optional action	restricted	허용된 pattern에서만 overt action을 추가할 수 있다.
Optional	optional facial detail	probabilistic	facial detail은 확률적으로 추가되는 미세 facial variation으로 사용한다.

Secondary Modulation Rules

보조 기능군은 독립적인 full pattern을 생성하지 않는다. 대신 주 기능군이 선택한 pattern 위에 일부 채널만 제한적으로 modulation한다.

Secondary Function Role	Allowed Modulation	Not Allowed	Operational Consequence
Affective reaction	facial intensity, gaze energy, slight head responsiveness	full posture replacement	기본 패턴은 유지하되 정서적 반응성만 약하게 높인다.
Attentive listening	gaze stabilization, posture openness increase	overt action insertion	기본 pattern 위에 listening bias를 약하게 부여한다.
Evaluative monitoring	brow tension, gaze fixation, reduced positive detail	independent transient override	판단적 주시를 강화하되 전체 패턴은 교체하지 않는다.
Uptake-following	nod readiness, forward attentional bias	fatigue-like shift	내용 추적 성향만 소폭 보강한다.

Self-regulation Interrupt Rules

self-regulation은 기본 반응 패턴과 분리된 interrupt layer에서 판단된다. threshold를 넘는 경우에만 독립 event로 삽입되며, 종료 후 previous sustained pattern 또는 strained pattern으로 복귀한다.

Condition	Interrupt Event	Insertion Rule	Recovery Rule
processing burden spike	ACT_briefSelfTouch	self-regulation threshold 초과 시 삽입	종료 후 previous pattern 또는 strained pattern으로 복귀
postural discomfort accumulation	ACT_seatAdjustment	지속적 불편 signal 누적 시 삽입	종료 후 기본 listening pattern 복귀
low-level tension / restlessness	ACT_fidgeting	간헐적 micro self-regulation 필요 시 삽입	종료 후 기존 sustained state 유지

Stage Prior Table

발표 단계 prior는 stage inference 결과가 주어졌을 때, appraisal scoring과 function-group weighting에 선행 bias로 작동한다. 본 표의 stage category는 Hu and Liu (2018)가 제시한 3MT 발표의 rhetorical move structure를 바탕으로 재구성하였다.

Stage	Primary Appraisal Objective	Secondary Appraisal Objective	Primary Function Prior	Secondary Function Prior
Orientation	Relevance	Implication	Attentive Listening	Affective Reaction
Rationale	Relevance	Implication	Attentive Listening	Evaluative Monitoring
Framework	Normative Significance	Implication	Evaluative Monitoring	Attentive Listening
Purpose	Relevance	Implication	Attentive Listening	Evaluative Monitoring
Methods	Coping Potential	Normative Significance / Implication	Evaluative Monitoring	Uptake-Following
Results	Implication	Relevance	Uptake-Following	Evaluative Monitoring
Implication	Implication	Normative Significance	Evaluative Monitoring	Affective Reaction
Termination	Normative Significance	Relevance	Evaluative Monitoring	Affective Reaction

Open Questions

-

Next Step

-

References

Hu, G., & Liu, Y. (2018). Three minute thesis presentations as an academic genre: A cross-disciplinary study. Journal of English for Academic Purposes, 35, 16–30. https://doi.org/10.1016/j.jeap.2018.06.004
Scherer, K. R., Schorr, A., & Johnstone, T. (2023). Appraisal processes in emotion: Theory, methods, research (1st ed.).