question archive 1)Questions of Syntax and Discourse in Pronoun Resolution: Can corpus studies further the understanding of co-reference? San Diego State University 2 Questions of Syntax and Discourse in Pronoun Resolution: Can corpus studies further the understanding of co-reference? Sophia called Olivia when she returned from her vacation

1)Questions of Syntax and Discourse in Pronoun Resolution: Can corpus studies further the understanding of co-reference? San Diego State University 2 Questions of Syntax and Discourse in Pronoun Resolution: Can corpus studies further the understanding of co-reference? Sophia called Olivia when she returned from her vacation

Subject:WritingPrice:16.89 Bought3

1)Questions of Syntax and Discourse in Pronoun Resolution: Can corpus studies further the understanding of co-reference? San Diego State University 2 Questions of Syntax and Discourse in Pronoun Resolution: Can corpus studies further the understanding of co-reference? Sophia called Olivia when she returned from her vacation. Who returned from her vacation? Sophia? Olivia? Alexander had lunch with John after he finished his reports. Who finished his reports? Alexander? John? These two sentences present the problems of ambiguous anaphora resolution and co-reference in general. How does one decide to whom the pronoun from the dependent clause refers? What might make one of the names a more likely referent than the other? In this paper, I begin by summarizing the syntactic and discourse-related factors identified in anaphora resolution throughout a variety of languages and then discuss the potential contributions corpus studies could offer to better understanding co-reference. To summarize the basic steps involved in co-reference, most linguists agree that there are several different stages, usually presented in the following three steps (de la Fuente, Hemforth, Colonna, and Schimke, 2016). When a reader/listener encounters an anaphora, they gather any and all of the possible referents available within a particular frame of discourse context, without prioritizing them in any way yet. Secondly, these potential candidates are sorted and eliminated through a series of “morphosyntactic constraints, such as number, gender, person, binding, etc.” (de la Fuente et al., 2016). Lastly, if more than one potential referent remains as the antecedent for that anaphor, then a selection is made “based on some combination of soft constraints or heuristics (e.g. syntactic function, parallelism, thematic role, etc.).” Although there are various more specific theories that aim at explaining these steps, such as the binding theory (Chomsky, 1981), the memory focus model (Garrod and Sanford, 1983), and the centering theory (Gordon et a., 1993), they all involve those three steps. However, ambiguous pronouns, such as the ones in 3 the sentences above, provide ground for studies that attempt to better understand the more nuanced factors that play into the third step where a referent is chosen. Syntactic Considerations in Anaphora Resolution First, it must be noted that the following observations will be made about a variety of languages that may differ in syntactic systems. Also, as mentioned in second step of the process of co-reference above, morphosyntactic contraints, such as matching gender, number, and person, are always taken into consideration. But what about other aspects of syntax? In English, syntactic position affects the likelihood of a word being chosen as a referent; specifically, if a word is in the subject position it is more in focus to a reader than an object or oblique (Gordon and Scearce, 1995). For example, in the examples from the introduction, Sophia and John might be the preferred antecedents due to their syntactic nature (although, as we will see in the next section on discourse, other factors may be prioritized more than syntactic position). However, there are other syntactic structures that can draw focus to a particular word even more prominently than the subject role. Consider the following sentences: (1) It was John who stole the money. (2) There was a banker who stole a lot of money. In several studies, Almor (1999) and Morris & Folk (1998) found that when speakers use itclefts (as in sentence 1) and there-insertions (as in sentence 2), listeners more quickly resolved a subsequent anaphor to John or a banker than if either of those had been in a syntactic subject position of a typical subject-verb-object sentence. Now typically, the most likely or easiest referent to choose is the first mentioned subject in a conventional declarative English sentence, but the emphasis of these syntactic constructions changes the focus of the hearer or reader. In pro-drop or null subject languages, the idea of the subject being the most common 4 referent for ambiguous pronoun anaphora is further complicated. Carminatti (2002) found that in Italian there is a complementary division of preferences for null and overt pronouns. In her study, participants were presented with sentences that contained ambiguous intrasentential anaphora, half null (as in 3 below) and the other have with overt pronouns (as in 4 below). (3) Marta scriveva frequenmente a Piera quando era negli Stati Uniti. Marta wrote frequently to Piera when ∅ was in the United States. (4) Marta scrieveva frequenmente a Piera quando lei era negli Stati Uniti. Marta wrote frequently to Piera when she was in the United States These items were followed by multiple-choice comprehension questions (such as “Who was in the United States?”). Carminatti found that when the ambiguous pronoun was null, the subject was chosen 81% of the time, whereas the object was chosen 19% of the time. Conversely, when the pronoun was overt, the object antecedent was chosen 83% of the time, and the subject was chosen 17% of the time (Carminatti, 2002). What does this tell us about the process of co-reference? Carminatti built her hypothesis and study off earlier observations by Calabrese (1986), who argues that “the null and overt pronoun are in complementary distribution, where ∅ prefers an ‘expected referent’ and the pronoun a referent which is ‘unexpected’.” However, Carminatti’s findings show, as anticipated, that rather more than semantic “expectations”, the syntax of a sentence and structural basis of pronoun-antecedent attachment play a key role in resolution and cannot be reduced, in null subject languages, to the assumption that the processor will prefer whatever is in the syntactic subject position in a typical subject-verb-object sentence. Yet subsequent studies on Carminatti’s hypothesis, the Position of Antecedent Hypothesis (PAH), found that further manipulations of the syntax of sentences yielded varying results. In sentences 3 and 4 above from Carminatti’s study, the independent or main clause of 5 the sentence is followed by a dependent clause containing the pronoun (whether null or void), resulting in forward anaphora. But what if the order of clauses, and consequentially the type of anaphora, were reversed? Would there still be a complimentary distribution of antecedent preferences for null and overt pronouns? Sorace and Filiaci (2006) investigated this question in a study using data from two groups of participants—native speakers of Italian and adults who had learned Italian to a high degree of proficiency (near-native). Sorace and Filiaci (2006) used a Picture Verification Task that was designed to examine the interpretation of null and overt subjects in both contexts of backward and forward anaphora. The task consisted of 20 experimental items, where half were forward anaphora and the other half backward anaphora, while in each type of anaphora half of the items presented an overt pronominal subject in the subordinate clause and the other half a null subject, as seen the translation of (5) below. (5a)“While she/pro is wearing her coat, the mother kisses her daughter.” (forward) (5b)“The mother kisses her daughter, while she/pro is wearing her coat.” (backward) The results of the study supported the predictions made that clause order (and consequentially the type of anaphora, whether forward or backward) would alter preferences of subject or object as antecedent, but added to Carminati’s theory in that pronoun attachment seems to differ between forward and backward anaphora sentences; the subject was preferred for a null pronoun only for backward anaphora sentences, whereas in forward anaphora sentences, preferences are more equally seen between the subject and object of the main clause In summary, then, the syntactic effects on pronoun resolution seen thus far are the significant promotion of an antecedent in subject position or when following an it-cleft or thereinsertion in English, the effect of the type of pronoun for null subject languages, and the point of 6 subordination for intrasentential anaphora—whether resulting in forward or backward anaphora. Yet, as Sorace and Filiaci found, syntactic effects cannot be isolated or considered alone aside from their relationship with elements of discourse. Discourse Effects in Anaphora Resolution The less strictly defined “soft constraints” that play a role in the third part of coreference, selecting the best antecedent from potentially multiple candidates, are often semantic, but broader discourse context must be considered as well. For example, as Van den Hoven and Ferstl (2018) found, lexical semantics tip the balance in favor of different antecedents in the following two sentences: (6a) Sally amazed Mary because she… (6b) Sally loved Mary because she… Implicit causality of the verbs make the natural antecedent for “she” in (6a) to refer to Sally, whereas “she” in (6b) seems to refer to Mary (Van den Hoven and Ferstl, 2018). And yet, studies with discourse manipulations have shown that a broader context for these sentences can very well overcome the apparent bias or implicit causality indicated by lexical semantics. For example, a story about Mary’s low standards could end with (6a) concluding that “Sally amazed Mary because she was very easily impressed,” in which case, against the expectations one might have about the verb and its semantic roles, “she” would refer to Mary (Van den Hoven and Ferstl, 2018). So what aspects of discourse are critical for pronoun resolution? In a model proposed by Asher and Vieu (2005) called Segmented Discourse Representation Theory (SDRT) there is hierarchy within discourse structure that indicates a hierarchy in discourse is due to the relations of coordination and subordination (in the discourse functions, not the syntactic structures 7 mentioned above. SDRT explains these connections in terms of the discourse function of a constituent in respect to its predecessors. The first type, Narration, combines two or more constituents in a series to indicate connected events, whereas the second type, Elaboration, happens when a constituent provides supporting or further detail to the ideas in a previous constituent. For example, Asher and Vieu (2005) use the following examples: 7a. John had a great evening last night. 7b. He had a great meal. Elaboration of (7a) 7c. He ate salmon. Elaboration of (7b) 7d. He devoured lots of cheese. Narration of (7c); elaboration of (7b) 7e. He then won a dancing competition. Narration of (7b); elaboration of (7a) 7f. # It was a beautiful pink. Incoherent continuation If a constituent is dependent on a preceding clause, such as the cases of Elaboration in 7b, 7c, 7d, and 7e above, then it is considered in a subordinate relationship with the preceding one (whether or not that is also syntactically true). On the other hand, if the sentence or clause is a case of Narration, such as 7d and 7e, then it is in a relationship of coordination with the preceding one and therefore on the same level of discourse hierarchy. So what does this hierarchy do for pronoun resolution? Asher (1993) found that discourse structure restricts pronoun resolution. Specifically, in English, “the potential referents are found in the discourse constituents to its left and in those to which it is inferior” (Asher, 1993). The first part of a series of experiments conducted by Jegerski et al. (2011) illustrates this hierarchical effect of discourse structure on intrasentential pronoun resolution. In their test, forty-three 8 students were presented with 20 experimental sentences with ambiguous anaphora appearing in either subordinate or coordinate discourse (as defined by SDRT, not syntactically subordinate or coordinate). Particularly, half of the sentences consisted of a pair of clauses linked with the word while, resulting in parallel events as in (8), or a coordinate discourse relationship, while the other half consisted of a pair of clauses linked with the words after or until, denoting a subordinate discourse relationship as in (9). (8) Jeffery saw Ricky while he was hunting for coins in the fountain. (coordination) (9) Anita talked to her sister after she had the baby. (subordination) The results of the experiment corroborated the previous research by Asher and Vieu (2005), showing that understanding the discourse structure in binary relations of coordination or subordination is critical for properly understanding anaphora resolution and co-reference generally. Specifically, in coordinate structure, such as Narration, the topic of the second clause is assumed to be the same, meaning that in a sentence like (8), “he” would be taken to refer to Jeffrey. On the other hand, subordinate structure, such as Elaboration, indicates a shift of topic, such as how in (9), “she” would refer to sister. Overall, it is clear that discourse structure (as well as broader discourse context in cases of verb bias) must be taken into account when analyzing pronoun resolution. And yet, syntax and discourse do not operate solely as separate, individual factors; rather, as Sorace and Filiace (2006) observed in Italian and near-Italian speakers, those two elements interface with one another, further complicating the way co-reference can be studied and even achieved by second language learners. Turning to Corpora 9 Thus far, the experiments used in collecting the results on syntax and discourse effects in pronoun resolution mentioned above have used a variety of methods: eye-tracking, picture verification tasks, offline comprehension tasks, amongst others. But with the increasing progress of computational linguistics, is it possible that corpus studies could further our understanding, even given the complicated “soft constraints” that are involved in co-reference? Ruslan Mitkov (2013) identifies the need for annotated, whether anaphorically or coreferentially, corpora in order for corpus linguistics to truly inform the study of co-reference, particularly with anaphora resolution. While he acknowledges that such a task will be timeconsuming, a properly annotated and broad corpora would be “indispensable…since the data they provide are critical to the development, optimization, and evaluation of new approaches” (Mitkov, 2013). There are anaphorically annotated resources already in existence, such as the Lancaster Anaphoric Treebank (100,000 words), the MUC coreference task (65,000 words), a corpus by the University of Wolverhampton (60,000 words), and then an ongoing project by the University of Stendahl, Grenoble, and Xerox Research Centre Europe (expected a million words, but limited to anaphor- closest antecedent pairs rather than full anaphoric chains). However, these varying resources are annotated in varying schemes, or particular methodologies prescribing how linguistic features are encoded in text. In his book Anaphora Resolution, Mitkov summarizes many of these schemes, such as the most notable UCREL anaphora annotation scheme applied to newswire texts and the SGML-based annotation scheme used in the MUC co-reference task mentioned above. The UCRL scheme, developed originally by Leech and Black but reiterated further since, allows for the marking of a variety of features, addition of special symbols to anaphors or 10 antecedents to indicate direction of reference (e.g. forward or backward, which would be important for analyses such as Sorace’s and Filiaci’s), the type of relationship, and semantic features of both anaphors and antecedents. For example, (10) Anything (108 Kurt Thomas 108) does,

Option 1

Low Cost Option
Download this past answer in few clicks

16.89 USD

PURCHASE SOLUTION

Option 2

Custom new solution created by our subject matter experts

GET A QUOTE

rated 5 stars

Purchased 3 times

Completion Status 100%