Churches and Aeroplanes

I just came across an interesting image, the only color picture in the January 1910 issue of Scribner's Magazine at the Modernist Journals Project, and wanted to document it here for a later expansion. It seems relevant to the passages having to do with airplanes, angels, and of course churches. It depicts an early biplane and monoplane with the profile of Rheims cathedral in the background. Rheims of course became a focal point for Allied propaganda after it was shelled by the Germans.

The poster refers to an aviation event at Rheims in September 1909. Perhaps Proust attended it. Some biographical searching about his experiences with early flight and where they took place could be very useful. Also the caption centers on the idea of point of view, a crucial focus of the Recherche. Will have to read the story to which the image corresponds, "The Point of View" on p. 124.

Proust

Just found out about this website called Proust that facilitates asking life-defining questions of your loved ones and having deep conversations with them. It works like a questionnaire. Not really sure how this is different from Story Corps, or why you wouldn't want to just ask your loved ones in person.

Introduction To Proust.com from Proust on Vimeo.

Foray Into Topic Modeling

Topic modeling suggests new avenues for Proust studies. Applications like Mallet and PhiloMine compute the statistical relationships among tokens (as single, double, or triple word phrases) appearing within specified spans of text such as paragraphs or groups of, say, fifty words. Since the Recherche embodies more than one million words, topic modeling can be used to highlight features of the text that are not perceptible during the act of serial reading. I ran the first tome, which contains Du côté de chez Swann part I, “Combray,” and part II, “Un Amour de Swann,” through Mallet to generate token clusters for ten topics, which reveals some interesting patterns. The command line output shows the top nineteen recurring words that are statistically significant within the top ten recurring patterns (topics) in the text.

  1. chose moment pouvait jamais puis rien esprit pourtant visage savait voulait dire savoir mal trouvait première devait autres instant
  2. dit bien dire air jamais beaucoup tête toujours princesse ami docteur reste choses sais enfin regard répondit jeune entendu
  3. vie amour plaisir souvent celle ainsi gilberte pu pensée besoin donnait tant sorte milieu cause femmes étais connaître joie
  4. après temps jusqu heure pendant allait presque chambre longtemps près seul passer heures penser jour tard souvenir chercher toute
  5. combray côté déjà rue soleil semblait fleurs saint bois place eau ciel petits vers jardin matin champs dessus autour
  6. faisait toutes petite peine seule beau toute sourire donner phrase quelques trouver parfois contraire nature suite musique croire corps
  7. swann odette chez verdurin monde disait gens femme forcheville homme soir effet amis connaissait demander personne cœur cottard
  8. voir faire aller autres jours jour toujours maison venait venir désir grande contre dès autant paris rien lequel bien
  9. grand tante mère père françoise faire bien fille disait parents maman voix partie personne bonne petit mort famille laisser
  10. devant guermantes yeux nom air petit surtout or doute mieux église image fit vue dame tant aussitôt figure lesquelles

Some of the results are unsurprising, such as topic 7, which clearly derives from the many evening scenes at the Verdurins (soir, chez, maison) where Swann courted Odette among their coterie (forcheville, cottard), often becoming jealously heartbroken (cœur, désir) with wondering whether she was seeing other admirers on the sly (demander, conaissait, amis). Other topics reveal interesting patterns that fit with scenes across the entire narrative, such as number 10. It emphasizes the use and observation of the eyes (yeux, vue) in connection with the Duc and Duchesse de Guermantes, whose mysterious airs and glances are described in several Combray church passages, as well as their association with art and symbolism of France (image, figure). But what also emerges is the consistency of the preposition before (devant), emphasizing the narrator's location not only in front of their paintings and of their glances, but also in front of a church in connection to a woman (dame), a recurrence that we can tease out by reading the database passages from the English translation.

Using a PHP script and MySQL database (graciously provided by Elijah Meeks), we can extract the tokens, word counts, and their connections from the Mallet topic model files into a graph file that generates edges and nodes, allowing us to view the ten topics as a network model in Gephi.

This entirely computer-generated model of associative networks in tome 1 of the Recherche is markedly different from the static model created by my particular reading of the church motif above, though it shares some consistencies and interesting disparities.

For instance, when we drill down and filter to look more closely at the terms that join the different topics, we see that the word for nothing (rien) is the one that most frequently connects topics 6 and 9, which respectively center on themes of beautiful bodily gestures in music and family/home relationships, while time (temps) joins topic 6 with 3, which is focused on positive terms for love of Gilberte.

According to the statistical features of the text, then, the first two parts of Du côté de chez Swann associate the expression of romantic love primarily with time, while the memory of familial love is associated primarily with absence. This perhaps comes as no shock to most readers of Proust, but if we compare this model with a search for the term nothing in the church motif database, we receive a number of passages associated predominantly with romantic love. These two fields of data, then, suggest a reading of the church motif as concerned with concepts of absence in romantic love, somewhat against the grain of the rest of the novel. There is not enough space here to deal with the problematics of translation/tutor text comparisons or the relation of computation algorithms to critical interpretation. But it is clear that domain expertise is just as necessary with digital scholarship as it is in print, as shown by the (illuminating) disparities between a human-reading and machine-reading of the text.

French Stop Words / Mots d'arrêt français

I've searched for French stop word lists for use in text mining and synthesized my findings here. It may not be definitive, but could be useful for those looking for stop lists.

J'ai effectué une recherche pour des listes de mots d'arrêt français pour utilisation dans l'exploration de texte et synthétisé mes conclusions ici. Il ne peut être définitive, mais pourrait être utile pour ceux qui recherchent des listes d'arrêt.

à
ah
ai
aie
aient
aies
ait
alors
as
au
aucuns
aurai
auraient
aurais
aurait
auras
auriez
aurions
aussi
autre
aux
avaient
avais
avait
avant
avec
avez
aviez
avions
avoir
avons
ayant
ayez
ayons
bon
car
ce
ceci
cela
celà
celles
celui
ces
cet
cette
ceux
chaque
ci
comme
comment
dans
de
des
du
dedans
dehors
depuis
deux
devrait
doit
donc
dont
dos
droite
début
elle
elles
en
encore
es
essai
est
et
eu
eue
eues
eusse
eusses
eûmes
eurent
eus
eussions
eussiez
eut
eût
eûtes
eux
fait
faites
fois
font
force
fûmes
furent
fus
fusse
fussent
fusses
fussions
fussiez
fut
fût
fûtes
haut
hors
ici
il
ils
je
jusqu’à
juste
la
laquelle
lequelle
le
les
leur
leurs
lui

ma
maintenant
mais
me
mes
mine
moi
moins
mon
mot
même
ne
ni
nommés
non
nos
notre
nous
nouveaux
on
ont
ou

par
parce
parole
pas
personnes
peut
peut-être
peu
pièce
plupart
plus
pour
pourquoi
quand
qu
que
quel
quelle
quelles
quelque
quels
qui
sa
sans
se
sera
serai
seraient
serais
serait
seras
serez
seriez
serions
serons
seront
ses
seulement
si
sien
soi
soient
sois
soit
somme
sommes
son
sont
sous
soyez
soyons
suis
sujet
sur
ta
tandis
te
tellement
tels
tes
toi
ton
tous
tout
trop
très
tu
un
une
valeur
voie
voient
vont
vos
votre
vous
vu
ça
étaient
était
étant
état
étions
été
étée
étées
étés
êtes
être

mme
mlle

a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
1
2
3
4
5
6
7
8
9
10
~
`
!
@
#
$
%
^
&
*
(
)
_
-
=
+
{
}
[
]
"
'
;
:
,
.
/
<
>
?
«
»

A Little Close Reading with Network Analysis Software

I thought I would do some closer "distant-reading" of the Recherche. When using ORA to look at a metanetwork, the visualization can be manipulated in real time to highlight the links among nodes and their related concepts or passages.  What this means for the study of Proust is that we can think of the novel (and the novel genre) as a network of nodes consisting of concepts, characters, narrative elements, and any other unit of meaning that might enhance exploration of its text.

For instance, isolating the network constellated around the note “Finding common element of formative impressions” shows that the narrator's activity of reflecting upon his formative impressions is primarily connected to the associations of Memory, Music, and Literature.

The Memory association in turn connects with “Memory at Grandmother's deathbed”; Music connects with “End of Mass at Combray church, return home” and “Vinteul's sonata played at Swann's”; and Literature connects with “Charlus berating Marcel,” “Epiphany at Guermantes' Party,” and “Describing morning routine back in Paris, after second Balbec visit.” When the nodes above are moused over to show the passages they represent, we see that most of them, from various sequences involving musical performance or literary discussion from all over the novel, refer to the twin steeples at Martinville that formed the subject of the narrator's first piece of writing as a youth (I.253-257). Thus, the steeples form an orientation point for that part of the narrator's writerly vocation that pertains to analysis of impressions, filtered primarily through memories of music and literature. These connections are not apparent by searching the database at the Ecclesiastical Proust Archive because it does not provide the simultaneous view of layered networks afforded by ORA. It seems on the surface, then, that this particular network within the Recherche forms a theory of impressionism based on the structural commonalities between music and literature.

The network for the Time association similarly connects various types of recollection to provide insight into the narrator's artistic development.

We find Time at the center, ringed by “Contemplation sparked by conversation with M. de Cambremer, at Guermantes party,” “Imagining Florence and Venice (before visit),” “Contemplating experience of Vinteuil's sonata while jealous of Mlle Vinteuil and Albertine,” “Contemplating women and past,” “Observations at Guermantes party,” and “First visit to Balbec.” The last in turn connects with Narthex and Carqueville. In other words, the primary function of Time as a backwards-looking concept is associated with jealousy over women, while the forward-looking passages imagine the reddish domes of Florence and the frescos of Venice. This suggests a deepening of the structure that became apparent in the database searches, where the thought of meeting a future lover in early passages, though not explicitly concerned with the nature of time, took place on the porch of a gothic cathedral. These nodes presented by ORA show that the church passages that consciously deal with the nature of time happen after the narrator as had experience being in love with women. And correspondingly, the architectural element of this ring is the narthex, which is the entrance area just indoors or on the threshold to the porch. The narthex was not considered part of the church proper, but was placed close enough so that those not worthy of entry, such as the unbaptized or unconfessed, could still receive instruction from services. Hence, the experience of love has brought the narrator past the porch but, because he is lost through jealousy, he still remains an outsider.

The Truth network presents a very clear view of the novel's main thematic chains and character developments.

With Truth at the center of the middle network, the first ring comprises “Riding in Dr. Percepied's carriage” (the moment at which he observed in motion the twin steeples of Martinville), “Reflections on getting the truth about Albertine from Andrée” (in which he had final confirmation of his lack of knowledge about Albertine's lesbianism, the root of his obsessive jealousy), “Reading Bergotte” (the writer who most influenced his literary sensibility, and who figures so prominently in his appreciation for churches), and “Reflection on Charlus' perversion” (the unmasking of homosexuality as a major recurring element of the novel's concern with epistemology). What we also see in the picture above are two micro networks that are not directly connected, yet were placed close to the Truth network because of their conceptual affinity. If we take all three networks into consideration, the second ring around the Truth association comprises Motion, Laws, the Archaic, Beauty, and Knowledge, which further connect with three passages about household habits and the Great War. Taken together, Truth in Proust's novel can ultimately be understood as a rather stable essence based on the epistemic laws of motion and observation, as well as the aesthetic laws of beauty as evident in old objects. These, too, are a function of Time. While this visualization might not provide much insight that is new in Proust studies, the interface at least allows the reader instantly to access the passages that contribute to a given part of the network.

Tome I Word Cloud

Generated by Wordle. Wordle recognized the French and stripped out the common words, but many of them, like comme, quand, et, si, etc., still crept in. Interesting, though, that in this visualization of absolute word frequency, the words Swann and Odette are weighted as heavy or heavier than many prepositions and conjunctions. Given that this tome covers the "Combray" and "Swann in Love" sections, it accounts for the narrator's obsession with the pair in his early childhood, and likewise Swann's obsession with Odette in the years before the narrator's birth. I would have expected words like église or fenêtre or mère to weigh heavier. Interesting too that the other meaningful words that make it into this cloud are Verdurin (emphasizing the salon and the coterie culture), yeux (where the narrator reads the souls of others), Françoise (who is mentioned -- and valued? -- more than his mother), and tante (Léonie, the relative in residence at Combray).

Upcoming Lecture -- Digital Methods for Literary Criticism: Proust, Illustration, and the Archive

I'm giving a lecture on some of my recent digital research on Proust. The talk will cover methods in text annotation and visualization, with a view toward their theoretical implications for literary criticism. Along the way it will describe some of my experiments with text mining and social network analysis for generating and representing associative paths.

  • Wednesday November 17, 5:00-7:00 pm
  • Lucy Ellis Lounge, 1st floor Foreign Language Building
  • University of Illinois, Urbana-Champaign

Proust Flyer

Pages