Layer Composition of the Bezan Text using Multivariate Clustering

This page publishes details of the layers extracted by the multivariate clustering procedure described in my earlier post Extracting Layers in Codex Bezae. This procedure was applied to Bezae’s readings in John 4:1-42.

Input Data

The data are prepared in a tab-delimited format. Observations are represented by readings, i.e. there is one observation per reading in Bezae. (Note that only one reading per variant is observed, which is the reading found in Bezae. We can only partition readings found in the document!) Variables are represented by the witnesses.

The tab-delimited data has 43 columns for witnesses and 73 rows for readings (see the article or input data for witnesses and readings). In addition, there are three column headers identifying the readings (short, medium, long) and one row header identifying the witnesses. There is one column “Level” with my own ad hoc assessment of the level of Bezae’s reading in the local genealogical stemma for that variant. There is one column “Layer” with my own ad hoc categorization of Bezae’s reading according to Holmes’ approach (A = Alexandrian, B = Byzantine, G = Greek minority neither Alexandrian nor Byzantine, L = Latin).

Download Input Data

Download R Code

The statistical characteristics of the data are worth a separate discussion.

Partitioning Bezae’s Readings

In PAM, the number of clusters is determined in advance and the algorithm responds by optimally partitioning the observations according to the determined number. This allows us to experiment with different numbers of partitions to identify an intuitive fit. I found that six clusters partitioned the data in a manner that best distinguished readings represented by the Byzantine and Old Latin traditions, as these became the two most recognizable features in the distribution.

For a statistical criterion, the number of clusters at which the rate of change decreases of the within-cluster sum of squares provides a good optimum (Figure 1). The change of slope indicates the point at which adding new clusters has a decreased return in producing a good fit of observations to clusters.

Within groups sum of squares by number of clusters

Figure 1: Within groups sum of squares by number of clusters

After running PAM for six clusters, the readings are partitioned as shown in the following page:

View Clusters

The partition contents can also be downloaded:

Download Clusters

Rendering the Results

Various plots show the distribution of Bezae’s readings by cluster. The points in the plots represent individual readings of Bezae in John 4:1-42.

Figure 2 shows clusters partitioned according to each of four categories proposed by Holmes (x = minority Greek, neither Alexandrian or Byzantine; o = Alexandrian; ▲ = Old Latin; ♦ = Byzantine) superimposed over the six clusters produced using PAM. The figure suggests some correspondence between the two methods, esp. in clusters 2, 5, and 6. Cluster 2 (far right) represents Bezae’s stratum shared with the Byzantine tradition, clusters 5 and 6 (lower left) represent the Old Latin version, and clusters 1 and 4 (center) and cluster 3 (upper left) represent various minority combinations of Greek witnesses.

Figure 2: Holmes' categories superimposed over six clusters

Figure 2: Holmes’ categories superimposed over six clusters (x = minority Greek, neither Alexandrian or Byzantine; o = Alexandrian; ▲ = Old Latin; ♦ = Byzantine)

Figure 3 shows the same clusters with readings coded by level in the local genealogical stemma (1 = initial reading; 2 or 3 = secondary reading). Readings at level 1 group in clusters 1, 2, and 4, which include Alexandrian and Byzantine elements. Readings at level 2 are found throughout, but concentrate especially in clusters 3, 5, and 6, which include Old Latin readings.

Figure 3 Clustered readings coded by level in the local stemma

Figure 3 Clustered readings coded by level in the local stemma

Figure 4 plots Bezae’s agreements with Codex Sinaiticus in a solid color. Note Sinaiticus’ agreement with Bezae in clusters 5 and 6, where its Old Latin readings are concentrated.

Clustered readings shaded by agreement with Codex Sinaitucus

Figure 4 Clustered readings shaded (solid) by agreement with Codex Sinaiticus

Figure 5 plots Bezae’s agreements with each of the 31 cited Greek witnesses. Sinaiticus agrees with Bezae in clusters 5 and 6 more than any other cited Greek witness.

Agreements with cited Greek witnesses

Figure 5 Agreements with cited Greek witnesses

Figure 6 plots agreements of Bezae’s Greek column with each of the 8 cited Old Latin witnesses, including its Latin column. As expected, the Latin column agrees with the Greek in most readings. Unlike the Greek witnesses except for Sinaiticus, the Old Latin witnesses consistently appear in clusters 5 and 6. Cluster 5 is represented in all Old Latin witnesses, but readings supported by “African” Old Latin witnesses (e and c) are stronger in cluster 5 than cluster 6. Cluster 6 is better represented in the “European” Old Latin witnesses.

Agreements with cited Latin witnesses

Figure 6 Agreements with cited Latin witnesses

What do you think?