YouGov Brexit ranking data of June 12-13 2017

In a July weblog entry, I reported on a rather important YouGov poll. YouGov.com and Anthony Wells were so kind to provide the underlying poll data. Earlier I estimated some rankings, but thanks to this kindness we now have certainty about the poll data, so that only the uncertainty remains due to polling itself. It also appeared that what I had categorized as a hard (H) Brexit better be rephrased as the No Deal (N) case. I will maintain the label on the Tariff (T) option, that some would call hard.

The UK general election was on June 8 and the poll was taken on June 12-13 so that the persons polled will have had vivid recollections. For this reason, these polling data can be considered quite important.

The poll generated data about confusions in the British electorate. It is useful to belabour the point, for Brexit is a key event and would have quite some impact for the coming decades. I would respect the UK decision to leave the EU but have my doubts when it is not based upon Proportional Representation (PR). A referendum gives proportions but referenda tend to be silly and dangerous, as they are an instrument of populism rather than of representative democracy. Indeed, it appears that the Brexit referendum question was flawed in design. The YouGov poll helps us to observe how confused a major section of the UK electorate is. Let us dig a bit deeper.

The following copies my weblog text of July 11, but now replacing the estimate by the real data.

Representation of preferences via a ranking matrix

Let voters consider the options R = Remain, S = European Economic Area (EEA) a.k.a. Single Market a.k.a Soft, T = Tariffs a.k.a. Hard, N = No Deal, World Trade Organisation (WTO). A consistent Remainer would tend to have the ranking R > S > T > N, and a consistent Leaver would tend to have this in reverse.

The YouGov poll presents the data in a ranking matrix, with the first preferences in the first row, then the second preferences, and so on. For the Brexit referendum outcome of 48% Remain and 52% Leave, for example, we might have the following setup. It is a guess, since the particular ways of Leaving were not included in the referendum question (and neither for Remaining). This example however is the result that you would expect if Remainers and Leavers would have the mentioned consistent orderings.

Observe that each voting weight (take e.g. 48) for a preference order list is put in precisely one place per row and per column, i.e. that it doesn’t occur more times in a single row or column. This explains why the border sums add up to 100.

The YouGov poll of June 12-13 2017

The YouGov data, that I have been referring to, contain the results of a poll of 1651 adults in Great Britain, i.e. the UK excluding Northern Ireland. From page 13-16 we can collect these data for the whole of Great Britain for 2017. YouGov states that the sample has been weighted for social-economic and political indicators. It is not clear to me how the “Don’t know”s are being handled for this particular issue. See also this discussion by Anthony Wells.

We can observe:

These are percentages, and both the row sums and the column sums should be 100, except for rounding errors.
35% has Remain in the first position, 47% has it in the last position, so that 9 + 8 ≈ 17% (a 1% missing due to rounding) has a confused position, in which Remain is sandwiched between some options for Leaving. We would wonder how such people would vote in a referendum when they are presented with only two options R or L. One cannot say that the referendum was only about the first positions in the rankings, for voters would tend to develop an expectation about what would be the likely kind of Brexit and vote accordingly. Some of these 17% might have voted Remain because they disliked the otherwise expected version for Leave. This might indicate that the outcome for Remain was overstated. Yet we have no information on subdivisions of Remain, that might cause an opposite effect. Some might be okay with Remain as it is but vote for Leave because they fear that the UK otherwise might also join up on the Eurozone or some United States of Europe. The reason why the Brexit referendum question was flawed in design is that it left too much to guess here.
Remarkably, the split between R and L now in June 2017 would be 35% versus 65% instead of 48% versus 52% in 2016. In one single year Great Britain switched from fairly divided to a seemingly clear preference for Brexit (though divided upon how) ? I very much doubt this distribution, see this discussion on populism and DR. The electoral data still suggest more than 50% for Remain. In the July weblog entry it is discussed that some 26% of the electorate say that they voted for Remain but accept the loss at the referendum, so that they “play along” with the winning side, focusing on what would be the best option for Leave. This seems loyal to some notion of democracy, but it would also be a misplaced loyalty to the flawed Brexit referendum question. (One can respect such loyalty, but it still makes sense to discuss it.)

Using techniques of apportionment we estimated the number of people per cell in the poll. However, we now have the actual data (rounded to one digit from multi-digit percentages times 1651):

Possible permutations of rankings

With 4 options there are 4 possibilities for a first place, 3 remaining for the second place, 2 remaining for the third place, and then the final one follows. Thus there are 4 x 3 x 2 x 1 = 24 permutations for possible rankings. We already saw two of these: R > S > T > N and its reverse. Above ranking matrix is actually based upon these 24 possibilities.

Some of these 24 possibilities will be rather curious. It is not clear what to think about R > N > S > T for example (Case 5 below). This would be a Remainer who would rather prefer No Deal to the EEA or some agreement not to have a trade war on tariffs. A tentative explanation is that this voter has a somewhat binary position, as Remain versus No Deal At All, while the other options are neglected.

Policy options can also be sorted in logical order. This gives rise to the theory of Single Peakedness. For the topics of R, S, T and N there is a logical scale from left to right. An example of single-peakedness is Case 7 below, with a ranking S > R > T > N. See the graph below. The 1st rank gets utility level 4, the 2nd rank gets utility level 3, the 3rd rank gets utility level 2, and the 4th rank gets utility level 1. The utility levels are just the reversed of the ranks, but then the case must be reordered to the logical order.

Voting theory has a core that assumes that voters are both autonomous and rational, so that any preference would have some logic. The logical order R, S, T and N might seem arbitrary to some voters who may think otherwise. We do not impose that order but invite voters who think otherwise to explain why they choose a different order. Potentially each voter has his or her own criteria so that the best is on top, and all other options follow in proper order. Voters with multiple peaks in their preferences would have more to explain to us to understand them than voters with a single peak. Without a good explanation, we cannot reject the possibility that there is some confusion.

Presentation of preferences via preference orderings

The following are the YouGov data for the preferences orderings that underlie above YouGov results on percentages. See the excel sheet in the Appendix. This table shows only the percentages and not the numbers of people in the poll (that add up to above table), since the percentages are the main finding. Single dots are zero’s. The ConR / L and LabR / L subdivisions concern the voters in the poll who voted R or L in the 2016 Brexit referendum and who voted Con or Lab in 2017. They form only a part of the sample, so their sum doesn’t add up to the total on the left.

Discussion on GB

Some observations are:

The YouGov summary ranking matrix already showed a rather even split on S, T and N, but the data give a landscape with even more diversity in opinions.
Only 24.8% has the preference R > S > T > N and only 15.0% its reverse, so that 60.1% has some mixture.

Above results for GB can be split up in on the peaks and sandwich. The combinations give the following percentages:

The mentioned 60.1% split up again in 33.3% who are single peaked, and 26.8% who have multiple peaks.
The sandwich of 17.3% splits up into 8.5% with a single peak and 8.8% with multiple peaks.
Of the 26.8% with multiple peaks there are 10.5% who can join the Remainers with a first preference and there are 7.4% who can join the leavers with Remain in the last position (but various ways how to Leave).

The 8.8% would be a relevant section of the vote. They all voted Leave, but divided on S, T and N. Potentially the outcome of the 2016 Brexit referendum has been decided by the 8.8% GB voters who have Remain neither in the first or last position, and who do not follow the standard logical order on the options.

Discussion on ConR / L and LabR / L

The division of ConR / L and LabR / L is losing its relevance because it are dwingling groups, they are changing loyalties, and their 2016 votes are becoming history while there are new issues. Yet, the 2016 referendum question was flawed, and it is relevant to see how sizeable parts of the UK electorate deal with the logical conundrum that they took part in.

The 17.3% of the votes with Remain sandwiched can be found in the subdivisions in similar proportions.
28.6% of ConR voters and 55.2% of LabR voters are united on the preference R > S > T > N. Presumably this was also the case in 2016, or there must be factors that increased or reduced consistency or confusion.
30.8% of ConL and 22.4% of LabL are united on the preference N > T > S > R. Presumably this was also the case in 2016, or there must be factors that increased or reduced consistency or confusion.
One might expect that ConR / L and LabR / L voters of 2016 would have the benefit of a party preference and thus show more consistency, yet the distribution of views is quite as much, and the sandwich with multiple peaks is quite present.
The 2016 Conservative Remainers are loyal for 45.2% to the old point of view, but still vote for a Conservative party that is set on Leave. Part will be the misplaced loyalty for the flawed referendum. Alternatively, they voted for a minority in this party that still tries to bring balance ? (A good poll requires a focus group.) (And there is more in the world than just Brexit.)
The 2016 Labour Remainers are 76.1% loyal to the old point of view. Yet Labour leader Corbyn also prefers a Brexit. It might be the pecularities of the British system of District Representation (DR) that caused these voters not to switch to LibDem. (But the LibDem also have a liberal policy that many voters for Labour dislike. The system of DR doesn’t favour the entry of new political competitors.)
The 2016 Leavers have a high loyalty to the old view, ConL 88.2% and LabL 73.3%. Yet this doesn’t diminish the diversity of opinion about how to Leave.

Conclusions

The ranking matrix is a fine way to summarize results, yet the preference ordering are more accurate on the underlying and relevant orders. The ranking matrix is merely a matter of presentation by the statistical reporter. A person in a poll who can answer on a ranking matrix in fact gives the personal preference ordering. The statistician can compound these data while not losing information on the permutations. From the permutations it is always possible to create a ranking matrix, yet the reverse requires estimation techniques which generate needless uncertainty.
Asking for voter preference orderings in a poll is a useful exercise. It is not intended to propose this for general elections. For general elections it suffices that voters exercise a single vote for a party of choice. The condition however is Proportional Representation, otherwise there are serious distortions, see the earlier discussion on this weblog.
The information on the rankings and implied preference orderings suggest a rather large state of confusion in the electorate of Great Britain. The notion of single-peakedness appears to be quite useful in highlighting the issue of the preference order. Perhaps we cannot quite call this “confusion” since voters might have their own logic to order the four options. Until there is more clarity on what strikes one as illogical, the term “confusion” seems apt though.
It must be greatly appreciated that YouGov and Anthony Wells made these data available, since they provide a key insight in the state of opinion in Great Britain close to the general election of June 8 2017.

Appendix September 18 2017

The excel workbook with the full YouGov data and the earlier estimate is: 2017-09-18-YouGov-Rankings-full-data

—Boycott Holland

An advice till the censorship of science by the directorate of the Centraal Planbureau is lifted