A first look at TREASURE ethno data

At the time of writing, the TREASURE corpus consists of 3,149 posts in 71 topics, for a total of almost 100,000 words. Participants are 111. This corpus was enriched with 5,036 annotations, which use 285 codes. The 285 codes are connected by 11,468 co-occurrence edges, many of them parallel (which means that the same two codes co-occur multiple times). Unique co-occurrences are 3,999. The edge between perception of recycling and modification is both the deepest (d = 98) and the broadest (b = 54, really a large consensus!). The form of this graph is the usual for CCNs, though even more densely connected than most. We need network reduction to make sense of it.

Highest core value nodes

Thie highest-k K-core in this network consists of 71 codes who are connected to at least 28 others (k = 28).

codes in the highest-k core

Aesthetics
Agency
Attention
Automotive Politics
Awareness of Inequality
challenges
Choosing new elements
Choosing used elements
Comparison
Concept of Circular Economy
Concerns about Environmental Pollution
Conditionality
Consciousness
Contested Battery (Production)
Contested new technicality
Contested sustainability
Cost-Effectiveness
Data Share
development
Diesel Car
Different Era
Differentiation between electronic and combustion engine cars
Disfunctionality
Durability
Effort to recycle
Elder Automotives Models
Electronics in Cars
Emissions
Environmental Pollution
Environmental Preservation
Environmental System
Expertise and Knowledge
Expertise and Service
Feeling of external Control
Future Mobility
Global Inequality
Global Politics/Economics
Gradual Functionality
High Standard
Ideal
Indifference
Indifference through non-influence
Innovative Automotives
Innovative Electronics
Junk
Market Strategies
Materiality
Modification
Monetarian Availability
Necessity-driven
Negativity
New Cars
New Reprocess
Perception of Recycling
Planned Obsolescence
Political Measurements
politics
Practicability
Production
Relevant Features
Replacement
resources
Self-Reflexion
Self-Service Ability
Social Change
Specific Car Labels
Specific Electronic Feature
Status quo
Sustainability
Sustainable Behaviour
Wasting Behaviour

Association depth

This network reduces in a way that I have not encountered so far: the density goes down faster than the number of nodes as d increases. So, we get networks that are not so dense, but still have a lot of codes. Provisionally, I chose this one, with d >= 10. It has 97 codes and 191 edges. Bluer edges represent higher values of d.

image

Association breadth

The reduction with association breadth seems to work better, and it encodes similar information, because the correlation between d and b is very high. This is a reduction with b >= 7. Greener edges mean higher values of b, so more informants endorsing the same association.

image

Simmelian backbone

Here I need a little more time, because my script has a glitch. I need to review it, and possibly ask a question to Bruno.

Other than that, happy to do more tweaking. Ping @Nica , @siri and @ivan.

2 Likes

Ok, managed to extract the backbone. But the result is quite strange: this network is very resistant to the breaking down into cliques that we experienced in other datasets. I would be more comfortable if I could check my results with @bpinaud, but he’s on holiday for another couple of weeks.

At 100 codes, the network is still connected (there are no “islands”), and very dense at 1,171 edges:

image

Reducing even further (50 codes, 188 edges) separate components begin to appear.

image

How would you interpret it?

Okay, some initial thoughts on association depth:

We are seeing the concept of circular economy kind of sitting in the middle between a cluster of codes around resources, modification, recycling – so practical /how-to codes that are more corporeal/material in essence (it would be interesting to see where the actual “materiality” code appears – I don’t see it at this level of reduction) – and then the more conceptual cluster of sustainability and its relatedness to more social/global issues like politics and political measurements, and codes of senses of agency and responsibility, which branches off into sustainable behavior. To me all those codes in the “southwest” corner have to do with self-discipline and perhaps cultivating what Arun Agrawal calls “environmentality.”

From that perspective, the concept of circular economy, while not a “native term” of informants (which we already knew from Jos, and which would account for its smaller size), could be seen as something linking concrete material resources and how they are literally handled with sustainable praxis, and the political space of sustainability. .

I am also noting, and I think it’s interesting that all those concept clusters seem separate from economic considerations (about cost effectiveness etc).

There is also a separate branch in the “Northeast” pertaining to data sharing – if I recall, we hypothesized that data sharing might be a barrier to circular economy because of privacy concerns but while some concerns emerge, it is not particular central/prominent, and it does not seem highly salient, so that is an actionable insight for the clients, I think.

Thoughts? @alberto ?

I have some thoughts about association breadth, but just checking – so is “concept of circular economy” code not showing up at this level of reduction?

In terms of the Simmelian backbone, I wonder if the fact that our first event was specifically antique/old car lovers might be influencing the visualization? So, I wonder if the branch down from “elder automotive models” in the “Southwest” and onto expertise, etc. is aggregating codes that are from Ulf / Technorama…

There is not much commmunity structure. Nodes that are part of many triangles are connected with nodes that are part of slightly fewer triangles, which in turn are connected to nodes that are part of even slightly fewer triangles, and so on. As you delete the edges which are part of fewer and fewer triangles, you delete “from the outside in”, and the giant component never breaks down.

Or maybe there is an error in my code.

Yes:

1 Like

A look at some key codes.

I am going to use only association depth to a first approximation. In this corpus, association depth and breadth are exceptionally well correlated. Their (Pearson) correlation coefficient is 0.96, statistically significant with, to all practicall purposes, certainty.

Concept of circular economy. At b >= 7, it connects to the hub that is Sustainability, and to Modification, Perception of recycling and Resources.

To get a bit more granularity, I visualized the graph for b >= 4 (126 codes, 442 edges). Here more “political” codes appear.

list of neighbors of `Concept of circular economy` at b >= 4
Automotive Politics
Buzzword
Comparison
Concept of Circular Economy
Contested Battery (Production)
Contested new technicality
Contested sustainability
Different Era
Durability
Effort to recycle
Innovative Automotives
Materiality
Modification
New Reprocess
Perception of Recycling
Replacement
resources
Resources needed to recycle
Sustainability

Sustainable behaviour. This code is already a hub at b >= 7. It is a senter of a cluster that connects to the rest of the graph via sustainability, indifference and Electronic in cars.

list of neighbours of `Sustinable behaviour` at b >= 7
Agency
Being independent
bike
Concerns about Environmental Pollution
Consciousness
development
Garbage Separation
Less Car Use
Political Measurements
public transportation
Responsability
Self-Reflexion
Sustainability
Sustainable Behaviour
Wasting Behaviour
Wrapping

agency. At b>= 7, this code in in the same general cluster as sustainable behaviour and sustainablity, and direcly connected to both. Other direct neighbours at this (high) level of b are Being independent, consciousness and responsibility.

If I lower my threshold for b to 5 or 4 only one more code appears as a direct neighbor, self-reflection. Going down to b >= 3 (154 codes, 721 edges), I get a few more, but they don’t seem very illuminating:

direct neighbours of `agency` at b >= 3
Agency
Being independent
Comparison
Consciousness
development
Environmental Pollution
Market Strategies
Responsability
Self-Reflexion
Self-Service Ability
Sustainability
Sustainable Behaviour
Wasting Behaviour

Feeling of external control. At b >= 7, this code is present but only connected to Data share. This is also true if I lower the threshold to 5. At b >= 4 a second edge appears, to conditionality. At b >= 3 there are two more: to Global Surveillance and Critical about information/Knowledge dissemination.

Perception of recycling. This is another hub, even at b >= 7. It connects to the rest of the graph via Sustainability, Concept of circular economy, and Conditionality.

Its direct neighbours are:

direct neighbours of `Perception of recycling` at b >= 7
Comparison
Concept of Circular Economy
Conditionality
Effort to recycle
Junk
Knowledge of transportation of useless electronics
Materiality
Modification
New Reprocess
Perception of Recycling
resources
Specific recycling material product
Sustainability

Lowering my threshold to 5 only adds a few codes: Automotive politics, Replacement, Resources needed to recycle.

Modularity

I also ran some modularity analysis. The stacked graph is not very modular, but not quite random either (Q = 0.37, using b as a measure of edge strength. The results do not change significantly using d). The reduced graph for b >= 7, however, is much more modular, and you can even see the modularity just by looking at it: Q = 0.72. Below, I have colored nodes by the community they belong to (in network slang, by their class in the maximal modularity partition). The result is visualized below:

Here’s a list of the codes in each of the six communities. They are listed in alphabetical order: the star * indicates the highest-connected code in that community.

codes in the "pros and cons of electronics in cars" (grey) community
Aesthetics
Appearance
Disfunctionality
Electronics as Comfort
Electronics as Entity
Electronics as regulatory Mechanism
Electronics in Cars
Ideal *
Necessity-driven
New Cars
Reasons of Space
Relevant Features
Specific Car Labels
Specific Electronic Feature

codes in the "materiality and circular economy" (red) community
Awareness of Inequality
Concept of Circular Economy
Effort to recycle
Global Inequality
Junk
Knowledge of transportation of useless electronics
Lack of Knowledge about recycling processes
Materiality
Modification
New Reprocess
Perception of Recycling *
resources
Specific recycling material product
codes in the "personal expertise networks" (yellow) community
(Interpersonal) Trust
Careful Car Maintenance
Expertise and Knowledge
Expertise and Service
Garage *
kinship
Self-Service Ability
Specific Service from Producer
codes in the "deliberation/consideration by drivers" (green) community
Choosing new elements
Choosing used elements
Conditionality *
Cost-Effectiveness
Durability
Elder Automotives Models
Gradual Functionality
Guarantee of Quality
safety
codes in the "data sharing and surveillance" (purple) community
Comparison
Data Protection through Reset
Data Share *
Different Era
Feeling of external Control
Global Surveillance
GPS System
Indifference
Indifference through non-influence
Mobile Phone
privacy
Reluctance
Transparency
codes in the "systemic consideration of sustainability" (blue) community
Agency
Attention
Automotive Politics
Being independent
bike
climate change
Concerns about Environmental Pollution
Consciousness
Contested Battery (Production)
Contested new technicality
Contested sustainability
development
Emissions
Environmental Pollution
Garbage Separation
Innovative Automotives
Innovative Electronics
Less Car Use
Market Strategies
Negativity
Political Measurements
politics
public transportation
Responsability
Self-Reflexion
Status quo
Sustainability *
Sustainable Behaviour
Wasting Behaviour
Wrapping
3 Likes

@Nica and @siri, a small follow-up to modularity analysis. Here is how the different communities of codes “talk” to each other:

image

The edges are size-coded for association breadth. Notice that:

  1. The grey and the red community have no connecting edges, though the algorithm placed them close together.
  2. The purple community is connected to most other communities, but this is an artifact of the algorithm assigning comparison to it. comparison is on the border between different communities, and the algo could have assigned to other communities just as well. In that case, purple would only connect to blue.
  3. Same for green: conditionality connects to codes in all communities. In fact, these non-semantic codes have a potential to confuse this kind of analysis somewhat.
  4. Strongest inter-community edges are between yellow and green and between green and gray, in decreasing order.