This is just a quick-and-dirty round of impressions. It is not made any easier bu the fact that I cannot very well do much without programming. But basically:
- K-core reductions have a big problem: they tend to generate graphs that are highly connected, and become difficult to read, even for low numbers of nodes. I can see the codes, but visually not really keeping track of which code is connected to which. This is true of all types of K-core reduction.
- Giatsidis has the opposite behaviour: graphs reduced to a small number of codes tend also to have few edges, and they break down into many connected components. A clear giant component is still visible at ~150 codes, but for smaller ones (~50 codes) the giant component breaks down completely. This is not necessarily a problem. Additionally, it tends to produce trees. This may be a problem, as clearly trees cannot be interpreted as hierarchies of codes, and then you do not really know what you are seeing. Triangles and clusters of triangles help see deep, multi-way relationships.
- Clique is like Giatsidis, but even more extreme.
- I have formed an opinion about the qualitative side of the different reduction methods (do they highlight different things?), but I will not share it here, as I do not want to influence Amelia unduly.
So, here is a suggestion for representing this result in one figure.
- Consider a smaller domain (0-200 nodes, for example).
- Put your curves onto one single graph
- Use the same scale for both axis; if this is impractical, draw a reference 45° line, the locus where the number of nodes is equal to the number of edges.
This will show us how different techniques reduce the graph in the relevant domain. Your own work, @melancon, tells us that networks where
number of edges > 4 x number of nodes become unreadable. Ideally, we still want a giant component to emerge, as that will represent the core of the study. This means that the optimal zone is somewhere between 1.1 and 4 edges per each node.
It would be simple to also make curves depicting the number of connected components vs. the number of nodes for each method. I am not sure what that would teach us, because I am an unsure as to how @amelia would interpret a graph divided into many components.
Final issue: I think @amelia needs a bit of help in arriving at graphs that she can attempt to interpret. At the very least, she needs encouragement to install Tulip, and a little help to start engaging with the files.